Smithery Logo
MCPsSkillsDocsPricing
Login
Smithery Logo

Accelerating the Agent Economy

Resources

DocumentationPrivacy PolicySystem Status

Company

PricingAboutBlog

Connect

© 2026 Smithery. All rights reserved.

    mastra-ai

    e2e-tests-studio

    mastra-ai/e2e-tests-studio
    DevOps
    20,883
    6 installs

    About

    SKILL.md

    Install

    Install via Skills CLI

    or add to your agent
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    ├─
    ├─
    └─

    About

    REQUIRED when modifying any file in packages/playground-ui or packages/playground...

    SKILL.md

    E2E Behavior Validation for Frontend Modifications

    Core Principle: Test Product Behavior, Not UI States

    CRITICAL: Tests must verify that product features WORK correctly, not just that UI elements render.

    What NOT to test (UI States):

    • ❌ "Dropdown opens when clicked"
    • ❌ "Modal appears after button click"
    • ❌ "Loading spinner shows during request"
    • ❌ "Form fields are visible"
    • ❌ "Sidebar collapses"

    What TO test (Product Behavior):

    • ✅ "Selecting an LLM provider configures the agent to use that provider"
    • ✅ "Creating a new agent persists it and shows in the agents list"
    • ✅ "Running a tool with parameters returns the expected output"
    • ✅ "Chat messages stream correctly and maintain conversation context"
    • ✅ "Workflow execution triggers tools in the correct order"

    Prerequisites

    Requires Playwright MCP server. If the browser_navigate tool is unavailable, instruct the user to add it:

    claude mcp add playwright -- npx @playwright/mcp@latest
    

    Step 1: Understand the Feature Intent

    Before writing ANY test, answer these questions:

    1. What user problem does this feature solve?
    2. What is the expected outcome when the feature works correctly?
    3. What data flows through the system? (user input → API → state → UI)
    4. What should persist after page reload?
    5. What downstream effects should this action have?

    Document these answers as comments in your test file.

    Step 2: Build and Start

    pnpm build:cli
    cd packages/playground/e2e/kitchen-sink && pnpm dev
    

    Verify server at http://localhost:4111

    Step 3: Map Feature to Behavior Tests

    Feature-to-Test Mapping Guide

    Feature Category What to Test Example Assertion
    Agent Configuration Config changes affect agent behavior Send message → verify response uses selected model
    LLM Provider Selection Selected provider is used in requests Intercept API call → verify provider in request payload
    Tool Execution Tool runs with correct params & returns result Execute tool → verify output matches expected transformation
    Workflow Execution Steps execute in order, data flows between steps Run workflow → verify each step's output feeds next step
    Chat/Streaming Messages persist, context maintained across turns Multi-turn conversation → verify context awareness
    MCP Server Tools Server tools are callable and return data Call MCP tool → verify response structure and content
    Memory/Persistence Data survives page reload Create item → reload → verify item exists
    Error Handling Errors surface correctly to user Trigger error condition → verify error message + recovery

    Step 4: Write Behavior-Focused Tests

    Test Structure Template

    import { test, expect, Page } from '@playwright/test';
    import { resetStorage } from '../__utils__/reset-storage';
    import { selectFixture } from '../__utils__/select-fixture';
    import { nanoid } from 'nanoid';
    
    /**
     * FEATURE: [Name of feature]
     * USER STORY: As a user, I want to [action] so that [outcome]
     * BEHAVIOR UNDER TEST: [Specific behavior being validated]
     */
    
    test.describe('[Feature Name] - Behavior Tests', () => {
      let page: Page;
    
      test.beforeEach(async ({ browser }) => {
        const context = await browser.newContext();
        page = await context.newPage();
      });
    
      test.afterEach(async () => {
        await resetStorage(page);
      });
    
      test('should [verb describing behavior] when [trigger condition]', async () => {
        // ARRANGE: Set up preconditions
        // - Navigate to the feature
        // - Configure any required state
        // ACT: Perform the user action that triggers the behavior
        // ASSERT: Verify the OUTCOME, not the UI state
        // - Check data persistence
        // - Verify downstream effects
        // - Confirm API calls made correctly
      });
    });
    

    Behavior Test Patterns

    Pattern 1: Configuration Affects Behavior

    test('selecting LLM provider should use that provider for agent responses', async () => {
      // ARRANGE
      await page.goto('/agents/my-agent/chat');
    
      // Intercept API to verify provider
      let capturedProvider: string | null = null;
      await page.route('**/api/chat', route => {
        const body = JSON.parse(route.request().postData() || '{}');
        capturedProvider = body.provider;
        route.continue();
      });
    
      // ACT: Select a different provider
      await page.getByTestId('provider-selector').click();
      await page.getByRole('option', { name: 'OpenAI' }).click();
    
      // Send a message to trigger the agent
      await page.getByTestId('chat-input').fill('Hello');
      await page.getByTestId('send-button').click();
    
      // ASSERT: Verify the selected provider was used
      await expect.poll(() => capturedProvider).toBe('openai');
    });
    

    Pattern 2: Data Persistence

    test('created agent should persist after page reload', async () => {
      // ARRANGE
      await page.goto('/agents');
      const agentName = `Test Agent ${nanoid()}`;
    
      // ACT: Create new agent
      await page.getByTestId('create-agent-button').click();
      await page.getByTestId('agent-name-input').fill(agentName);
      await page.getByTestId('save-agent-button').click();
    
      // Wait for creation to complete
      await expect(page.getByText(agentName)).toBeVisible();
    
      // ASSERT: Verify persistence
      await page.reload();
      await expect(page.getByText(agentName)).toBeVisible({ timeout: 10000 });
    });
    

    Pattern 3: Tool Execution Produces Correct Output

    test('weather tool should return formatted weather data', async () => {
      // ARRANGE
      await selectFixture(page, 'weather-success');
      await page.goto('/tools/weather-tool');
    
      // ACT: Execute tool with parameters
      await page.getByTestId('param-city').fill('San Francisco');
      await page.getByTestId('execute-tool-button').click();
    
      // ASSERT: Verify OUTPUT content, not just that output appears
      const output = page.getByTestId('tool-output');
      await expect(output).toContainText('temperature');
      await expect(output).toContainText('San Francisco');
    
      // Verify structured data if applicable
      const outputText = await output.textContent();
      const outputData = JSON.parse(outputText || '{}');
      expect(outputData).toHaveProperty('temperature');
      expect(outputData).toHaveProperty('conditions');
    });
    

    Pattern 4: Workflow Step Chaining

    test('workflow should pass data between steps correctly', async () => {
      // ARRANGE
      await selectFixture(page, 'workflow-multi-step');
      const sessionId = nanoid();
      await page.goto(`/workflows/data-pipeline?session=${sessionId}`);
    
      // ACT: Trigger workflow execution
      await page.getByTestId('workflow-input').fill('test input data');
      await page.getByTestId('run-workflow-button').click();
    
      // ASSERT: Verify each step received correct input from previous step
      // Wait for completion
      await expect(page.getByTestId('workflow-status')).toHaveText('completed', { timeout: 30000 });
    
      // Check step outputs show data transformation chain
      const step1Output = await page.getByTestId('step-1-output').textContent();
      const step2Output = await page.getByTestId('step-2-output').textContent();
    
      // Verify step 2 received step 1's output as input
      expect(step2Output).toContain(step1Output);
    });
    

    Pattern 5: Streaming Chat with Context

    test('chat should maintain conversation context across messages', async () => {
      // ARRANGE
      await selectFixture(page, 'contextual-chat');
      const chatId = nanoid();
      await page.goto(`/agents/assistant/chat/${chatId}`);
    
      // ACT: Multi-turn conversation
      await page.getByTestId('chat-input').fill('My name is Alice');
      await page.getByTestId('send-button').click();
      await expect(page.getByTestId('assistant-message').last()).toBeVisible({ timeout: 20000 });
    
      await page.getByTestId('chat-input').fill('What is my name?');
      await page.getByTestId('send-button').click();
    
      // ASSERT: Verify context was maintained
      const response = page.getByTestId('assistant-message').last();
      await expect(response).toContainText('Alice', { timeout: 20000 });
    });
    

    Pattern 6: Error Recovery

    test('should show actionable error and allow retry when API fails', async () => {
      // ARRANGE: Set up failure fixture
      await selectFixture(page, 'api-failure');
      await page.goto('/tools/flaky-tool');
    
      // ACT: Trigger the error
      await page.getByTestId('execute-tool-button').click();
    
      // ASSERT: Error is shown with recovery option
      await expect(page.getByTestId('error-message')).toContainText('failed');
      await expect(page.getByTestId('retry-button')).toBeVisible();
    
      // Switch to success fixture and retry
      await selectFixture(page, 'api-success');
      await page.getByTestId('retry-button').click();
    
      // Verify recovery worked
      await expect(page.getByTestId('tool-output')).toBeVisible({ timeout: 10000 });
      await expect(page.getByTestId('error-message')).not.toBeVisible();
    });
    

    Step 5: Update Existing Tests

    When a test file already exists:

    1. Read the existing tests to understand current coverage
    2. Identify if tests are UI-focused or behavior-focused
    3. Refactor UI-focused tests to verify behavior instead:

    Refactoring Example

    BEFORE (UI-focused):

    test('dropdown opens when clicked', async () => {
      await page.getByTestId('model-dropdown').click();
      await expect(page.getByRole('listbox')).toBeVisible();
    });
    

    AFTER (Behavior-focused):

    test('selecting model from dropdown updates agent configuration', async () => {
      // Open dropdown and select model
      await page.getByTestId('model-dropdown').click();
      await page.getByRole('option', { name: 'GPT-4' }).click();
    
      // Verify the selection persists and affects behavior
      await page.reload();
      await expect(page.getByTestId('model-dropdown')).toHaveText('GPT-4');
    
      // Optionally: verify the model is used in actual requests
      // (via request interception or checking response metadata)
    });
    

    Step 6: Kitchen-Sink Fixtures for Behavior Testing

    Fixtures should represent realistic scenarios, not just mock data:

    Fixture Naming Convention

    <feature>-<scenario>.fixture.ts
    
    Examples:
    - agent-with-tools.fixture.ts
    - chat-multi-turn-context.fixture.ts
    - workflow-parallel-execution.fixture.ts
    - tool-validation-error.fixture.ts
    - mcp-server-timeout.fixture.ts
    

    Fixture Content Requirements

    Each fixture must define:

    1. Scenario description (what behavior it enables testing)
    2. Expected outcomes (what assertions should pass)
    3. Edge cases covered (error states, empty states, etc.)
    // fixtures/agent-provider-switch.fixture.ts
    export const agentProviderSwitch = {
      name: 'agent-provider-switch',
      description: 'Tests that switching LLM providers changes agent behavior',
    
      // Mock responses for different providers
      responses: {
        openai: { content: 'Response from OpenAI', model: 'gpt-4' },
        anthropic: { content: 'Response from Anthropic', model: 'claude-3' },
      },
    
      expectedBehavior: {
        // When provider is switched, subsequent messages use new provider
        providerSwitchAffectsNextMessage: true,
        // Provider selection persists across page reload
        providerPersistsOnReload: true,
      },
    };
    

    Step 7: Run and Validate

    cd packages/playground && pnpm test:e2e
    

    Test Quality Checklist

    Before considering tests complete, verify:

    • Each test has a clear user story comment
    • Tests verify OUTCOMES, not intermediate UI states
    • Tests would FAIL if the feature broke (not just if UI changed)
    • Persistence is verified via page.reload() where applicable
    • Error scenarios are covered
    • Tests use appropriate timeouts for async operations
    • Fixtures represent realistic usage scenarios

    Quick Reference

    Step Command/Action
    Build pnpm build:cli
    Start cd packages/playground/e2e/kitchen-sink && pnpm dev
    App URL http://localhost:4111
    Routes @packages/playground/src/App.tsx
    Run tests cd packages/playground && pnpm test:e2e
    Test dir packages/playground/e2e/tests/
    Fixtures packages/playground/e2e/kitchen-sink/fixtures/

    Anti-Patterns to Avoid

    ❌ Don't ✅ Do Instead
    Test that modal opens Test that modal action completes and persists
    Test that button is clickable Test that clicking button produces expected result
    Test loading spinner appears Test that loaded data is correct
    Test form validation message shows Test that invalid form cannot submit AND valid form succeeds
    Test dropdown has options Test that selecting option changes system behavior
    Test sidebar navigation works Test that navigated page has correct data/functionality
    Assert element is visible Assert element contains expected data/state
    Recommended Servers
    EduBase
    EduBase
    Codeinterpreter
    Codeinterpreter
    EasyWeek
    EasyWeek
    Repository
    mastra-ai/mastra
    Files