Smithery Logo
MCPsSkillsDocsPricing
Login
Smithery Logo

Accelerating the Agent Economy

Resources

DocumentationPrivacy PolicySystem Status

Company

PricingAboutBlog

Connect

© 2026 Smithery. All rights reserved.

    erichowens

    playwright-screenshot-inspector

    erichowens/playwright-screenshot-inspector
    AI & ML
    21
    4 installs

    About

    SKILL.md

    Install

    Install via Skills CLI

    or add to your agent
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    ├─
    ├─
    └─

    About

    LLM-powered visual testing expert for automated screenshot capture, analysis, and UI verification using Playwright with multimodal AI inspection.

    SKILL.md

    Playwright Screenshot Inspector

    LLM-powered visual testing expert for automated screenshot capture, analysis, and UI verification using Playwright with multimodal AI inspection.

    Activation Triggers

    Activate on:

    • "screenshot test", "visual test", "screenshot inspection"
    • "playwright headless", "playwright screenshot"
    • "UI verification", "visual regression"
    • "theme compliance test", "dark mode test", "light mode test"
    • "automated screenshot", "capture and analyze"
    • "compare screenshots", "visual diff"

    NOT for:

    • Simple one-off screenshots (use browser DevTools)
    • Pixel-perfect comparison without AI (use native Playwright toHaveScreenshot)
    • Non-web UI testing (use platform-specific tools)
    • Performance testing (use Lighthouse/WebPageTest)

    Core Philosophy

    Traditional visual testing compares pixels. LLM-powered visual testing understands semantics.

    Instead of "these 50 pixels changed", LLM inspection answers:

    • "Is the content actually rendered?"
    • "Does the theme switch correctly?"
    • "Are interactive elements visible and properly styled?"
    • "What's broken vs. what's just different?"

    The Screenshot Inspection Loop

    ┌─────────────────────────────────────────────────────────────┐
    │                    LLM SCREENSHOT INSPECTION                │
    ├─────────────────────────────────────────────────────────────┤
    │                                                             │
    │  1. CAPTURE (Playwright)                                    │
    │     └─► Wait for React hydration, not just network          │
    │                                                             │
    │  2. READ (Claude vision)                                    │
    │     └─► Pass screenshot to LLM with specific questions      │
    │                                                             │
    │  3. ANALYZE (Structured response)                           │
    │     └─► Extract: content present? theme correct? errors?    │
    │                                                             │
    │  4. ACT (Conditional logic)                                 │
    │     └─► Pass/fail based on semantic understanding           │
    │                                                             │
    └─────────────────────────────────────────────────────────────┘
    

    Critical: Waiting for React Content

    The #1 failure mode: Taking screenshots before React hydrates.

    Anti-Pattern: Network Idle Alone

    # ❌ WRONG - React may not have rendered yet
    page.goto(url)
    page.wait_for_load_state('networkidle')
    page.screenshot(path='broken.png')  # Often blank!
    

    Correct Pattern: Wait for Actual Content

    # ✅ CORRECT - Wait for React to mount
    page.goto(url, wait_until='domcontentloaded')
    page.wait_for_load_state('networkidle')
    
    # Give React time to hydrate
    import time
    time.sleep(0.5)
    
    # Wait for actual content selector
    page.wait_for_selector('.main-content, h1, [data-testid="app"]',
                           state='visible',
                           timeout=10000)
    
    # Verify content exists
    body_text = page.locator('body').inner_text()
    if len(body_text) < 50:
        time.sleep(2)  # Extra wait for slow hydration
    
    page.screenshot(path='good.png', full_page=True)
    

    Content Verification Function

    def wait_for_react_content(page, selectors, timeout=10000):
        """Wait for React to hydrate by checking for actual content."""
        page.wait_for_load_state('domcontentloaded')
        page.wait_for_load_state('networkidle')
        time.sleep(0.5)  # React hydration buffer
    
        for selector in selectors.split(','):
            try:
                locator = page.locator(selector.strip())
                if locator.count() > 0:
                    locator.first.wait_for(state='visible', timeout=timeout)
                    return True
            except:
                continue
    
        # Fallback: wait for substantial body content
        try:
            page.wait_for_function(
                'document.body.innerText.length > 100',
                timeout=timeout
            )
            return True
        except:
            return False
    

    Headless Mode: Preventing Window Spam

    Always use headless=True to prevent browser windows from spawning:

    from playwright.sync_api import sync_playwright
    
    with sync_playwright() as p:
        # CRITICAL: headless=True prevents visible browser windows
        browser = p.chromium.launch(headless=True)
    
        context = browser.new_context(
            viewport={'width': 1280, 'height': 800},
            color_scheme='dark'  # Initial theme
        )
        page = context.new_page()
    
        # ... your test logic ...
    
        browser.close()  # Always clean up
    

    Theme Testing Pattern

    # Dark mode screenshot
    page.emulate_media(color_scheme='dark')  # Note: on PAGE, not context
    page.goto(url)
    wait_for_react_content(page, '.app-container, main, h1')
    page.screenshot(path='dark.png', full_page=True)
    
    # Light mode screenshot
    page.emulate_media(color_scheme='light')
    page.reload()
    wait_for_react_content(page, '.app-container, main, h1')
    page.screenshot(path='light.png', full_page=True)
    

    LLM Screenshot Analysis Patterns

    Pattern 1: Content Verification

    Prompt: "Analyze this screenshot. Answer:
    1. Is the main content rendered (not blank/loading)?
    2. What major UI elements are visible?
    3. Are there any error states or broken layouts?
    4. Rate content completeness: FULL / PARTIAL / EMPTY"
    

    Pattern 2: Theme Compliance

    Prompt: "This is a {dark/light} mode screenshot. Verify:
    1. Background color matches expected theme (dark bg for dark mode)
    2. Text has sufficient contrast against background
    3. Interactive elements are visible and styled correctly
    4. No theme leakage (dark elements on light bg or vice versa)"
    

    Pattern 3: Comparison Analysis

    Prompt: "Compare these two screenshots (before/after). Identify:
    1. What changed between them?
    2. Are changes intentional (theme switch) or bugs?
    3. Is any content missing in the 'after' version?
    4. Rate similarity: IDENTICAL / MINOR_DIFF / MAJOR_DIFF / BROKEN"
    

    Pattern 4: Accessibility Check

    Prompt: "Evaluate this screenshot for visual accessibility:
    1. Is text readable (sufficient size and contrast)?
    2. Are interactive elements clearly identifiable?
    3. Is there visual hierarchy (headings, sections)?
    4. Any elements that would fail WCAG contrast requirements?"
    

    Complete Test Script Template

    #!/usr/bin/env python3
    """
    LLM-Powered Screenshot Test Suite
    Captures screenshots and uses Claude vision for semantic analysis.
    """
    
    from playwright.sync_api import sync_playwright
    import os
    import time
    
    PAGES_TO_TEST = [
        # (path, name, content_selectors)
        ('/', 'Home', '.hero, main, h1'),
        ('/about', 'About', '.about-content, main, h1'),
        ('/dashboard', 'Dashboard', '.dashboard, .stats, h1'),
    ]
    
    BASE_URL = 'http://localhost:5173'
    SCREENSHOT_DIR = '/tmp/visual-tests'
    
    
    def wait_for_content(page, selectors, timeout=10000):
        """Wait for React/Vue/Svelte to hydrate."""
        page.wait_for_load_state('domcontentloaded')
        page.wait_for_load_state('networkidle')
        time.sleep(0.5)
    
        for selector in selectors.split(','):
            try:
                loc = page.locator(selector.strip())
                if loc.count() > 0:
                    loc.first.wait_for(state='visible', timeout=timeout)
                    return True
            except:
                continue
    
        try:
            page.wait_for_function('document.body.innerText.length > 100', timeout=timeout)
            return True
        except:
            return False
    
    
    def capture_themed_screenshots(page, url, name, selectors):
        """Capture both dark and light mode screenshots."""
        safe_name = name.lower().replace(' ', '-')
        results = {'name': name, 'url': url}
    
        for theme in ['dark', 'light']:
            page.emulate_media(color_scheme=theme)
    
            if theme == 'dark':
                page.goto(url, wait_until='domcontentloaded')
            else:
                page.reload(wait_until='domcontentloaded')
    
            content_loaded = wait_for_content(page, selectors)
    
            if not content_loaded:
                print(f"  ⚠️  {theme} mode: Content slow to load, waiting...")
                time.sleep(2)
    
            screenshot_path = f'{SCREENSHOT_DIR}/{safe_name}-{theme}.png'
            page.screenshot(path=screenshot_path, full_page=True)
    
            # Check content length
            body_text = page.locator('body').inner_text().strip()
            results[f'{theme}_screenshot'] = screenshot_path
            results[f'{theme}_content_length'] = len(body_text)
            results[f'{theme}_has_content'] = len(body_text) > 50
    
            print(f"  {theme}: {'✅' if results[f'{theme}_has_content'] else '❌'} ({len(body_text)} chars)")
    
        return results
    
    
    def run_tests():
        """Run visual tests on all pages."""
        os.makedirs(SCREENSHOT_DIR, exist_ok=True)
    
        with sync_playwright() as p:
            browser = p.chromium.launch(headless=True)
            context = browser.new_context(
                viewport={'width': 1280, 'height': 800},
                color_scheme='dark'
            )
            page = context.new_page()
    
            # Capture console errors
            errors = []
            page.on('console', lambda m: errors.append(m.text) if m.type == 'error' else None)
    
            results = []
    
            for path, name, selectors in PAGES_TO_TEST:
                print(f"Testing {name}...")
                url = f'{BASE_URL}{path}'
                result = capture_themed_screenshots(page, url, name, selectors)
                result['errors'] = list(errors)
                errors.clear()
                results.append(result)
    
            browser.close()
    
            # Summary
            print("\n" + "=" * 50)
            print("VISUAL TEST SUMMARY")
            print("=" * 50)
    
            passed = sum(1 for r in results
                         if r.get('dark_has_content') and r.get('light_has_content'))
            print(f"\nPassed: {passed}/{len(results)}")
            print(f"Screenshots: {SCREENSHOT_DIR}")
    
            return results
    
    
    if __name__ == '__main__':
        run_tests()
    

    MCP vs Native Playwright Decision Tree

    What are you doing?
    │
    ├─ Interactive debugging / exploring
    │  └─► Playwright MCP (see live browser)
    │
    ├─ Automated test suite
    │  └─► Native Python Playwright (headless)
    │
    ├─ CI/CD pipeline
    │  └─► Native Python Playwright (headless)
    │
    ├─ Screenshot capture for LLM analysis
    │  └─► Native Python Playwright (headless)
    │
    └─ One-off inspection
       └─► Either works, MCP is convenient
    

    Common Failures and Fixes

    Failure: Blank Screenshots

    Cause: Screenshot taken before React hydrates Fix: Wait for content selectors, add hydration buffer

    Failure: "Reconnecting..." Badge Visible

    Cause: HMR/WebSocket not connected (cosmetic in tests) Fix: This is often fine - focus on actual content

    Failure: Theme Not Applied

    Cause: emulate_media called on context instead of page Fix: Use page.emulate_media(color_scheme='dark')

    Failure: Browser Windows Spawning

    Cause: headless=False or using MCP instead of native Fix: Use p.chromium.launch(headless=True)

    Failure: Timeout on Content

    Cause: Wrong selectors or page actually broken Fix: Verify selectors exist, check console errors


    Integration with Claude Code

    When Claude reads screenshots captured by this pattern:

    1. Request specific analysis: Don't just show screenshot - ask targeted questions
    2. Provide context: "This should be dark mode" or "This is the login page"
    3. Compare systematically: Before/after, dark/light, desktop/mobile
    4. Trust semantic analysis: LLM can tell "blank page" from "content loaded"

    References

    Research Papers

    • Using Vision LLMs For UI Testing - University of Washington
    • Vision-driven Automated Mobile GUI Testing - Multimodal LLM approach
    • ScreenLLM: Stateful Screen Schema - UI understanding framework

    Tools & Integrations

    • Building an AI QA Engineer with Claude + Playwright
    • AI-Powered Visual Testing in Playwright
    • Playwright Visual Regression Testing Guide

    Official Documentation

    • Playwright Visual Comparisons

    Version History

    • 2026-01-23: Initial skill creation
      • Researched multimodal LLM screenshot analysis best practices
      • Documented React hydration waiting patterns
      • Added headless mode requirements
      • Created complete test script template

    Core Insight: The difference between useless and useful screenshot tests is waiting for content, not just network. LLMs can analyze semantics, but only if there's actually content to analyze.

    Recommended Servers
    Browserbase
    Browserbase
    Browser tool
    Browser tool
    Gemini
    Gemini
    Repository
    erichowens/some_claude_skills
    Files