Smithery Logo
MCPsSkillsDocsPricing
Login
Smithery Logo

Accelerating the Agent Economy

Resources

DocumentationPrivacy PolicySystem Status

Company

PricingAboutBlog

Connect

© 2026 Smithery. All rights reserved.

    obra

    browsing

    obra/browsing
    Productivity
    149

    About

    SKILL.md

    Install

    Install via Skills CLI

    or add to your agent
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    ├─
    ├─
    └─

    About

    Use when you need direct browser control - teaches Chrome DevTools Protocol for controlling existing browser sessions, multi-tab management, form automation, and content extraction via use_browser...

    SKILL.md

    Browsing with Chrome Direct

    Overview

    Control Chrome via DevTools Protocol using the use_browser MCP tool. Single unified interface with auto-starting Chrome.

    Announce: "I'm using the browsing skill to control Chrome."

    When to Use

    Use this when:

    • Controlling authenticated sessions
    • Managing multiple tabs in running browser
    • Playwright MCP unavailable or excessive

    Use Playwright MCP when:

    • Need fresh browser instances
    • Generating screenshots/PDFs
    • Prefer higher-level abstractions

    Auto-Capture

    Every DOM action (navigate, click, type, select, eval, keyboard_press) automatically saves:

    • {prefix}.png — viewport screenshot
    • {prefix}.md — page content as structured markdown
    • {prefix}.html — full rendered DOM
    • {prefix}-console.txt — browser console messages

    Files are saved to the session directory with sequential prefixes (001-navigate, 002-click, etc.). You must check these before using extract or screenshot actions.

    The use_browser Tool

    Single MCP tool with action-based interface. Chrome auto-starts on first use.

    Parameters:

    • action (required): Operation to perform
    • tab_index (optional): Tab to operate on (default: 0)
    • selector (optional): CSS selector for element operations
    • payload (optional): Action-specific data
    • timeout (optional): Timeout in ms for await operations (default: 5000)

    Actions Reference

    Navigation

    • navigate: Navigate to URL

      • payload: URL string
      • Example: {action: "navigate", payload: "https://example.com"}
    • await_element: Wait for element to appear

      • selector: CSS selector
      • timeout: Max wait time in ms
      • Example: {action: "await_element", selector: ".loaded", timeout: 10000}
    • await_text: Wait for text to appear

      • payload: Text to wait for
      • Example: {action: "await_text", payload: "Welcome"}

    Interaction

    • click: Click element

      • selector: CSS selector
      • Example: {action: "click", selector: "button.submit"}
    • type: Type text into input (append \n to submit)

      • selector: CSS selector
      • payload: Text to type
      • Example: {action: "type", selector: "#email", payload: "user@example.com\n"}
    • select: Select dropdown option

      • selector: CSS selector
      • payload: Option value(s)
      • Example: {action: "select", selector: "select[name=state]", payload: "CA"}

    Extraction

    • extract: Get page content

      • payload: Format ('markdown'|'text'|'html')
      • selector: Optional - limit to element
      • Example: {action: "extract", payload: "markdown"}
      • Example: {action: "extract", payload: "text", selector: "h1"}
    • attr: Get element attribute

      • selector: CSS selector
      • payload: Attribute name
      • Example: {action: "attr", selector: "a.download", payload: "href"}
    • eval: Execute JavaScript

      • payload: JavaScript code
      • Example: {action: "eval", payload: "document.title"}

    Export

    • screenshot: Capture screenshot of a specific element
      • payload: Filename
      • selector: Optional - screenshot specific element
      • Viewport screenshots are auto-captured after every DOM action. Use this only when you need a specific element.
      • Example: {action: "screenshot", payload: "/tmp/chart.png", selector: ".chart"}

    Tab Management

    • list_tabs: List all open tabs

      • Example: {action: "list_tabs"}
    • new_tab: Create new tab

      • Example: {action: "new_tab"}
    • close_tab: Close tab

      • tab_index: Tab to close
      • Example: {action: "close_tab", tab_index: 2}

    Browser Mode Control

    • show_browser: Make browser window visible (headed mode)

      • Example: {action: "show_browser"}
      • ⚠️ WARNING: Restarts Chrome, reloads pages via GET, loses POST state
    • hide_browser: Switch to headless mode (invisible browser)

      • Example: {action: "hide_browser"}
      • ⚠️ WARNING: Restarts Chrome, reloads pages via GET, loses POST state
    • browser_mode: Check current browser mode, port, and profile

      • Example: {action: "browser_mode"}
      • Returns: {"headless": true|false, "mode": "headless"|"headed", "running": true|false, "port": 9222, "profile": "name", "profileDir": "/path"}

    Profile Management

    • set_profile: Change Chrome profile (must kill Chrome first)

      • Example: {action: "set_profile", "payload": "browser-user"}
      • ⚠️ WARNING: Chrome must be stopped first
    • get_profile: Get current profile name and directory

      • Example: {action: "get_profile"}
      • Returns: {"profile": "name", "profileDir": "/path"}

    Default behavior: Chrome starts in headless mode with "superpowers-chrome" profile on a dynamically allocated port (range 9222-12111). Override with CHROME_WS_PORT env var or MCP --port=N flag.

    Critical caveats when toggling modes:

    1. Chrome must restart - Cannot switch headless/headed mode on running Chrome
    2. Pages reload via GET - All open tabs are reopened with GET requests
    3. POST state is lost - Form submissions, POST results, and POST-based navigation will be lost
    4. Session state is lost - Any client-side state (JavaScript variables, etc.) is cleared
    5. Cookies/auth may persist - Uses same user data directory, so logged-in sessions may survive

    When to use headed mode:

    • Debugging visual rendering issues
    • Demonstrating browser behavior to user
    • Testing features that only work with visible browser
    • Debugging issues that don't reproduce in headless mode

    When to stay in headless mode (default):

    • All other cases - faster, cleaner, less intrusive
    • Screenshots work perfectly in headless mode
    • Most automation works identically in both modes

    Profile management: Profiles store persistent browser data (cookies, localStorage, extensions, auth sessions).

    Profile locations:

    • macOS: ~/Library/Caches/superpowers/browser-profiles/{name}/
    • Linux: ~/.cache/superpowers/browser-profiles/{name}/
    • Windows: %LOCALAPPDATA%/superpowers/browser-profiles/{name}/

    When to use separate profiles:

    • Default profile ("superpowers-chrome"): General automation, shared sessions
    • Agent-specific profiles: Isolate different agents' browser state
      • Example: browser-user agent uses "browser-user" profile
    • Task-specific profiles: Testing with different user contexts
      • Example: "test-logged-in" vs "test-logged-out"

    Profile data persists across:

    • Chrome restarts
    • Mode toggles (headless ↔ headed)
    • System reboots (data is in cache directory)

    To use a different profile:

    1. Kill Chrome if running: await chromeLib.killChrome()
    2. Set profile: {action: "set_profile", "payload": "my-profile"}
    3. Start Chrome: Next navigate/action will use new profile

    Quick Start Pattern

    Navigate and extract:
    {action: "navigate", payload: "https://example.com"}
    {action: "await_element", selector: "h1"}
    {action: "extract", payload: "text", selector: "h1"}
    

    Common Patterns

    Fill and Submit Form

    {action: "navigate", payload: "https://example.com/login"}
    {action: "await_element", selector: "input[name=email]"}
    {action: "type", selector: "input[name=email]", payload: "user@example.com"}
    {action: "type", selector: "input[name=password]", payload: "pass123\n"}
    {action: "await_text", payload: "Welcome"}
    

    The \n at the end of the password submits the form.

    Multi-Tab Workflow

    {action: "list_tabs"}
    {action: "click", tab_index: 2, selector: "a.email"}
    {action: "await_element", tab_index: 2, selector: ".content"}
    {action: "extract", tab_index: 2, payload: "text", selector: ".amount"}
    

    Dynamic Content

    {action: "navigate", payload: "https://example.com"}
    {action: "type", selector: "input[name=q]", payload: "query"}
    {action: "click", selector: "button.search"}
    {action: "await_element", selector: ".results"}
    {action: "extract", payload: "text", selector: ".result-title"}
    

    Get Link Attribute

    {action: "navigate", payload: "https://example.com"}
    {action: "await_element", selector: "a.download"}
    {action: "attr", selector: "a.download", payload: "href"}
    

    Execute JavaScript

    {action: "eval", payload: "document.querySelectorAll('a').length"}
    {action: "eval", payload: "Array.from(document.querySelectorAll('a')).map(a => a.href)"}
    

    Resize Viewport (Responsive Testing)

    Use eval to resize the browser window for testing responsive layouts:

    {action: "eval", payload: "window.resizeTo(375, 812); 'Resized to mobile'"}
    {action: "eval", payload: "window.resizeTo(768, 1024); 'Resized to tablet'"}
    {action: "eval", payload: "window.resizeTo(1920, 1080); 'Resized to desktop'"}
    

    Note: This resizes the window, not device emulation. It won't change:

    • Device pixel ratio (retina displays)
    • Touch events
    • User-Agent string

    For most responsive testing, window resize is sufficient.

    Clear Cookies

    Use eval to clear cookies accessible to JavaScript:

    {action: "eval", payload: "document.cookie.split(';').forEach(c => { document.cookie = c.trim().split('=')[0] + '=;expires=Thu, 01 Jan 1970 00:00:00 GMT;path=/'; }); 'Cookies cleared'"}
    

    Note: This clears cookies accessible to JavaScript. It won't clear:

    • httpOnly cookies (server-side only)
    • Cookies from other domains

    For most logout/reset scenarios, this is sufficient.

    Scroll Page

    {action: "eval", payload: "window.scrollTo(0, document.body.scrollHeight); 'Scrolled to bottom'"}
    {action: "eval", payload: "window.scrollTo(0, 0); 'Scrolled to top'"}
    {action: "eval", payload: "document.querySelector('.target').scrollIntoView(); 'Scrolled to element'"}
    

    Tips

    Always wait before interaction: Don't click or fill immediately after navigate - pages need time to load.

    // BAD - might fail if page slow
    {action: "navigate", payload: "https://example.com"}
    {action: "click", selector: "button"}  // May fail!
    
    // GOOD - wait first
    {action: "navigate", payload: "https://example.com"}
    {action: "await_element", selector: "button"}
    {action: "click", selector: "button"}
    

    Use specific selectors: Avoid generic selectors that match multiple elements.

    // BAD - matches first button
    {action: "click", selector: "button"}
    
    // GOOD - specific
    {action: "click", selector: "button[type=submit]"}
    {action: "click", selector: "#login-button"}
    

    Submit forms with \n: Append newline to text to submit forms automatically.

    {action: "type", selector: "#search", payload: "query\n"}
    

    Check content first: Extract page content to verify selectors before building workflow.

    {action: "extract", payload: "html"}
    

    Troubleshooting

    Element not found:

    • Use await_element before interaction
    • Verify selector with extract action using 'html' format

    Timeout errors:

    • Increase timeout: {timeout: 30000} for slow pages
    • Wait for specific element instead of text

    Tab index out of range:

    • Use list_tabs to get current indices
    • Tab indices change when tabs close

    eval returns [object Object]:

    • Use JSON.stringify() for complex objects: {action: "eval", payload: "JSON.stringify({name: 'test'})"}
    • For async functions: {action: "eval", payload: "JSON.stringify(await yourAsyncFunction())"}

    Test Automation (Advanced)

    Click to expand test automation guidance

    When building test automation, you have two approaches:

    Approach 1: use_browser MCP (Simple Tests)

    Best for: Single-step tests, direct Claude control during conversation

    {"action": "navigate", "payload": "https://app.com"}
    {"action": "click", "selector": "#test-button"}
    {"action": "eval", "payload": "JSON.stringify({passed: document.querySelector('.success') !== null})"}
    

    Approach 2: chrome-ws CLI (Complex Tests)

    Best for: Multi-step test suites, standalone automation scripts

    Key insight: chrome-ws is the reference implementation showing proper Chrome DevTools Protocol usage. When use_browser doesn't work as expected, examine how chrome-ws handles the same operation.

    # Example: Automated form testing
    ./chrome-ws navigate 0 "https://app.com/form"
    ./chrome-ws fill 0 "#email" "test@example.com"
    ./chrome-ws click 0 "button[type=submit]"
    ./chrome-ws wait-text 0 "Success"
    

    When use_browser Fails

    1. Check chrome-ws source code - It shows the correct CDP pattern
    2. Use chrome-ws to verify - Test the same operation via CLI
    3. Adapt the pattern - Apply the working CDP approach to use_browser

    Common Test Automation Patterns

    • Form validation: Fill forms, check error states
    • UI state testing: Click elements, verify DOM changes
    • Performance testing: Measure load times, capture metrics
    • Screenshot comparison: Capture before/after states

    Advanced Usage

    For command-line usage outside Claude Code, see COMMANDLINE-USAGE.md.

    For detailed examples, see EXAMPLES.md.

    Protocol Reference

    Full CDP documentation: https://chromedevtools.github.io/devtools-protocol/

    Recommended Servers
    Browserbase
    Browserbase
    Repository
    obra/superpowers-chrome
    Files