Smithery Logo
MCPsSkillsDocsPricing
Login
NewFlame, an assistant that learns and improves. Available onTelegramSlack
    vm0-ai

    scrapeninja

    vm0-ai/scrapeninja
    Data & Analytics
    28
    1 installs

    About

    SKILL.md

    Install

    • Telegram
      Telegram
    • Slack
      Slack
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    • Download skill
    ├─
    ├─
    └─
    Smithery Logo

    Give agents more agency

    Resources

    DocumentationPrivacy PolicySystem Status

    Company

    PricingAboutBlog

    Connect

    © 2026 Smithery. All rights reserved.

    About

    High-performance web scraping API with Chrome TLS fingerprint and JS rendering

    SKILL.md

    Troubleshooting

    If requests fail, run zero doctor check-connector --env-name SCRAPENINJA_TOKEN or zero doctor check-connector --url https://scrapeninja.p.rapidapi.com/scrape --method POST

    How to Use

    1. Basic Scrape (Non-JS, Fast)

    High-performance scraping with Chrome TLS fingerprint, no JavaScript:

    Write to /tmp/scrapeninja_request.json:

    {
      "url": "https://example.com"
    }
    

    Then run:

    curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq '{status: .info.statusCode, url: .info.finalUrl, bodyLength: (.body | length)}'
    

    With custom headers and retries:

    Write to /tmp/scrapeninja_request.json:

    {
      "url": "https://example.com",
      "headers": ["Accept-Language: en-US"],
      "retryNum": 3,
      "timeout": 15
    }
    

    Then run:

    curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json
    

    2. Scrape with JavaScript Rendering

    For JavaScript-heavy sites (React, Vue, etc.):

    Write to /tmp/scrapeninja_request.json:

    {
      "url": "https://example.com",
      "waitForSelector": "h1",
      "timeout": 20
    }
    

    Then run:

    curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq '{status: .info.statusCode, bodyLength: (.body | length)}'
    

    With screenshot:

    Write to /tmp/scrapeninja_request.json:

    {
      "url": "https://example.com",
      "screenshot": true
    }
    

    Then run:

    # Get screenshot URL from response
    curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq -r '.info.screenshot'
    

    3. Geo-Based Proxy Selection

    Use proxies from specific regions:

    Write to /tmp/scrapeninja_request.json:

    {
      "url": "https://example.com",
      "geo": "eu"
    }
    

    Then run:

    curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq .info
    

    Available geos: us, eu, br (Brazil), fr (France), de (Germany), 4g-eu

    4. Smart Retries

    Retry on specific HTTP status codes or text patterns:

    Write to /tmp/scrapeninja_request.json:

    {
      "url": "https://example.com",
      "retryNum": 3,
      "statusNotExpected": [403, 429, 503],
      "textNotExpected": ["captcha", "Access Denied"]
    }
    

    Then run:

    curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json
    

    5. Extract Data with Cheerio

    Extract structured JSON using Cheerio extractor functions:

    Write to /tmp/scrapeninja_request.json:

    {
      "url": "https://news.ycombinator.com",
      "extractor": "function(input, cheerio) { let $ = cheerio.load(input); return $(\".titleline > a\").slice(0,5).map((i,el) => ({title: $(el).text(), url: $(el).attr(\"href\")})).get(); }"
    }
    

    Then run:

    curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq '.extractor'
    

    6. Intercept AJAX Requests

    Capture XHR/fetch responses:

    Write to /tmp/scrapeninja_request.json:

    {
      "url": "https://example.com",
      "catchAjaxHeadersUrlMask": "api/data"
    }
    

    Then run:

    curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json | jq '.info.catchedAjax'
    

    7. Block Resources for Speed

    Speed up JS rendering by blocking images and media:

    Write to /tmp/scrapeninja_request.json:

    {
      "url": "https://example.com",
      "blockImages": true,
      "blockMedia": true
    }
    

    Then run:

    curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: $SCRAPENINJA_TOKEN" -d @/tmp/scrapeninja_request.json
    

    API Endpoints

    Endpoint Description
    /scrape Fast non-JS scraping with Chrome TLS fingerprint
    /scrape-js Full Chrome browser with JS rendering
    /v2/scrape-js Enhanced JS rendering for protected sites (APIRoad only)

    Request Parameters

    Common Parameters (all endpoints)

    Parameter Type Default Description
    url string required URL to scrape
    headers string[] - Custom HTTP headers
    retryNum int 1 Number of retry attempts
    geo string us Proxy geo: us, eu, br, fr, de, 4g-eu
    proxy string - Custom proxy URL (overrides geo)
    timeout int 10/16 Timeout per attempt in seconds
    textNotExpected string[] - Text patterns that trigger retry
    statusNotExpected int[] [403, 502] HTTP status codes that trigger retry
    extractor string - Cheerio extractor function

    JS Rendering Parameters (/scrape-js, /v2/scrape-js)

    Parameter Type Default Description
    waitForSelector string - CSS selector to wait for
    postWaitTime int - Extra wait time after load (1-12s)
    screenshot bool true Take page screenshot
    blockImages bool false Block image loading
    blockMedia bool false Block CSS/fonts loading
    catchAjaxHeadersUrlMask string - URL pattern to intercept AJAX
    viewport object 1920x1080 Custom viewport size

    Response Format

    {
      "info": {
      "statusCode": 200,
      "finalUrl": "https://example.com",
      "headers": ["content-type: text/html"],
      "screenshot": "base64-encoded-png",
      "catchedAjax": {
      "url": "https://example.com/api/data",
      "method": "GET",
      "body": "...",
      "status": 200
      }
      },
      "body": "<html>...</html>",
      "extractor": { "extracted": "data" }
    }
    

    Guidelines

    1. Start with /scrape: Use the fast non-JS endpoint first, only switch to /scrape-js if needed
    2. Retries: Set retryNum to 2-3 for unreliable sites
    3. Geo Selection: Use eu for European sites, us for American sites
    4. Extractors: Test extractors at https://scrapeninja.net/cheerio-sandbox/
    5. Blocked Sites: For Cloudflare/Datadome protected sites, use /v2/scrape-js via APIRoad
    6. Screenshots: Set screenshot: false to speed up JS rendering
    7. Rate Limits: Check your plan limits on RapidAPI/APIRoad dashboard

    Tools

    • Playground: https://scrapeninja.net/scraper-sandbox
    • Cheerio Sandbox: https://scrapeninja.net/cheerio-sandbox
    • cURL Converter: https://scrapeninja.net/curl-to-scraper
    Recommended Servers
    Apify
    Apify
    ScrapeGraph AI Integration Server
    ScrapeGraph AI Integration Server
    Nimble MCP Server
    Nimble MCP Server
    Repository
    vm0-ai/vm0-skills
    Files