# crawlforge-mcp

CrawlForge MCP is a production-ready MCP server that gives AI agents the power to scrape websites, extract structured data, run deep research, bypass anti-bot detection, and process documents. It pac…

## Quick Start

```bash
# Connect this server (installs CLI if needed)
npx -y @smithery/cli@latest mcp add CrawlForgeDEV/crawlforge-mcp

# Browse available tools
npx -y @smithery/cli@latest tool list CrawlForgeDEV/crawlforge-mcp

# Get full schema for a tool
npx -y @smithery/cli@latest tool get CrawlForgeDEV/crawlforge-mcp fetch_url

# Call a tool
npx -y @smithery/cli@latest tool call CrawlForgeDEV/crawlforge-mcp fetch_url '{}'
```

## Direct MCP Connection

Endpoint: `https://crawlforge-mcp--crawlforgedev.run.tools`

## Tools (20)

- `fetch_url` — Fetch content from a URL with optional headers and timeout
- `extract_text` — Extract clean text content from a webpage
- `extract_links` — Extract all links from a webpage with optional filtering
- `extract_metadata` — Extract metadata from a webpage (title, description, keywords, etc.)
- `scrape_structured` — Extract structured data from a webpage using CSS selectors
- `search_web` — Search the web using Google Search API (proxied through CrawlForge)
- `crawl_deep` — Crawl websites deeply using breadth-first search
- `map_site` — Discover and map website structure
- `extract_content` — Extract and analyze main content from web pages with enhanced readability detection
- `process_document` — Process documents from multiple sources and formats including PDFs and web pages
- `summarize_content` — Generate intelligent summaries of text content with configurable options
- `analyze_content` — Perform comprehensive content analysis including language detection and topic extraction
- `extract_structured` — Extract structured data from a webpage using LLM-powered analysis and a JSON Schema. Falls back to CSS selector extract…
- `batch_scrape` — Process multiple URLs simultaneously with support for async job management and webhook notifications
- `scrape_with_actions` — Execute browser action chains before scraping content, with form auto-fill and intermediate state capture
- `deep_research` — Conduct comprehensive multi-stage research with intelligent query expansion, source verification, and conflict detection
- `track_changes` — Enhanced content change tracking with baseline capture, comparison, scheduled monitoring, advanced comparison engine, a…
- `generate_llms_txt` — Analyze websites and generate standard-compliant LLMs.txt and LLMs-full.txt files defining AI model interaction guideli…
- `stealth_mode` — Advanced anti-detection browser management with stealth features, fingerprint randomization, and human behavior simulat…
- `localization` — Multi-language and geo-location management with country-specific settings, browser locale emulation, timezone spoofing,…

```bash
# Get full input/output schema for a tool
npx -y @smithery/cli@latest tool get CrawlForgeDEV/crawlforge-mcp <tool-name>
```

## Prompts (1)

- `getting-started` (description)
