Firecrawl web scraping API via curl. Use this skill to scrape webpages, crawl websites, discover URLs, search the web, or extract structured data.
If requests fail, run zero doctor check-connector --env-name FIRECRAWL_TOKEN or zero doctor check-connector --url https://api.firecrawl.dev/v1/scrape --method POST
All examples below assume you have FIRECRAWL_TOKEN set.
Base URL: https://api.firecrawl.dev/v1
Extract content from a single webpage.
Write to /tmp/firecrawl_request.json:
{
"url": "https://example.com",
"formats": ["markdown"]
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json
Write to /tmp/firecrawl_request.json:
{
"url": "https://docs.example.com/api",
"formats": ["markdown"],
"onlyMainContent": true,
"timeout": 30000
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data.markdown'
Write to /tmp/firecrawl_request.json:
{
"url": "https://example.com",
"formats": ["html"]
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data.html'
Write to /tmp/firecrawl_request.json:
{
"url": "https://example.com",
"formats": ["screenshot"]
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data.screenshot'
Scrape Parameters:
| Parameter | Type | Description |
|---|---|---|
url |
string | URL to scrape (required) |
formats |
array | markdown, html, rawHtml, screenshot, links |
onlyMainContent |
boolean | Skip headers/footers |
timeout |
number | Timeout in milliseconds |
Crawl all pages of a website (async operation).
Write to /tmp/firecrawl_request.json:
{
"url": "https://example.com",
"limit": 50,
"maxDepth": 2
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/crawl" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json
Response:
{
"success": true,
"id": "crawl-job-id-here"
}
Replace <job-id> with the actual job ID returned from the crawl request:
curl -s "https://api.firecrawl.dev/v1/crawl/<job-id>" -H "Authorization: Bearer $FIRECRAWL_TOKEN" | jq '{status, completed, total}'
Replace <job-id> with the actual job ID:
curl -s "https://api.firecrawl.dev/v1/crawl/<job-id>" -H "Authorization: Bearer $FIRECRAWL_TOKEN" | jq '.data[] | {url: .metadata.url, title: .metadata.title}'
Write to /tmp/firecrawl_request.json:
{
"url": "https://blog.example.com",
"limit": 20,
"maxDepth": 3,
"includePaths": ["/posts/*"],
"excludePaths": ["/admin/*", "/login"]
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/crawl" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json
Crawl Parameters:
| Parameter | Type | Description |
|---|---|---|
url |
string | Starting URL (required) |
limit |
number | Max pages to crawl (default: 100) |
maxDepth |
number | Max crawl depth (default: 3) |
includePaths |
array | Paths to include (e.g., /blog/*) |
excludePaths |
array | Paths to exclude |
Get all URLs from a website quickly.
Write to /tmp/firecrawl_request.json:
{
"url": "https://example.com"
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/map" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.links[:10]'
Write to /tmp/firecrawl_request.json:
{
"url": "https://shop.example.com",
"search": "product",
"limit": 500
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/map" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.links'
Map Parameters:
| Parameter | Type | Description |
|---|---|---|
url |
string | Website URL (required) |
search |
string | Filter URLs containing keyword |
limit |
number | Max URLs to return (default: 1000) |
Search the web and get full page content.
Write to /tmp/firecrawl_request.json:
{
"query": "AI news 2024",
"limit": 5
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/search" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data[] | {title: .metadata.title, url: .url}'
Write to /tmp/firecrawl_request.json:
{
"query": "machine learning tutorials",
"limit": 3,
"scrapeOptions": {
"formats": ["markdown"]
}
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/search" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data[] | {title: .metadata.title, content: .markdown[:500]}'
Search Parameters:
| Parameter | Type | Description |
|---|---|---|
query |
string | Search query (required) |
limit |
number | Number of results (default: 10) |
scrapeOptions |
object | Options for scraping results |
Extract structured data from pages using AI.
Write to /tmp/firecrawl_request.json:
{
"urls": ["https://example.com/product/123"],
"prompt": "Extract the product name, price, and description"
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'
Write to /tmp/firecrawl_request.json:
{
"urls": ["https://example.com/product/123"],
"prompt": "Extract product information",
"schema": {
"type": "object",
"properties": {
"name": {"type": "string"},
"price": {"type": "number"},
"currency": {"type": "string"},
"inStock": {"type": "boolean"}
}
}
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'
Write to /tmp/firecrawl_request.json:
{
"urls": [
"https://example.com/product/1",
"https://example.com/product/2"
],
"prompt": "Extract product name and price"
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'
Extract Parameters:
| Parameter | Type | Description |
|---|---|---|
urls |
array | URLs to extract from (required) |
prompt |
string | Description of data to extract (required) |
schema |
object | JSON schema for structured output |
Write to /tmp/firecrawl_request.json:
{
"url": "https://docs.python.org/3/tutorial/",
"formats": ["markdown"],
"onlyMainContent": true
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq -r '.data.markdown' > python-tutorial.md
Write to /tmp/firecrawl_request.json:
{
"url": "https://blog.example.com",
"search": "post"
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/map" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq -r '.links[]'
Write to /tmp/firecrawl_request.json:
{
"query": "best practices REST API design 2024",
"limit": 5,
"scrapeOptions": {"formats": ["markdown"]}
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/search" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data[] | {title: .metadata.title, url: .url}'
Write to /tmp/firecrawl_request.json:
{
"urls": ["https://example.com/pricing"],
"prompt": "Extract all pricing tiers with name, price, and features"
}
Then run:
curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $FIRECRAWL_TOKEN" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'
Replace <job-id> with the actual job ID:
while true; do
STATUS="$(curl -s "https://api.firecrawl.dev/v1/crawl/<job-id>" -H "Authorization: Bearer $FIRECRAWL_TOKEN" | jq -r '.status')"
echo "Status: $STATUS"
[ "$STATUS" = "completed" ] && break
sleep 5
done
{
"success": true,
"data": {
"markdown": "# Page Title\n\nContent...",
"metadata": {
"title": "Page Title",
"description": "...",
"url": "https://..."
}
}
}
{
"success": true,
"status": "completed",
"completed": 50,
"total": 50,
"data": [...]
}
limit values to control API usageonlyMainContent: true for cleaner output/crawl/{id} for statussuccess field in responses