exa-search

benjaminjackson/exa-search

Research

9 installs

About

SKILL.md

exa-search

benjaminjackson/exa-search

Research

9 installs

About

Search the web for content matching a query with AI-powered semantic search...

SKILL.md

Exa Search

Token-efficient strategies for web search using exa-ai.

Use --help to see available commands and verify usage before running:

exa-ai <command> --help

Critical Requirements

MUST follow these rules when using exa-ai search:

Shared Requirements

This skill inherits requirements from Common Requirements:

Schema design patterns → All schema operations
Output format selection → All output operations

MUST NOT Rules

Avoid --text flag: Prefer structured output with schemas over raw text extraction for better token efficiency

Cost Optimization

Pricing

1-25 results: $0.005 per search
26-100 results: $0.025 per search (5x more expensive)

Cost strategy:

Default to 1-25 results: 5x cheaper, sufficient for most queries
Need 50+ results? Run multiple targeted searches: Two 25-result searches with different angles beats one 50-result search (better quality, more control)
Use 26-100 results sparingly: Only when you need comprehensive coverage that multiple targeted searches would miss

Token Optimization

Apply these strategies:

Use toon format: --output-format toon for 40% fewer tokens than JSON (use when reading output directly)
Use JSON + jq: Extract only needed fields with jq (use when piping/processing output)
Use --summary: Get AI-generated summaries instead of full page text
Use schemas: Extract structured data with --summary-schema (always pipe to jq)
Limit results: Use --num-results N to get only what you need

IMPORTANT: Choose one approach, don't mix them:

Approach 1: toon only - Compact YAML-like output for direct reading
Approach 2: JSON + jq - Extract specific fields programmatically
Approach 3: Schemas + jq - Get structured data, always use JSON output (default) and pipe to jq

Examples:

# ❌ High token usage
exa-ai search "AI news" --num-results 10

# ✅ Approach 1: toon format for direct reading (60% reduction)
exa-ai search "AI news" --num-results 3 --output-format toon

# ✅ Approach 2: JSON + jq for field extraction (90% reduction)
exa-ai search "AI news" --num-results 3 | jq -r '.results[].title'

# ❌ Don't mix toon with jq (toon is YAML-like, not JSON)
exa-ai search "AI news" --output-format toon | jq -r '.results[].title'

Quick Start

Basic Search

exa-ai search "Anthropic Claude new features" --num-results 5 --output-format toon

Search with Category Filter

exa-ai search "machine learning architectures" --category "research paper" --num-results 10

Extract Structured Data

exa-ai search "AI safety research papers 2024" \
  --summary \
  --summary-schema '{"type":"object","properties":{"title":{"type":"string"},"key_finding":{"type":"string"}}}' \
  --num-results 3 | jq -r '.results[].summary | fromjson | "- \(.title): \(.key_finding)"'

LinkedIn Search

exa-ai search "Anthropic" --linkedin company
exa-ai search "Dario Amodei" --linkedin person

Detailed Reference

For complete options, examples, and advanced usage, consult REFERENCE.md.

Shared Requirements

Schema Design

MUST: Use object wrapper for schemas

Applies to: answer, search, find-similar, get-contents

When using schema parameters (--output-schema or --summary-schema), always wrap properties in an object:

{"type":"object","properties":{"field_name":{"type":"string"}}}

DO NOT use bare properties without the object wrapper:

{"properties":{"field_name":{"type":"string"}}}  // ❌ Missing "type":"object"

Why: The Exa API requires a valid JSON Schema with an object type at the root level. Omitting this causes validation errors.

Examples:

# ✅ CORRECT - object wrapper included
exa-ai search "AI news" \
  --summary-schema '{"type":"object","properties":{"headline":{"type":"string"}}}'

# ❌ WRONG - missing object wrapper
exa-ai search "AI news" \
  --summary-schema '{"properties":{"headline":{"type":"string"}}}'

Output Format Selection

MUST NOT: Mix toon format with jq

Applies to: answer, context, search, find-similar, get-contents

toon format produces YAML-like output, not JSON. DO NOT pipe toon output to jq for parsing:

# ❌ WRONG - toon is not JSON
exa-ai search "query" --output-format toon | jq -r '.results'

# ✅ CORRECT - use JSON (default) with jq
exa-ai search "query" | jq -r '.results[].title'

# ✅ CORRECT - use toon for direct reading only
exa-ai search "query" --output-format toon

Why: jq expects valid JSON input. toon format is designed for human readability and produces YAML-like output that jq cannot parse.

SHOULD: Choose one output approach

Applies to: answer, context, search, find-similar, get-contents

Pick one strategy and stick with it throughout your workflow:

Approach 1: toon only - Compact YAML-like output for direct reading
- Use when: Reading output directly, no further processing needed
- Token savings: ~40% reduction vs JSON
- Example: exa-ai search "query" --output-format toon
Approach 2: JSON + jq - Extract specific fields programmatically
- Use when: Need to extract specific fields or pipe to other commands
- Token savings: ~80-90% reduction (extracts only needed fields)
- Example: exa-ai search "query" | jq -r '.results[].title'
Approach 3: Schemas + jq - Structured data extraction with validation
- Use when: Need consistent structured output across multiple queries
- Token savings: ~85% reduction + consistent schema
- Example: exa-ai search "query" --summary-schema '{...}' | jq -r '.results[].summary | fromjson'

Why: Mixing approaches increases complexity and token usage. Choosing one approach optimizes for your use case.

Shell Command Best Practices

MUST: Run commands directly, parse separately

Applies to: monitor, search (websets), research, and all skills using complex commands

When using the Bash tool with complex shell syntax, run commands directly and parse output in separate steps:

# ❌ WRONG - nested command substitution
webset_id=$(exa-ai webset-create --search '{"query":"..."}' | jq -r '.webset_id')

# ✅ CORRECT - run directly, then parse
exa-ai webset-create --search '{"query":"..."}'
# Then in a follow-up command:
webset_id=$(cat output.json | jq -r '.webset_id')

Why: Complex nested $(...) command substitutions can fail unpredictably in shell environments. Running commands directly and parsing separately improves reliability and makes debugging easier.

MUST NOT: Use nested command substitutions

Applies to: All skills when using complex multi-step operations

Avoid nesting multiple levels of command substitution:

# ❌ WRONG - deeply nested
result=$(exa-ai search "$(cat query.txt | tr '\n' ' ')" --num-results $(cat config.json | jq -r '.count'))

# ✅ CORRECT - sequential steps
query=$(cat query.txt | tr '\n' ' ')
count=$(cat config.json | jq -r '.count')
exa-ai search "$query" --num-results $count

Why: Nested command substitutions are fragile and hard to debug when they fail. Sequential steps make each operation explicit and easier to troubleshoot.

SHOULD: Break complex commands into sequential steps

Applies to: All skills when working with multi-step workflows

For readability and reliability, break complex operations into clear sequential steps:

# ❌ Less maintainable - everything in one line
exa-ai webset-create --search '{"query":"startups","count":1}' | jq -r '.webset_id' | xargs -I {} exa-ai webset-search-create {} --query "AI" --behavior override

# ✅ More maintainable - clear steps
exa-ai webset-create --search '{"query":"startups","count":1}'
webset_id=$(jq -r '.webset_id' < output.json)
exa-ai webset-search-create $webset_id --query "AI" --behavior override

Why: Sequential steps are easier to understand, debug, and modify. Each step can be verified independently.

About

SKILL.md

About

Search the web for content matching a query with AI-powered semantic search...

SKILL.md

Exa Search

Token-efficient strategies for web search using exa-ai.

Use --help to see available commands and verify usage before running:

exa-ai <command> --help

Critical Requirements

MUST follow these rules when using exa-ai search:

Shared Requirements

This skill inherits requirements from Common Requirements:

Schema design patterns → All schema operations
Output format selection → All output operations

MUST NOT Rules

Avoid --text flag: Prefer structured output with schemas over raw text extraction for better token efficiency

Cost Optimization

Pricing

1-25 results: $0.005 per search
26-100 results: $0.025 per search (5x more expensive)

Cost strategy:

Default to 1-25 results: 5x cheaper, sufficient for most queries
Need 50+ results? Run multiple targeted searches: Two 25-result searches with different angles beats one 50-result search (better quality, more control)
Use 26-100 results sparingly: Only when you need comprehensive coverage that multiple targeted searches would miss

Token Optimization

Apply these strategies:

Use toon format: --output-format toon for 40% fewer tokens than JSON (use when reading output directly)
Use JSON + jq: Extract only needed fields with jq (use when piping/processing output)
Use --summary: Get AI-generated summaries instead of full page text
Use schemas: Extract structured data with --summary-schema (always pipe to jq)
Limit results: Use --num-results N to get only what you need

IMPORTANT: Choose one approach, don't mix them:

Approach 1: toon only - Compact YAML-like output for direct reading
Approach 2: JSON + jq - Extract specific fields programmatically
Approach 3: Schemas + jq - Get structured data, always use JSON output (default) and pipe to jq

Examples:

# ❌ High token usage
exa-ai search "AI news" --num-results 10

# ✅ Approach 1: toon format for direct reading (60% reduction)
exa-ai search "AI news" --num-results 3 --output-format toon

# ✅ Approach 2: JSON + jq for field extraction (90% reduction)
exa-ai search "AI news" --num-results 3 | jq -r '.results[].title'

# ❌ Don't mix toon with jq (toon is YAML-like, not JSON)
exa-ai search "AI news" --output-format toon | jq -r '.results[].title'

Quick Start

Basic Search

exa-ai search "Anthropic Claude new features" --num-results 5 --output-format toon

Search with Category Filter

exa-ai search "machine learning architectures" --category "research paper" --num-results 10

Extract Structured Data

exa-ai search "AI safety research papers 2024" \
  --summary \
  --summary-schema '{"type":"object","properties":{"title":{"type":"string"},"key_finding":{"type":"string"}}}' \
  --num-results 3 | jq -r '.results[].summary | fromjson | "- \(.title): \(.key_finding)"'

LinkedIn Search

exa-ai search "Anthropic" --linkedin company
exa-ai search "Dario Amodei" --linkedin person

Detailed Reference

For complete options, examples, and advanced usage, consult REFERENCE.md.

Shared Requirements

Schema Design

MUST: Use object wrapper for schemas

Applies to: answer, search, find-similar, get-contents

When using schema parameters (--output-schema or --summary-schema), always wrap properties in an object:

{"type":"object","properties":{"field_name":{"type":"string"}}}

DO NOT use bare properties without the object wrapper:

{"properties":{"field_name":{"type":"string"}}}  // ❌ Missing "type":"object"

Why: The Exa API requires a valid JSON Schema with an object type at the root level. Omitting this causes validation errors.

Examples:

# ✅ CORRECT - object wrapper included
exa-ai search "AI news" \
  --summary-schema '{"type":"object","properties":{"headline":{"type":"string"}}}'

# ❌ WRONG - missing object wrapper
exa-ai search "AI news" \
  --summary-schema '{"properties":{"headline":{"type":"string"}}}'

Output Format Selection

MUST NOT: Mix toon format with jq

Applies to: answer, context, search, find-similar, get-contents

toon format produces YAML-like output, not JSON. DO NOT pipe toon output to jq for parsing:

# ❌ WRONG - toon is not JSON
exa-ai search "query" --output-format toon | jq -r '.results'

# ✅ CORRECT - use JSON (default) with jq
exa-ai search "query" | jq -r '.results[].title'

# ✅ CORRECT - use toon for direct reading only
exa-ai search "query" --output-format toon

Why: jq expects valid JSON input. toon format is designed for human readability and produces YAML-like output that jq cannot parse.

SHOULD: Choose one output approach

Applies to: answer, context, search, find-similar, get-contents

Pick one strategy and stick with it throughout your workflow:

Approach 1: toon only - Compact YAML-like output for direct reading
- Use when: Reading output directly, no further processing needed
- Token savings: ~40% reduction vs JSON
- Example: exa-ai search "query" --output-format toon
Approach 2: JSON + jq - Extract specific fields programmatically
- Use when: Need to extract specific fields or pipe to other commands
- Token savings: ~80-90% reduction (extracts only needed fields)
- Example: exa-ai search "query" | jq -r '.results[].title'
Approach 3: Schemas + jq - Structured data extraction with validation
- Use when: Need consistent structured output across multiple queries
- Token savings: ~85% reduction + consistent schema
- Example: exa-ai search "query" --summary-schema '{...}' | jq -r '.results[].summary | fromjson'

Why: Mixing approaches increases complexity and token usage. Choosing one approach optimizes for your use case.

Shell Command Best Practices

MUST: Run commands directly, parse separately

Applies to: monitor, search (websets), research, and all skills using complex commands

When using the Bash tool with complex shell syntax, run commands directly and parse output in separate steps:

# ❌ WRONG - nested command substitution
webset_id=$(exa-ai webset-create --search '{"query":"..."}' | jq -r '.webset_id')

# ✅ CORRECT - run directly, then parse
exa-ai webset-create --search '{"query":"..."}'
# Then in a follow-up command:
webset_id=$(cat output.json | jq -r '.webset_id')

Why: Complex nested $(...) command substitutions can fail unpredictably in shell environments. Running commands directly and parsing separately improves reliability and makes debugging easier.

MUST NOT: Use nested command substitutions

Applies to: All skills when using complex multi-step operations

Avoid nesting multiple levels of command substitution:

# ❌ WRONG - deeply nested
result=$(exa-ai search "$(cat query.txt | tr '\n' ' ')" --num-results $(cat config.json | jq -r '.count'))

# ✅ CORRECT - sequential steps
query=$(cat query.txt | tr '\n' ' ')
count=$(cat config.json | jq -r '.count')
exa-ai search "$query" --num-results $count

Why: Nested command substitutions are fragile and hard to debug when they fail. Sequential steps make each operation explicit and easier to troubleshoot.

SHOULD: Break complex commands into sequential steps

Applies to: All skills when working with multi-step workflows

For readability and reliability, break complex operations into clear sequential steps:

# ❌ Less maintainable - everything in one line
exa-ai webset-create --search '{"query":"startups","count":1}' | jq -r '.webset_id' | xargs -I {} exa-ai webset-search-create {} --query "AI" --behavior override

# ✅ More maintainable - clear steps
exa-ai webset-create --search '{"query":"startups","count":1}'
webset_id=$(jq -r '.webset_id' < output.json)
exa-ai webset-search-create $webset_id --query "AI" --behavior override

Why: Sequential steps are easier to understand, debug, and modify. Each step can be verified independently.