Smithery Logo
MCPsSkillsDocsPricing
Login
Smithery Logo

Accelerating the Agent Economy

Resources

DocumentationPrivacy PolicySystem Status

Company

PricingAboutBlog

Connect

© 2026 Smithery. All rights reserved.

    context-engine-ai

    context-engine

    context-engine-ai/context-engine
    AI & ML
    307
    1 installs

    About

    SKILL.md

    Install

    Install via Skills CLI

    or add to your agent
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    ├─
    ├─
    └─

    About

    Hybrid semantic/lexical code search with neural reranking via MCP tools...

    SKILL.md

    Context-Engine

    Search and retrieve code context from any codebase using hybrid vector search (semantic + lexical) with neural reranking.

    Client- or provider-specific wrapper files should stay thin and defer to this document for shared MCP tool-selection and search guidance.

    Quickstart

    1. Start with search for most codebase questions.
    2. Use symbol_graph first for direct symbol relationships such as callers, definitions, importers, subclasses, and base classes.
    3. Use graph_query only if that tool is available and you need transitive impact, dependency, or cycle analysis; otherwise combine symbol_graph with targeted search.
    4. Prefer MCP tools for exploration. Narrow grep/file-open use is still fine for exact literal confirmation, exact file/path confirmation, or opening a file you already identified for editing.
    5. Use cross_repo_search for multi-repo questions. For public V1 context_search, treat include_memories=true as compatibility-only: it preserves response shape but keeps results code-only and may add memory_note.

    Decision Tree: Choosing the Right Tool

    What do you need?
        |
        +-- UNSURE or GENERAL QUERY --> search (RECOMMENDED DEFAULT)
        |       |
        |       +-- Auto-detects intent and routes to the best tool
        |       +-- Handles: code search, Q&A, tests, config, symbols, imports
        |       +-- Use this when you don't know which specialized tool to pick
        |
        +-- Find code locations/implementations
        |       |
        |       +-- Unsure what tool to use → search (DEFAULT - routes to repo_search if needed)
        |       +-- Speed-critical or complex filters → repo_search (skip routing overhead)
        |       +-- Want LLM explanation → context_answer
        |
        +-- Understand how something works
        |       |
        |       +-- Want LLM explanation --> search OR context_answer
        |       +-- Just code snippets --> search OR repo_search with include_snippet=true
        |
        +-- Find similar code patterns (retry loops, error handling, etc.)
        |       |
        |       +-- Have code example --> pattern_search with code snippet (if enabled)
        |       +-- Describe pattern --> pattern_search with natural language (if enabled)
        |
        +-- Find specific file types
        |       |
        |       +-- Test files --> search OR search_tests_for
        |       +-- Config files --> search OR search_config_for
        |
        +-- Find relationships
        |       |
        |       +-- Direct callers/defs/importers/inheritance --> search OR symbol_graph
        |       +-- Multi-hop callers --> symbol_graph (depth=2+)
        |       +-- Deep impact/dependencies/cycles --> graph_query (if available) OR symbol_graph + targeted search
        |
        +-- Git history
        |       |
        |       +-- Find commits --> search_commits_for
        |       +-- Predict co-changing files --> search_commits_for with predict_related=true
        |
        +-- Store/recall knowledge --> memory_store, memory_find
        |
        +-- Preserve public search shape while accepting memory flags --> context_search with include_memories=true (compatibility-only in V1)
        |
        +-- Multiple independent searches at once
                |
                +-- batch_search (runs N repo_search calls in one invocation, ~75% token savings)
    

    Standard Parameters Reference

    All SaaS-exposed tools organize parameters into families with consistent naming and behavior.

    Family 1: Code Search Tools

    Applies to: search, repo_search, code_search, batch_search, info_request, context_search

    Standard Parameters:

    Parameter Type Required? Default Purpose
    query string or string[] YES — Single query OR array of queries for fusion
    language string optional (auto-detect) Filter by language: "python", "typescript", "go", etc.
    under string optional (root) Path prefix filter, e.g., "src/api/" or "tests/"
    path_glob string[] optional (all) Include patterns: ["**/*.ts", "lib/**"]
    not_glob string[] optional (none) Exclude patterns: ["**/test_*", "**/*_test.*"]
    symbol string optional (all) Filter by symbol name (function, class, variable)
    kind string optional (all) AST node type: "function", "class", "method", "variable"
    ext string optional (all) File extension: "py", "ts", "go" (alias for language)
    repo string or string[] optional (default) Repository filter: single repo OR list OR "*" for all
    limit int optional 10 Max results to return (1-100)
    include_snippet bool optional true Include code snippets in results
    compact bool optional false Strip verbose fields from response
    output_format string optional "json" "json" (structured) OR "toon" (token-efficient)
    rerank_enabled bool optional true Enable neural reranking (default ON)
    case string optional (insensitive) "sensitive" for case-sensitive matching
    context_lines int optional 2 Lines of context around matches
    per_path int optional 2 Max results per file

    Standard Constraints:

    • limit max 100 (higher values slow queries)
    • query max 400 characters / 50 words
    • language must be valid code language or auto-detection will fail silently
    • path_glob / not_glob support glob patterns (*, **, ?)
    • Multiple query terms are fused via Reciprocal Rank Fusion (RRF) for better recall

    Family 2: Symbol Graph Tools

    Applies to: symbol_graph, batch_symbol_graph, graph_query, batch_graph_query

    Standard Parameters:

    Parameter Type Required? Default Purpose
    symbol string YES — Symbol name to analyze (e.g., "authenticate", "UserService.get_user")
    query_type string optional "callers" "callers", "callees", "definition", "importers", "subclasses", "base_classes", "impact", "cycles", "transitive_callers", "transitive_callees", "dependencies"
    depth int optional 1 Traversal depth: 1=direct, 2=callers of callers, etc. (symbol_graph: max 3, graph_query: max 5+)
    language string optional (auto) Filter by language for multi-language codebases
    under string optional (all) Path prefix filter
    limit int optional 20 Max results to return
    include_paths bool optional false Include full traversal paths (graph_query only)
    output_format string optional "json" "json" or "toon"
    repo string optional (default) Repository filter
    collection string optional (session) Target collection (use session defaults)

    Standard Constraints:

    • symbol must be exact match or use fuzzy fallback
    • query_type is case-sensitive
    • depth > 3 may be slow on large graphs
    • Results are auto-hydrated with code snippets

    Family 3: Specialized Search Tools

    Applies to: search_tests_for, search_config_for, search_callers_for, search_importers_for, search_commits_for

    Standard Parameters:

    Parameter Type Required? Default Purpose
    query string YES — Natural language or symbol name
    limit int optional 10 Max results to return
    language string optional (auto) Filter by language
    under string optional (all) Path prefix filter

    Additional Parameters:

    Tool Extra Parameters
    search_commits_for path (optional), predict_related (bool, default false)
    All others (inherit code search family)

    Family 4: Memory Tools

    Applies to: memory_store, memory_find

    Standard Parameters:

    Parameter Type Required? Default Purpose
    information string YES (store) — Knowledge to persist (clear, self-contained)
    query string YES (find) — Search for stored knowledge by similarity
    metadata dict optional (store) {} Structured metadata: kind, topic, priority (1-5), tags, author
    kind string optional (find) (all) Filter by kind: "memory", "note", "decision", "convention", "gotcha", "policy"
    topic string optional (find) (all) Filter by topic: "auth", "database", "api", "caching", etc.
    tags string or string[] optional (find) (all) Filter by tags: ["security", "sql", ...]
    priority_min int optional (find) 1 Minimum priority threshold (1-5)
    limit int optional 10 Max results to return

    Family 5: Batch Tools

    Applies to: batch_search, batch_symbol_graph, batch_graph_query

    Standard Parameters (Shared across all queries):

    Parameter Type Purpose
    searches / queries array Array of individual search/query specs (max 10 items)
    collection string Shared default collection for all queries
    language string Shared default language filter
    under string Shared default path prefix
    limit int Shared default result limit
    output_format string "json" or "toon" for all results

    Per-Search Overrides: Each item in searches / queries can override ANY shared parameter.

    Example: searches[0] has different limit than searches[1]

    Family 6: Cross-Repo & Admin Tools

    Applies to: cross_repo_search, qdrant_status, qdrant_list, set_session_defaults

    Standard Parameters:

    Tool Parameters
    cross_repo_search query, collection, target_repos, discover, trace_boundary, boundary_key
    qdrant_status / qdrant_list (no parameters)
    set_session_defaults collection, language, under, output_format, limit

    Unified Search: search (RECOMMENDED DEFAULT)

    Use search as your PRIMARY tool. It auto-detects query intent and routes to the best specialized tool. No need to choose between 15+ tools.

    {
      "query": "authentication middleware"
    }
    

    Returns:

    {
      "ok": true,
      "intent": "search",
      "confidence": 0.92,
      "tool": "repo_search",
      "result": {
        "results": [...],
        "total": 8
      },
      "plan": ["detect_intent", "dispatch_repo_search"],
      "execution_time_ms": 245
    }
    

    What it handles automatically:

    • Code search ("find auth middleware") -> routes to repo_search
    • Q&A ("how does caching work?") -> routes to context_answer
    • Test discovery ("tests for payment") -> routes to search_tests_for
    • Config lookup ("database settings") -> routes to search_config_for
    • Symbol queries ("who calls authenticate") -> routes to symbol_graph
    • Import tracing ("what imports CacheManager") -> routes to search_importers_for

    Override parameters (all optional):

    {
      "query": "error handling patterns",
      "limit": 5,
      "language": "python",
      "under": "src/api/",
      "include_snippet": true
    }
    

    When to use search:

    • You're unsure which specialized tool to use
    • You want intent auto-detection (routing to repo_search, context_answer, symbol_graph, tests, etc.)
    • Acceptable latency overhead: ~50-100ms for routing + tool execution
    • You're doing exploratory queries where routing overhead is negligible

    When NOT to use search:

    • You know you need raw code results (use repo_search directly)
    • Time is critical (<100ms target) and routing overhead matters
    • You're in a tight loop doing 10+ sequential searches (use batch_search instead)

    Routing Performance:

    • Intent detection: ~10-20ms
    • Tool dispatch: ~5-10ms
    • Total routing overhead: ~20-40ms typical, up to ~100ms worst-case
    • For time-critical loops: skip routing with repo_search directly

    When to use specialized tools instead:

    • Cross-repo search -> cross_repo_search
    • Multiple independent searches -> batch_search (N searches in one call, ~75% token savings)
    • Memory storage/retrieval -> memory_store, memory_find
    • Admin/diagnostics -> qdrant_status, qdrant_list
    • Pattern matching (structural) -> pattern_search

    When to use repo_search instead of search:

    • Full control over filters: You know exactly what you're searching for and want to apply specific language, path, or symbol filters without auto-detection overhead
      • Example: "In a polyglot repo, I need Python code only" → use repo_search with language="python" to avoid search's auto-detected language=javascript
      • Example: "Find only test files matching a pattern" → use repo_search with path_glob="**/test_*.py" directly
    • Speed-critical queries (<100ms target): You can't afford the ~20-40ms routing overhead
      • Example: Time-sensitive tool loops where each query must complete in <50ms
    • Complex filter combinations: You need language + under + not_glob together, not guessed by auto-detection
    • Guaranteed exact behavior: You want reproducible results without routing confidence variations (search routing confidence varies 0.6-0.95)
    • Known tool type: You already know you need code results (not Q&A, tests, configs, or symbols) so routing is wasted

    Example: When search guesses wrong:

    SEARCH (auto-routes, may detect wrong intent):
    query: "authenticate in FastAPI"
    confidence: 0.75
    intent: "Q&A - what does authenticate do in FastAPI?"
    → routes to context_answer, returns explanation instead of code
    
    REPO_SEARCH (explicit, predictable):
    query: "authenticate"
    language: "python"
    under: "src/auth/"
    → returns code implementations in src/auth/ only, no routing overhead
    

    Routing Overhead: When It Matters

    Latency Impact of Using search vs repo_search directly:

    Scenario search Latency repo_search Latency Routing Cost Use search?
    One exploratory query ~150-200ms ~80-100ms ~70-100ms YES (worth it for auto-routing)
    3 independent queries, sequential ~450-600ms ~240-300ms ~210-300ms NO (use batch_search instead)
    Time-critical query (<50ms) Can miss deadline ~80-100ms ❌ Unacceptable NO (use repo_search)
    Tight loop (20+ queries) ~3000-4000ms ~1600-2000ms ~1400-2000ms NO (use batch_search)

    Decision Criteria:

    • Use search when: One-off query, exploratory, unsure which tool, latency <200ms is acceptable
    • Use repo_search when: Speed <100ms required, complex filter combo needed, tight loop (use batch_search if >2 queries), know you need code (not Q&A)
    • Use batch_search when: 2+ independent code searches to reduce routing overhead by 75-85% per batch

    Real-world example - Interactive AI assistant loop:

    Bad (repeated routing overhead):
    for query in user_queries:  # 5 queries
        result = search(query)  # ~70-100ms routing × 5 = 350-500ms wasted
    
    Good (one batch call):
    results = batch_search([query1, query2, query3, query4, query5])  # Routing once, ~25ms × 5 = 125ms
    # Saves ~300ms+ per iteration
    

    Primary Search: repo_search

    Use repo_search (or its alias code_search) for direct code lookups when you need full control. Reranking is ON by default.

    {
      "query": "database connection handling",
      "limit": 10,
      "include_snippet": true,
      "context_lines": 3
    }
    

    Returns:

    {
      "results": [
        {"score": 3.2, "path": "src/db/pool.py", "symbol": "ConnectionPool", "start_line": 45, "end_line": 78, "snippet": "..."}
      ],
      "total": 8,
      "used_rerank": true
    }
    

    Multi-query for better recall - pass a list to fuse results:

    {
      "query": ["auth middleware", "authentication handler", "login validation"]
    }
    

    Apply filters to narrow results:

    {
      "query": "error handling",
      "language": "python",
      "under": "src/api/",
      "not_glob": ["**/test_*", "**/*_test.*"]
    }
    

    Search across repos (same collection):

    {
      "query": "shared types",
      "repo": ["frontend", "backend"]
    }
    

    Use repo: "*" to search all indexed repos.

    Search across repos (separate collections — use cross_repo_search):

    // cross_repo_search
    {"query": "shared types", "target_repos": ["frontend", "backend"]}
    // With boundary tracing for cross-repo flow discovery
    {"query": "login submit", "trace_boundary": true}
    

    Available Filters

    • language - Filter by programming language
    • under - Path prefix (e.g., "src/api/")
    • path_glob - Include patterns (e.g., ["/*.ts", "lib/"])
    • not_glob - Exclude patterns (e.g., ["**/test_*"])
    • symbol - Symbol name match
    • kind - AST node type (function, class, etc.)
    • ext - File extension
    • repo - Repository filter for multi-repo setups
    • case - Case-sensitive matching

    Batch Search: batch_search

    Run N independent repo_search calls in a single MCP tool invocation. Reduces token overhead by ~75-85% compared to sequential calls.

    Token Savings & Latency Metrics:

    N Searches Token Savings Sequential Latency Batch Latency Worth Batching?
    1 0% ~100ms N/A N/A
    2 ~40% ~180-200ms ~150-160ms ✅ YES (save 30-40ms, 40% tokens)
    3 ~55% ~270-300ms ~180-200ms ✅ YES (save 90-100ms, 55% tokens)
    5 ~70% ~450-500ms ~220-250ms ✅ YES (save 250ms, 70% tokens)
    10 ~75% ~900-1000ms ~300-350ms ✅ YES (save 600ms, 75% tokens)

    Decision Rule: Always use batch_search when you have 2+ independent code searches. The latency savings alone (30-100ms faster) justify batching, plus you save ~40-75% tokens.

    {
      "searches": [
        {"query": "authentication middleware", "limit": 5},
        {"query": "rate limiting implementation", "limit": 5},
        {"query": "error handling patterns"}
      ],
      "compact": true,
      "output_format": "toon"
    }
    

    Returns:

    {
      "ok": true,
      "batch_results": [result_set_0, result_set_1, result_set_2],
      "count": 3,
      "elapsed_ms": 245
    }
    

    Each result_set has the same schema as repo_search output.

    Shared parameters (applied to all searches unless overridden per-search):

    • collection, output_format, compact, limit, language, under, repo, include_snippet, rerank_enabled

    Per-search overrides: Each entry in searches can include any repo_search parameter to override the shared defaults.

    Limits: Maximum 10 searches per batch.

    When to use batch_search vs multiple search calls:

    • Use batch_search when you have 2+ independent code searches and want to minimize token usage and round-trips
    • Use individual search calls when you need intent routing (Q&A, symbol graph, etc.) or when searches depend on each other's results

    Simple Lookup: info_request

    Use info_request for natural language queries with minimal parameters:

    {
      "info_request": "how does user authentication work"
    }
    

    Add explanations:

    {
      "info_request": "database connection pooling",
      "include_explanation": true
    }
    

    Q&A with Citations: context_answer

    Use context_answer when you need an LLM-generated explanation grounded in code:

    {
      "query": "How does the caching layer invalidate entries?",
      "budget_tokens": 2000
    }
    

    Returns an answer with file/line citations. Use expand: true to generate query variations for better retrieval.

    Pattern Search: pattern_search (Optional)

    Note: This tool may not be available in all deployments. If pattern detection is disabled, calls return {"ok": false, "error": "Pattern search module not available"}.

    Find structurally similar code patterns across all languages. Accepts either code examples or natural language descriptions—auto-detects which.

    Code example query - find similar control flow:

    {
      "query": "for i in range(3): try: ... except: time.sleep(2**i)",
      "limit": 10,
      "include_snippet": true
    }
    

    Natural language query - describe the pattern:

    {
      "query": "retry with exponential backoff",
      "limit": 10,
      "include_snippet": true
    }
    

    Cross-language search - Python pattern finds Go/Rust/Java equivalents:

    {
      "query": "if err != nil { return err }",
      "language": "go",
      "limit": 10
    }
    

    Explicit mode override - force code or description mode:

    {
      "query": "error handling",
      "query_mode": "description",
      "limit": 10
    }
    

    Key parameters:

    • query - Code snippet OR natural language description
    • query_mode - "code", "description", or "auto" (default)
    • language - Language hint for code examples (python, go, rust, etc.)
    • limit - Max results (default 10)
    • min_score - Minimum similarity threshold (default 0.3)
    • include_snippet - Include code snippets in results
    • context_lines - Lines of context around matches
    • aroma_rerank - Enable AROMA structural reranking (default true)
    • aroma_alpha - Weight for AROMA vs original score (default 0.6)
    • target_languages - Filter results to specific languages

    Returns:

    {
      "ok": true,
      "results": [...],
      "total": 5,
      "query_signature": "L2_2_B0_T2_M0",
      "query_mode": "code",
      "search_mode": "aroma"
    }
    

    The query_signature encodes control flow: L (loops), B (branches), T (try/except), M (match).

    Specialized Search Tools

    search_tests_for - Find test files:

    {"query": "UserService", "limit": 10}
    

    search_config_for - Find config files:

    {"query": "database connection", "limit": 5}
    

    search_callers_for - Find callers of a symbol:

    {"query": "processPayment", "language": "typescript"}
    

    search_importers_for - Find importers:

    {"query": "utils/helpers", "limit": 10}
    

    symbol_graph - Symbol graph navigation (callers / callees / definition / importers / subclasses / base classes):

    Query types:

    Type Description
    callers Who calls this symbol?
    callees What does this symbol call?
    definition Where is this symbol defined?
    importers Who imports this module/symbol?
    subclasses What classes inherit from this symbol?
    base_classes What classes does this symbol inherit from?

    Examples:

    {"symbol": "ASTAnalyzer", "query_type": "definition", "limit": 10}
    
    {"symbol": "get_embedding_model", "query_type": "callers", "under": "scripts/", "limit": 10}
    
    {"symbol": "qdrant_client", "query_type": "importers", "limit": 10}
    
    {"symbol": "authenticate", "query_type": "callees", "limit": 10}
    
    {"symbol": "BaseModel", "query_type": "subclasses", "limit": 20}
    
    {"symbol": "MyService", "query_type": "base_classes"}
    
    • Supports language, under, depth, and output_format like other tools.
    • Use depth=2 or depth=3 for multi-hop traversals (callers of callers).
    • If there are no graph hits, it falls back to semantic search.
    • Note: Results are "hydrated" with ~500-char source snippets for immediate context.

    graph_query - Advanced graph traversals and impact analysis (available to all SaaS users):

    Query types:

    Type Description
    callers Direct callers of this symbol
    callees Direct callees of this symbol
    transitive_callers Multi-hop callers (up to depth)
    transitive_callees Multi-hop callees (up to depth)
    impact What would break if I change this symbol?
    dependencies Combined calls + imports
    definition Where is this symbol defined?
    cycles Detect circular dependencies involving this symbol

    Examples:

    {"symbol": "UserService", "query_type": "impact", "depth": 3}
    
    {"symbol": "auth_module", "query_type": "cycles"}
    
    {"symbol": "processPayment", "query_type": "transitive_callers", "depth": 2, "limit": 20}
    
    • Supports language, under, depth, limit, include_paths, and output_format.
    • Use include_paths: true to get full traversal paths in results.
    • Use depth to control how many hops to traverse (default varies by query type).
    • Note: symbol_graph is always available (Qdrant-backed). graph_query provides advanced Memgraph-backed traversals and is available to all SaaS users.

    Comparison: symbol_graph vs graph_query

    Feature symbol_graph graph_query
    Availability Always (Qdrant-backed) SaaS/Enterprise (Memgraph-backed)
    Performance ~2-5ms per query ~50-200ms per query
    Supported Relationships callers, callees, definition, importers, subclasses, base_classes All symbol_graph + impact, cycles, transitive_*
    Max Depth up to 3 up to 5+
    Best For Direct relationships, exploratory queries Impact analysis, dependency chains, circular detection
    Fallback When Unavailable Falls back to semantic search N/A (use symbol_graph instead)
    Latency-Critical Loops ✅ YES (fast) ❌ NO (slower)

    Decision Guide:

    • Use symbol_graph for: direct callers/callees/definitions, inheritance queries, when you need speed, always as first stop
    • Use graph_query for: impact analysis ("what breaks?"), cycle detection, transitive chains, when available and you need depth >3

    search_commits_for - Search git history:

    {"query": "fixed authentication bug", "limit": 10}
    

    Predict co-changing files (predict_related mode):

    {"path": "src/api/auth.py", "predict_related": true, "limit": 10}
    

    Returns ranked files that historically co-change with the given path, along with the most relevant commit message explaining why.

    change_history_for_path - File change summary:

    {"path": "src/api/auth.py", "include_commits": true}
    

    Memory: Store and Recall Knowledge

    Memory tools allow you to persist team knowledge, architectural decisions, and findings for later retrieval across sessions.

    Memory Workflow: Store → Retrieve → Reuse

    Phase 1: During Exploration (Session 1) As you discover important patterns, decisions, or findings, store them for future reference:

    {
      "memory_store": {
        "information": "Auth service uses JWT tokens with 24h expiry. Refresh tokens last 7 days. Stored in Redis with LRU eviction.",
        "metadata": {
          "kind": "decision",
          "topic": "auth",
          "priority": 5,
          "tags": ["security", "jwt", "session-management"]
        }
      }
    }
    

    Phase 2: In Later Sessions Retrieve and reuse stored knowledge by similarity:

    {
      "memory_find": {
        "query": "token expiration policy",
        "topic": "auth",
        "limit": 5
      }
    }
    

    Returns the exact note stored in Phase 1, plus any other auth-related memories.

    Phase 3: Blend Code + Memory When you want BOTH code search results AND stored team knowledge:

    {
      "context_search": {
        "query": "authentication flow",
        "include_memories": true,
        "per_source_limits": {"code": 6, "memory": 3}
      }
    }
    

    Returns: 6 code snippets + 3 memory notes, all ranked by relevance.

    Timeline and Persistence

    Property Behavior
    Searchability Memories searchable immediately after memory_store (indexing is instant)
    Persistence Memories persist across sessions indefinitely (durable storage)
    Scope Org/workspace scoped: one team's memories don't leak to another
    Latency ~100ms per memory_find query (same as code search)
    Storage Embedded in same Qdrant collection as code, but logically isolated

    Real-World Example: Session Continuity

    Session 1 (Day 1) - Discovery:

    Context: Investigating why JWT refresh tokens sometimes expire unexpectedly
    
    → memory_store(
        information="Found: RefreshTokenManager.py line 89 uses session.expire_in instead of constants.REFRESH_TTL. This was a bug introduced in PR #1234 where the constant was 7 days but the session value was hardcoded to 3 days. The mismatch causes premature expiration.",
        metadata={"kind": "gotcha", "topic": "auth", "tags": ["bug", "jwt"], "priority": 4}
      )
    

    Session 2 (Day 5) - Troubleshooting a Similar Issue:

    → memory_find(query="refresh token expiration problem", topic="auth")
    
    Response: Found Session 1's note about the RefreshTokenManager bug, plus similar findings about token TTL misconfigurations.
    
    → User goes directly to line 89 of RefreshTokenManager.py and verifies the fix status.
    

    Result: Problem solved in 2 minutes instead of 30 minutes of debugging.

    When to Store What

    Memory Kind Use Case Example
    decision Architectural choices and their rationale "We chose JWT over sessions because stateless scaling"
    gotcha Subtle bugs or trap conditions "RefreshTokenManager line 89 has TTL mismatch"
    convention Team patterns and standards "All API responses use envelope pattern with status/data/errors"
    note General findings or context "Auth service was moved to separate repo last month"
    policy Compliance or operational rules "Session tokens must be rotated every 24h per SOC2"

    Integration with Code Search

    Pattern 1: Pure Code Search

    {"search": "authentication validation"}
    

    Returns: code snippets only. Fast, no memory overhead.

    Pattern 2: Code + Memory Blend

    {
      "context_search": {
        "query": "authentication validation",
        "include_memories": true,
        "per_source_limits": {"code": 5, "memory": 2}
      }
    }
    

    Returns: 5 code snippets + 2 relevant memory notes (team insights about auth validation patterns).

    Pattern 3: Memory Only

    {"memory_find": {"query": "authentication patterns", "limit": 10}}
    

    Returns: stored team knowledge about auth, useful for onboarding or architecture review.

    Common Patterns

    Team Onboarding:

    • New engineer joins → memory_find(query="project architecture", topic="architecture")
    • Retrieves all stored architectural decisions in one place
    • Much faster than reading scattered code comments

    Incident Response:

    • Production auth bug occurs
    • → memory_find(query="auth failures", priority_min=3)
    • Retrieves gotchas, prior incidents, and known traps
    • Faster root-cause diagnosis

    Code Review Efficiency:

    • Reviewer checks PR modifying auth module
    • → context_search(query="authentication standards", include_memories=true)
    • Sees both current code AND team conventions/policies
    • Makes more informed review decisions

    Error Cases and Recovery

    Error Cause Recovery
    "No results from memory_find" Query too specific or memories not yet stored Broaden query, check metadata filters (topic, tags, kind)
    "Memory not found in next session" Wrong workspace/collection or stale cache Verify workspace matches, run qdrant_list to confirm collection
    "include_memories=true returns only code" Memory store empty for this workspace Start storing with memory_store - next session will have memories
    "Duplicate memories with same info" Same finding discovered twice Use memory_find with topic/tags filter, consolidate via note

    Admin and Diagnostics

    qdrant_status - Check index health:

    {}
    

    qdrant_list - List all collections:

    {}
    

    embedding_pipeline_stats - Get cache efficiency, bloom filter stats, pipeline performance:

    {}
    

    set_session_defaults - Set defaults for session:

    {"collection": "my-project", "language": "python"}
    

    Deployment Mode Capabilities

    SaaS Mode: In SaaS deployments, indexing is handled automatically by the VS Code extension upload service. The tools below marked "Self-Hosted Only" are not available in SaaS mode. All search, symbol graph, memory, and session tools work normally.

    Self-Hosted Only Tools (not available in SaaS):

    Tool Purpose When to Use
    qdrant_index_root Index entire workspace Initial indexing or after major codebase reorg
    qdrant_index Index subdirectory Incremental indexing of specific folders
    qdrant_prune Remove stale entries Clean up entries from deleted files

    Tool Availability Matrix

    Which tools are available in which deployment modes:

    Tool Category Tool SaaS Self-Hosted Enterprise
    Search search ✅ ✅ ✅
    repo_search / code_search ✅ ✅ ✅
    cross_repo_search ✅ ✅ ✅
    batch_search ✅ ✅ ✅
    Search (Specialized) info_request ✅ ✅ ✅
    context_answer ✅ ✅ ✅
    search_tests_for ✅ ✅ ✅
    search_config_for ✅ ✅ ✅
    search_callers_for ✅ ✅ ✅
    search_importers_for ✅ ✅ ✅
    search_commits_for ✅ ✅ ✅
    change_history_for_path ✅ ✅ ✅
    pattern_search (if enabled) ✅* ✅* ✅*
    Symbol Graph symbol_graph ✅ ✅ ✅
    batch_symbol_graph ✅ ✅ ✅
    graph_query ✅ (limited)** ✅ ✅
    batch_graph_query ✅ (limited)** ✅ ✅
    Memory memory_store ✅ ✅ ✅
    memory_find ✅ ✅ ✅
    context_search ✅ ✅ ✅
    Session set_session_defaults ✅ ✅ ✅
    expand_query ✅ ✅ ✅
    Admin qdrant_status ✅ ✅ ✅
    qdrant_list ✅ ✅ ✅
    embedding_pipeline_stats ✅ ✅ ✅
    qdrant_index_root ❌ ✅ ✅
    qdrant_index ❌ ✅ ✅
    qdrant_prune ❌ ✅ ✅

    Legend:

    • ✅ = Available
    • ❌ = Not available
    • ✅* = Pattern search available only if enabled during deployment
    • ✅ (limited)** = SaaS graph_query has limited depth/performance vs Enterprise with dedicated Memgraph

    Choosing Your Deployment Mode

    Requirement Best Fit
    Automatic indexing via VS Code SaaS
    Manual control over indexing pipeline Self-Hosted
    Advanced graph queries (cycles, impact analysis) Self-Hosted or Enterprise
    High-performance graph traversal Enterprise (dedicated Memgraph)
    Cost-sensitive small team SaaS (pay per upload)
    Large codebase with frequent indexing Self-Hosted (unlimited reindex)

    Error Handling and Recovery

    Tools return structured errors via error field or ok: false flag. Below are common errors and recovery steps by category.

    Search Tools (search, repo_search, batch_search, info_request)

    Error HTTP 400? Cause Recovery Steps
    "Collection not found" Yes Collection doesn't exist, workspace hasn't been indexed, or collection was deleted 1. Run qdrant_list() to verify available collections
    2. Check workspace name in config matches indexed name
    3. If missing: re-upload workspace to indexing service
    4. If collection exists but stale: wait for background refresh or trigger reindex
    "Invalid language filter" Yes language parameter has invalid value Use only valid language codes: "python", "typescript", "go", "rust", "java", etc.
    Check qdrant_status for supported languages
    "Timeout during rerank" No (504) Reranking took too long (default 5s timeout) Set rerank_enabled: false to skip reranking
    OR set rerank_timeout_ms: 10000 for longer timeout
    OR reduce limit to speed up reranking
    "Empty results" No (200) Query too specific, collection not fully indexed, or no matches exist 1. Broaden query (remove filters, use more general terms)
    2. Check language filter is correct
    3. Run qdrant_status to see point count
    4. If points=0: indexing is incomplete, wait and retry
    "Query too long" Yes Query exceeds 400 chars or 50 words Shorten query or split into multiple searches
    "Syntax error in path_glob" Yes Invalid glob pattern in path_glob or not_glob Check glob syntax: valid wildcards are * (any), ** (any directories), ? (any single char)

    Silent Failure Watches:

    • Empty results when expecting matches → check under path filter (may be excluding files)
    • Results from wrong language → verify language parameter is set correctly
    • Reranking disabled silently → check rerank_timeout_ms if you set custom timeout
    • Wrong collection queried → session defaults may not match workspace (use set_session_defaults to "cd" into correct collection)

    Symbol Graph Tools (symbol_graph, batch_symbol_graph, graph_query)

    Error Cause Recovery Steps
    "Symbol not found" Symbol doesn't exist, wrong name, or graph not indexed 1. Verify exact symbol name using repo_search(symbol="...")
    2. Check spelling and case sensitivity
    3. If graph unavailable: use repo_search instead
    4. For imported symbols: search with full module path
    "Graph unavailable / not ready" Memgraph backend not initialized (graph_query only) Fall back to symbol_graph (always available)
    graph_query requires SaaS/Enterprise plan with Neo4j/Memgraph
    "Depth too high" depth parameter exceeds max for this tool Reduce depth: symbol_graph max 3, graph_query max 5+
    For deeper chains, use multiple queries with results as input
    "Timeout during graph traversal" Graph query took too long Reduce depth, reduce limit, or use smaller under path filter

    Silent Failure Watches:

    • No callers found when method is clearly called → fuzzy fallback may have triggered, use repo_search to verify method exists
    • Symbol seems undefined but code uses it → cross-module imports may not be resolved in graph yet

    Context Answer (LLM-powered explanation)

    Error Cause Recovery Steps
    "Insufficient context" Retrieved code wasn't enough to answer question 1. Rephrase question more specifically
    2. Use expand: true to generate query variations
    3. Increase budget_tokens for deeper retrieval
    4. Use repo_search first to verify code exists
    "Timeout during retrieval or generation" LLM generation or retrieval took >60s Set rerank_enabled: false to skip reranking
    Reduce budget_tokens for faster shallow retrieval
    Ask simpler question requiring less context
    "Budget exceeded" Generated answer would use >budget_tokens Increase budget_tokens or ask more focused question

    Memory Tools (memory_store, memory_find)

    Error Cause Recovery Steps
    "Memory not found" No memories match query or metadata filters 1. Broaden query (more general terms)
    2. Remove metadata filters (topic, kind, priority_min)
    3. Check if memories exist: memory_find(query="*")
    4. Verify workspace/collection (memories are org-scoped)
    "Storage failure" Backend couldn't persist memory Retry memory_store - likely transient
    Check qdrant_status for cluster health
    "Duplicate memory detected" (warning) Similar memory already exists with higher priority Review existing memories first: memory_find(query="...", topic="...")
    Consolidate if same information

    Batch Tools (batch_search, batch_symbol_graph)

    Error Cause Recovery Steps
    "Too many searches" searches / queries array > 10 items Split into multiple batch calls (max 10 per call)
    If independent: use sequential calls (lower token savings but more granular)
    If dependent: must be sequential anyway
    "Mixed error in batch" Some queries succeeded, others failed Check individual batch_results array for per-query ok: false
    Failed queries return error details in batch_results[i].error
    Successful queries still have results in batch_results[i].results
    "Timeout on any query" One query in the batch timed out Set rerank_enabled: false in that query's override
    Reduce limit for slow queries
    Consider running that query separately

    Cross-Repo & Discovery

    Error Cause Recovery Steps
    "No collections found" No indexed repositories available OR discover mode="never" 1. Run qdrant_list() manually to see available collections
    2. Try discover: "always" in cross_repo_search
    3. Verify workspace is indexed
    4. If nothing indexed: use upload_service to index workspace
    "Multiple ambiguous collections" User query matched multiple repos but target unclear Use target_repos: [...] to explicitly specify repos
    OR use boundary_key to search with exact interface name
    OR do two separate targeted searches
    "Boundary key not found" boundary_key doesn't exist in other repo Verify boundary_key is exact string (routes, event names, type names)
    May be named slightly differently in other repo (check similar names)
    Try broader search instead of boundary tracing

    Logging and Diagnostics

    When errors persist:

    1. Check cluster health: qdrant_status() shows point counts, last indexed time, scanned_points
    2. List available collections: qdrant_list() with include_status=true shows health per collection
    3. Check embedding stats: embedding_pipeline_stats() shows cache hit rate, dedup efficiency
    4. Verify auth: If authentication errors, check workspace/org identity matches request
    5. Review recent changes: If started failing recently, check change_history_for_path() for relevant commits

    Multi-Repo Navigation (CRITICAL)

    When multiple repositories are indexed, you MUST discover and explicitly target collections.

    Discovery (Lazy — only when needed)

    Don't discover at every session start. Trigger when: search returns no/irrelevant results, user asks a cross-repo question, or you're unsure which collection to target.

    // qdrant_list — discover available collections
    {}
    

    Context Switching (Session Defaults = cd)

    Treat set_session_defaults like cd — it scopes ALL subsequent searches:

    // "cd" into backend repo — all searches now target this collection
    // set_session_defaults
    {"collection": "backend-api-abc123"}
    
    // One-off peek at another repo (does NOT change session default)
    // search (or repo_search)
    {"query": "login form", "collection": "frontend-app-def456"}
    

    For unified collections: use "repo": "*" or "repo": ["frontend", "backend"]

    Cross-Repo Flow Tracing (Boundary-Driven)

    NEVER search both repos with the same vague query. Find the interface boundary in Repo A, extract the hard key, then search Repo B with that specific key.

    Pattern 1 — Interface Handshake (API/RPC):

    // 1. Find client call in frontend
    // search
    {"query": "login API call", "collection": "frontend-col"}
    // → Found: axios.post('/auth/v1/login', ...)
    
    // 2. Search backend for that exact route
    // search
    {"query": "'/auth/v1/login'", "collection": "backend-col"}
    

    Pattern 2 — Shared Contract (Types/Schemas):

    // 1. Find type usage in consumer
    // symbol_graph
    {"symbol": "UserProfile", "query_type": "importers", "collection": "frontend-col"}
    
    // 2. Find definition in source
    // search
    {"query": "interface UserProfile", "collection": "shared-lib-col"}
    

    Pattern 3 — Event Relay (Pub/Sub):

    // 1. Find producer → extract event name
    // search
    {"query": "publish event", "collection": "service-a-col"}
    // → Found: bus.publish("USER_CREATED", payload)
    
    // 2. Find consumer with exact event name
    // search
    {"query": "'USER_CREATED'", "collection": "service-b-col"}
    

    Automated Cross-Repo Search (PRIMARY for Multi-Repo)

    cross_repo_search is the PRIMARY tool for multi-repo scenarios. Use it BEFORE manual qdrant_list + repo_search chains.

    Discovery Modes:

    Mode Behavior When to Use
    "auto" (default) Discovers only if results empty or no targeting Normal usage
    "always" Always runs discovery before search First search in session, exploring new codebase
    "never" Skips discovery, uses explicit collection When you know exact collection, speed-critical
    // Search across all repos at once (auto-discovers collections)
    // cross_repo_search
    {"query": "authentication flow", "discover": "auto"}
    
    // Target specific repos by name
    // cross_repo_search
    {"query": "login handler", "target_repos": ["frontend", "backend"]}
    
    // Boundary tracing — auto-extracts routes/events/types from results
    // cross_repo_search
    {"query": "login submit", "trace_boundary": true}
    // → Returns boundary_keys: ["/api/auth/login"] + trace_hint for next search
    
    // Follow boundary key to another repo
    // cross_repo_search
    {"boundary_key": "/api/auth/login", "collection": "backend-col"}
    

    Use cross_repo_search when you need breadth across repos. Use search (or repo_search) with explicit collection when you need depth in one repo.

    Multi-Repo Anti-Patterns

    • DON'T search both repos with the same vague query (noisy, confusing)
    • DON'T assume the default collection is correct — verify with qdrant_list
    • DON'T forget to "cd back" after cross-referencing another repo
    • DO extract exact strings (route paths, event names, type names) as search anchors

    Query Expansion

    expand_query - Generate query variations for better recall:

    {"query": "auth flow", "max_new": 2}
    

    Output Formats

    • json (default) - Structured output
    • toon - Token-efficient compressed format

    Set via output_format parameter.

    Tool Aliases and Compatibility

    Tool Aliases

    These tools have alternate names that work identically:

    Primary Name Alias(es) When to Use Note
    repo_search code_search Either name works identically Both names are equivalent, use whichever is familiar
    memory_store (none) Standard name Part of memory server, no aliases
    memory_find (none) Standard name Part of memory server, no aliases
    search (none) Standard name Auto-routing search, no aliases
    symbol_graph (none) Standard name Direct symbol queries, no aliases

    Compatibility Wrappers

    These wrappers provide backward compatibility for legacy clients by accepting alternate parameter names:

    Wrapper Primary Tool Alternate Parameter Names When to Use
    repo_search_compat repo_search Accepts q, text (instead of query), top_k (instead of limit) Legacy clients that don't support standard parameter names
    context_answer_compat context_answer Accepts q, text (instead of query) Legacy clients using old parameter names

    Preference: Use primary tools and standard parameter names whenever possible. Compat wrappers exist only for legacy client support and may have slower adoption of new features.

    Cross-Server Tools

    These tools are provided by separate MCP servers:

    Tool Server Purpose
    memory_store Memory server Persist team knowledge for later retrieval
    memory_find Memory server Search stored memories by similarity
    All search/symbol tools Context server Primary code search and analysis

    All tools are transparently integrated into the unified search interface.

    Best Practices

    1. Use search as your default tool - It auto-routes to the best specialized tool. Only use specific tools when you need precise control or features search doesn't handle (cross-repo, memory, admin).
    2. Prefer MCP over Read/grep for exploration - Use MCP tools (search, repo_search, symbol_graph, context_answer) for discovery and cross-file understanding. Narrow file/grep use is still fine for exact literal confirmation, exact path/line confirmation, or opening a file you already identified for editing.
    3. Use symbol_graph first for symbol relationships - It handles callers, callees, definitions, importers, subclasses, and base classes. Use graph_query only when available and you need deeper impact/dependency traversal.
    4. Start broad, then filter - Begin with search or a semantic query, add filters if too many results
    5. Use multi-query - Pass 2-3 query variations for better recall on complex searches
    6. Include snippets - Set include_snippet: true to see code context in results
    7. Store decisions - Use memory_store to save architectural decisions and context for later
    8. Check index health - Run qdrant_status if searches return unexpected results
    9. Use pattern_search for structural matching - When looking for code with similar control flow (retry loops, error handling), use pattern_search instead of repo_search (if enabled)
    10. Describe patterns in natural language - pattern_search understands "retry with backoff" just as well as actual code examples (if enabled)
    11. Fire independent searches in parallel - Call multiple search, repo_search, symbol_graph, etc. in the same message block for 2-3x speedup. Alternatively, use batch_search to run N repo_search calls in a single invocation with ~75% token savings
    12. Use TOON format for discovery - Set output_format: "toon" for 60-80% token reduction on exploratory queries
    13. Bootstrap sessions with defaults - Call set_session_defaults(output_format="toon", compact=true) early to avoid repeating params
    14. Two-phase search - Discovery first (limit=3, compact=true), then deep dive (limit=5-8, include_snippet=true) on targets
    15. Use fallback chains - If context_answer times out, fall back to search or repo_search + info_request(include_explanation=true)

    Return Shapes Reference

    Every tool returns a consistent envelope. Understanding the response structure helps you parse results correctly and detect errors.

    Universal Response Envelope

    All tools return minimum:

    {
      "ok": boolean,
      "error": "string (only if ok=false)"
    }
    
    • ok: true = success (may have zero results but no error)
    • ok: false = error (details in error field)

    Search Family Return Shape

    Applies to: search, repo_search, batch_search, info_request, context_search

    {
      "ok": true,
      "results": [
        {
          "score": 0.85,                  // Relevance score (0-1+, higher=better)
          "path": "src/auth.py",          // File path
          "symbol": "authenticate",        // Symbol name (optional)
          "start_line": 42,               // Start line number
          "end_line": 67,                 // End line number
          "snippet": "def authenticate...",// Code snippet (if include_snippet=true)
          "language": "python"            // Programming language
        }
      ],
      "total": 5,                         // Total results found
      "used_rerank": true,                // Whether reranking was applied
      "execution_time_ms": 245            // Query execution time
    }
    

    Symbol Graph Return Shape

    Applies to: symbol_graph, batch_symbol_graph, graph_query, batch_graph_query

    {
      "ok": true,
      "results": [
        {
          "path": "src/api/handlers.py",  // File path
          "start_line": 142,              // Start line
          "end_line": 145,                // End line
          "symbol": "handle_login",       // Symbol at this location
          "symbol_path": "handlers.handle_login", // Qualified symbol
          "language": "python",           // Programming language
          "snippet": "result = authenticate(username, password)", // Code snippet
          "hop": 1                        // For depth>1: which hop found this
        }
      ],
      "symbol": "authenticate",           // Symbol queried
      "query_type": "callers",            // Type of query
      "count": 12,                        // Total results
      "depth": 1,                         // Traversal depth used
      "used_graph": true,                 // Whether graph backend was used
      "suggestions": [...]                // Fuzzy matches if exact symbol not found
    }
    

    Unified Search Return Shape

    Applies to: search (auto-routing wrapper)

    {
      "ok": true,
      "intent": "search",                 // Detected intent (search, qa, tests, config, symbols, etc.)
      "confidence": 0.92,                 // Intent detection confidence (0-1)
      "tool": "repo_search",              // Tool used for routing
      "result": {                         // Result from dispatched tool
        "results": [...],
        "total": 8,
        "used_rerank": true,
        "execution_time_ms": 245
      },
      "plan": ["detect_intent", "dispatch_repo_search"], // Steps taken
      "execution_time_ms": 245            // Total time
    }
    

    Context Answer Return Shape

    Applies to: context_answer

    {
      "ok": true,
      "answer": "The authentication system validates tokens by first checking the JWT signature using the secret from config [1], then verifying expiration time [2]...", // LLM-generated answer with citations [1], [2]...
      "citations": [
        {
          "id": 1,                        // Citation number
          "path": "src/auth/jwt.py",      // File path
          "start_line": 45,               // Start line
          "end_line": 52,                 // End line
          "snippet": "def verify_token(token):..." // Optional code snippet
        }
        // ... more citations
      ],
      "query": ["How does authentication validate tokens"], // Original query
      "used": {
        "spans": 5,                       // Code spans retrieved
        "tokens": 1842                    // Tokens used for answer
      }
    }
    

    Memory Tools Return Shape

    Applies to: memory_store, memory_find

    {
      "ok": true,
      "id": "abc123...",                  // Unique ID (memory_store only)
      "message": "Successfully stored information", // Status message
      "collection": "codebase",           // Collection name
      "vector": "bge-base-en-v1-5"        // Embedding model used
    }
    

    memory_find results:

    {
      "ok": true,
      "results": [
        {
          "id": "abc123...",              // Memory ID
          "information": "JWT tokens expire after 24h...", // Stored knowledge
          "metadata": {                   // Structured metadata
            "kind": "decision",
            "topic": "auth",
            "created_at": "2024-01-15T10:30:00Z",
            "tags": ["security", "architecture"]
          },
          "score": 0.85,                  // Similarity score
          "highlights": [...]             // Query term matches in context
        }
      ],
      "total": 3,
      "count": 3,
      "query": "authentication decisions"
    }
    

    Error Response Shape

    All tools on error:

    {
      "ok": false,
      "error": "Collection not found",    // Error message
      "error_code": "COLLECTION_NOT_FOUND" // Optional error code
    }
    

    Or HTTP-level errors (504, 400, etc.) with structured response.

    Batch Tool Return Shape

    Applies to: batch_search, batch_symbol_graph, batch_graph_query

    {
      "ok": true,
      "batch_results": [
        { /* result from search/query 0 */ },
        { /* result from search/query 1 */ },
        { /* result from search/query 2 */ }
      ],
      "count": 3,                         // Number of results
      "elapsed_ms": 123.4                 // Total execution time
    }
    

    Each item in batch_results has the same schema as the individual tool (repo_search, symbol_graph, etc.).

    Admin Tools Return Shape

    Applies to: qdrant_status, qdrant_list, set_session_defaults

    {
      "ok": true,
      "collections": [
        {
          "name": "frontend-abc123",
          "count": 1234,                  // Point count
          "last_ingested_at": {
            "unix": 1704067800,
            "iso": "2024-01-01T10:30:00Z"
          }
        }
      ]
    }
    

    Recommended Servers
    Cloudflare AI Search
    Cloudflare AI Search
    Bright Data
    Bright Data
    Brave Search
    Brave Search
    Repository
    context-engine-ai/context-engine
    Files