Smithery Logo
MCPsSkillsDocsPricing
Login
Smithery Logo

Accelerating the Agent Economy

Resources

DocumentationPrivacy PolicySystem Status

Company

PricingAboutBlog

Connect

© 2026 Smithery. All rights reserved.

    llama-farm

    rag-skills

    llama-farm/rag-skills
    AI & ML
    825
    1 installs

    About

    SKILL.md

    Install

    Install via Skills CLI

    or add to your agent
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    ├─
    ├─
    └─

    About

    RAG-specific best practices for LlamaIndex, ChromaDB, and Celery workers. Covers ingestion, retrieval, embeddings, and performance.

    SKILL.md

    RAG Skills for LlamaFarm

    Framework-specific patterns and code review checklists for the RAG component.

    Extends: python-skills - All Python best practices apply here.

    Component Overview

    Aspect Technology Version
    Python Python 3.11+
    Document Processing LlamaIndex 0.13+
    Vector Storage ChromaDB 1.0+
    Task Queue Celery 5.5+
    Embeddings Universal/Ollama/OpenAI Multiple

    Directory Structure

    rag/
    ├── api.py                 # Search and database APIs
    ├── celery_app.py          # Celery configuration
    ├── main.py                # Entry point
    ├── core/
    │   ├── base.py            # Document, Component, Pipeline ABCs
    │   ├── factories.py       # Component factories
    │   ├── ingest_handler.py  # File ingestion with safety checks
    │   ├── blob_processor.py  # Binary file processing
    │   ├── settings.py        # Pydantic settings
    │   └── logging.py         # RAGStructLogger
    ├── components/
    │   ├── embedders/         # Embedding providers
    │   ├── extractors/        # Metadata extractors
    │   ├── parsers/           # Document parsers (LlamaIndex)
    │   ├── retrievers/        # Retrieval strategies
    │   └── stores/            # Vector stores (ChromaDB, FAISS)
    ├── tasks/                 # Celery tasks
    │   ├── ingest_tasks.py    # File ingestion
    │   ├── search_tasks.py    # Database search
    │   ├── query_tasks.py     # Complex queries
    │   ├── health_tasks.py    # Health checks
    │   └── stats_tasks.py     # Statistics
    └── utils/
        └── embedding_safety.py  # Circuit breaker, validation
    

    Quick Reference

    Topic File Key Points
    LlamaIndex llamaindex.md Document parsing, chunking, node conversion
    ChromaDB chromadb.md Collections, embeddings, distance metrics
    Celery celery.md Task routing, error handling, worker config
    Performance performance.md Batching, caching, deduplication

    Core Patterns

    Document Dataclass

    from dataclasses import dataclass, field
    from typing import Any
    
    @dataclass
    class Document:
        content: str
        metadata: dict[str, Any] = field(default_factory=dict)
        id: str = field(default_factory=lambda: str(uuid.uuid4()))
        source: str | None = None
        embeddings: list[float] | None = None
    

    Component Abstract Base Class

    from abc import ABC, abstractmethod
    
    class Component(ABC):
        def __init__(
            self,
            name: str | None = None,
            config: dict[str, Any] | None = None,
            project_dir: Path | None = None,
        ):
            self.name = name or self.__class__.__name__
            self.config = config or {}
            self.logger = RAGStructLogger(__name__).bind(name=self.name)
            self.project_dir = project_dir
    
        @abstractmethod
        def process(self, documents: list[Document]) -> ProcessingResult:
            pass
    

    Retrieval Strategy Pattern

    class RetrievalStrategy(Component, ABC):
        @abstractmethod
        def retrieve(
            self,
            query_embedding: list[float],
            vector_store,
            top_k: int = 5,
            **kwargs
        ) -> RetrievalResult:
            pass
    
        @abstractmethod
        def supports_vector_store(self, vector_store_type: str) -> bool:
            pass
    

    Embedder with Circuit Breaker

    class Embedder(Component):
        DEFAULT_FAILURE_THRESHOLD = 5
        DEFAULT_RESET_TIMEOUT = 60.0
    
        def __init__(self, ...):
            super().__init__(...)
            self._circuit_breaker = CircuitBreaker(
                failure_threshold=config.get("failure_threshold", 5),
                reset_timeout=config.get("reset_timeout", 60.0),
            )
            self._fail_fast = config.get("fail_fast", True)
    
        def embed_text(self, text: str) -> list[float]:
            self.check_circuit_breaker()
            try:
                embedding = self._call_embedding_api(text)
                self.record_success()
                return embedding
            except Exception as e:
                self.record_failure(e)
                if self._fail_fast:
                    raise EmbedderUnavailableError(str(e)) from e
                return [0.0] * self.get_embedding_dimension()
    

    Review Checklist Summary

    When reviewing RAG code:

    1. LlamaIndex (Medium priority)

      • Proper chunking configuration
      • Metadata preservation during parsing
      • Error handling for unsupported formats
    2. ChromaDB (High priority)

      • Thread-safe client access
      • Proper distance metric selection
      • Metadata type compatibility
    3. Celery (High priority)

      • Task routing to correct queue
      • Error logging with context
      • Proper serialization
    4. Performance (Medium priority)

      • Batch processing for embeddings
      • Deduplication enabled
      • Appropriate caching

    See individual topic files for detailed checklists with grep patterns.

    Recommended Servers
    InfraNodus Knowledge Graphs & Text Analysis
    InfraNodus Knowledge Graphs & Text Analysis
    Vercel Grep
    Vercel Grep
    Cloudflare Workers Observability
    Cloudflare Workers Observability
    Repository
    llama-farm/llamafarm
    Files