Smithery Logo
MCPsSkillsDocsPricing
Login
Smithery Logo

Give agents more agency

Resources

DocumentationPrivacy PolicySystem Status

Company

PricingAboutBlog

Connect

© 2026 Smithery. All rights reserved.

    siti34

    ai-dev-guidelines

    siti34/ai-dev-guidelines
    AI & ML

    About

    SKILL.md

    Install

    Install via Skills CLI

    or add to your agent
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    ├─
    ├─
    └─

    About

    Comprehensive AI/ML development guide for LangChain, LangGraph, and ML model integration in FastAPI...

    SKILL.md

    AI/ML Development Guidelines (LangChain/LangGraph/FastAPI)

    Purpose

    Establish best practices for integrating AI/ML capabilities into FastAPI applications, with focus on LangChain, LangGraph, and ML model deployment.

    When to Use This Skill

    Automatically activates when working on:

    • LangChain chains, agents, tools
    • LangGraph workflows and state machines
    • RAG (Retrieval Augmented Generation)
    • Prompt engineering and templates
    • Vector stores and embeddings
    • ML model integration (sentiment analysis, aspect-based analysis, etc.)
    • Streaming LLM responses
    • AI service layers
    • Model deployment and optimization

    Quick Start

    New AI Feature Checklist

    • Model Selection: Choose appropriate model/API (OpenAI, Anthropic, local)
    • Prompt Design: Create prompt templates
    • Chain/Graph: Build LangChain chain or LangGraph workflow
    • API Endpoint: FastAPI route with streaming support
    • Error Handling: Retry logic, fallbacks, timeouts
    • Validation: Input/output validation with Pydantic
    • Monitoring: Log tokens, latency, errors
    • Testing: Unit tests with mocked LLM responses
    • Documentation: Document prompts and expected behavior

    Architecture for AI Features

    Layered AI Architecture

    HTTP Request
        ↓
    FastAPI Route (streaming setup)
        ↓
    AI Service (chain/graph orchestration)
        ↓
    LangChain/LangGraph (LLM calls)
        ↓
    Model/API (OpenAI, Anthropic, HuggingFace, local)
    

    Current Project ML Structure:

    backend/app/
    ├── routes/
    │   └── ML_Routes.py        # ML API endpoints
    ├── machine_learning/
    │   └── mlendpoint.py       # Sentiment analysis functions
    └── services/               # TO BE CREATED for LangChain/LangGraph
        └── ai_services/
            ├── sentiment.py    # Refactored sentiment analysis
            ├── chains.py       # LangChain chains
            ├── graphs.py       # LangGraph workflows
            └── prompts.py      # Prompt templates
    

    Package Management

    This project uses uv for Python package management.

    # Add LangChain dependencies
    uv add langchain langchain-openai langchain-anthropic langgraph
    
    # Add vector store dependencies (when needed)
    uv add faiss-cpu chromadb
    
    # Add other AI dependencies
    uv add tiktoken sentence-transformers
    
    # Run Python scripts with uv
    uv run python script.py
    

    ❌ NEVER use pip install - Always use uv add instead.


    Core Principles for AI Development

    1. Separate Prompts from Code

    # ❌ NEVER: Hardcoded prompts in code
    def analyze_text(text: str):
        response = llm.invoke(f"Analyze sentiment of: {text}")
        return response
    
    # ✅ ALWAYS: Templated prompts
    from langchain.prompts import ChatPromptTemplate
    
    SENTIMENT_PROMPT = ChatPromptTemplate.from_messages([
        ("system", "You are a sentiment analysis expert."),
        ("user", "Analyze the sentiment of the following text:\n\n{text}\n\nProvide: sentiment (positive/neutral/negative) and confidence (0-1).")
    ])
    
    def analyze_text(text: str):
        chain = SENTIMENT_PROMPT | llm | output_parser
        return chain.invoke({"text": text})
    

    2. Use Pydantic for LLM Output Validation

    from pydantic import BaseModel, Field
    from langchain.output_parsers import PydanticOutputParser
    
    class SentimentResult(BaseModel):
        sentiment: str = Field(description="positive, neutral, or negative")
        confidence: float = Field(ge=0.0, le=1.0)
        reasoning: str = Field(description="Brief explanation")
    
    # Create parser
    parser = PydanticOutputParser(pydantic_object=SentimentResult)
    
    # Add format instructions to prompt
    prompt = ChatPromptTemplate.from_messages([
        ("system", "You are a sentiment expert."),
        ("user", "{text}\n\n{format_instructions}")
    ])
    
    # Use in chain
    chain = prompt | llm | parser
    
    result: SentimentResult = chain.invoke({
        "text": "I love this product!",
        "format_instructions": parser.get_format_instructions()
    })
    

    3. Handle Streaming for Better UX

    from fastapi import FastAPI
    from fastapi.responses import StreamingResponse
    from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
    
    # Setup streaming LLM
    from langchain_openai import ChatOpenAI
    
    llm = ChatOpenAI(
        model="gpt-4",
        streaming=True,
        callbacks=[StreamingStdOutCallbackHandler()]
    )
    
    @app.post("/ai/generate-stream")
    async def generate_stream(request: GenerateRequest):
        """Stream LLM responses for real-time feedback"""
    
        async def event_generator():
            async for chunk in chain.astream({"input": request.text}):
                if chunk:
                    yield f"data: {chunk}\n\n"
    
        return StreamingResponse(
            event_generator(),
            media_type="text/event-stream"
        )
    

    4. Implement Retry Logic and Fallbacks

    from tenacity import retry, stop_after_attempt, wait_exponential
    from langchain.chat_models import ChatOpenAI, ChatAnthropic
    
    @retry(
        stop=stop_after_attempt(3),
        wait=wait_exponential(multiplier=1, min=2, max=10)
    )
    async def call_llm_with_retry(prompt: str):
        try:
            # Try primary model
            response = await primary_llm.ainvoke(prompt)
            return response
        except Exception as e:
            # Log and fallback
            logger.error(f"Primary LLM failed: {e}")
            # Try fallback model
            response = await fallback_llm.ainvoke(prompt)
            return response
    

    5. Use Environment Variables for API Keys

    # ❌ NEVER: Hardcoded API keys
    llm = ChatOpenAI(api_key="sk-...")
    
    # ✅ ALWAYS: From environment
    import os
    from pydantic_settings import BaseSettings
    
    class AISettings(BaseSettings):
        openai_api_key: str
        anthropic_api_key: str
        model_name: str = "gpt-4"
        max_tokens: int = 1000
        temperature: float = 0.7
    
        class Config:
            env_file = ".env"
    
    settings = AISettings()
    
    llm = ChatOpenAI(
        api_key=settings.openai_api_key,
        model=settings.model_name,
        max_tokens=settings.max_tokens
    )
    

    6. Track Token Usage and Costs

    from langchain.callbacks import get_openai_callback
    
    @app.post("/ai/analyze")
    async def analyze_with_tracking(text: str):
        with get_openai_callback() as cb:
            result = chain.invoke({"text": text})
    
            # Log metrics
            logger.info(f"""
            Tokens used: {cb.total_tokens}
            Prompt tokens: {cb.prompt_tokens}
            Completion tokens: {cb.completion_tokens}
            Total cost: ${cb.total_cost}
            """)
    
        return {
            "result": result,
            "metadata": {
                "tokens": cb.total_tokens,
                "cost": cb.total_cost
            }
        }
    

    7. Implement Proper Error Boundaries

    from fastapi import HTTPException
    from langchain.schema import LLMResult
    from openai.error import RateLimitError, APIError
    
    @app.post("/ai/process")
    async def process_with_ai(request: AIRequest):
        try:
            result = await ai_service.process(request.text)
            return result
    
        except RateLimitError:
            raise HTTPException(
                status_code=429,
                detail="Rate limit exceeded. Please try again later."
            )
    
        except APIError as e:
            logger.error(f"LLM API error: {e}")
            raise HTTPException(
                status_code=503,
                detail="AI service temporarily unavailable"
            )
    
        except Exception as e:
            logger.error(f"Unexpected error in AI processing: {e}")
            raise HTTPException(
                status_code=500,
                detail="Error processing request"
            )
    

    LangChain Patterns

    Basic Chain Pattern

    from langchain_openai import ChatOpenAI
    from langchain.prompts import ChatPromptTemplate
    from langchain.output_parsers import StrOutputParser
    
    # Components
    llm = ChatOpenAI(model="gpt-4")
    prompt = ChatPromptTemplate.from_messages([
        ("system", "You are a helpful assistant."),
        ("user", "{input}")
    ])
    output_parser = StrOutputParser()
    
    # Chain
    chain = prompt | llm | output_parser
    
    # Invoke
    result = chain.invoke({"input": "Hello!"})
    

    Chain with Memory

    from langchain.memory import ConversationBufferMemory
    from langchain.chains import ConversationChain
    
    memory = ConversationBufferMemory()
    
    conversation = ConversationChain(
        llm=llm,
        memory=memory,
        verbose=True
    )
    
    # First interaction
    response1 = conversation.predict(input="Hi, I'm Aaron")
    
    # Memory persists
    response2 = conversation.predict(input="What's my name?")
    # Will respond: "Your name is Aaron"
    

    RAG Chain Pattern

    from langchain.vectorstores import FAISS
    from langchain.embeddings import OpenAIEmbeddings
    from langchain.text_splitter import RecursiveCharacterTextSplitter
    from langchain.chains import RetrievalQA
    
    # 1. Prepare documents
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=1000,
        chunk_overlap=200
    )
    texts = text_splitter.split_documents(documents)
    
    # 2. Create vector store
    embeddings = OpenAIEmbeddings()
    vectorstore = FAISS.from_documents(texts, embeddings)
    
    # 3. Create retrieval chain
    qa_chain = RetrievalQA.from_chain_type(
        llm=llm,
        chain_type="stuff",
        retriever=vectorstore.as_retriever(search_kwargs={"k": 3})
    )
    
    # 4. Query
    result = qa_chain.invoke({"query": "What are the main findings?"})
    

    Tool-Using Agent

    from langchain.agents import create_openai_functions_agent, AgentExecutor
    from langchain.tools import Tool
    
    # Define tools
    def search_projects(query: str) -> str:
        """Search projects in database"""
        # Your search logic
        return f"Found 3 projects matching {query}"
    
    def get_sentiment(text: str) -> str:
        """Analyze sentiment of text"""
        # Your sentiment analysis
        return "positive"
    
    tools = [
        Tool(
            name="search_projects",
            func=search_projects,
            description="Search for urban planning projects by name or city"
        ),
        Tool(
            name="analyze_sentiment",
            func=get_sentiment,
            description="Analyze sentiment of text (positive/neutral/negative)"
        )
    ]
    
    # Create agent
    agent = create_openai_functions_agent(llm, tools, prompt)
    agent_executor = AgentExecutor(agent=agent, tools=tools)
    
    # Execute
    result = agent_executor.invoke({
        "input": "Find projects in Mexico City and analyze sentiment"
    })
    

    LangGraph Patterns

    Simple State Machine

    from langgraph.graph import StateGraph, END
    from typing import TypedDict, Annotated
    from operator import add
    
    # Define state
    class AgentState(TypedDict):
        input: str
        analysis: str
        sentiment: str
        final_output: str
    
    # Define nodes
    def analyze_node(state: AgentState):
        # Perform analysis
        analysis = llm.invoke(f"Analyze: {state['input']}")
        return {"analysis": analysis}
    
    def sentiment_node(state: AgentState):
        # Extract sentiment
        sentiment = extract_sentiment(state['analysis'])
        return {"sentiment": sentiment}
    
    def format_node(state: AgentState):
        # Format output
        output = f"Analysis: {state['analysis']}\nSentiment: {state['sentiment']}"
        return {"final_output": output}
    
    # Build graph
    workflow = StateGraph(AgentState)
    
    workflow.add_node("analyze", analyze_node)
    workflow.add_node("sentiment", sentiment_node)
    workflow.add_node("format", format_node)
    
    workflow.set_entry_point("analyze")
    workflow.add_edge("analyze", "sentiment")
    workflow.add_edge("sentiment", "format")
    workflow.add_edge("format", END)
    
    app = workflow.compile()
    
    # Run
    result = app.invoke({"input": "This project is amazing!"})
    

    Conditional Routing

    def route_by_sentiment(state: AgentState):
        """Route based on sentiment"""
        if state["sentiment"] == "negative":
            return "handle_negative"
        elif state["sentiment"] == "positive":
            return "handle_positive"
        else:
            return "handle_neutral"
    
    # Add conditional edges
    workflow.add_conditional_edges(
        "sentiment",
        route_by_sentiment,
        {
            "handle_negative": "negative_handler",
            "handle_positive": "positive_handler",
            "handle_neutral": "neutral_handler"
        }
    )
    

    Detailed Guides

    LangChain Patterns

    • Chain composition
    • Memory and context
    • Tools and agents
    • RAG implementation

    LangGraph Workflows

    • State machines
    • Conditional routing
    • Multi-agent systems
    • Complex orchestration

    Prompt Engineering

    • Effective prompt design
    • Few-shot learning
    • Chain-of-thought prompting
    • Prompt templates

    Model Deployment

    • Local model serving
    • API integration
    • Optimization and caching
    • Cost management

    Testing AI Systems

    • Unit testing with mocks
    • Integration testing
    • Prompt testing
    • Evaluation metrics

    Quick Reference

    Common LangChain Imports

    from langchain_openai import ChatOpenAI, OpenAI
    from langchain_anthropic import ChatAnthropic
    from langchain.prompts import ChatPromptTemplate, PromptTemplate
    from langchain.output_parsers import PydanticOutputParser, StrOutputParser
    from langchain.chains import LLMChain, SequentialChain
    from langchain.memory import ConversationBufferMemory
    from langchain.agents import create_openai_functions_agent, AgentExecutor
    from langchain.tools import Tool
    from langchain.callbacks import get_openai_callback
    

    Common LangGraph Imports

    from langgraph.graph import StateGraph, END
    from typing import TypedDict, Annotated
    from operator import add
    

    FastAPI + Streaming

    from fastapi.responses import StreamingResponse
    
    async def stream_response():
        async for chunk in chain.astream(input):
            yield f"data: {chunk}\n\n"
    
    return StreamingResponse(stream_response(), media_type="text/event-stream")
    

    Resources

    • LangChain Docs
    • LangGraph Docs
    • OpenAI Cookbook
    • Anthropic Docs
    • Prompt Engineering Guide

    Remember: AI features require careful prompt design, error handling, and monitoring. Always validate LLM outputs, implement retry logic, and track costs!

    Recommended Servers
    ScrapeGraph AI Integration Server
    ScrapeGraph AI Integration Server
    InfraNodus Knowledge Graphs & Text Analysis
    InfraNodus Knowledge Graphs & Text Analysis
    GENESIS ProofRelay MCP Verifier
    GENESIS ProofRelay MCP Verifier
    Repository
    siti34/spds-ura
    Files