Smithery Logo
MCPsSkillsDocsPricing
Login
Smithery Logo

Accelerating the Agent Economy

Resources

DocumentationPrivacy PolicySystem Status

Company

PricingAboutBlog

Connect

© 2026 Smithery. All rights reserved.

    ancoleman

    using-graph-databases

    ancoleman/using-graph-databases
    Data & Analytics
    154
    2 installs

    About

    SKILL.md

    Install

    Install via Skills CLI

    or add to your agent
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    ├─
    ├─
    └─

    About

    Graph database implementation for relationship-heavy data models. Use when building social networks, recommendation engines, knowledge graphs, or fraud detection...

    SKILL.md

    Graph Databases

    Purpose

    This skill guides selection and implementation of graph databases for applications where relationships between entities are first-class citizens. Unlike relational databases that model relationships through foreign keys and joins, graph databases natively represent connections as properties, enabling efficient traversal-heavy queries.

    When to Use This Skill

    Use graph databases when:

    • Deep relationship traversals (4+ hops): "Friends of friends of friends"
    • Variable/evolving relationships: Schema changes don't break existing queries
    • Path finding: Shortest route, network analysis, dependency chains
    • Pattern matching: Fraud detection, recommendation engines, access control

    Do NOT use graph databases when:

    • Fixed schema with shallow joins (2-3 tables) → Use PostgreSQL
    • Primarily aggregations/analytics → Use columnar databases
    • Key-value lookups only → Use Redis/DynamoDB

    Quick Decision Framework

    DATA CHARACTERISTICS?
    ├── Fixed schema, shallow joins (≤3 hops)
    │   └─ PostgreSQL (relational)
    │
    ├── Already on PostgreSQL + simple graphs
    │   └─ Apache AGE (PostgreSQL extension)
    │
    ├── Deep traversals (4+ hops) + general purpose
    │   └─ Neo4j (battle-tested, largest ecosystem)
    │
    ├── Multi-model (documents + graph)
    │   └─ ArangoDB
    │
    ├── AWS-native, serverless
    │   └─ Amazon Neptune
    │
    └── Real-time streaming, in-memory
        └─ Memgraph
    

    Core Concepts

    Property Graph Model

    Graph databases store data as:

    • Nodes (vertices): Entities with labels and properties
    • Relationships (edges): Typed connections with properties
    • Properties: Key-value pairs on nodes and relationships
    (Person {name: "Alice", age: 28})-[:FRIEND {since: "2020-01-15"}]->(Person {name: "Bob"})
    

    Query Languages

    Language Databases Readability Best For
    Cypher Neo4j, Memgraph, AGE ⭐⭐⭐⭐⭐ SQL-like General purpose
    Gremlin Neptune, JanusGraph ⭐⭐⭐ Functional Cross-database
    AQL ArangoDB ⭐⭐⭐⭐ SQL-like Multi-model
    SPARQL Neptune, RDF stores ⭐⭐⭐ W3C standard Semantic web

    Common Cypher Patterns

    Reference references/cypher-patterns.md for comprehensive examples.

    Pattern 1: Basic Matching

    // Find all users at a company
    MATCH (u:User)-[:WORKS_AT]->(c:Company {name: 'Acme Corp'})
    RETURN u.name, u.title
    

    Pattern 2: Variable-Length Paths

    // Find friends up to 3 degrees away
    MATCH (u:User {name: 'Alice'})-[:FRIEND*1..3]->(friend)
    WHERE u <> friend
    RETURN DISTINCT friend.name
    LIMIT 100
    

    Pattern 3: Shortest Path

    // Find shortest connection between two users
    MATCH path = shortestPath(
      (a:User {name: 'Alice'})-[*]-(b:User {name: 'Bob'})
    )
    RETURN path, length(path) AS distance
    

    Pattern 4: Recommendations

    // Collaborative filtering: Products liked by similar users
    MATCH (u:User {id: $userId})-[:PURCHASED]->(p:Product)<-[:PURCHASED]-(similar)
    MATCH (similar)-[:PURCHASED]->(rec:Product)
    WHERE NOT exists((u)-[:PURCHASED]->(rec))
    RETURN rec.name, count(*) AS score
    ORDER BY score DESC
    LIMIT 10
    

    Pattern 5: Fraud Detection

    // Detect circular money flows
    MATCH path = (a:Account)-[:SENT*3..6]->(a)
    WHERE all(r IN relationships(path) WHERE r.amount > 1000)
    RETURN path, [r IN relationships(path) | r.amount] AS amounts
    

    Database Selection Guide

    Neo4j (Primary Recommendation)

    Use for: General-purpose graph applications

    Strengths:

    • Most mature (2007), largest community (2M+ developers)
    • 65+ graph algorithms (GDS library): PageRank, Louvain, Dijkstra
    • Best tooling: Neo4j Browser, Bloom visualization
    • Comprehensive Cypher support

    Installation:

    # Python driver
    pip install neo4j
    
    # TypeScript driver
    npm install neo4j-driver
    
    # Rust driver
    cargo add neo4rs
    

    Reference: references/neo4j.md

    ArangoDB

    Use for: Multi-model applications (documents + graph)

    Strengths:

    • Store documents AND graph in one database
    • AQL combines document and graph queries
    • Schema flexibility with relationships

    Reference: references/arangodb.md

    Apache AGE

    Use for: Adding graph capabilities to existing PostgreSQL

    Strengths:

    • Extend PostgreSQL with graph queries
    • No new infrastructure needed
    • Query both relational and graph data

    Reference: Implementation details in examples/

    Amazon Neptune

    Use for: AWS-native, serverless deployments

    Strengths:

    • Fully managed, auto-scaling
    • Supports Gremlin AND SPARQL
    • AWS ecosystem integration

    Graph Data Modeling Patterns

    Reference references/graph-modeling.md for comprehensive patterns.

    Best Practice 1: Relationships as First-Class Citizens

    Anti-pattern (storing relationships in node properties):

    // BAD
    (:Person {name: 'Alice', friend_ids: ['b123', 'c456']})
    

    Pattern (explicit relationships):

    // GOOD
    (:Person {name: 'Alice'})-[:FRIEND]->(:Person {id: 'b123'})
    (:Person {name: 'Alice'})-[:FRIEND]->(:Person {id: 'c456'})
    

    Best Practice 2: Relationship Properties for Metadata

    // Track interaction details on relationships
    (:Person)-[:FRIEND {
      since: '2020-01-15',
      strength: 0.85,
      last_interaction: datetime()
    }]->(:Person)
    

    Best Practice 3: Bounded Traversals for Performance

    // SLOW: Unbounded traversal
    MATCH (a)-[:FRIEND*]->(distant)
    RETURN distant
    
    // FAST: Bounded depth with index
    MATCH (a)-[:FRIEND*1..4]->(distant)
    WHERE distant.active = true
    RETURN distant
    LIMIT 100
    

    Best Practice 4: Avoid Supernodes

    Problem: Nodes with thousands of relationships slow traversals.

    Solution: Intermediate aggregation nodes

    // Instead of: (:User)-[:POSTED]->(:Post) [1M relationships]
    
    // Use time partitioning:
    (:User)-[:POSTED_IN]->(:Year {year: 2025})
           -[:HAS_MONTH]->(:Month {month: 12})
           -[:HAS_POST]->(:Post)
    

    Use Case Examples

    Social Network

    Schema and implementation in examples/social-graph/

    Key features:

    • Friend recommendations (friends-of-friends)
    • Mutual connections
    • News feed generation
    • Influence metrics

    Knowledge Graph for AI/RAG

    Integration example in examples/knowledge-graph/

    Key features:

    • Hybrid vector + graph search
    • Entity relationship mapping
    • Context expansion for LLM prompts
    • Semantic relationship traversal

    Integration with Vector Databases:

    # Step 1: Vector search in Qdrant/pgvector
    vector_results = qdrant.search(collection="concepts", query_vector=embedding)
    
    # Step 2: Expand with graph relationships
    concept_ids = [r.id for r in vector_results]
    graph_context = neo4j.run("""
      MATCH (c:Concept) WHERE c.id IN $ids
      MATCH (c)-[:RELATED_TO|IS_A*1..2]-(related)
      RETURN c, related, relationships(path)
    """, ids=concept_ids)
    

    Recommendation Engine

    Examples in examples/social-graph/

    Strategies:

    1. Collaborative filtering: "Users who bought X also bought Y"
    2. Content-based: "Products similar to what you like"
    3. Session-based: "Recently viewed items"

    Fraud Detection

    Pattern detection in examples/

    Detection patterns:

    • Circular money flows
    • Shared devices across accounts
    • Rapid transaction chains
    • Connection pattern anomalies

    Performance Optimization

    Reference references/cypher-patterns.md for detailed optimization.

    Indexing

    // Single-property index
    CREATE INDEX user_email FOR (u:User) ON (u.email)
    
    // Composite index (Neo4j 5.x+)
    CREATE INDEX user_name_location FOR (u:User) ON (u.name, u.location)
    
    // Full-text search
    CREATE FULLTEXT INDEX product_search FOR (p:Product) ON EACH [p.name, p.description]
    

    Caching Expensive Aggregations

    // Materialize friend count as property
    MATCH (u:User)-[:FRIEND]->(f)
    WITH u, count(f) AS friendCount
    SET u.friend_count = friendCount
    
    // Query becomes instant
    MATCH (u:User) WHERE u.friend_count > 100
    RETURN u.name, u.friend_count
    

    Scaling Strategies

    Scale Strategy Implementation
    Vertical Add RAM/CPU In-memory caching, larger instances
    Horizontal (Read) Read replicas Neo4j Cluster, ArangoDB Cluster
    Horizontal (Write) Sharding ArangoDB SmartGraphs, JanusGraph
    Caching App-level cache Redis for hot paths

    Language Integration

    Python (Neo4j)

    Complete example in examples/social-graph/python-neo4j/

    from neo4j import GraphDatabase
    
    class GraphDB:
        def __init__(self, uri: str, user: str, password: str):
            self.driver = GraphDatabase.driver(uri, auth=(user, password))
    
        def find_friends_of_friends(self, user_id: str, max_depth: int = 2):
            query = """
            MATCH (u:User {id: $userId})-[:FRIEND*1..$maxDepth]->(fof)
            WHERE u <> fof
            RETURN DISTINCT fof.id, fof.name
            LIMIT 100
            """
            with self.driver.session() as session:
                result = session.run(query, userId=user_id, maxDepth=max_depth)
                return [dict(record) for record in result]
    
    # Usage
    db = GraphDB("bolt://localhost:7687", "neo4j", "password")
    friends = db.find_friends_of_friends("u123", max_depth=3)
    

    TypeScript (Neo4j)

    Complete example in examples/social-graph/typescript-neo4j/

    import neo4j, { Driver } from 'neo4j-driver'
    
    class Neo4jService {
      private driver: Driver
    
      constructor(uri: string, username: string, password: string) {
        this.driver = neo4j.driver(uri, neo4j.auth.basic(username, password))
      }
    
      async findFriendsOfFriends(userId: string, maxDepth: number = 2) {
        const session = this.driver.session()
        try {
          const result = await session.run(
            `MATCH (u:User {id: $userId})-[:FRIEND*1..$maxDepth]->(fof)
             WHERE u <> fof
             RETURN DISTINCT fof.id, fof.name
             LIMIT 100`,
            { userId, maxDepth }
          )
          return result.records.map(r => r.toObject())
        } finally {
          await session.close()
        }
      }
    }
    

    Go (ArangoDB)

    import (
        "github.com/arangodb/go-driver"
        "github.com/arangodb/go-driver/http"
    )
    
    func findFriendsOfFriends(db driver.Database, userId string, maxDepth int) ([]User, error) {
        query := `
            FOR vertex, edge, path IN 1..@maxDepth OUTBOUND @startVertex GRAPH 'socialGraph'
                FILTER vertex._id != @startVertex
                RETURN DISTINCT vertex
                LIMIT 100
        `
    
        cursor, err := db.Query(ctx, query, map[string]interface{}{
            "startVertex": userId,
            "maxDepth": maxDepth,
        })
    
        // Handle results...
    }
    

    Schema Validation

    Use scripts/validate_graph_schema.py to check for:

    • Unbounded traversals (missing depth limits)
    • Missing indexes on frequently queried properties
    • Supernodes (nodes with excessive relationships)
    • Relationship property consistency

    Run validation:

    python scripts/validate_graph_schema.py --database neo4j://localhost:7687
    

    Integration with Other Skills

    With databases-vector (Hybrid Search)

    Combine vector similarity with graph context for AI/RAG applications. See examples/knowledge-graph/

    With search-filter

    Implement relationship-based queries: "Find all users within 3 degrees of connection"

    With ai-chat

    Use knowledge graphs to enrich LLM context with structured relationships.

    With auth-security (ReBAC)

    Implement relationship-based access control: "Can user X access resource Y through relation Z?"

    Common Schema Patterns

    Star Schema (Hub and Spokes)

    (:User)-[:PURCHASED]->(:Product)
    (:User)-[:VIEWED]->(:Product)
    (:User)-[:RATED]->(:Product)
    

    Hierarchical Schema (Trees)

    (:CEO)-[:MANAGES]->(:VP)-[:MANAGES]->(:Director)
    

    Temporal Schema (Event Sequences)

    (:Event {timestamp})-[:NEXT]->(:Event {timestamp})
    

    Getting Started

    1. Choose database: Use decision framework above
    2. Design schema: Reference references/graph-modeling.md
    3. Implement queries: Use patterns from references/cypher-patterns.md
    4. Validate: Run scripts/validate_graph_schema.py
    5. Optimize: Add indexes, bound traversals, cache aggregations

    Further Reading

    • references/neo4j.md - Neo4j setup, drivers, GDS algorithms
    • references/arangodb.md - ArangoDB multi-model patterns
    • references/cypher-patterns.md - Comprehensive Cypher query library
    • references/graph-modeling.md - Data modeling best practices
    • examples/social-graph/ - Complete social network implementation
    • examples/knowledge-graph/ - Hybrid vector + graph for AI/RAG
    Recommended Servers
    InfraNodus Knowledge Graphs & Text Analysis
    InfraNodus Knowledge Graphs & Text Analysis
    Prisma
    Prisma
    ThinAir Data
    ThinAir Data
    Repository
    ancoleman/ai-design-components
    Files