Smithery Logo
MCPsSkillsDocsPricing
Login
Smithery Logo

Accelerating the Agent Economy

Resources

DocumentationPrivacy PolicySystem Status

Company

PricingAboutBlog

Connect

© 2026 Smithery. All rights reserved.

    lifangda

    bioservices

    lifangda/bioservices
    Productivity
    24
    1 installs

    About

    SKILL.md

    Install

    Install via Skills CLI

    or add to your agent
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    ├─
    ├─
    └─

    About

    Primary Python tool for 40+ bioinformatics services. Preferred for multi-database workflows: UniProt, KEGG, ChEMBL, PubChem, Reactome, QuickGO. Unified API for queries, ID mapping, pathway analysis...

    SKILL.md

    BioServices

    Overview

    BioServices is a Python package providing programmatic access to approximately 40 bioinformatics web services and databases. Retrieve biological data, perform cross-database queries, map identifiers, analyze sequences, and integrate multiple biological resources in Python workflows. The package handles both REST and SOAP/WSDL protocols transparently.

    When to Use This Skill

    This skill should be used when:

    • Retrieving protein sequences, annotations, or structures from UniProt, PDB, Pfam
    • Analyzing metabolic pathways and gene functions via KEGG or Reactome
    • Searching compound databases (ChEBI, ChEMBL, PubChem) for chemical information
    • Converting identifiers between different biological databases (KEGG↔UniProt, compound IDs)
    • Running sequence similarity searches (BLAST, MUSCLE alignment)
    • Querying gene ontology terms (QuickGO, GO annotations)
    • Accessing protein-protein interaction data (PSICQUIC, IntactComplex)
    • Mining genomic data (BioMart, ArrayExpress, ENA)
    • Integrating data from multiple bioinformatics resources in a single workflow

    Core Capabilities

    1. Protein Analysis

    Retrieve protein information, sequences, and functional annotations:

    from bioservices import UniProt
    
    u = UniProt(verbose=False)
    
    # Search for protein by name
    results = u.search("ZAP70_HUMAN", frmt="tab", columns="id,genes,organism")
    
    # Retrieve FASTA sequence
    sequence = u.retrieve("P43403", "fasta")
    
    # Map identifiers between databases
    kegg_ids = u.mapping(fr="UniProtKB_AC-ID", to="KEGG", query="P43403")
    

    Key methods:

    • search(): Query UniProt with flexible search terms
    • retrieve(): Get protein entries in various formats (FASTA, XML, tab)
    • mapping(): Convert identifiers between databases

    Reference: references/services_reference.md for complete UniProt API details.

    2. Pathway Discovery and Analysis

    Access KEGG pathway information for genes and organisms:

    from bioservices import KEGG
    
    k = KEGG()
    k.organism = "hsa"  # Set to human
    
    # Search for organisms
    k.lookfor_organism("droso")  # Find Drosophila species
    
    # Find pathways by name
    k.lookfor_pathway("B cell")  # Returns matching pathway IDs
    
    # Get pathways containing specific genes
    pathways = k.get_pathway_by_gene("7535", "hsa")  # ZAP70 gene
    
    # Retrieve and parse pathway data
    data = k.get("hsa04660")
    parsed = k.parse(data)
    
    # Extract pathway interactions
    interactions = k.parse_kgml_pathway("hsa04660")
    relations = interactions['relations']  # Protein-protein interactions
    
    # Convert to Simple Interaction Format
    sif_data = k.pathway2sif("hsa04660")
    

    Key methods:

    • lookfor_organism(), lookfor_pathway(): Search by name
    • get_pathway_by_gene(): Find pathways containing genes
    • parse_kgml_pathway(): Extract structured pathway data
    • pathway2sif(): Get protein interaction networks

    Reference: references/workflow_patterns.md for complete pathway analysis workflows.

    3. Compound Database Searches

    Search and cross-reference compounds across multiple databases:

    from bioservices import KEGG, UniChem
    
    k = KEGG()
    
    # Search compounds by name
    results = k.find("compound", "Geldanamycin")  # Returns cpd:C11222
    
    # Get compound information with database links
    compound_info = k.get("cpd:C11222")  # Includes ChEBI links
    
    # Cross-reference KEGG → ChEMBL using UniChem
    u = UniChem()
    chembl_id = u.get_compound_id_from_kegg("C11222")  # Returns CHEMBL278315
    

    Common workflow:

    1. Search compound by name in KEGG
    2. Extract KEGG compound ID
    3. Use UniChem for KEGG → ChEMBL mapping
    4. ChEBI IDs are often provided in KEGG entries

    Reference: references/identifier_mapping.md for complete cross-database mapping guide.

    4. Sequence Analysis

    Run BLAST searches and sequence alignments:

    from bioservices import NCBIblast
    
    s = NCBIblast(verbose=False)
    
    # Run BLASTP against UniProtKB
    jobid = s.run(
        program="blastp",
        sequence=protein_sequence,
        stype="protein",
        database="uniprotkb",
        email="your.email@example.com"  # Required by NCBI
    )
    
    # Check job status and retrieve results
    s.getStatus(jobid)
    results = s.getResult(jobid, "out")
    

    Note: BLAST jobs are asynchronous. Check status before retrieving results.

    5. Identifier Mapping

    Convert identifiers between different biological databases:

    from bioservices import UniProt, KEGG
    
    # UniProt mapping (many database pairs supported)
    u = UniProt()
    results = u.mapping(
        fr="UniProtKB_AC-ID",  # Source database
        to="KEGG",              # Target database
        query="P43403"          # Identifier(s) to convert
    )
    
    # KEGG gene ID → UniProt
    kegg_to_uniprot = u.mapping(fr="KEGG", to="UniProtKB_AC-ID", query="hsa:7535")
    
    # For compounds, use UniChem
    from bioservices import UniChem
    u = UniChem()
    chembl_from_kegg = u.get_compound_id_from_kegg("C11222")
    

    Supported mappings (UniProt):

    • UniProtKB ↔ KEGG
    • UniProtKB ↔ Ensembl
    • UniProtKB ↔ PDB
    • UniProtKB ↔ RefSeq
    • And many more (see references/identifier_mapping.md)

    6. Gene Ontology Queries

    Access GO terms and annotations:

    from bioservices import QuickGO
    
    g = QuickGO(verbose=False)
    
    # Retrieve GO term information
    term_info = g.Term("GO:0003824", frmt="obo")
    
    # Search annotations
    annotations = g.Annotation(protein="P43403", format="tsv")
    

    7. Protein-Protein Interactions

    Query interaction databases via PSICQUIC:

    from bioservices import PSICQUIC
    
    s = PSICQUIC(verbose=False)
    
    # Query specific database (e.g., MINT)
    interactions = s.query("mint", "ZAP70 AND species:9606")
    
    # List available interaction databases
    databases = s.activeDBs
    

    Available databases: MINT, IntAct, BioGRID, DIP, and 30+ others.

    Multi-Service Integration Workflows

    BioServices excels at combining multiple services for comprehensive analysis. Common integration patterns:

    Complete Protein Analysis Pipeline

    Execute a full protein characterization workflow:

    python scripts/protein_analysis_workflow.py ZAP70_HUMAN your.email@example.com
    

    This script demonstrates:

    1. UniProt search for protein entry
    2. FASTA sequence retrieval
    3. BLAST similarity search
    4. KEGG pathway discovery
    5. PSICQUIC interaction mapping

    Pathway Network Analysis

    Analyze all pathways for an organism:

    python scripts/pathway_analysis.py hsa output_directory/
    

    Extracts and analyzes:

    • All pathway IDs for organism
    • Protein-protein interactions per pathway
    • Interaction type distributions
    • Exports to CSV/SIF formats

    Cross-Database Compound Search

    Map compound identifiers across databases:

    python scripts/compound_cross_reference.py Geldanamycin
    

    Retrieves:

    • KEGG compound ID
    • ChEBI identifier
    • ChEMBL identifier
    • Basic compound properties

    Batch Identifier Conversion

    Convert multiple identifiers at once:

    python scripts/batch_id_converter.py input_ids.txt --from UniProtKB_AC-ID --to KEGG
    

    Best Practices

    Output Format Handling

    Different services return data in various formats:

    • XML: Parse using BeautifulSoup (most SOAP services)
    • Tab-separated (TSV): Pandas DataFrames for tabular data
    • Dictionary/JSON: Direct Python manipulation
    • FASTA: BioPython integration for sequence analysis

    Rate Limiting and Verbosity

    Control API request behavior:

    from bioservices import KEGG
    
    k = KEGG(verbose=False)  # Suppress HTTP request details
    k.TIMEOUT = 30  # Adjust timeout for slow connections
    

    Error Handling

    Wrap service calls in try-except blocks:

    try:
        results = u.search("ambiguous_query")
        if results:
            # Process results
            pass
    except Exception as e:
        print(f"Search failed: {e}")
    

    Organism Codes

    Use standard organism abbreviations:

    • hsa: Homo sapiens (human)
    • mmu: Mus musculus (mouse)
    • dme: Drosophila melanogaster
    • sce: Saccharomyces cerevisiae (yeast)

    List all organisms: k.list("organism") or k.organismIds

    Integration with Other Tools

    BioServices works well with:

    • BioPython: Sequence analysis on retrieved FASTA data
    • Pandas: Tabular data manipulation
    • PyMOL: 3D structure visualization (retrieve PDB IDs)
    • NetworkX: Network analysis of pathway interactions
    • Galaxy: Custom tool wrappers for workflow platforms

    Resources

    scripts/

    Executable Python scripts demonstrating complete workflows:

    • protein_analysis_workflow.py: End-to-end protein characterization
    • pathway_analysis.py: KEGG pathway discovery and network extraction
    • compound_cross_reference.py: Multi-database compound searching
    • batch_id_converter.py: Bulk identifier mapping utility

    Scripts can be executed directly or adapted for specific use cases.

    references/

    Detailed documentation loaded as needed:

    • services_reference.md: Comprehensive list of all 40+ services with methods
    • workflow_patterns.md: Detailed multi-step analysis workflows
    • identifier_mapping.md: Complete guide to cross-database ID conversion

    Load references when working with specific services or complex integration tasks.

    Installation

    pip install bioservices
    

    Dependencies are automatically managed. Package is tested on Python 3.9-3.12.

    Additional Information

    For detailed API documentation and advanced features, refer to:

    • Official documentation: https://bioservices.readthedocs.io/
    • Source code: https://github.com/cokelaer/bioservices
    • Service-specific references in references/services_reference.md
    Recommended Servers
    Codeinterpreter
    Codeinterpreter
    Vercel Grep
    Vercel Grep
    EasyWeek
    EasyWeek
    Repository
    lifangda/claude-plugins
    Files