Smithery Logo
MCPsSkillsDocsPricing
Login
Smithery Logo

Accelerating the Agent Economy

Resources

DocumentationPrivacy PolicySystem Status

Company

PricingAboutBlog

Connect

© 2026 Smithery. All rights reserved.

    anthropics

    scvi-tools

    anthropics/scvi-tools
    AI & ML
    6,647
    15 installs

    About

    SKILL.md

    Install

    Install via Skills CLI

    or add to your agent
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    ├─
    ├─
    └─

    About

    Deep learning for single-cell analysis using scvi-tools...

    SKILL.md

    scvi-tools Deep Learning Skill

    This skill provides guidance for deep learning-based single-cell analysis using scvi-tools, the leading framework for probabilistic models in single-cell genomics.

    How to Use This Skill

    1. Identify the appropriate workflow from the model/workflow tables below
    2. Read the corresponding reference file for detailed steps and code
    3. Use scripts in scripts/ to avoid rewriting common code
    4. For installation or GPU issues, consult references/environment_setup.md
    5. For debugging, consult references/troubleshooting.md

    When to Use This Skill

    • When scvi-tools, scVI, scANVI, or related models are mentioned
    • When deep learning-based batch correction or integration is needed
    • When working with multi-modal data (CITE-seq, multiome)
    • When reference mapping or label transfer is required
    • When analyzing ATAC-seq or spatial transcriptomics data
    • When learning latent representations of single-cell data

    Model Selection Guide

    Data Type Model Primary Use Case
    scRNA-seq scVI Unsupervised integration, DE, imputation
    scRNA-seq + labels scANVI Label transfer, semi-supervised integration
    CITE-seq (RNA+protein) totalVI Multi-modal integration, protein denoising
    scATAC-seq PeakVI Chromatin accessibility analysis
    Multiome (RNA+ATAC) MultiVI Joint modality analysis
    Spatial + scRNA reference DestVI Cell type deconvolution
    RNA velocity veloVI Transcriptional dynamics
    Cross-technology sysVI System-level batch correction

    Workflow Reference Files

    Workflow Reference File Description
    Environment Setup references/environment_setup.md Installation, GPU, version info
    Data Preparation references/data_preparation.md Formatting data for any model
    scRNA Integration references/scrna_integration.md scVI/scANVI batch correction
    ATAC-seq Analysis references/atac_peakvi.md PeakVI for accessibility
    CITE-seq Analysis references/citeseq_totalvi.md totalVI for protein+RNA
    Multiome Analysis references/multiome_multivi.md MultiVI for RNA+ATAC
    Spatial Deconvolution references/spatial_deconvolution.md DestVI spatial analysis
    Label Transfer references/label_transfer.md scANVI reference mapping
    scArches Mapping references/scarches_mapping.md Query-to-reference mapping
    Batch Correction references/batch_correction_sysvi.md Advanced batch methods
    RNA Velocity references/rna_velocity_velovi.md veloVI dynamics
    Troubleshooting references/troubleshooting.md Common issues and solutions

    CLI Scripts

    Modular scripts for common workflows. Chain together or modify as needed.

    Pipeline Scripts

    Script Purpose Usage
    prepare_data.py QC, filter, HVG selection python scripts/prepare_data.py raw.h5ad prepared.h5ad --batch-key batch
    train_model.py Train any scvi-tools model python scripts/train_model.py prepared.h5ad results/ --model scvi
    cluster_embed.py Neighbors, UMAP, Leiden python scripts/cluster_embed.py adata.h5ad results/
    differential_expression.py DE analysis python scripts/differential_expression.py model/ adata.h5ad de.csv --groupby leiden
    transfer_labels.py Label transfer with scANVI python scripts/transfer_labels.py ref_model/ query.h5ad results/
    integrate_datasets.py Multi-dataset integration python scripts/integrate_datasets.py results/ data1.h5ad data2.h5ad
    validate_adata.py Check data compatibility python scripts/validate_adata.py data.h5ad --batch-key batch

    Example Workflow

    # 1. Validate input data
    python scripts/validate_adata.py raw.h5ad --batch-key batch --suggest
    
    # 2. Prepare data (QC, HVG selection)
    python scripts/prepare_data.py raw.h5ad prepared.h5ad --batch-key batch --n-hvgs 2000
    
    # 3. Train model
    python scripts/train_model.py prepared.h5ad results/ --model scvi --batch-key batch
    
    # 4. Cluster and visualize
    python scripts/cluster_embed.py results/adata_trained.h5ad results/ --resolution 0.8
    
    # 5. Differential expression
    python scripts/differential_expression.py results/model results/adata_clustered.h5ad results/de.csv --groupby leiden
    

    Python Utilities

    The scripts/model_utils.py provides importable functions for custom workflows:

    Function Purpose
    prepare_adata() Data preparation (QC, HVG, layer setup)
    train_scvi() Train scVI or scANVI
    evaluate_integration() Compute integration metrics
    get_marker_genes() Extract DE markers
    save_results() Save model, data, plots
    auto_select_model() Suggest best model
    quick_clustering() Neighbors + UMAP + Leiden

    Critical Requirements

    1. Raw counts required: scvi-tools models require integer count data

      adata.layers["counts"] = adata.X.copy()  # Before normalization
      scvi.model.SCVI.setup_anndata(adata, layer="counts")
      
    2. HVG selection: Use 2000-4000 highly variable genes

      sc.pp.highly_variable_genes(adata, n_top_genes=2000, batch_key="batch", layer="counts", flavor="seurat_v3")
      adata = adata[:, adata.var['highly_variable']].copy()
      
    3. Batch information: Specify batch_key for integration

      scvi.model.SCVI.setup_anndata(adata, layer="counts", batch_key="batch")
      

    Quick Decision Tree

    Need to integrate scRNA-seq data?
    ├── Have cell type labels? → scANVI (references/label_transfer.md)
    └── No labels? → scVI (references/scrna_integration.md)
    
    Have multi-modal data?
    ├── CITE-seq (RNA + protein)? → totalVI (references/citeseq_totalvi.md)
    ├── Multiome (RNA + ATAC)? → MultiVI (references/multiome_multivi.md)
    └── scATAC-seq only? → PeakVI (references/atac_peakvi.md)
    
    Have spatial data?
    └── Need cell type deconvolution? → DestVI (references/spatial_deconvolution.md)
    
    Have pre-trained reference model?
    └── Map query to reference? → scArches (references/scarches_mapping.md)
    
    Need RNA velocity?
    └── veloVI (references/rna_velocity_velovi.md)
    
    Strong cross-technology batch effects?
    └── sysVI (references/batch_correction_sysvi.md)
    

    Key Resources

    • scvi-tools Documentation
    • scvi-tools Tutorials
    • Model Hub
    • GitHub Issues
    Recommended Servers
    Excel
    Excel
    Jina AI
    Jina AI
    InfraNodus Knowledge Graphs & Text Analysis
    InfraNodus Knowledge Graphs & Text Analysis
    Repository
    anthropics/knowledge-work-plugins
    Files