Smithery Logo
MCPsSkillsDocsPricing
Login
Smithery Logo

Accelerating the Agent Economy

Resources

DocumentationPrivacy PolicySystem Status

Company

PricingAboutBlog

Connect

© 2026 Smithery. All rights reserved.

    Starlitnightly

    data-export-excel

    Starlitnightly/data-export-excel
    Productivity
    834

    About

    SKILL.md

    Install

    Install via Skills CLI

    or add to your agent
    • Claude Code
      Claude Code
    • Codex
      Codex
    • OpenClaw
      OpenClaw
    • Cursor
      Cursor
    • Amp
      Amp
    • GitHub Copilot
      GitHub Copilot
    • Gemini CLI
      Gemini CLI
    • Kilo Code
      Kilo Code
    • Junie
      Junie
    • Replit
      Replit
    • Windsurf
      Windsurf
    • Cline
      Cline
    • Continue
      Continue
    • OpenCode
      OpenCode
    • OpenHands
      OpenHands
    • Roo Code
      Roo Code
    • Augment
      Augment
    • Goose
      Goose
    • Trae
      Trae
    • Zencoder
      Zencoder
    • Antigravity
      Antigravity
    ├─
    ├─
    └─

    About

    Export analysis results, data tables, and formatted spreadsheets to Excel files using openpyxl. Works with ANY LLM provider (GPT, Gemini, Claude, etc.).

    SKILL.md

    Excel Data Export (Universal)

    Overview

    This skill enables you to export bioinformatics data, analysis results, and formatted tables to professional Excel spreadsheets. Unlike cloud-hosted solutions, this skill uses the openpyxl Python library and executes locally in your environment, making it compatible with ALL LLM providers including GPT, Gemini, Claude, DeepSeek, and Qwen.

    When to Use This Skill

    • Export AnnData observations (.obs) or variables (.var) to Excel
    • Save DEG analysis results with formatting
    • Create multi-sheet workbooks with different data types
    • Generate formatted Excel reports with cell styling
    • Export cluster annotations, cell type assignments, or quality control metrics

    How to Use

    Step 1: Import Required Libraries

    import openpyxl
    from openpyxl import Workbook
    from openpyxl.styles import Font, PatternFill, Alignment, Border, Side
    from openpyxl.utils.dataframe import dataframe_to_rows
    import pandas as pd
    import numpy as np
    

    Step 2: Prepare Your Data

    Convert your data to pandas DataFrame format:

    # Example: Export AnnData observations
    df = adata.obs.copy()
    
    # Example: Export DEG results
    deg_df = pd.DataFrame({
        'gene': gene_names,
        'log2FC': log2_fold_changes,
        'pvalue': pvalues,
        'qvalue': qvalues
    })
    
    # Example: Export cluster statistics
    cluster_stats = adata.obs.groupby('clusters').size().reset_index(name='count')
    

    Step 3: Create Excel Workbook

    # Create new workbook
    wb = Workbook()
    ws = wb.active
    ws.title = "Sheet Name"
    
    # Write DataFrame to worksheet
    for r in dataframe_to_rows(df, index=False, header=True):
        ws.append(r)
    

    Step 4: Add Formatting (Optional)

    # Style header row
    header_fill = PatternFill(start_color="4472C4", end_color="4472C4", fill_type="solid")
    header_font = Font(bold=True, color="FFFFFF")
    
    for cell in ws[1]:
        cell.fill = header_fill
        cell.font = header_font
        cell.alignment = Alignment(horizontal='center')
    
    # Auto-adjust column widths
    for column in ws.columns:
        max_length = 0
        column_letter = column[0].column_letter
        for cell in column:
            try:
                if len(str(cell.value)) > max_length:
                    max_length = len(str(cell.value))
            except:
                pass
        adjusted_width = min(max_length + 2, 50)
        ws.column_dimensions[column_letter].width = adjusted_width
    
    # Add borders
    thin_border = Border(
        left=Side(style='thin'),
        right=Side(style='thin'),
        top=Side(style='thin'),
        bottom=Side(style='thin')
    )
    for row in ws.iter_rows(min_row=1, max_row=ws.max_row, min_col=1, max_col=ws.max_column):
        for cell in row:
            cell.border = thin_border
    

    Step 5: Save the Workbook

    # Save to file
    output_path = "analysis_results.xlsx"
    wb.save(output_path)
    print(f"✅ Excel file saved to: {output_path}")
    

    Multi-Sheet Workbooks

    Create workbooks with multiple sheets for different data types:

    wb = Workbook()
    
    # Sheet 1: Cell metadata
    ws1 = wb.active
    ws1.title = "Cell Metadata"
    for r in dataframe_to_rows(adata.obs, index=True, header=True):
        ws1.append(r)
    
    # Sheet 2: Gene metadata
    ws2 = wb.create_sheet("Gene Metadata")
    for r in dataframe_to_rows(adata.var, index=True, header=True):
        ws2.append(r)
    
    # Sheet 3: DEG results
    ws3 = wb.create_sheet("DEG Results")
    for r in dataframe_to_rows(deg_df, index=False, header=True):
        ws3.append(r)
    
    wb.save("multi_sheet_analysis.xlsx")
    

    Best Practices

    1. Column Headers: Always include column headers in the first row
    2. Data Types: Convert numpy arrays to lists before writing
    3. Large Datasets: For datasets >100K rows, consider CSV export instead
    4. File Paths: Use absolute paths or ensure output directory exists
    5. Formatting: Apply formatting sparingly to reduce file size
    6. Index: Decide whether to include DataFrame index (set index=True/False in dataframe_to_rows)

    Common Use Cases

    Export Quality Control Metrics

    qc_metrics = adata.obs[['n_genes', 'n_counts', 'percent_mito', 'clusters']].copy()
    
    wb = Workbook()
    ws = wb.active
    ws.title = "QC Metrics"
    
    for r in dataframe_to_rows(qc_metrics, index=False, header=True):
        ws.append(r)
    
    # Highlight cells with high mitochondrial content
    for row in range(2, ws.max_row + 1):
        if ws.cell(row, 3).value > 0.2:  # percent_mito > 20%
            ws.cell(row, 3).fill = PatternFill(start_color="FFC7CE", end_color="FFC7CE", fill_type="solid")
    
    wb.save("qc_metrics.xlsx")
    

    Export Marker Genes by Cluster

    # Assuming you have marker genes for each cluster
    marker_dict = {
        'Cluster_0': ['CD3D', 'CD3E', 'CD8A'],
        'Cluster_1': ['CD79A', 'MS4A1', 'CD19'],
        'Cluster_2': ['LYZ', 'S100A9', 'CD14']
    }
    
    wb = Workbook()
    
    for cluster_name, genes in marker_dict.items():
        ws = wb.create_sheet(cluster_name)
        ws.append(['Marker Gene'])
        for gene in genes:
            ws.append([gene])
    
    # Remove default sheet
    if 'Sheet' in wb.sheetnames:
        wb.remove(wb['Sheet'])
    
    wb.save("marker_genes.xlsx")
    

    Export DEG Analysis with Conditional Formatting

    wb = Workbook()
    ws = wb.active
    ws.title = "DEG Analysis"
    
    # Write DEG results
    for r in dataframe_to_rows(deg_df, index=False, header=True):
        ws.append(r)
    
    # Color code by fold change
    for row in range(2, ws.max_row + 1):
        log2fc = ws.cell(row, 2).value  # Assuming log2FC in column 2
        if log2fc > 1:  # Upregulated
            ws.cell(row, 2).fill = PatternFill(start_color="C6EFCE", end_color="C6EFCE", fill_type="solid")
        elif log2fc < -1:  # Downregulated
            ws.cell(row, 2).fill = PatternFill(start_color="FFC7CE", end_color="FFC7CE", fill_type="solid")
    
    wb.save("deg_results_formatted.xlsx")
    

    Troubleshooting

    Issue: "openpyxl not found"

    Solution: Install the library:

    import subprocess
    subprocess.check_call(['pip', 'install', 'openpyxl'])
    

    Issue: "Invalid data type for cell"

    Solution: Convert numpy/pandas types to native Python types:

    # Convert numpy types
    df = df.astype(object).where(pd.notnull(df), None)
    
    # Or convert specific columns
    df['column_name'] = df['column_name'].astype(str)
    

    Issue: "Memory error with large datasets"

    Solution: Export in chunks or use CSV format instead:

    # Fallback to CSV for large data
    df.to_csv('large_dataset.csv', index=False)
    print("Dataset too large for Excel, saved as CSV instead")
    

    Technical Notes

    • Library: Uses openpyxl (pure Python, no external dependencies)
    • Execution: Runs locally in the agent's sandbox
    • Compatibility: Works with ALL LLM providers (GPT, Gemini, Claude, DeepSeek, Qwen, etc.)
    • File Limits: Excel has a 1,048,576 row limit (use CSV for larger datasets)
    • Performance: Writing ~10K rows takes 1-2 seconds; 100K rows takes 10-20 seconds

    References

    • openpyxl documentation: https://openpyxl.readthedocs.io/
    • pandas DataFrame export: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_excel.html
    Recommended Servers
    Excel
    Excel
    Google BigQuery
    Google BigQuery
    Google Sheets
    Google Sheets
    Repository
    starlitnightly/omicverse
    Files