Xenium benchmarking 本地文档快照(全量 HTML)- 包含完整模块文档
Comprehensive assistance with Xenium spatial transcriptomics benchmarking, generated from official documentation.
This skill should be triggered when:
# Import required modules
import os
import numpy as np
import pandas as pd
import scanpy as sc
from xb.formatting import *
from xb.plotting import *
from xb.preprocessing import *
from xb.Spage_main import *
from xb.calculating import *
# Format Xenium output to AnnData
files=['./data/output-XETG00047__0011146__1886C__20231102__180733',
'./data/output-XETG00047__0011146__1886P__20231102__180733']
output_path=r'../pipeline_output/'
max_nucleus_distance=10
min_quality=0
adata=format_to_adata(files=files,
output_path=output_path,
max_nucleus_distance=max_nucleus_distance,
min_quality=min_quality,
save=True)
# Define clustering parameters
clustering_params={
'normalization_target_sum':100,
'min_counts_x_cell':40,
'min_genes_x_cell':15,
'scale':False,
'clustering_alg':'louvain',
'resolutions':[0.2,0.5,1.1],
'n_neighbors':15,
'umap_min_dist':0.1,
'n_pcs':0
}
# Read and preprocess data
adata=sc.read(output_path+'combined_filtered.h5ad')
adata=preprocess_adata(adata, save=True,
clustering_params=clustering_params,
output_path=output_path)
# Banksy parameters for spatial domain identification
banksy_params={
'resolutions':[.9],
'pca_dims':[20],
'lambda_list':[.8],
'k_geom':15,
'max_m':1,
'nbr_weight_decay':"scaled_gaussian",
'cluster_algorithm':'leiden'
}
# Run Banksy domain identification
adata, adata_banksy = domains_by_banksy(adata,
plot_path=plot_path,
banksy_params=banksy_params)
# Prepare Xenium data for Baysor
prep_xenium_data_for_baysor(files[0], output_path,
CROP=True,
COORDS=[1000, 5000, 1000, 5000])
# Run Baysor using Docker (command line)
!docker pull louisk92/txsim_baysor:v0.6.2bin
!sudo docker run -it --rm -v /path/to/baysor/input:/data \
-v /path/to/xb/module:/module louisk92/txsim_baysor:v0.6.2bin
!cd /path/to/xb && bash run_baysor.sh "/data"
# Format Baysor output to AnnData
path='/path/to/baysor/output'
adata=format_baysor_output_to_adata(path, output_path)
# Calculate neighborhood enrichment
adata1 = neighborhood_enrichment(adata,
sample_key='sample',
radius=50,
cluster_key='cell_type')
# Identify spatially variable genes
adata1 = spatially_variable_features(adata1,
sample_key='sample',
radius=50)
# Generate spatial plots
spatial_plot(adata, key='cell_type', clusters='all',
size=10, background='white',
figuresize=(10, 8), save=True)
# Calculate cell density
cell_density_value = cell_density(adata_sp, pipeline_output=True)
# Calculate negative marker purity
purity_score = negative_marker_purity(adata_sp, adata_sc,
key='cell_type',
pipeline_output=True)
# Compute clustering comparisons
fmi = fowlkes_mallows_index(ground_truth, predicted)
nmi = normalized_mutual_info_score(ground_truth, predicted)
vi = variation_of_information(ground_truth, predicted)
This skill includes comprehensive documentation in references/:
main_documentation.md - Core module API documentation
pipeline.md - End-to-end workflow documentation
format_to_adata() to process Xenium outputsmax_nucleus_distance and min_quality filtersspatial_plot() to explore cell type distributionsAdd helper scripts for:
Add example data for:
use_parquet=True for faster loadingrate_limit parameters based on system resourcesbatch_prep_xenium_data_for_baysor) for multiple samplesmodify_coords_for_banksypip install xb)To refresh this skill with updated documentation: