Performs quality control on single-cell RNA-seq data (.h5ad or .h5 files) using scverse best practices with MAD-based filtering and comprehensive visualizations. Use when users request QC analysis, filtering low-quality cells, assessing data quality, or following scverse/scanpy best practices for single-cell analysis.
Performs quality control on single-cell RNA-seq data using scverse best practices with MAD-based filtering and visualizations.
/plugin marketplace add sksdesignnew/claudepg/plugin install bio-research@knowledge-work-pluginsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
LICENSE.txtreferences/scverse_qc_guidelines.mdscripts/qc_analysis.pyscripts/qc_core.pyscripts/qc_plotting.pyAutomated QC workflow for single-cell RNA-seq data following scverse best practices.
Use when users:
Supported input formats:
.h5ad files (AnnData format from scanpy/Python workflows).h5 files (10X Genomics Cell Ranger output)Default recommendation: Use Approach 1 (complete pipeline) unless the user has specific custom requirements or explicitly requests non-standard filtering logic.
For standard QC following scverse best practices, use the convenience script scripts/qc_analysis.py:
python3 scripts/qc_analysis.py input.h5ad
# or for 10X Genomics .h5 files:
python3 scripts/qc_analysis.py raw_feature_bc_matrix.h5
The script automatically detects the file format and loads it appropriately.
When to use this approach:
Requirements: anndata, scanpy, scipy, matplotlib, seaborn, numpy
Parameters:
Customize filtering thresholds and gene patterns using command-line parameters:
--output-dir - Output directory--mad-counts, --mad-genes, --mad-mt - MAD thresholds for counts/genes/MT%--mt-threshold - Hard mitochondrial % cutoff--min-cells - Gene filtering threshold--mt-pattern, --ribo-pattern, --hb-pattern - Gene name patterns for different speciesUse --help to see current default values.
Outputs:
All files are saved to <input_basename>_qc_results/ directory by default (or to the directory specified by --output-dir):
qc_metrics_before_filtering.png - Pre-filtering visualizationsqc_filtering_thresholds.png - MAD-based threshold overlaysqc_metrics_after_filtering.png - Post-filtering quality metrics<input_basename>_filtered.h5ad - Clean, filtered dataset ready for downstream analysis<input_basename>_with_qc.h5ad - Original data with QC annotations preservedIf copying outputs for user access, copy individual files (not the entire directory) so users can preview them directly.
The script performs the following steps:
For custom analysis workflows or non-standard requirements, use the modular utility functions from scripts/qc_core.py and scripts/qc_plotting.py:
# Run from scripts/ directory, or add scripts/ to sys.path if needed
import anndata as ad
from qc_core import calculate_qc_metrics, detect_outliers_mad, filter_cells
from qc_plotting import plot_qc_distributions # Only if visualization needed
adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)
# ... custom analysis logic here
When to use this approach:
Available utility functions:
From qc_core.py (core QC operations):
calculate_qc_metrics(adata, mt_pattern, ribo_pattern, hb_pattern, inplace=True) - Calculate QC metrics and annotate adatadetect_outliers_mad(adata, metric, n_mads, verbose=True) - MAD-based outlier detection, returns boolean maskapply_hard_threshold(adata, metric, threshold, operator='>', verbose=True) - Apply hard cutoffs, returns boolean maskfilter_cells(adata, mask, inplace=False) - Apply boolean mask to filter cellsfilter_genes(adata, min_cells=20, min_counts=None, inplace=True) - Filter genes by detectionprint_qc_summary(adata, label='') - Print summary statisticsFrom qc_plotting.py (visualization):
plot_qc_distributions(adata, output_path, title) - Generate comprehensive QC plotsplot_filtering_thresholds(adata, outlier_masks, thresholds, output_path) - Visualize filtering thresholdsplot_qc_after_filtering(adata, output_path) - Generate post-filtering plotsExample custom workflows:
Example 1: Only calculate metrics and visualize, don't filter yet
adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)
plot_qc_distributions(adata, 'qc_before.png', title='Initial QC')
print_qc_summary(adata, label='Before filtering')
Example 2: Apply only MT% filtering, keep other metrics permissive
adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)
# Only filter high MT% cells
high_mt = apply_hard_threshold(adata, 'pct_counts_mt', 10, operator='>')
adata_filtered = filter_cells(adata, ~high_mt)
adata_filtered.write('filtered.h5ad')
Example 3: Different thresholds for different subsets
adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)
# Apply type-specific QC (assumes cell_type metadata exists)
neurons = adata.obs['cell_type'] == 'neuron'
other_cells = ~neurons
# Neurons tolerate higher MT%, other cells use stricter threshold
neuron_qc = apply_hard_threshold(adata[neurons], 'pct_counts_mt', 15, operator='>')
other_qc = apply_hard_threshold(adata[other_cells], 'pct_counts_mt', 8, operator='>')
For detailed QC methodology, parameter rationale, and troubleshooting guidance, see references/scverse_qc_guidelines.md. This reference provides:
Load this reference when users need deeper understanding of the methodology or when troubleshooting QC issues.
Typical downstream analysis steps:
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.