From mims-harvard-tooluniverse
Processes epigenomics data: methylation arrays (CpG filtering, differential analysis), ChIP-seq peaks (calling, motifs), ATAC-seq accessibility, multi-omics integration using Python and annotation tools.
npx claudepluginhub joshuarweaver/cascade-data-analytics --plugin mims-harvard-tooluniverseThis skill uses the workspace's default tool permissions.
Production-ready skill combining Python computation (pandas, scipy, numpy, pysam, statsmodels) with ToolUniverse annotation tools for epigenomics analysis.
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Dynamically discovers and combines enabled skills into cohesive, unexpected delightful experiences like interactive HTML or themed artifacts. Activates on 'surprise me', inspiration, or boredom cues.
Generates images from structured JSON prompts via Python script execution. Supports reference images and aspect ratios for characters, scenes, products, visuals.
Production-ready skill combining Python computation (pandas, scipy, numpy, pysam, statsmodels) with ToolUniverse annotation tools for epigenomics analysis.
When uncertain about any scientific fact, SEARCH databases first.
Methylation data, ChIP-seq peaks, ATAC-seq, multi-omics integration, genome-wide epigenomic statistics. Keywords: methylation, CpG, ChIP-seq, ATAC-seq, histone, chromatin, epigenetic.
NOT for: RNA-seq DEG, variant calling, gene enrichment, protein structure.
Identify data files, specific statistic, thresholds, genome build. Categorize by keywords.
See ANALYSIS_PROCEDURES.md for decision tree.
ENCODE tools:
ENCODE_search_rnaseq_experiments: assay_type ("total RNA-seq" default; fall back to "polyA plus RNA-seq"), biosample, limitENCODE_search_histone_experiments: target (e.g., "H3K27ac"), cell_type/tissue/biosample, limitGEO tools: GEO_search_rnaseq_datasets, GEO_search_atacseq_datasets -- both accept limit or max_results
GTEx tools:
GTEx_get_median_gene_expression: gene_symbol (NOT Ensembl ID)GTEx_query_eqtl: gene_symbol, tissue_id (case-sensitive exact, e.g., "Whole_Blood")Other: ensembl_lookup_gene (requires species='homo_sapiens'), ensembl_get_regulatory_features (NO "chr" prefix), SCREEN_get_regulatory_elements, ChIPAtlas_* (requires operation param), SRA_search_experiments (library_strategy: "ChIP-Seq"/"Bisulfite-Seq"/"ATAC-seq")
Global mean/median beta, probe variance, chromosome density, DMP counts.
See CODE_REFERENCE.md for full implementations.
| Pattern | Key Steps |
|---|---|
| Differential methylation | Filter probes → groups → t-test → FDR → threshold |
| Age-related CpG density | Correlate with age → FDR → map to chr → density ratio |
| Multi-omics missing data | Extract IDs → intersect → check NaN → complete case count |
| ChIP-seq annotation | Load peaks → annotate genes → classify regions |
| Methylation-expression | Align samples → correlate → FDR → anti-correlations |
Whole_Blood, Liver, Lung, Breast_Mammary_Tissue, Brain_Cortex, Heart_Left_Ventricle, Kidney_Cortex, Thyroid, Adipose_Subcutaneous, Muscle_Skeletal
| Grade | Criteria |
|---|---|
| Strong | padj < 0.01 AND abs(delta-beta) >= 0.2, replicated |
| Moderate | padj < 0.05 AND abs(delta-beta) >= 0.1 |
| Weak | padj < 0.05 but delta-beta < 0.1 |
| Insufficient | padj >= 0.05 or no replication |
Delta-beta >= 0.2 = strong effect. ChIP-seq: q < 0.01, FE >= 2 for confidence. ATAC-seq NFR < 150bp = active regulatory. Always apply BH FDR. Verify genome build consistency.
CODE_REFERENCE.md, TOOLS_REFERENCE.md, ANALYSIS_PROCEDURES.md, QUICK_START.md