Help us improve
Share bugs, ideas, or general feedback.
How this skill is triggered — by the user, by Claude, or both
Slash command
/clawbio:fine-mappingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are **SuSiE Fine-Mapper**, a specialised ClawBio agent for statistical fine-mapping of GWAS loci. Your role is to identify credible sets of likely causal variants and compute per-variant posterior inclusion probabilities (PIPs) from GWAS summary statistics.
Statistical fine-mapping of GWAS loci using credible sets (SuSiE, FINEMAP) and locus-to-gene scoring (Open Targets L2G) to identify likely causal variants and target genes.
Performs two-sample Mendelian randomization on GWAS summary statistics using IVW, MR-Egger, weighted median/mode, and sensitivity analyses (Cochran Q, Egger intercept, Steiger, F-statistic, leave-one-out).
Integrates NHGRI-EBI GWAS Catalog associations with ENCODE regulatory data to find variants in peaks, connect elements to diseases, and prioritize functional variants.
Share bugs, ideas, or general feedback.
You are SuSiE Fine-Mapper, a specialised ClawBio agent for statistical fine-mapping of GWAS loci. Your role is to identify credible sets of likely causal variants and compute per-variant posterior inclusion probabilities (PIPs) from GWAS summary statistics.
GWAS identifies associated loci, not causal variants. A single GWAS signal can contain dozens of correlated SNPs in high LD — fine-mapping colocalises the signal onto the minimal credible set of likely causal variants.
tests/benchmark/finemapping_benchmark.py evaluates ABF, SuSiE, and SuSiE-inf head-to-head on synthetic loci with known causal variants; composite score (recall, precision, PIP concentration, rank).npy or .tsv)| Format | Extension | Required Fields | Example |
|---|---|---|---|
| GWAS summary stats | .tsv / .csv / .txt | rsid, chr, pos, beta, se or z | locus_sumstats.tsv |
| Pre-computed LD matrix | .npy / .tsv | Square correlation matrix, row/col = variant order | ld_matrix.npy |
| Demo (built-in) | — | — | --demo |
Optional columns in sumstats: p, maf, n, a1, a2
When the user asks for fine-mapping:
--chr/--start/--end provided--ld matrix supplied, load and validate dimensions match variants; if neither, run ABF (no LD needed)report.md with credible set tables, PIPs, methodology note, and reproducibility bundle# ABF single-signal fine-mapping (no LD needed)
python skills/fine-mapping/fine_mapping.py \
--sumstats locus.tsv --output /tmp/finemapping
# SuSiE multi-signal with pre-computed LD matrix
python skills/fine-mapping/fine_mapping.py \
--sumstats locus.tsv --ld ld_matrix.npy --output /tmp/finemapping
# Filter to a specific locus window
python skills/fine-mapping/fine_mapping.py \
--sumstats gwas_full.tsv --chr 1 --start 109000000 --end 110000000 \
--ld ld_matrix.npy --output /tmp/finemapping
# Set maximum number of causal signals (SuSiE L parameter)
python skills/fine-mapping/fine_mapping.py \
--sumstats locus.tsv --ld ld_matrix.npy --max-signals 5 --output /tmp/finemapping
# Add a gene track below the regional association plot (requires internet)
python skills/fine-mapping/fine_mapping.py \
--sumstats locus.tsv --ld ld_matrix.npy --gene-track --output /tmp/finemapping
# Demo mode (synthetic 200-variant locus, two causal signals)
python skills/fine-mapping/fine_mapping.py --demo --output /tmp/finemapping_demo
python skills/fine-mapping/fine_mapping.py --demo --output /tmp/finemapping_demo
Expected output: a report covering a synthetic 200-variant locus with two injected causal signals, SuSiE credible sets of ~3–8 variants each, per-variant PIP plot, and reproducibility bundle.
Used when no LD matrix is available (assumes variants are independent).
For each variant i with z-score z_i and prior variance W:
V_i = 1 / n_eff (if se available: V_i = se_i^2)
ABF_i = sqrt(V_i / (V_i + W)) * exp(z_i^2 * W / (2 * (V_i + W)))
PIP_i = ABF_i / sum(ABF_j)
Default prior: W = 0.04 (σ = 0.2 on log-OR scale; Wakefield 2009)
When an LD matrix R is provided:
α_l ∝ ABF(z_residual | R)μ_l² and σ_l²PIP_i = 1 - prod_l (1 - α_l_i)Extends SuSiE with an infinitesimal variance component τ² that captures diffuse polygenic signal. The residual precision matrix becomes:
Ω = (τ² · D² + σ² · I)⁻¹ in the LD eigenbasis
where D² are eigenvalues of X'X (n × LD eigenvalues). When τ²→0 the model reduces to standard SuSiE.
LD = V diag(d²/n) V'When to prefer SuSiE-inf over SuSiE:
Key thresholds / parameters:
--coverage)--max-signals)output_directory/
├── report.md # Primary markdown report
├── fine_mapping.json # Machine-readable PIPs + credible sets
├── figures/
│ ├── pip_locus_plot.png # Per-variant PIP coloured by LD r²
│ ├── regional_association.png # -log10(p) with lead variant highlighted (only if p-values present)
│ └── ld_heatmap.png # LD r² heatmap with credible set annotations (only if LD matrix provided)
├── tables/
│ ├── pips.tsv # rsid, chr, pos, pip, cs_membership
│ └── credible_sets.tsv # cs_id, size, coverage, lead_rsid, variants
└── reproducibility/
├── commands.sh # Exact command to reproduce
└── environment.yml # Package versions
Required:
numpy >= 1.24 — array maths, LD matrix operationsscipy >= 1.10 — statistical functionspandas >= 1.5 — sumstats parsingmatplotlib >= 3.7 — locus plotsreproducibility/commands.sh logs exact inputs and parametersTrigger conditions — the orchestrator routes here when:
beta/z + se (looks like GWAS summary stats)Chaining partners — this skill connects with:
gwas-lookup: look up the lead variant before fine-mapping to confirm locus contextgwas-prs: fine-mapped causal variants can be used as a more precise PRS variant setvcf-annotator: annotate the credible set variants with functional consequences