Help us improve
Share bugs, ideas, or general feedback.
From science-superpowers
Creates concrete analysis plans from approved research questions, covering model specs, confounds, power, and pipeline structure. Use before touching outcome data or fitting models.
npx claudepluginhub k-dense-ai/science-superpowers --plugin science-superpowersHow this skill is triggered — by the user, by Claude, or both
Slash command
/science-superpowers:designing-the-analysisThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Write a comprehensive analysis plan assuming the analyst has zero context for this project and questionable statistical taste. Document everything they need: which datasets and variables, how each construct is computed, the exact model or test, the sample size / power justification, which confounds are handled and how, the decision rules, and the planned figures. Give them the whole thing as bi...
Guides framing fuzzy research questions into precise, falsifiable investigations before any data is loaded or analyzed. Enforces a hard gate to prevent confirmatory contamination.
Guides writing pre-analysis plans (PAPs) for experimental social science. Covers registry selection (OSF, EGAP, AsPredicted), PAP structure, analysis tiers (locked/conditional/exploratory), code pre-registration, contingencies, deviations, registered reports.
Share bugs, ideas, or general feedback.
Write a comprehensive analysis plan assuming the analyst has zero context for this project and questionable statistical taste. Document everything they need: which datasets and variables, how each construct is computed, the exact model or test, the sample size / power justification, which confounds are handled and how, the decision rules, and the planned figures. Give them the whole thing as bite-sized steps. DRY. YAGNI. Pre-register. Validate pipelines on known data. Commit frequently.
Assume they are a capable programmer but know almost nothing about this domain, this dataset, or good statistical design.
Announce at start: "I'm using the designing-the-analysis skill to create the analysis plan."
Save plans to: docs/science-superpowers/plans/YYYY-MM-DD-<topic>.md
If the question document still bundles several independent investigations, stop and split it — one plan per question. Each plan should produce an interpretable, self-contained result.
docs/science-superpowers/questions/...)If you don't have a prior effect size, the design must include how you will justify the sample size anyway (smallest effect of interest, precision target, or a sensitivity analysis).
Before defining steps, map the pipeline. Data flows one direction: raw → cleaned → derived → results.
For each known confound from the survey, state how it is handled: measured and adjusted for, stratified, matched, design-excluded, or explicitly acknowledged as a limitation. "We'll see" is not a plan.
State the threats to validity you are accepting and why.
State the target effect size, alpha, desired power, and the resulting required N — or, for a fixed existing sample, the minimum detectable effect at the planned power. If underpowered, say so and decide with your human partner whether to proceed (e.g., reframe as estimation, not a hypothesis test).
Each step is one action (2-5 minutes). Every step that touches data is paired with a validation — the science analog of watching a test fail then pass:
The simulated-data validation step is mandatory for any nontrivial estimator or model: if you never watched your pipeline recover a known signal, you don't know it works.
Every plan MUST start with this header:
# [Question] Analysis Plan
> **For agentic workers:** REQUIRED SUB-SKILL: pre-register this plan with science-superpowers:preregistering-analysis BEFORE execution. Then use science-superpowers:subagent-driven-analysis (recommended) or science-superpowers:executing-analysis to run it step-by-step. Steps use checkbox (`- [ ]`) syntax for tracking.
**Question:** [the falsifiable question, one sentence]
**Design:** [observational/experimental; cross-sectional/longitudinal; the comparison]
**Data:** [datasets, sample, unit of analysis]
**Primary analysis:** [the one model/test that answers the question]
**Decision rule:** [exactly what result confirms vs. disconfirms H1]
---
### Task N: [Analysis component]
**Artifacts:**
- Create: `analysis/exact_script.py`
- Reads: `data/raw/exact_file.csv` (immutable)
- Writes: `data/derived/exact_output.parquet`
- [ ] **Step 1: Write the loading/transform code**
```python
df = pd.read_csv("data/raw/exact_file.csv")
clean = df[df["value"].between(0, 100)]
```
- [ ] **Step 2: Validate the step**
Run: `python analysis/exact_script.py --check`
Expected: `rows in: 10342, rows out: 10298, dropped: 44 (out-of-range)` — dropped count matches the known data-quality issue, not silent loss.
- [ ] **Step 3: Run the primary model exactly as specified**
```python
model = smf.ols("outcome ~ exposure + age + site", data=clean).fit()
```
- [ ] **Step 4: Apply the pre-registered decision rule**
The estimate for `exposure` is interpreted against the rule fixed in the pre-registration — not re-decided here.
- [ ] **Step 5: Commit**
```bash
git add analysis/exact_script.py data/derived/exact_output.parquet
git commit -m "analysis: primary model for exposure effect"
```
Every step must contain the actual content the analyst needs. These are plan failures — never write them:
After writing the plan, re-read the question document with fresh eyes and check the plan against it:
Fix issues inline. If a question requirement has no task, add the task.
The plan is not ready to execute until its predictions and decision rules are locked.
REQUIRED NEXT SKILL: Use science-superpowers:preregistering-analysis to freeze the confirmatory hypotheses, predictions, and decision rules before any outcome is observed. Execution happens only after that.