Help us improve
Share bugs, ideas, or general feedback.
From science-superpowers
Investigates surprising, impossible, or contradictory results in data pipelines, models, or experiments. Guides root-cause analysis before any adjustments.
npx claudepluginhub k-dense-ai/science-superpowers --plugin science-superpowersHow this skill is triggered — by the user, by Claude, or both
Slash command
/science-superpowers:investigating-anomalous-resultsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Random tweaks waste time and manufacture false findings. Quietly dropping the inconvenient data point, nudging the cutoff, or re-running until it "works" doesn't fix the problem — it fabricates a result.
Scientific bug hunting using falsifiable hypotheses. Forms hypotheses, designs falsifying tests, eliminates candidates systematically, and logs the full investigation trail in a structured debug/ folder. TRIGGER when: user has a bug to investigate scientifically; user wants systematic root-cause analysis; user says "debug", "investigate", "root cause", "why is this failing"; user invokes /autoresearch:debug. DO NOT TRIGGER when: user wants to optimize a metric (use /autoresearch); user wants to fix a known error automatically (use /autoresearch:fix); user just wants a quick one-line answer about what a function does.
Mid-analysis course correction for data science workflows: diagnose wrong results, notebook errors, or data changes before fixing. Loads context, identifies root cause, then applies output-first verification.
Enforces fresh verification of analysis results before making claims. Requires running the analysis from raw data and reading actual output before reporting any finding.
Share bugs, ideas, or general feedback.
Random tweaks waste time and manufacture false findings. Quietly dropping the inconvenient data point, nudging the cutoff, or re-running until it "works" doesn't fix the problem — it fabricates a result.
Core principle: ALWAYS find the root cause before adjusting anything. An adjustment made before you understand the cause is, at best, noise and, at worst, fraud.
Violating the letter of this process is violating the spirit of it.
The first question for any anomaly: is this a code bug, a data issue, or a real finding? You cannot answer by guessing, and you must not "fix" it until you know — because a real finding is not something to fix.
NO ADJUSTMENTS WITHOUT ROOT-CAUSE INVESTIGATION FIRST
No dropping data, changing a test, transforming a variable, re-running with new parameters, or "cleaning" until Phase 1 is complete. If you haven't found the cause, you cannot justify the adjustment.
Use this ESPECIALLY when:
Complete each before the next.
Before any adjustment:
git diff, data provenance, package versions.raw: N=10342, outcome mean=0.31, nulls=0
cleaned: N=10298, outcome mean=0.31, nulls=0 <- 44 dropped, expected
derived: N=10298, feature mean=4e7 <- WRONG: unit blew up here
result: coefficient enormous <- symptom; cause is upstream
Resolve according to the cause you established:
If 3+ adjustments fail: STOP. The pipeline or the design may be wrong — repeated failures that each reveal a new problem indicate a structural issue, not a series of small bugs. Discuss the design with your human partner before another attempt. Re-opening the design or pre-registration must be documented as a deviation.
All of these mean: STOP. Return to Phase 1.
| Excuse | Reality |
|---|---|
| "It's clearly an outlier, just remove it" | "Clearly" is a guess. Find why it's extreme; document if you remove it. |
| "Re-running with a new seed fixed it" | You changed the result by chance, not the cause. Investigate. |
| "The good result is the right one" | Convenient results need MORE scrutiny, not less. |
| "Emergency, no time to investigate" | Investigation is faster than retracting a wrong finding. |
| "Tweak the model, then understand it" | The first tweak sets a false trail. Understand first. |
| "Performance is just high, ship it" | Suspiciously high performance is usually leakage. Check. |
| Phase | Activities | Success Criteria |
|---|---|---|
| 1. Characterize | Read, reproduce, check changes, instrument stages, classify | Know WHERE and WHICH (bug/data/finding) |
| 2. Pattern | Working comparison, reference, list differences | Identify the difference |
| 3. Hypothesis | Single theory, minimal test | Confirmed or new hypothesis |
| 4. Resolution | Fix bug / document data rule / report real finding | Anomaly explained, not just hidden |