Help us improve
Share bugs, ideas, or general feedback.
From papermill
This skill should be used when the user asks to "design an experiment", "plan my experiments", "set up a benchmark", "how should I test my thesis", "design a computational study", or needs to plan experiments for a research paper. Covers hypothesis formulation, variable identification, methodology selection, and success criteria definition. Produces a structured experiment plan with reproducibility in mind.
npx claudepluginhub queelius/claude-anvil --plugin papermillHow this skill is triggered — by the user, by Claude, or both
Slash command
/papermill:experimentThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Help the researcher design rigorous experiments or computational studies. Good experiments are hypothesis-driven, reproducible, and have clear success criteria before they are run.
Plans experiment protocols, result tables, mock data, evaluation gates, method traceability, and table schemas for research papers before real results exist.
Designs detailed experimental protocols for validating research hypotheses, including variables, controls, power analysis, timeline, and expected outcomes.
Designs ML experiments: ablation studies, baseline comparisons, experiment matrices; estimates GPU/API costs; generates config stubs, execution scripts, and analysis plans.
Share bugs, ideas, or general feedback.
Help the researcher design rigorous experiments or computational studies. Good experiments are hypothesis-driven, reproducible, and have clear success criteria before they are run.
Read .papermill/state.md (Read tool) for:
If .papermill/state.md does not exist, ask the user what claim the experiments should test. Experiment design can proceed without the state file — suggest running /papermill:init afterward to register experiments persistently.
Scan the repository for existing code (Glob tool) in research/, code/, scripts/, experiments/, or analysis/ directories.
Ask the user: "What specific claim or aspect of your thesis do these experiments need to support?"
Different contribution types need different experimental approaches:
| Contribution | Experimental approach |
|---|---|
| Theorem/proof | Numerical validation of theoretical predictions |
| Algorithm | Runtime/accuracy benchmarks against baselines |
| Statistical method | Monte Carlo simulations with known ground truth |
| Empirical finding | Controlled experiments with statistical tests |
| Framework/model | Case studies demonstrating applicability |
For each experiment, specify:
State the expected outcome in falsifiable terms. "We expect X to be Y under conditions Z" -- not "we want to show our method works."
Define before running what constitutes support for the hypothesis. This prevents post-hoc rationalization.
Check for and warn about:
If .papermill/state.md exists, update it (Edit tool) by adding to the experiments list. If it does not exist, skip registration and suggest running /papermill:init to persist the experiment.
experiments:
- name: "descriptive-name"
type: "simulation | benchmark | case-study | ablation"
hypothesis: "Expected outcome in one sentence"
status: "planned | running | completed | failed"
script: "path/to/script.R"
last_run: null
Append a timestamped note documenting the experiment design.
Based on the experiment type, suggest the most relevant next step:
/papermill:simulation for detailed simulation design — it covers sample sizes, convergence diagnostics, and result presentation."/papermill:proof to verify the theory before running experiments."/papermill:review once the results are written up."