From iclr-skills
Audits ICLR experiments for scientific rigor: baselines, ablations, scaling laws, robustness, statistics, benchmarks, human evaluation, and compute reporting. Helps pre-answer reviewer questions and isolate mechanisms with compute-matched controls.
How this skill is triggered — by the user, by Claude, or both
Slash command
/iclr-skills:iclr-experimentsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this before submission or during a revision pass to stress-test empirical claims. ICLR
Use this before submission or during a revision pass to stress-test empirical claims. ICLR experiments should answer the scientific question, not merely assemble a leaderboard.
ICLR's empirical culture prizes honest ablations and mechanism over leaderboard position. A clean ablation that explains why a representation works often outscores a larger raw number.
| Claim type | Evidence that convinces ICLR reviewers | Common reject trigger |
|---|---|---|
| New objective helps | Ablate the objective with everything else fixed | Gains confounded with extra tuning |
| Method scales | Several model sizes/tasks with a trend | One large run, no scaling curve |
| Robust representation | Tests across shifts, seeds, prompts | Single-seed peak on one benchmark |
| Beats prior method | Tuned, current, open-source baseline | Stale or under-tuned baseline |
A paper claims a new self-supervised pretext task yields better linear-probe accuracy. Reviewers ask whether the gain is the pretext task or simply longer pretraining. The author audit: hold total pretraining compute fixed, swap only the pretext objective, and report linear-probe accuracy with error bars over five seeds. The compute-matched ablation isolates the mechanism and is small enough to post inline during discussion, where the table becomes part of the permanent public record.
[Claim] <paper claim>
[Experiment evidence] sufficient / needs baseline / needs ablation / needs robustness
[Fairness issue] <compute, tuning, data, prompt, metric>
[Fast fix] <experiment or analysis feasible before deadline>
[Appendix placement] <what can move out of main text>
npx claudepluginhub brycewang-stanford/awesome-journal-skills --plugin iclr-skillsStrengthens reproducibility for ICLR papers: maps claims to seeds, splits, commands, and compute; writes reproducibility statements and addresses reviewer concerns about verifiability.
Audits ML experiments for ICML submission/rebuttal: baselines, ablations, variance, data leakage, compute disclosure, reproducibility, negative results.
Audits IJCAI/IJCAI-ECAI experiments for baselines, ablations, statistical evidence, hyperparameters, compute, dataset handling, ethics, and reproducibility.