From aistats-skills
Designs and audits AISTATS experiments: simulations, baselines, statistical tests, uncertainty estimates, ablations, and theory-validation checks with claim-to-evidence mapping.
How this skill is triggered — by the user, by Claude, or both
Slash command
/aistats-skills:aistats-experimentsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this before submission when the empirical or simulation story is not yet locked.
Use this before submission when the empirical or simulation story is not yet locked.
| Theoretical claim | Matching experiment | Reject pattern avoided |
|---|---|---|
| Convergence rate in n | Log-log error versus n with fitted slope | "Rates asserted but never plotted" |
| Confidence-interval coverage | Empirical coverage across many replications | "Nominal 95 percent never verified" |
| Regret bound | Cumulative regret versus horizon, with the bound curve overlaid | "Bound and trajectory never compared" |
| Robustness to misspecification | Violation-severity sweep | "Guarantees hold under assumptions the experiments quietly break" |
Suppose the paper proves finite-sample type-I error control under a boundedness assumption. The matching plan: simulate under the null at several sample sizes to verify size, sweep dependence strength for power curves, then inject heavy-tailed noise that breaks boundedness to map degradation — every panel tied to a numbered theorem or remark.
[Experiment readiness] strong / adequate / weak
[Claim -> evidence map] <claim: table/figure/simulation>
[Missing statistical evidence] <uncertainty/test/seed/baseline>
[Reproducibility gaps] <hyperparameters/compute/data/code>
[Decision-critical next run] <one experiment or simulation>
npx claudepluginhub brycewang-stanford/awesome-journal-skills --plugin aistats-skillsStrengthens AISTATS reproducibility evidence by mapping claims to verifiable locations, auditing checklists, and ensuring turnkey simulation scripts.
Audits IJCAI/IJCAI-ECAI experiments for baselines, ablations, statistical evidence, hyperparameters, compute, dataset handling, ethics, and reproducibility.
Audits ML experiments for ICML submission/rebuttal: baselines, ablations, variance, data leakage, compute disclosure, reproducibility, negative results.