Help us improve
Share bugs, ideas, or general feedback.
From mlx
Autonomous time-budget experiment loop. Modify a training script, train for a fixed wall-clock budget, evaluate, record, repeat. Inspired by karpathy/autoresearch. Use for overnight architecture search, systematic hyperparameter sweeps, or any iterative model improvement workflow.
npx claudepluginhub damionrashford/mlx --plugin mlxHow this skill is triggered — by the user, by Claude, or both
Slash command
/mlx:autoexperimentopusThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Run autonomous time-budget experiment loops. Each iteration modifies `train.py`,
Generates program.md for autonomous AI research experiments (Karpathy's autoresearch). Interviews user on codebase, metrics, constraints; explores code; tailors agent instructions from template.
Sets up and runs autonomous experiment loops to optimize any target metric using git branches, autoresearch.md configs, bash benchmark scripts, and JSONL state logging. Activates on 'run autoresearch' or optimization loop requests.
Guides technical evaluation of code review feedback: read fully, restate for understanding, verify against codebase, respond with reasoning or pushback before implementing.
Share bugs, ideas, or general feedback.
Run autonomous time-budget experiment loops. Each iteration modifies train.py,
trains for a fixed wall-clock budget, evaluates, records in results.tsv, and repeats.
results.tsv exists with a baseline (exp000) before iteratingEXPERIMENT.md with your goal, baseline, hypothesis, and constraints/mlx:autoexperiment path/to/train.pyEXPERIMENT.md for the current hypothesisresults.tsv for experiment historytrain.py with the single changetimeout $BUDGET uv run train.pyresults.tsv: KEEP / DISCARD / CRASHEXPERIMENT.md "Next to try" sectionSee references/EXPERIMENT.md.template for the hypothesis file format.
See scripts/time_budget_train.py for a complete training script template with all patterns.
total_nats / (math.log(2) * total_bytes) — vocab-independent metricif math.isnan(loss) or loss > 100: sys.exit(1)See references/autoexperiment-guide.md for full documentation.