From sage
Autonomously optimizes numeric metrics like bundle size, test coverage, query time via repeated code modifications, commits, and verify commands. For tasks with deterministic parseable outputs.
npx claudepluginhub xoai/sageThis skill uses the workspace's default tool permissions.
Autonomous iteration toward a measurable outcome. The agent modifies
examples/bundle-size/README.mdexamples/bundle-size/autoresearch.shexamples/bundle-size/brief.mdexamples/prose-readability/README.mdexamples/prose-readability/autoresearch.shexamples/prose-readability/brief.mdexamples/test-coverage/README.mdexamples/test-coverage/autoresearch.shexamples/test-coverage/brief.mdreferences/crash-handling.mdreferences/harness-conventions.mdreferences/loop-protocol.mdreferences/metric-design.mdreferences/session-continuity.mdreferences/stuck-recovery.mdGuides interactive setup of optimization goals, metrics, and scope; runs autonomous git-committed experiment loops: code changes, testing, measurement, keep improvements or revert. For performance tuning in git repos.
Runs autonomous improvement loops: specify code scope, shell metric command, direction; agent iteratively modifies files, verifies via bash/git, keeps gains, discards regressions until target, stagnation, or cap.
Runs autonomous git-based optimization loops: set goal/metric/guard, iteratively review-modify-verify-commit-decide-keep/revert until improved or max iterations via ~loop command.
Share bugs, ideas, or general feedback.
Autonomous iteration toward a measurable outcome. The agent modifies code, commits, runs a verify command, keeps improvements, reverts regressions — repeating until a target is hit, a budget is exhausted, or the user interrupts.
Core principles (from Karpathy's autoresearch pattern):
Before the loop can start, capture these (skip if already provided):
| Field | Required | Example |
|---|---|---|
| Goal | Yes | "Reduce bundle below 200KB" |
| Metric name | Yes | bundle_kb |
| Direction | Yes | lower or higher |
| Target | Optional | 200 |
| Verify command | Yes | pnpm build && measure.sh |
| Writable scope | Recommended | src/**/*.ts |
| Frozen scope | Recommended | package.json, *.lock |
| Per-run budget | Yes (default 120s) | 120 seconds |
| Max iterations | Optional | 100 |
| Termination | Auto | target if target given, else interrupt |
Present as a brief for user approval:
Sage: Autoresearch session configured.
Goal: [goal statement]
Metric: [name] ([direction]), target: [target or "none — runs until interrupted"]
Verify: [command]
Scope: writable [globs], frozen [globs]
Budget: [seconds]s per run, [max iterations or "unlimited"]
[A] Start — begin autonomous iteration
[R] Revise — change configuration
Each iteration follows 8 phases. Read references/loop-protocol.md
for per-phase detail.
| # | Phase | Actor | What happens |
|---|---|---|---|
| 1 | REVIEW | agent | Read current state, recent history (last 20 iterations from JSONL) |
| 2 | IDEATE | agent | Propose ONE change, ≤1 sentence. If stuck, load references/stuck-recovery.md |
| 3 | MODIFY | agent | Make the change. Stay within writable scope. |
| 4 | COMMIT | runtime | git add -A && git commit on autoresearch/<slug> branch |
| 5 | VERIFY | runtime | Run verify command with wall-clock budget |
| 6 | DECIDE | runtime | Parse METRIC, compare to best → keep / discard / crash |
| 7 | LOG | runtime+agent | Append JSONL, rebuild TSV, agent updates living doc |
| 8 | REPEAT | runtime | Check termination → loop or exit |
Decision rules (Phase 6):
crash, reset to HEADcrash, resetcrash, resetkeep, advance branchdiscard, resetThe Python runtime at core/autoresearch/ handles deterministic phases
(COMMIT, VERIFY, DECIDE, LOG, REPEAT). The agent handles creative
phases (REVIEW, IDEATE, MODIFY).
Running the runtime:
python -m core.autoresearch run --brief .sage/work/<slug>/brief.md --project .
Harness contract: The verify command must print METRIC name=number
to stdout. See references/harness-conventions.md.
All state lives in .sage/work/<YYYYMMDD-slug>/:
| File | Role |
|---|---|
brief.md | Configuration (goal, metric, scope, budget) |
autoresearch.md | Living doc — ideas tried, wins, dead ends |
autoresearch.jsonl | Structured log (one line per iteration) |
results.tsv | Human-readable view (derived from JSONL) |
runs/NNNN-*.log | Per-iteration stdout+stderr |
.autoresearch-state.json | Crash recovery state (not committed) |
On resume (new session, context reset, platform switch):
autoresearch.md for high-level contextautoresearch.jsonl for recent historygit log on the branchSee references/session-continuity.md for full protocol.
Session end: Store a structured summary in sage-memory:
Session start: Search sage-memory for priors on this repo + metric. Inject into IDEATE as "known-good starting points" and "known dead ends."
| Gate | When | Check |
|---|---|---|
| scope | After MODIFY | Changed files ⊆ writable, frozen untouched |
| pre-verify | After COMMIT | git status is clean |
| metric-parseable | After VERIFY | At least one METRIC line in stdout |
| budget | During VERIFY | Wall-clock ≤ per_run_seconds |
Gates are enforced by the runtime, not by prose. The agent cannot bypass them.
references/loop-protocol.md — per-phase inputs, outputs, failure modesreferences/metric-design.md — what makes a good metricreferences/harness-conventions.md — METRIC line contractreferences/stuck-recovery.md — escape local minimareferences/crash-handling.md — retry vs skip decision treereferences/session-continuity.md — resume protocol