By morphaxl
Define a goal with verifiable rubrics once, then ultragoal autonomously works in a loop until each rubric passes, stopping to report results and saving lessons learned into project memory with independent verification for sign-off.
Lint and compact the ultragoal project memory — find contradictions and stale claims, merge duplicates, refresh the index. Run every ~10 sessions (the session banner nudges when due).
Turn a messy brain dump (raw voice transcript welcome) into a verifiable goal with a checkable rubric, then work autonomously until it is verified and lessons are saved. Use for substantial end-to-end work — build, fix, migrate, investigate — AND for follow-up rounds on a finished or paused goal ("next round", "improve on this"); re-invoke it rather than arming from memory of a previous round.
Distill lessons from the current session or a finished goal into the project memory (.ultragoal/memory/). Used directly, as the final step of a goal loop, and immediately whenever the user corrects you.
Initialize or reconfigure ultragoal in this project — scaffold .ultragoal/, choose preference knobs, wire the CLAUDE.md block. Runs automatically on first /ultragoal:goal.
Show the current ultragoal state — active goal, rubric progress, budget, memory health.
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Tell Claude what you want once. It works until the job is verifiably done — and it gets smarter every time.
In a long agent session you do the real work around the typing: you read each output, catch the false claims, remind it what it forgot, and decide when it's done. The model types — you are the quality control, the memory, and the off switch. That's fine for five minutes; it collapses at five hours. Not because the model isn't capable — Fable 5 one-shots most single tasks — but because you don't scale. You can't review two hundred actions an hour or stay awake overnight, and the moment you stop watching, the system has zero verification left. A days-capable model run this way is capped at the speed of your attention.
Loop engineering is the fix. Instead of steering prompt by prompt, you design a small system around the model — a goal, a check, a memory, a stopping rule — and the system does the steering. You build the loop once; the loop does the prompting. The model is the easy part now; writing "done" in a form a command can check is the skill. (The full argument, with the research behind it: docs/loop-engineering.md.)
ultragoal is that loop, packaged — so a system, not you, does the verifying, remembering, and stopping. You ramble (a messy voice note is fine); it interviews you on the few forks that actually change the outcome, compiles a rubric where every line is checkable by a command, and arms a loop you can walk away from — turn after turn, session after session, until an independent verifier confirms the work holds and the lessons are written down. It's the goal-loop architecture Anthropic's engineers describe using with Fable 5 (the same one Claude Code ships natively as /goal), with everything the published workflow still assumes an expert wires by hand — a checkable rubric, a fresh-eyes verifier, a memory discipline, a goal that survives the session — built into the harness. That's how you actually leverage a model built to run for days: the structure holds the standard, so the model's full range isn't bounded by how long you can watch it. Goals on steroids.
Every mechanism in the loop is research-backed — verifier design, evidence ledgers, rubric architecture, memory provenance all trace to published results from Anthropic, DeepSeek, Alibaba, ByteDance, Tencent, and academic agent-systems work. The full mechanism→evidence map lives in docs/research-foundations.md, fed by dated research sweeps in docs/research/.
BRIEF ──► GOAL ──► LOOP ──► VERIFY ──► DISTILL
│ │ │ │ │
ramble spec work fresh-eyes memory
(voice) +rubric turns subagent grows
▲ │
└──── consult ◄──────┘ next session starts smarter
Four parts keep each other honest:
/clear, restarts, and days away. Goals are per-session: run different goals in different sessions of the same repo at once, each gated independently. Same architecture as Claude Code's built-in /goal, with upgrades (see how the loop works).npx claudepluginhub morphaxl/ultragoal --plugin ultragoalDurable goal-following for Claude Code: contracts with definition-of-done, subagent judge gates, executor-subagent chain execution, and adaptive missions. Inspired by OpenAI Codex /goal and the Ralph loop pattern.
Turn broad Codex and Claude Code work into pressured /goal runs with oracles, local boards, receipts, and verification.
Auto-improving AI sub-agents that learn from their mistakes across sessions
Drive a goal to completion autonomously while enforcing backpressure (lint, tests, verification) at every step.
A structured goal-setting exercise grounded in MCII research to help developers set concrete learning goals with if-then plans for follow-through.
Plan and autonomously build a software task end-to-end. Recons the codebase, applies preloaded memory, decomposes into the right number of phases, gets one confirmation, then prepares a single ready-to-paste /goal command — one paste between you and done — that drives execution to completion with built-in retry, fix-spec recovery, and per-phase memory writeback. Works on Claude Code and Codex.