Help us improve
Share bugs, ideas, or general feedback.
From goalkeeper
Executes durable, contract-driven goals with checkpoint validation and judge-gated completion. Useful for multi-turn autonomous tasks that need structured progress tracking.
npx claudepluginhub itsuzef/goalkeeper --plugin goalkeeperHow this skill is triggered — by the user, by Claude, or both
Slash command
/goalkeeper:goalThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are operating the **goalkeeper** skill — durable, contract-driven goal execution with judge-gated completion. This skill is invoked when the user runs `/goal` or `/goal "<objective>"`.
Runs a linear sequence of goals where a judge gates progression between steps. Invoked via /goal-chain or auto-advanced by judge approval.
Defines goals with measurable success criteria and runs an autonomous convergence loop until goals are met or limits reached.
Manages and maintains AgentOps fitness goals using the `ao goals` CLI for measurement, drift, validation, migration, and reporting.
Share bugs, ideas, or general feedback.
You are operating the goalkeeper skill — durable, contract-driven goal execution with judge-gated completion. This skill is invoked when the user runs /goal or /goal "<objective>".
Goalkeeper goals run in one of two execution modes depending on how the goal was activated:
/goal / /goal "<objective>")Main conversation context runs the full Execution Loop (do work → checkpoint → run validator → invoke judge → branch on verdict). Use ScheduleWakeup to pace iterations across long-running validators. This is the default when the user invokes /goal directly.
/goal-chain)When /goal-chain is the orchestrator, the chain spawns a fresh-context executor subagent per goal. Main context only orchestrates — it does NOT do the per-goal implementation work. The executor subagent reads contract + log, does the work end-to-end, runs validator, and returns a structured summary. Main context then spawns the judge subagent, applies the verdict, and either advances the cursor or re-spawns the executor with the judge's fix-list.
This is the load-bearing change in v0.3 that lets multi-goal chains run autonomously without main context aging out. Main context cost per goal drops from "all the work + judge spawn" (tens of thousands of tokens accumulating per goal) to "executor return + judge spawn" (~10K tokens per goal, flat). A 9-goal chain that previously needed 9 fresh sessions can now complete in one main-context session.
The protocol described in the Execution Loop section below applies in BOTH modes — the difference is who runs it:
Both modes use the same state.json / log.md / active.json files. The only mode-specific divergence is whether the judge is invoked by the loop (inline) or by the chain orchestrator after executor return (subagent). See goal-chain/SKILL.md Steps 6–8 for the subagent-mode orchestration details.
Single source of truth for the JSON files goalkeeper reads and writes. Other skills (goal-clear, goal-judge, goal-chain) MUST conform to these shapes — drift here is the source of cross-skill bugs.
.claude/goals/active.jsonTwo shapes only — active or terminal.
Active (a goal is currently running):
{
"slug": "<slug>",
"activated_at": "<ISO8601>",
"chain": "<chain-name>"
}
The chain field is OPTIONAL — present only when activation was driven by /goal-chain (start mode or advance mode). Standalone goals omit it.
Terminal (no active goal):
{
"slug": null,
"ended_at": "<ISO8601>",
"ended_reason": "done" | "cleared" | "chain_completed" | "aborted",
"previous_slug": "<slug>",
"previous_chain": "<chain-name>"
}
slug, ended_at, and ended_reason are REQUIRED on terminal. previous_slug SHOULD be set when the last activity was a single goal or chain link. previous_chain SHOULD be set when a chain just ended (chain_completed or aborted).
.claude/goals/<slug>/state.json{
"status": "active" | "paused" | "done" | "needs_human",
"rejection_count": <int>,
"started_at": "<ISO8601>",
"started_at_commit": "<git rev-parse HEAD or null>",
"started_at_dirty_paths": ["<paths from git status --porcelain at activation>"],
"chain_step": <int>,
"last_checkpoint_at": "<ISO8601 or null>",
"last_validator_result": "pass" | "fail: <reason>" | null,
"last_judge_verdict": "approve" | "reject" | null,
"approved_at": "<ISO8601>",
"paused_at": "<ISO8601>",
"resumed_at": "<ISO8601>",
"needs_human_at": "<ISO8601>",
"validator_baseline_result": "pass" | "fail" | "not_runnable" | null,
"validator_baseline_failing_paths": ["<path>"]
}
status, rejection_count, started_at, started_at_commit, started_at_dirty_paths are REQUIRED on activation. chain_step SHOULD be present when the goal is part of a chain (denormalized for log clarity; chain.json is the source of truth for cursor). Timestamp fields populate as the goal transitions: approved_at on judge approve, paused_at/resumed_at on /goal-pause//goal-resume, and needs_human_at when rejection_count reaches max_rejections and status flips to needs_human.
validator_baseline_result and validator_baseline_failing_paths are populated by /goal-prep if it ran the validator once at activation baseline (which prep already does to confirm the command is runnable). When set, the judge treats failing_paths as pre-existing — a goal-end validator failure on those same paths is not the goal's fault. When null (validator was not run at prep, or prep is skipped), the judge has no pre-existing baseline to subtract from and treats validator failures as goal-caused.
.claude/goals/chain.json{
"name": "<chain name>",
"slugs": ["<slug>", "..."],
"cursor": <int>,
"status": "active" | "done" | "aborted",
"started_at": "<ISO8601>",
"completed_at": "<ISO8601 or null>",
"source_file": "<absolute path>",
"link_approvals": [
{"slug": "<slug>", "approved_at": "<ISO8601>"}
]
}
link_approvals accumulates one entry per judge-approved link. Provides chain-level visibility independent of per-link state.json files.
.claude/mission.json (v0.2 — supervisor layer){
"name": "<from mission.md heading>",
"status": "active" | "done" | "escalated",
"started_at": "<ISO8601>",
"completed_at": "<ISO8601 or absent>",
"goals_completed": [
{"slug": "<slug>", "result": "approved" | "cleared", "rejection_count": <n>, "ended_at": "<ISO8601>"}
],
"supervisor_verdicts": [
{"at": "<ISO8601>", "prior_slug": "<slug>", "verdict": "proceed" | "done" | "escalate", "next_objective": "<if proceed>", "escalation": "<if escalate>"}
]
}
The mission layer sits ONE LEVEL ABOVE chains. Missions adaptively decide the next goal based on the prior goal's actual output, where chains commit to a linear sequence at chain-start. Companion files: mission.md (user-authored charter — required to invoke /goal-supervisor), mission-log.md (append-only mission-level audit trail), mission-completed.md (final snapshot written when supervisor verdict is done). See skills/goal-supervisor/SKILL.md for the full state machine.
args non-empty → set mode (start or resume a goal)args empty → status mode (report on active goal)Read .claude/goals/active.json. If missing or slug is null, output: No active goal. Run /goal-prep "<rough idea>" or /goal "<objective>" to start one. and stop.
Read .claude/goals/<slug>/contract.md, state.json, and the last 10 entries of log.md.
Print a compact status block:
Goal: <slug>
Objective: <one-line>
Status: <status> Rejections: <n>/<max>
Started: <ISO8601> Elapsed: <human readable>
Validator: <last_validator_result>
Judge: <last_judge_verdict>
Last log: <timestamp> — <one-line summary>
If status == needs_human, surface the latest judge fix-list verbatim and tell the user: fix the listed items, then run /goal-resume to continue or /goal-clear to abandon.
Derive slug: extract or generate a kebab-case slug from the objective (≤64 chars, lowercase, alphanumeric and hyphen only). If the user passed an explicit --slug=<value>, use that.
Contract resolution:
.claude/goals/<slug>/contract.md exists, treat the request as a resume: skip prep, jump to step 3./goal-prep with the same objective. Do not write a thin one-line contract — prep is mandatory because the contract IS the spec. After prep completes and writes the contract, return here.Activate:
.claude/goals/ exists. If creating it for the first time, also create .claude/goals/.gitignore containing * and !shared/ and !.gitignore so personal goals stay private but a shared/ subdir can be opt-in committed.git rev-parse --is-inside-work-tree succeeds), capture git rev-parse HEAD as the goal's diff origin. If the tree is dirty, also capture git status --porcelain as a snapshot of the dirty paths so the judge can distinguish goal work from pre-existing changes. If not in a git repo, set both fields to null..claude/goals/<slug>/state.json:
{
"status": "active",
"rejection_count": 0,
"started_at": "<ISO8601 now>",
"started_at_commit": "<git rev-parse HEAD or null>",
"started_at_dirty_paths": ["<paths from git status --porcelain at activation>"],
"last_checkpoint_at": null,
"last_validator_result": null,
"last_judge_verdict": null
}
.claude/goals/active.json per the canonical active shape. If the activation is driven by /goal-chain, include "chain": "<chain-name>"; otherwise omit the field.log.md:
## <ISO8601> — activated
Starting work on: <objective>
Baseline commit: <short SHA or "no-git">
Begin the execution loop (next section).
This block runs on activation AND on every ScheduleWakeup re-entry.
Read the active contract: .claude/goals/active.json → <slug> → contract.md and state.json.
If state.status != active, stop. Do not schedule another wakeup.
Read recent log entries to know where work left off.
Do real work on the objective for one checkpoint's worth of progress (size per checkpoint_cadence in the contract — e.g. ~5 file edits, ~20 minutes of effort, or one logical sub-task).
Checkpoint: append to log.md:
## <ISO8601> — checkpoint
<one short paragraph: what changed, files touched, decisions made, what's next>
Run validator: execute validator.command from frontmatter. Capture stdout+stderr. Honor timeout_seconds. Interpret per validator.success (default exit_zero). Update state.last_validator_result to one of pass, fail: <short reason>. Update state.last_checkpoint_at.
Branch on validator:
ScheduleWakeup with delay from wakeup_seconds (contract) or pick cache-aware default (see below). Wakeup prompt:
Continue active goalkeeper goal — read .claude/goals/active.json and proceed per the goal skill execution loop.
STATUS: validator_fail + a clear BLOCKERS field (per goal-chain/SKILL.md Step 6 directive). Main context handles the user-surface..claude/goals/<slug>/.judge-pending and immediately running the judge skill in this same turn). Judge will read contract + git diff + log and return approve/reject.STATUS: validator_pass + the structured summary fields. Main context spawns the judge and applies the verdict.Branch on judge verdict (inline mode only — in subagent mode the chain orchestrator handles this section per goal-chain/SKILL.md Step 8):
Judge writes verdict to state.last_judge_verdict and a fix-list to log.md if reject.
.claude/goals/chain.json exists and contains the active slug at the cursor: this should not happen in inline mode, but if it does, defer to the goal-chain skill advance flow.state.status=done, append ## <ISO8601> — done to log, null out active.json ({"slug": null}). Surface a one-paragraph completion summary to the user.## <ISO8601> — judge rejected.state.rejection_count += 1.rejection_count >= max_rejections (default 5): set state.status=needs_human, append ## <ISO8601> — paused (max rejections), stop. Do NOT schedule next wakeup. Surface to user.Continue active goalkeeper goal — judge rejected the last attempt. Read the most recent "judge rejected" block in .claude/goals/<slug>/log.md and address each fix-list item. Then proceed per the goal skill execution loop.
The Anthropic prompt cache has a 5-minute TTL. Pick delaySeconds deliberately:
wakeup_seconds from the contract verbatim if set.contract.md mid-run. If the spec is wrong, stop and ask the user to amend it explicitly via /goal-clear + new prep./goal-clear.