From goal-lock
Enforces a structured PLAN→DO→VERIFY→FINALIZE→OUTPUT loop to keep agents on task, detect false success claims, and prevent scope creep. Use when executing defined coding tasks with measurable completion criteria.
How this skill is triggered — by the user, by Claude, or both
Slash command
/goal-lock:goal-lockThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> Lock the goal. Run the loop. Ship clean.
Lock the goal. Run the loop. Ship clean.
Prevents agents from drifting off target, masquerading success, or creeping scope. Quality through enforced loops, not prompt obedience.
Is DONE EVIDENCE verified by actual execution? — What the agent says is done vs what is actually done. Closing this gap to zero is the purpose of this skill.
/goal-lock/goal-lock quick (Quick mode)[A] GOAL Input Sheet — fill per task (goal definition)
[B] Fixed Loop — same for every task (execution discipline)
Missing/contradictory input → STOP. Conflicts → PRIORITY. STOP RULES → halt.
| Mode | Condition | Input Sheet | Loop |
|---|---|---|---|
| Quick | 1 file, clear change, ≤10 lines | 3 fields (GOAL/DONE/SCOPE) | DO→VERIFY only |
| Full | Everything else | All 7 fields | B1~B5 full |
User specifies /goal-lock quick, or change fits Quick criteria. When unsure, use Full.
## GOAL Input Sheet
### 1. GOAL
[Single measurable goal. No expansion.]
### 2. DONE EVIDENCE
[Completion proof — command to run + expected result. No subjective criteria.]
e.g.: `pytest tests/test_X.py -v` → 5 passed
e.g.: `curl localhost:3000/api/health` → 200 OK
### 3. CONTEXT
[Current state · existing structure · prior decisions · dependencies · known constraints]
### 4. STARTING POINT
[Files/logs/tests to look at first. Start here, no broad exploration.]
### 5. SCOPE
- **Include**: [Editable area + required work]
- **Exclude**: [Out of bounds · unrelated refactors · new features · production behavior changes]
### 6. CONSTRAINTS
- New dependencies: allow/forbid
- Network/API calls: allow/forbid
- Commit/PR/push: allow/forbid
- Migration/DB changes: allow/forbid
- Destructive actions: allow/forbid
### 7. BUDGET
[Time/token/call/cost limits. Follow if given, don't invent if not.]
## GOAL (Quick)
### 1. GOAL
[One-line goal]
### 2. DONE EVIDENCE
[One verification command]
### 3. SCOPE
- **Include**: [Files to modify]
- **Exclude**: [Don't touch]
Fields extractable from conversation context are auto-filled and shown for user confirmation:
After auto-fill: "Input sheet filled. Proceed if correct, or tell me what to change." — never proceed without confirmation.
Any of 7 fields (Quick: 3) missing or contradictory → don't guess, STOP.
Honesty (highest priority):
Success Masquerading Blocklist:
| Pattern | Description |
|---|---|
| Test deletion/disable | Delete failing tests or neutralize with skip/xfail |
| Mock wrapping | Assert only mock return values and report "passed" |
| Threshold relaxation | Lower the bar to pass |
| Benchmark condition change | Change measurement conditions to improve numbers |
| Failure log suppression | Send error output to /dev/null or catch-all |
| Hardcoding | Hardcoded results matching test inputs |
| Requirement reinterpretation | "Actually this was supposed to work like..." |
| Acceptance criteria weakening | Subtly lower DONE EVIDENCE standards |
| Production behavior change | Modify production logic to match tests |
| Goal-lock declaration ignored | Declare "proceeding with goal-lock" then skip the input sheet |
| Structural fix reported as upgrade | Report boilerplate additions as "substantive improvements" |
Language-specific patterns:
@pytest.mark.skip, @pytest.mark.xfail, mock.return_value abusetest.skip, .only left in, jest.fn() chains bypassing real logict.Skip(), //go:build ignore#[ignore], #[should_panic] misuse0 Honesty → 1 Stability → 2 Preserve existing behavior → 3 Verifiability → 4 Performance → 5 Code cleanup
PLAN → DO → VERIFY → FINALIZE → OUTPUT
| Risk | Check |
|---|---|
| Breaking change | Will existing callers break? |
| Race condition | Concurrent access to shared resource? |
| Stale state | Cache/state might not update? |
| Data loss | Irreversible deletion/overwrite? |
| Security | Input validation, permissions, secret exposure? |
| Perf regression | O(n²) introduction? |
| Backward compat | Existing API/interface changing? |
Risk detected → return to PLAN with avoidance strategy.
Actually execute the verification specified in DONE EVIDENCE.
Verification recipes (auto-detect stack):
| Stack | Command |
|---|---|
| Python (pytest) | pytest -q + ruff check (if available) |
| JavaScript (jest) | npm test + npx eslint . (if available) |
| TypeScript | npx tsc --noEmit + npm test |
| Go | go test ./... + go vet ./... |
| Rust | cargo test + cargo clippy |
| General | git diff --stat (verify change scope) |
Items not verified: NOT RUN: [reason]. Never "it should be fine."
## Result
**Changed files**: [list]
**Key changes**: [what and why]
**Completion evidence**: [commands run + results]
**Verification**: [passed/failed/not run — each with reason]
**Risks/trade-offs**: [if any]
**Remaining known issues**: [if any]
**Follow-up work**: [if any]
**Final status**: WORKING / PARTIAL / BROKEN
| # | Condition | Action |
|---|---|---|
| S1 | Goal splits into 2+ independent goals | "Goal is branching. Which one first?" |
| S2 | Input missing/contradictory | Specify exactly what's ambiguous |
| S3 | Need to change SCOPE Exclude area | "Need to modify X but it's Excluded. Allow?" |
| S4 | Destructive / external side effect needed | "DB deletion/API call/push needed. Proceed?" |
| S5 | Insufficient confidence in root cause | "Not sure if cause is A or B" |
| S6 | Same blocker repeated (2+ times) | "Same problem repeating. Need different approach" |
.goal-lock-progress.md (session crash protection).GOAL: Fix [symptom]
DONE EVIDENCE: Reproduction test → PASS + all existing tests PASS
SCOPE Exclude: No API signature changes, no new features
GOAL: Implement [feature]
DONE EVIDENCE: N new tests PASS + all existing tests PASS + render/behavior confirmed
SCOPE Exclude: No changes to existing feature behavior
GOAL: Refactor [target] — no behavior change
DONE EVIDENCE: All existing tests PASS (same test count) + before/after diff scope confirmed
SCOPE Exclude: No new features, no API signature changes
| Does | Does NOT |
|---|---|
| Auto-fill input sheet + get user confirmation | Start work without user confirmation |
| Enforce PLAN→DO→VERIFY→FINALIZE→OUTPUT loop | Skip loop steps |
| Halt immediately on STOP RULES | Continue with "probably fine" |
| Detect and block success masquerading patterns | Design code logic (that's the developer/agent's role) |
Save .goal-lock-progress.md checkpoints | Manage memory/handoff systems |
DONE EVIDENCE must be actually executed: Run all items before OUTPUT. "It should pass" is not verification. Violation → unverified code reported as "done."
SCOPE Exclude is absolute: Need to touch Exclude → S3 STOP. Only proceed after user approval. Violation → unintended changes reach production.
Success masquerading detected → OUTPUT BROKEN: If B1 pattern found in code, mark that verification as FAIL and downgrade OUTPUT to PARTIAL/BROKEN. Violation → false success report.
Incomplete input sheet → no work: Any of 7 (Quick: 3) fields missing/contradictory → STOP. Don't guess. Violation → unclear goal → rework.
| Failure Type | Recovery |
|---|---|
| VERIFY failure | Return to PLAN for root cause analysis → re-DO. Same approach fails twice → S6 STOP |
| Tool failure (Bash/Edit) | 1 retry → report "tool failure" + suggest alternative |
| BUDGET exceeded | Report status + clearly separate done/not-done → user decides |
| Rationalization | Counter |
|---|---|
| "Simple change, don't need the input sheet" | Quick mode exists. Can't fill 3 fields → goal is unclear |
| "Most of VERIFY passed so it's WORKING" | One FAIL = PARTIAL. "Most" ≠ "all" |
| "Test was too strict so I skipped it" | Success masquerading B1 violation. If test is strict, fix the code |
| "Doing the refactor together is more efficient" | SCOPE Exclude violation. Achieve goal first, then separate goal-lock for refactor |
| "This should be fine" | DONE EVIDENCE not executed = Invariant 1 violation |
| "I'll add tests later" | If DONE EVIDENCE includes tests, now. If not, they were never needed |
| "Goal-lock format is overhead, just look at the result" | Format IS discipline. Without the input sheet, scope drift and masquerading detection opportunities vanish. Quick mode is 10 seconds |
npx claudepluginhub alexzio00/claude-code-skills --plugin goal-lockDefines a goal with measurable success criteria and runs an autonomous plan-execute-validate loop until criteria are met or limits reached.
Auto-loop execution workflow with quality gates. Use when starting any non-trivial implementation task. Provides automatic task decomposition, code implementation, testing (L1-L4), and iterative quality gates until completion. Invoke with /autoworker.