From superpowers-plus
Enforces evidence-based verification before claiming work complete, fixed, or presenting results. Checks battery sentinel or dispatches battery for code changes.
npx claudepluginhub bordenet/superpowers-plus --plugin superpowers-plusThis skill uses the workspace's default tool permissions.
> **Wrong skill?** Pre-commit/push gate → `unified-commit-gate`. Branch done workflow → `finishing-a-development-branch`. Reviewing someone's PR → `providing-code-review`. Presenting a non-code artifact → `progressive-harsh-review`.
Mandates invoking relevant skills via tools before any response in coding sessions. Covers access, priorities, and adaptations for Claude Code, Copilot CLI, Gemini CLI.
Share bugs, ideas, or general feedback.
Wrong skill? Pre-commit/push gate →
unified-commit-gate. Branch done workflow →finishing-a-development-branch. Reviewing someone's PR →providing-code-review. Presenting a non-code artifact →progressive-harsh-review.
The trigger is your INTENT, not your words. The moment you are composing a message to the human that presents results — that is the moment to run this skill. Not after you've written it.
Claiming work is complete without verification is dishonesty, not efficiency.
Core principle: Evidence before claims, always.
Violating the letter of this rule is violating the spirit of this rule.
BEFORE running this skill's gate function, check:
| Task Type | Action |
|---|---|
| Bulk edit, audit, or refactoring | Invoke exhaustive-audit-validation FIRST, then return here |
| Single fix, feature, or bug fix | Continue directly with this skill |
NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
If you haven't run the verification command in this message, you cannot claim it passes.
This skill fires on WHAT YOU ARE ABOUT TO DO, not on what you say.
"Even if there's only a 1% chance you are about to present results — fire this skill."
| Internal state (INTENT) | Required action before writing |
|---|---|
| About to write any response describing implementation results | Stop. Verify evidence first. |
| About to describe what you built or changed | Stop. This IS the trigger. |
| About to share an MR/PR/commit link | Stop. Sentinel must exist for HEAD. |
| About to write a "here's what I did" summary | Stop. Battery must have passed. |
| About to commit or push code changes | Stop. Battery must have passed. |
| About to claim a bug is fixed | Stop. Show test output proving it. |
| About to claim tests pass | Stop. Show the actual test output. |
| Finishing a multi-step task | Stop. Run TODO maintenance first. |
| ANY response that presents results — even without "done" language | STOP. This IS the trigger. |
The output phrase is NOT the trigger. An agent can share an MR link and write a completion summary without using "done", "shipped", or "fixed". That is the exact failure mode this skill exists to prevent.
The trigger is the INTENT TO PRESENT — the moment you begin composing a response to the human that describes results. That moment fires this skill.
BEFORE forming any response that presents results to the human:
0. INTENT CHECK: Am I about to write a response presenting results?
- YES → continue. NO → this skill doesn't apply yet.
(Most common false negative: "I'm just sharing a link" — that IS presenting results.)
1. LOOSE-ENDS RETROSPECTIVE: See "Loose-Ends Retrospective" section below.
- Scan session for unacted observations and deferred items.
- Block on any must-address items before proceeding.
2. IDENTIFY: What command proves this claim?
**Note — output-verification pre-satisfies this:** If `output-verification` has already run in this same response step (it fires first, order=0), Steps 2–4 (IDENTIFY, RUN, READ) are already satisfied for artifact-description claims. Continue from Step 5 (VERIFY) for those claims. Still run Steps 2–4 for any non-artifact claims (e.g., "tests pass," "PR created").
3. RUN: Execute the FULL command (fresh, complete)
4. READ: Full output, check exit code, count failures
5. VERIFY: Does output confirm the claim?
- If NO: State actual status with evidence
- If YES: State claim WITH evidence
6. CODE REVIEW GATE: If you made code changes, see "Code Review Gate" below.
Run Step 0 of Code Review Gate BEFORE deciding whether to dispatch.
7. HOUSEKEEPING: If the work spanned multiple steps or used TODO.md, run:
`~/.codex/superpowers-plus/tools/todo-maintenance.sh`
Read the summary and resolve any stale-plan/archive surprises before proceeding.
8. ONLY THEN: Write the response
Skip any step = lying, not verifying
Purpose: Catch observations noted but not acted on — primary source of shipped bugs and broken links. Defense-in-depth, not a perfect audit.
Scan for:
~/.codex/superpowers-plus/tools/loose-ends.sh check
# Exit 0 = clean. Non-zero = items listed with justification visibility.
The pre-commit hook also runs this automatically at every commit.Classify each item shown:
| Label | Action |
|---|---|
resolved | Already addressed — proceed |
deferred | Confirm a note/reason line is visible in the output; if missing, escalate to human — no retrofit path exists — proceed once confirmed |
must-address | FIX IT NOW — do not claim completion until resolved |
Any must-address item → STOP → fix → restart gate from Step 1.
If you made code changes, you MUST verify battery evidence before claiming completion.
SENTINEL="$(git rev-parse --show-toplevel 2>/dev/null || echo .)/.code-review-cleared"
cat "$SENTINEL" 2>/dev/null || echo "NO CLEARANCE"
echo "HEAD: $(git rev-parse HEAD 2>/dev/null)"
# Check for uncommitted/staged changes (unreviewed code not yet in HEAD)
git diff --quiet && git diff --cached --quiet && echo "WORKTREE_CLEAN" || echo "WORKTREE_DIRTY"
| Sentinel state | Action |
|---|---|
NO CLEARANCE | Proceed to Step 1 (dispatch battery). |
| Sentinel SHA ≠ HEAD SHA | Proceed to Step 1 (battery is stale — changes were made after last review). |
Sentinel valid for HEAD but WORKTREE_DIRTY | Proceed to Step 1 (staged/unstaged changes exist that were not reviewed — sentinel covers HEAD, not the current diff). |
v1|SHA|PASS|... or PASS_WITH_NITS, SHA matches HEAD, AND WORKTREE_CLEAN | Evidence confirmed. Skip Step 1. Note the clearance and proceed to Step 5 (Housekeeping). |
| Malformed | Delete .code-review-cleared, proceed to Step 1. |
One-per-unit rule (agent self-enforcement): Battery fires at most once per coherent unit of work. If Step 0 confirms evidence, do NOT re-dispatch. This prevents double-dispatch when requesting-code-review and verification-before-completion both apply to the same moment. Note: this rule is expressed in skill prose (agent-layer), not in the runtime. The mechanical enforcement is at the git-hook layer (pre-commit Gate 0, pre-push Gate 1). Both layers are complementary.
Self-review is not review. The implementer cannot objectively evaluate their own work.
| Condition | Action |
|---|---|
Made code changes (any .ts, .js, .py, .sh, etc.) | Dispatch sub-agent-code-reviewer with diff context |
| Documentation-only changes | Skip code review (still verify links/content) |
| Config-only changes (env, yaml, toml, json) | Treat as code — dispatch code review (push gate enforces sentinel for these types) |
| Reviewer found issues | Fix issues, re-dispatch reviewer |
| Reviewer approved | Proceed to Step 5 (Housekeeping) |
Dispatch template:
Provide the reviewer with:
1. What was implemented (1-2 sentences)
2. Files changed (list with purpose)
3. The actual diff or file contents to review
4. Specific review questions (verifiable, not "is this good?")
5. Request to run tests independently
The reviewer loads providing-code-review automatically. You do not need to tell them how to review.
Why: The 2026-03-23 incident — implementer ran 1,636 tests, self-reviewed, claimed "Fixed". Reviewer found a state leak immediately. Self-review is not review.
| Claim | Requires | Not Sufficient |
|---|---|---|
| Tests pass | Test command output: 0 failures | Previous run, "should pass" |
| Linter clean | Linter output: 0 errors | Partial check, extrapolation |
| Build succeeds | Build command: exit 0 | Linter passing, logs look good |
| Bug fixed | Test original symptom: passes | Code changed, assumed fixed |
| PR created | API response showing PR exists | git push succeeded |
| Shipped | PR merged confirmation | PR created |
| Excuse | Reality |
|---|---|
| "Should work now" | RUN the verification |
| "I'm confident" | Confidence ≠ evidence |
| "Just this once" | No exceptions |
| "Push succeeded" | Push ≠ PR created |
| "Agent said success" | Verify independently |
| "Different words so rule doesn't apply" | Spirit over letter |
PR: ✅ [API returns: state=open, number=17] vs ❌ "Shipped!" after git push
Tests: ✅ [34/34 pass] vs ❌ "Should pass now"
Build: ✅ [exit 0] vs ❌ "Linter passed"
| Date | Violation | Impact |
|---|---|---|
| 2026-03-13 | Said "Shipped! 🚀" after git push before verifying PR created | Trust erosion, required post-hoc verification |
| 2026-03-23 | Claimed "Fixed" without dispatching code reviewer | State leak caught only after user forced review |
| 2026-04-02a | Presented work as "ready to commit and push" without running code review battery | Human caught the gap; unreviewed code nearly shipped |
| 2026-04-02b | Committed, pushed branch, shared MR link + summary — no battery run | Wrong approach shipped to MR. Battery (when forced) found safety regression. Skills had explicit incident history from same session yet gate still failed. Transition from "implementation" to "reporting" was not recognized as trigger. Fix: sentinel file gate in pre-push hook + explicit "writing a summary" trigger. |
No shortcuts. Run the command. Read the output. THEN claim the result. Non-negotiable.