From claude-code-config
Verifies implemented features against frozen acceptance criteria using independent agents post-build, ensuring unbiased confirmation specs are met.
npx claudepluginhub anastasiyaw/claude-code-configThis skill uses the workspace's default tool permissions.
Plan-based verification: freeze acceptance criteria BEFORE building, verify AFTER with independent agents.
Verifies task completion by enforcing fresh automated test runs, runtime evidence review, and spec re-read in /dev workflow Phase 7.
Runs structured verification pass on implemented changes: checks behavior vs design, happy/edge cases, failure recovery, docs alignment, records gaps/owners. Use for feature completion, risky fixes, milestones.
Enforces evidence-based verification by running fresh tests, builds, linters, reviewing outputs before claiming work done, committing, or PRing.
Share bugs, ideas, or general feedback.
Plan-based verification: freeze acceptance criteria BEFORE building, verify AFTER with independent agents.
PHASE 1: PLAN (before any code)
Create .proof/PLAN.md with numbered acceptance criteria
Each AC: testable, specific, has a verification command or check
Plan is FROZEN - no changes during build
PHASE 2: BUILD (normal work)
Implement against the plan
Mark progress in .proof/PROGRESS.md
Builder does NOT self-verify
PHASE 3: VERIFY (after build, independent agent)
Fresh agent reads PLAN.md (never saw the build process)
Walks through each AC, runs verification commands
Writes .proof/VERDICT.md with PASS/FAIL per criterion
If any FAIL → .proof/PROBLEMS.md with specific fixes
PHASE 4: FIX (if needed)
Builder reads PROBLEMS.md, makes minimal fixes
Back to PHASE 3 (re-verify)
Loop until all PASS
Create .proof/PLAN.md in the project root:
# Verification Plan
**Created:** YYYY-MM-DD HH:MM
**Task:** [one-line description]
**Builder:** [session ID or "current"]
**Status:** FROZEN
## Acceptance Criteria
### AC1: [short name]
**Description:** [what must be true]
**Verify:** [exact command or check to run]
**Expected:** [what success looks like]
### AC2: [short name]
**Description:** [what must be true]
**Verify:** [exact command or check to run]
**Expected:** [what success looks like]
### AC3: [short name]
...
## Out of Scope
- [explicitly what this plan does NOT cover]
## Constraints
- [time, resource, or technical constraints]
Rules for good ACs:
Normal implementation. The only additions:
.proof/PROGRESS.md as you work:# Build Progress
### AC1: [name]
- [x] Implemented in `src/foo.py:42`
- Files changed: `src/foo.py`, `tests/test_foo.py`
### AC2: [name]
- [x] Implemented in `src/bar.py:18`
- Files changed: `src/bar.py`
- Note: chose approach B because [reason]
.proof/EVIDENCE.md:# Evidence
### AC1: [name]
**Command:** `pytest tests/test_foo.py -v`
**Output:**
\```
tests/test_foo.py::test_returns_correct PASSED
tests/test_foo.py::test_handles_edge PASSED
\```
**Result:** PASS
### AC2: [name]
**Command:** `grep -c "TODO" src/bar.py`
**Output:** `0`
**Result:** PASS
Builder collects evidence but does NOT write the verdict. That is the verifier's job.
This is the critical phase. The verifier MUST be:
PLAN.md + access to the codebasePROGRESS.md, EVIDENCE.md, or any build contextYou are an independent verifier. Your job is to check whether
the implementation meets the acceptance criteria in .proof/PLAN.md.
Rules:
1. Read .proof/PLAN.md first. This is your ONLY specification.
2. For each AC, run the verification command yourself.
3. Do NOT read .proof/PROGRESS.md or .proof/EVIDENCE.md
(those are the builder's claims - you verify independently).
4. Write your verdict to .proof/VERDICT.md in this format:
# Verification Verdict
**Verifier:** [your session ID]
**Date:** YYYY-MM-DD HH:MM
**Plan hash:** [first 8 chars of md5 of PLAN.md]
## Results
### AC1: [name]
**Status:** PASS | FAIL
**Evidence:** [what you saw when you ran the check]
**Notes:** [any observations]
### AC2: [name]
...
## Summary
- Total: N criteria
- Passed: X
- Failed: Y
- **Overall:** PASS | FAIL
5. If any AC fails, also create .proof/PROBLEMS.md:
# Problems
### AC2: [name]
**Expected:** [from PLAN.md]
**Actual:** [what you found]
**Suggested fix:** [smallest change that would fix it]
**Affected files:** [list]
6. Do NOT fix anything. You are read-only. Report only.
Option A: Subagent (same session)
Agent({
description: "Independent verification against plan",
prompt: "[verifier prompt above]",
mode: "plan" // read-only first
})
Option B: Fresh session (stronger isolation) Write handoff with instruction: "Start by reading .proof/PLAN.md and running verification."
Option C: Multiple verifiers (highest confidence) Spawn 2-3 verifiers independently. If they disagree on any AC, that AC needs investigation.
If VERDICT.md shows any FAIL:
PROBLEMS.mdEVIDENCE.md with new evidence for failed ACsTypical: 1-2 fix rounds. If 3+ rounds on same AC → the AC itself might be wrong. Revisit PLAN.md.
.proof/
PLAN.md # frozen acceptance criteria (Phase 1)
PROGRESS.md # builder's notes (Phase 2)
EVIDENCE.md # builder's evidence (Phase 2)
VERDICT.md # verifier's verdict (Phase 3)
PROBLEMS.md # verifier's findings (Phase 3, if failures)
| Symptom | Cause | Fix |
|---|---|---|
| Verifier passes everything | ACs too vague | Rewrite with specific commands |
| 3+ fix rounds on same AC | AC is wrong or untestable | Revisit PLAN.md |
| Verifier disagrees with builder's evidence | Different env or stale state | Both run from clean state |
| Builder keeps editing PLAN.md | Not frozen | Hash check catches this |