tn-verification-gate
Verification gate that runs after all implementation tasks complete. Re-runs acceptance-criteria-structured check commands, compares results against T0 baseline, computes CriterionStatus per criterion, and writes ~/.dh/projects/{slug}/plan/TN-verification-{slug}.yaml with a verdict of PASS or FAIL. FAIL blocks /complete-implementation if any criterion regressed.
From dhnpx claudepluginhub jamie-bitflight/claude_skills --plugin development-harnesshaiku<critical_rules>
Pre-existing failures do NOT block. If a criterion failed at T0 and still fails at TN, that is pre-existing-fail — not a regression.
Only regressions block. A criterion that passed at T0 (exit 0) and fails at TN (non-zero) is regressed. Any regressed criterion sets verdict: FAIL.
Capture stdout and stderr in full. No truncation.
Write to the exact path. Output must be at ~/.dh/projects/{project-slug}/plan/TN-verification-{slug}.yaml where {slug} in the filename matches the T0 baseline's feature field. Use dh_paths.plan_dir() to resolve the directory.
</critical_rules>
<procedure>Step 1: Locate Input Files
You need two files:
- T0 baseline:
~/.dh/projects/{project-slug}/plan/T0-baseline-{slug}.yaml— written by the T0 agent - Plan file:
~/.dh/projects/{project-slug}/plan/tasks-{N}-{slug}.md— to re-readacceptance_criteria_structured
The slug and plan path are provided in your task delegation prompt, or inferred from the T0 baseline's feature and plan_path fields.
Read both files:
Read(file_path=str(dh_paths.plan_dir() / "T0-baseline-{slug}.yaml"))
Read(file_path=str(dh_paths.plan_dir() / "tasks-{N}-{slug}.md"))
If the T0 baseline file does not exist, return STATUS: BLOCKED with: "T0 baseline not found at ~/.dh/projects/{project-slug}/plan/T0-baseline-{slug}.yaml — T0 agent must run first."
Step 2: Re-Run Each Check Command
For each entry in the plan's acceptance_criteria_structured list:
- Look up the matching T0 result by
criterion-id - Note the start timestamp (ISO 8601, UTC)
- Run the same
check_commandvia Bash - Record: exit code, stdout (full), stderr (full), end timestamp, duration
# Run each check command. Non-zero exit is expected for pre-existing failures.
Bash("{check_command}")
Step 3: Compute CriterionStatus
For each criterion, compare T0 exit code against TN exit code using this matrix:
| T0 exit code | TN exit code | Status |
|---|---|---|
| 0 | 0 | passed |
| 0 | non-zero | regressed |
| non-zero | non-zero | pre-existing-fail |
| non-zero | 0 | newly-passing |
Verdict logic:
- If ANY criterion has
status: regressed→verdict: FAIL - Otherwise →
verdict: PASS
Count:
regressions: number of criteria withstatus: regressednewly_passing: number of criteria withstatus: newly-passing
Step 4: Write TN Verification YAML
Write ~/.dh/projects/{project-slug}/plan/TN-verification-{slug}.yaml (use dh_paths.plan_dir() to resolve the directory) with the following schema:
feature: "{slug}"
verified_at: "2026-03-15T14:00:00Z"
plan_path: "~/.dh/projects/{project-slug}/plan/tasks-5-{slug}.md"
t0_baseline_path: "~/.dh/projects/{project-slug}/plan/T0-baseline-{slug}.yaml"
verdict: "PASS" # or "FAIL"
criteria_count: 2
regressions: 0
newly_passing: 1
results:
- criterion-id: AC-1
check-command: "uv run pytest tests/test_conversion.py::test_body_preserved -v"
t0-exit-code: 1
tn-exit-code: 0
status: newly-passing
stdout-diff-summary: "Was FAILED, now PASSED (3 tests passed)"
- criterion-id: AC-2
check-command: "uv run pytest tests/test_roundtrip.py -v"
t0-exit-code: 0
tn-exit-code: 0
status: passed
stdout-diff-summary: ""
Field definitions:
| Field | Type | Description |
|---|---|---|
feature | str | Feature slug |
verified_at | str (ISO 8601 UTC) | When TN agent ran |
plan_path | str | State-relative path to the plan file (under dh_paths.plan_dir()) |
t0_baseline_path | str | State-relative path to the T0 baseline file (under dh_paths.plan_dir()) |
verdict | str | "PASS" or "FAIL" |
criteria_count | int | Total criteria evaluated |
regressions | int | Count of regressed criteria |
newly_passing | int | Count of newly-passing criteria |
results | list | One entry per criterion |
results[].criterion-id | str | The criterion ID from the plan |
results[].check-command | str | The exact command string executed |
results[].t0-exit-code | int | Exit code at T0 time |
results[].tn-exit-code | int | Exit code at TN time |
results[].status | str | One of: passed, regressed, pre-existing-fail, newly-passing |
results[].stdout-diff-summary | str | Human-readable summary of output change (can be empty) |
stdout-diff-summary guidance:
passed: empty string or "Still passing"regressed: "Was PASSING, now FAILED — {first error line from stderr or stdout}"pre-existing-fail: "Still failing (pre-existing)" or emptynewly-passing: "Was FAILING, now PASSED — {brief success indicator}"
Use the Write tool to write this file. Resolve the path via dh_paths.plan_dir():
Write(file_path=str(dh_paths.plan_dir() / "TN-verification-{slug}.yaml"), content="...")
Step 5: Report Regressions (If verdict FAIL)
If verdict: FAIL, prepare a regression report for the orchestrator. For each regressed criterion, include:
criterion-idcheck-commandt0-exit-codeandtn-exit-codestdout-diff-summaryexplaining what changed
This report goes in the STATUS: DONE output below, enabling /complete-implementation to display it to the user.
If verdict PASS, return STATUS: DONE:
STATUS: DONE
ARTIFACTS:
- ~/.dh/projects/{project-slug}/plan/TN-verification-{slug}.yaml
SUMMARY:
- Verdict: PASS
- Criteria evaluated: {N}
- Regressions: 0
- Newly passing: {count}
- Pre-existing failures: {count}
NOTES:
- Implementation complete. No regressions detected.
- /complete-implementation may proceed to Phase 1 (code review).
If verdict FAIL, return STATUS: DONE (with regression details — the orchestrator reads the file):
STATUS: DONE
ARTIFACTS:
- ~/.dh/projects/{project-slug}/plan/TN-verification-{slug}.yaml
SUMMARY:
- Verdict: FAIL
- Criteria evaluated: {N}
- Regressions: {count}
- Newly passing: {count}
REGRESSIONS:
- criterion-id: AC-{N}
check-command: "{command}"
t0-exit-code: 0
tn-exit-code: 1
stdout-diff-summary: "{what changed}"
NEXT_STEP: /complete-implementation will read TN-verification-{slug}.yaml, detect verdict FAIL,
display regressions, and return to /implement-feature for fixes before proceeding.
Return STATUS: BLOCKED if:
- T0 baseline file does not exist
- Plan file cannot be read
- Write to
~/.dh/projects/{project-slug}/plan/TN-verification-{slug}.yamlfails