From fakoli-flow
Verify phase — evidence-based validation with sentinel dispatch and pass/fail scorecard
npx claudepluginhub fakoli/fakoli-plugins --plugin fakoli-flowThis skill uses the workspace's default tool permissions.
Verification is not an opinion. It is a command you ran, output you read, and a result you can cite.
Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Designs, implements, and audits WCAG 2.2 AA accessible UIs for Web (ARIA/HTML5), iOS (SwiftUI traits), and Android (Compose semantics). Audits code for compliance gaps.
/flow:verify)Verification is not an opinion. It is a command you ran, output you read, and a result you can cite.
Core principle: Every PASS must cite fresh command output from this session. Every FAIL must cite what the output actually showed.
This skill is invoked:
/flow:execute completes/flow:verify/flow:quick after the agent finishesRun the following to determine which verification commands apply:
# Check for language markers in order of specificity
[ -f tsconfig.json ] && echo "TypeScript"
[ -f Cargo.toml ] && echo "Rust"
{ [ -f pyproject.toml ] || [ -f setup.py ]; } && echo "Python"
| Marker file | Language |
|---|---|
tsconfig.json or package.json | TypeScript |
Cargo.toml | Rust |
pyproject.toml or setup.py | Python |
If multiple markers exist, prefer the most specific: tsconfig.json > package.json.
If no marker is found: ask the user before proceeding.
Run the full command for the detected language. Do not split it. Do not skip a step because the previous one passed.
TypeScript:
npx tsc --noEmit && bun test
Python:
ruff check . && mypy . && pytest
Rust:
cargo check && cargo test
Capture the full output of each command. Read exit codes. Count errors explicitly — do not skim.
If fakoli-crew is installed, dispatch the sentinel agent with the acceptance criteria from the plan.
How to find the plan:
ls docs/plans/ | sort | tail -1
Read the most recent plan file. Extract the acceptance criteria for each task (the **Acceptance criteria:** bullet points under each task heading).
If multiple plans exist for the current date: Do not guess. Ask the user:
Multiple plans found for today:
- docs/plans/2026-04-04-feature-a.md
- docs/plans/2026-04-04-feature-b.md
Which plan should I verify against?
After /flow:quick (no plan file): Quick mode does not create a plan file. If verify is invoked after a quick session, ask the user for the original task description and verify the modified files against it. Use the same evidence gate — every PASS still requires a command output to cite.
Dispatch:
Agent(
subagent_type="fakoli-crew:sentinel",
prompt="Run verification against the following acceptance criteria. For every criterion, run the exact verify command from the plan, read the full output, and report PASS or FAIL with evidence. Do not claim PASS without a command output from this session to cite.
Acceptance criteria:
<paste criteria from plan>
Plan file: docs/plans/<filename>
Language: <detected language>
"
)
If fakoli-crew is not installed: Skip sentinel dispatch. Run the criteria checks yourself using the verify commands listed in the plan. Apply the same evidence gate.
This is non-negotiable. Every PASS must cite a specific piece of fresh evidence. Every FAIL must state what the output showed.
| Evidence type | Example |
|---|---|
| Exit code 0 from test command | bun test exited 0, 34/34 tests passed |
| Zero errors in typecheck output | npx tsc --noEmit output: (empty) |
| Expected value present in output | pytest output contains 5 passed |
| File exists at expected path | ls src/retry.ts exits 0 |
| Not evidence | Why it fails |
|---|---|
| "Should work" | Expectation is not observation |
| Output from a previous session | Stale — the code may have changed |
| An agent's claim without command output | Agent reports are not verification |
| Partial output ("first 10 lines looked fine") | Partial proves nothing — errors appear at the end |
| "Looks good" | This is an opinion |
If a command exits non-zero, or output contains errors, or an expected value is absent:
Report the results in this exact format:
## Verification Scorecard
Language: TypeScript
Plan: docs/plans/2026-04-04-feature-name.md
### Type Check
PASS — `npx tsc --noEmit` exited 0, no output (zero errors)
### Tests
PASS — `bun test` exited 0: 34/34 tests passed, 0 failed
### Acceptance Criteria
- [PASS] Retry function accepts optional timeout parameter
Evidence: `bun test src/retry.test.ts` — "timeout test" passed
- [PASS] Default timeout is 5000ms when not provided
Evidence: test output shows "default timeout: 5000ms"
- [FAIL] Timeout triggers RateLimitError on expiry
Evidence: `bun test` output shows 1 failed test: "expected RateLimitError, got TimeoutError"
---
Result: 2/3 criteria PASS — NOT READY TO SHIP
If all criteria pass:
Result: 3/3 criteria PASS — READY TO SHIP
Present the scorecard. Then:
/flow:finish to ship."Do not proceed past this skill if any of these are true:
State the problem explicitly. Return control to the user.
Claiming PASS because a prior session looked fine. The code changed. Run the command now.
Skipping the sentinel because "the tests already cover it." Tests cover implementation. The sentinel checks acceptance criteria. These are different things.
Marking PASS on a criterion with no verify command. If the plan has no verify command for a criterion, ask how to verify it. Do not assume.
Stopping after the first failure. Run all checks. A full scorecard is more useful than a partial one. Report everything.