You are orchestrating a verification workflow that ensures code actually works before shipping.
Orchestrates verification workflow to ensure code works before shipping.
/plugin marketplace add elb-pr/claudikins-kernel/plugin install elb-pr-claudikins-kernel@elb-pr/claudikins-kernelYou are orchestrating a verification workflow that ensures code actually works before shipping.
"Evidence before assertions. Always." - Verification philosophy
State file: .claude/verify-state.json
{
"session_id": "verify-YYYY-MM-DD-HHMM",
"execute_session_id": "exec-YYYY-MM-DD-HHMM",
"branch": "execute/task-1-feature",
"started_at": "ISO timestamp",
"status": "initialising|verifying|completed|failed",
"phases": {
"test_suite": { "status": "pending|PASS|FAIL" },
"lint": { "status": "pending|PASS|FAIL" },
"type_check": { "status": "pending|PASS|FAIL" },
"output_verification": { "status": "pending|PASS|FAIL" },
"code_simplification": { "status": "pending|PASS|FAIL|skipped" }
},
"all_checks_passed": false,
"human_checkpoint": {
"prompted_at": null,
"decision": null,
"caveats": []
},
"unlock_ship": false
}
Check for flags first:
--status → Display current verification status, exit
--resume → Load checkpoint, resume from saved state
--list-sessions → Show available sessions, exit
The SessionStart hook validates:
On validation failure:
ERROR: claudikins-kernel:execute has not been run
You must run claudikins-kernel:execute before claudikins-kernel:verify.
The verification command requires completed execution state.
Run: claudikins-kernel:execute [plan-file]
Detect project type automatically:
if package.json exists:
PROJECT_TYPE = "node"
TEST_CMD = "npm test"
LINT_CMD = "npm run lint"
TYPE_CMD = "npm run typecheck" (if typescript)
elif pyproject.toml or setup.py:
PROJECT_TYPE = "python"
TEST_CMD = "pytest"
LINT_CMD = "ruff check ."
TYPE_CMD = "mypy ."
elif Cargo.toml:
PROJECT_TYPE = "rust"
TEST_CMD = "cargo test"
LINT_CMD = "cargo clippy"
TYPE_CMD = "cargo check"
elif go.mod:
PROJECT_TYPE = "go"
TEST_CMD = "go test ./..."
LINT_CMD = "golangci-lint run"
TYPE_CMD = "go build ./..."
else:
PROJECT_TYPE = "unknown"
Ask user for commands
Run in sequence. STOP on any failure.
# Run tests with timeout
timeout 300 ${TEST_CMD}
On failure:
Tests failed.
[Show test output]
[Fix tests] [Re-run (flaky?)] [Skip tests] [Abort]
Flaky test detection (C-12): If tests fail, offer re-run:
Test failure detected. Could be flaky.
[Re-run tests] [Accept failure] [Abort]
If re-run passes:
Tests passed on retry. Likely flaky.
[Accept with flakiness caveat] [Fix tests] [Abort]
${LINT_CMD}
On failure with --fix-lint:
Lint issues found. Auto-fix available.
[Apply fixes] [Show issues] [Skip lint] [Abort]
After auto-fix, re-run lint to confirm:
${LINT_CMD}
If still failing after fix:
Auto-fix did not resolve all issues.
Remaining issues:
[Show remaining issues]
[Fix manually] [Skip lint] [Abort]
${TYPE_CMD}
On failure:
Type errors found.
[Show errors]
[Fix errors] [Skip type check] [Abort]
Automated checks complete.
Tests: ${TEST_STATUS}
Lint: ${LINT_STATUS}
Types: ${TYPE_STATUS}
[Continue to Output Verification] [Re-run checks] [Abort]
"Give Claude a tool to see the output of the code." - Boris
This is the feedback loop that makes Claude's code actually work.
Task(catastrophiser, {
prompt: `
Verify the implementation WORKS by SEEING its output.
Project type: ${PROJECT_TYPE}
Branch: ${BRANCH}
Use the appropriate verification method:
- Web: Start server, screenshot key pages
- API: Curl endpoints, verify responses
- CLI: Run commands, check output
- Library: Run examples, verify results
Capture evidence. Report any issues clearly.
Output JSON with status and evidence.
`,
context: "fork",
model: "opus",
});
1. Try primary method (screenshot/curl/run)
└─ If fails → Try secondary method
2. Try tests-only verification
└─ If fails → Try CLI verification
3. Try code review only
└─ If fails → Report inability to verify
Output verification complete.
Agent: catastrophiser
Status: ${VERIFICATION_STATUS}
Evidence: ${EVIDENCE_COUNT} items
[Show evidence] [Accept] [Debug] [Skip] [Abort]
Prerequisite: Phase 2 (catastrophiser) must PASS
If --skip-simplify flag set:
Skipping code simplification.
Proceeding directly to human checkpoint.
Otherwise, ask:
Run cynic for polish pass?
This will:
- Inline single-use helpers
- Remove dead code
- Improve naming clarity
- Flatten nested conditionals
Tests will be re-run after each change.
[Run polish pass] [Skip] [Abort]
Task(cynic, {
prompt: `
Polish the implementation while preserving behaviour.
Rules:
- Tests MUST still pass after each change
- One change at a time
- Revert on test failure
- Stop after 3 passes or 3 consecutive failures
Output JSON with simplifications made and test status.
`,
context: "fork",
model: "opus",
});
Code simplification complete.
Changes made: ${CHANGES_COUNT}
Changes reverted: ${REVERTED_COUNT}
Lines removed: ${LINES_REMOVED}
Tests still pass: ${TESTS_PASS}
[Show changes] [Accept] [Revert all] [Abort]
If verification keeps failing (3+ attempts):
Check if Klaus available:
# Check for claudikins-klaus plugin
if mcp__claudikins-klaus available:
Spawn Klaus for debugging
else:
[Manual review by human] [Try different approach] [Abort]
Comprehensive verification report:
Verification Report
===================
Session: ${SESSION_ID}
Branch: ${BRANCH}
Execute Session: ${EXECUTE_SESSION_ID}
Phase Results:
Tests: ${TEST_STATUS}
Lint: ${LINT_STATUS}
Types: ${TYPE_STATUS}
Output: ${OUTPUT_STATUS}
Simplification: ${SIMPLIFY_STATUS}
Evidence Summary:
Screenshots: ${SCREENSHOT_COUNT}
API responses: ${CURL_COUNT}
Command outputs: ${CMD_COUNT}
${ISSUES_SUMMARY}
Ready to ship?
[Ready to Ship] [Needs Work] [Accept with Caveats]
Ready to Ship:
all_checks_passed = truehuman_checkpoint.decision = "ready_to_ship"unlock_ship = trueNeeds Work:
human_checkpoint.decision = "needs_work"What needs work?
[Tests failing] [Lint issues] [Type errors] [Output broken] [Code quality] [Other]
human_checkpoint.issues_notedWhere should we return to fix this?
[Phase 1: Re-run automated checks] [Phase 2: Re-verify output] [Phase 3: Run polish pass] [Exit to fix manually]
If "Exit to fix manually":
status = "needs_work"claudikins-kernel:verify --resume to continue"Otherwise:
Accept with Caveats:
human_checkpoint.decision = "ready_to_ship"unlock_ship = trueOn successful completion:
Done! All checks passed.
Tests ${TEST_ICON} Lint ${LINT_ICON} Types ${TYPE_ICON} App works ${OUTPUT_ICON}
Session: ${SESSION_ID}
Verified at: ${TIMESTAMP}
When you're ready:
claudikins-kernel:ship
On any failure:
.claude/errors/Never lose work. Always checkpoint before risky operations.
On PreCompact event:
On --resume:
Resuming verification
Last checkpoint: ${CHECKPOINT_ID}
Phase: ${PHASE}
Status: ${STATUS}
[Continue] [Restart] [Abort]