Skill

strict-enforcement

Enforces strict verification methodology for code quality, requiring actual evidence of working code before shipping. Used with claudikins-kernel:verify and cross-command gates.

code-quality

developer-tools

npx claudepluginhub espalier-redoubt/claudikins-marketplace --plugin claudikins-kernel

Popularity

Stars

125

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/claudikins-kernel:strict-enforcement

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

ReadGrepGlobBashWebFetchSkillmcp__plugin_claudikins-tool-executor_tool-executor__search_toolsmcp__plugin_claudikins-tool-executor_tool-executor__get_tool_schemamcp__plugin_claudikins-tool-executor_tool-executor__execute_code

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Use this skill when you need to:

Supporting Files

references/advanced-verification.mdreferences/agent-integration.mdreferences/cynic-rollback.mdreferences/lint-fix-validation.mdreferences/red-flags.mdreferences/test-timeout-handling.mdreferences/type-check-confidence.mdreferences/verification-checklist.mdreferences/verification-method-fallback.mdreferences/verify-state-compression.md

SKILL.md

297 lines · ~2.7k tokens

Similar Skills

skill-verify

3.5k

Enforces verification gates: run fresh commands before claiming work is complete, tests pass, or bugs are fixed. Prevents confirmation bias.

1 file

octo

verification-before-completion

Enforces running verification commands before claiming work is complete. Useful for preventing false success claims and ensuring evidence-based completion.

rpikit

verification-before-completion

Enforces evidence-before-claims discipline: requires fresh verification (test, build, lint) before any completion claim. Prevents premature sign-offs.

virtual-team

Stats

LanguageShell

Stars125

Forks7

MaintenanceFair

Last CommitApr 14, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

Strict Enforcement Verification Methodology

When to use this skill

Use this skill when you need to:

Run the claudikins-kernel:verify command
Validate implementation before shipping
Decide pass/fail verdicts
Check code integrity after changes
Enforce cross-command gates

Core Philosophy

"Evidence before assertions. Always." - Verification philosophy

Never claim code works without seeing it work. Tests passing is not enough. Claude must SEE the output.

The Three Laws

See it working - Screenshots, curl responses, CLI output. Actual evidence.
Human checkpoint - No auto-shipping. Human reviews evidence and decides.
Exit code 2 gates - Verification failures block claudikins-kernel:ship. No exceptions.

Verification Phases

Phase 1: Automated Quality Checks

Run the automated checks first. Fast feedback.

Check	Command Pattern	What It Catches
Tests	`npm test` / `pytest` / `cargo test`	Logic errors, regressions
Lint	`npm run lint` / `ruff` / `clippy`	Style issues, common bugs
Types	`tsc` / `mypy` / `cargo check`	Type mismatches, interface drift
Build	`npm run build` / `cargo build`	Compilation errors, bundling issues

Flaky Test Detection (C-12):

Test fails?
├── Re-run failed tests
├── Pass 2nd time?
│   └── Yes → STOP: [Accept flakiness] [Fix tests] [Abort]
└── Fail 2nd time?
    ├── Run isolated
    └── Still fail? → STOP: [Fix] [Skip] [Abort]

Phase 2: Output Verification (catastrophiser)

This is the feedback loop that makes Claude's code actually work.

Project Type	Verification Method	Evidence
Web app	Start server, screenshot, test flows	Screenshots, console logs
API	Curl endpoints, check responses	Status codes, response bodies
CLI	Run commands, capture output	stdout, stderr, exit codes
Library	Run examples, check results	Output values, test coverage
Service	Check logs, verify health endpoint	Log patterns, health responses

Fallback Hierarchy (A-3):

If primary method unavailable, fall back:

Start server + screenshot (preferred for web)
Curl endpoints (preferred for API)
Run CLI commands (preferred for CLI)
Run tests only (fallback)
Code review only (last resort)

Timeout: 30 seconds per verification method (CMD-30).

Phase 3: Code Simplification (Optional)

After verification passes, optionally run cynic for polish.

Prerequisites:

Phase 2 (catastrophiser) must PASS
Human approves: "Run cynic for polish pass?"

cynic Rules:

Preserve exact behaviour (tests MUST still pass)
Remove unnecessary abstraction
Improve naming clarity
Delete dead code
Flatten nested conditionals

If tests fail after simplification:

Log failure reasons
Show human
Proceed anyway (A-5) with caveat

See cynic-rollback.md for recovery patterns.

Phase 4: Klaus Escalation

If stuck during verification:

Is mcp__claudikins-klaus available? (E-16)
├── No →
│   Offer: [Manual review] [Ask Claude differently] (E-17)
│   Fallback: [Accept with uncertainty] [Max retries, abort] (E-18)
└── Yes →
    Spawn klaus via SubagentStop hook

Phase 5: Human Checkpoint

The final gate. Present comprehensive evidence.

Verification Report
-------------------
Tests:  ✓ 47/47 passed
Lint:   ✓ 0 issues
Types:  ✓ 0 errors
Build:  ✓ success

Evidence:
- Screenshot: .claude/evidence/login-flow.png
- API test: POST /api/auth → 200 OK
- CLI test: mycli --help → exit 0

[Ready to Ship] [Needs Work] [Accept with Caveats]

Human decides. If approved, set unlock_ship = true.

Rationalizations to Resist

Agents under pressure find excuses. These are all violations:

Excuse	Reality
"Tests pass, that's good enough"	Tests aren't enough. SEE it working. Screenshots, curl, output.
"I'll verify after shipping"	Verify BEFORE ship. That's the whole point.
"The type checker caught everything"	Types don't catch runtime issues. Get evidence.
"Screenshot failed but it probably works"	"Probably" isn't evidence. Fix the screenshot or use fallback.
"Human checkpoint is just a formality"	Human checkpoint is the gate. No auto-shipping.
"Code review is enough for this change"	Code review is last resort fallback. Try harder.
"Tests are flaky, I'll ignore the failure"	Flaky tests hide real failures. Fix or explicitly accept with caveat.
"Exit code 2 is too strict"	Exit code 2 exists to block bad ships. Pass properly.

All of these mean: Get evidence. Human decides. No shortcuts.

Red Flags — STOP and Reassess

If you're thinking any of these, you're about to violate the methodology:

"It should work because..."
"The tests pass so..."
"I'm confident that..."
"It worked before..."
"The types check so..."
"I'll just skip verification this once"
"Human will approve anyway"
"Evidence isn't necessary for this change"

All of these mean: STOP. Get evidence. Present to human. Let them decide.

Exit Code 2 Pattern (CRITICAL)

The verify-gate.sh hook enforces the gate:

# Both conditions MUST be true
ALL_PASSED=$(jq -r '.all_checks_passed' "$STATE")
HUMAN_APPROVED=$(jq -r '.human_checkpoint.decision' "$STATE")

if [ "$ALL_PASSED" != "true" ]; then
  exit 2  # Blocks claudikins-kernel:ship
fi

if [ "$HUMAN_APPROVED" != "ready_to_ship" ]; then
  exit 2  # Blocks claudikins-kernel:ship
fi

File Manifest (C-6):

At verification completion, generate SHA256 hashes of all source files:

find . \( -name '*.ts' -o -name '*.py' -o -name '*.rs' \) \
  | xargs sha256sum > .claude/verify-manifest.txt

This lets claudikins-kernel:ship detect if code was modified after verification.

Cross-Command Gate (C-14)

claudikins-kernel:verify requires claudikins-kernel:execute to have completed:

if [ ! -f "$EXECUTE_STATE" ]; then
  echo "ERROR: claudikins-kernel:execute has not been run"
  exit 2
fi

This enforces the claudikins-kernel:outline → claudikins-kernel:execute → claudikins-kernel:verify → claudikins-kernel:ship flow.

Agent Integration

Agent	Role	When
catastrophiser	See code working	Phase 2: Output verification
cynic	Polish pass	Phase 3: Simplification (optional)

Both agents run with context: fork and background: true.

See agent-integration.md for coordination patterns.

State Tracking

verify-state.json

{
  "session_id": "verify-2026-01-16-1100",
  "execute_session_id": "execute-2026-01-16-1030",
  "branch": "execute/task-1-auth-middleware",
  "phases": {
    "test_suite": { "status": "PASS", "count": 47 },
    "lint": { "status": "PASS", "issues": 0 },
    "type_check": { "status": "PASS", "errors": 0 },
    "output_verification": { "status": "PASS", "agent": "catastrophiser" },
    "code_simplification": { "status": "PASS", "agent": "cynic" }
  },
  "all_checks_passed": true,
  "human_checkpoint": {
    "decision": "ready_to_ship",
    "caveats": []
  },
  "unlock_ship": true,
  "verified_manifest": "sha256:...",
  "verified_commit_sha": "abc123..."
}

Anti-Patterns

Don't do these:

Trusting test results without seeing code run
Skipping output verification because "tests pass"
Auto-approving verification without human checkpoint
Modifying code after verification passes
Ignoring flaky test warnings
Proceeding when lint/type checks fail

Edge Case Handling

Situation	Reference
Tests hang or timeout	test-timeout-handling.md
Auto-fix breaks code	lint-fix-validation.md
Primary verification fails	verification-method-fallback.md
Type-check results unclear	type-check-confidence.md
cynic breaks tests	cynic-rollback.md
Large project state	verify-state-compression.md

References

Full documentation in this skill's references/ folder:

verification-checklist.md - Complete verification checklist
red-flags.md - Common rationalisation patterns
agent-integration.md - How catastrophiser and cynic coordinate
advanced-verification.md - Complex verification scenarios
test-timeout-handling.md - When tests hang (S-13)
lint-fix-validation.md - Validating auto-fix safety (S-14)
verification-method-fallback.md - Fallback strategies (S-15)
type-check-confidence.md - Interpreting type results (S-16)
cynic-rollback.md - Rolling back failed simplifications (S-17)
verify-state-compression.md - State management for large projects (S-18)

strict-enforcement

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

strict-enforcement

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

Strict Enforcement Verification Methodology

When to use this skill

Core Philosophy

The Three Laws

Verification Phases

Phase 1: Automated Quality Checks

Phase 2: Output Verification (catastrophiser)

Phase 3: Code Simplification (Optional)

Phase 4: Klaus Escalation

Phase 5: Human Checkpoint

Rationalizations to Resist

Red Flags — STOP and Reassess

Exit Code 2 Pattern (CRITICAL)

Cross-Command Gate (C-14)

Agent Integration

State Tracking

verify-state.json

Anti-Patterns

Edge Case Handling

References

Similar Skills

Help us improve

Strict Enforcement Verification Methodology

When to use this skill

Core Philosophy

The Three Laws

Verification Phases

Phase 1: Automated Quality Checks

Phase 2: Output Verification (catastrophiser)

Phase 3: Code Simplification (Optional)

Phase 4: Klaus Escalation

Phase 5: Human Checkpoint

Rationalizations to Resist

Red Flags — STOP and Reassess

Exit Code 2 Pattern (CRITICAL)

Cross-Command Gate (C-14)

Agent Integration

State Tracking

verify-state.json

Anti-Patterns

Edge Case Handling

References