From shipyard
Enforces running verification commands (tests, linters, builds) before claiming work complete/fixed/passing or before commits/PRs. Ensures evidence-based assertions.
npx claudepluginhub lgbarn/shipyard --plugin shipyardThis skill uses the workspace's default tool permissions.
<!-- TOKEN BUDGET: 180 lines / ~540 tokens -->
Enforces running verification commands like tests, builds, linters before claiming work complete, bugs fixed, or tests pass. Includes red-green TDD cycles and agent delegation checks.
Enforces running verification commands (tests, linters, builds) before claiming completion, success, or satisfaction. Activates on any positive work state claims.
Prevents silent decimal mismatch bugs in EVM ERC-20 tokens via runtime decimals lookup, chain-aware caching, bridged-token handling, and normalization. For DeFi bots, dashboards using Python/Web3, TypeScript/ethers, Solidity.
Share bugs, ideas, or general feedback.
Claiming work is complete without verification is dishonesty, not efficiency.
Core principle: Evidence before claims, always.
Violating the letter of this rule is violating the spirit of this rule.
NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
If you haven't run the verification command in this message, you cannot claim it passes.
BEFORE claiming any status or expressing satisfaction:
1. IDENTIFY: What command proves this claim?
2. RUN: Execute the FULL command (fresh, complete)
3. READ: Full output, check exit code, count failures
4. VERIFY: Does output confirm the claim?
- If NO: State actual status with evidence
- If YES: State claim WITH evidence
5. ONLY THEN: Make the claim
Skip any step = lying, not verifying
| Claim | Requires | Not Sufficient |
|---|---|---|
| Tests pass | Test command output: 0 failures | Previous run, "should pass" |
| Linter clean | Linter output: 0 errors | Partial check, extrapolation |
| Build succeeds | Build command: exit 0 | Linter passing, logs look good |
| Bug fixed | Test original symptom: passes | Code changed, assumed fixed |
| Regression test works | Red-green cycle verified | Test passes once |
| Agent completed | VCS diff shows changes | Agent reports "success" |
| Requirements met | Line-by-line checklist | Tests passing |
When SHIPYARD_IS_TEAMMATE=true:
In solo mode, this section has no effect.
| Excuse | Reality |
|---|---|
| "Should work now" | RUN the verification |
| "I'm confident" | Confidence != evidence |
| "Just this once" | No exceptions |
| "Linter passed" | Linter != compiler |
| "Agent said success" | Verify independently |
| "I'm tired" | Exhaustion != excuse |
| "Partial check is enough" | Partial proves nothing |
| "Different words so rule doesn't apply" | Spirit over letter |
ALWAYS before:
Rule applies to:
Agent: I'll run the tests to verify.
$ npm test
> 34 passing (1.2s)
> 0 failing
All 34 tests pass. The refactored auth module is working correctly.
Why it works: The agent ran the command, read the output, and THEN made the claim. The evidence is inline and verifiable.
Agent: I've updated the auth module to fix the token validation bug.
The tests should pass now. Let me move on to the next task.
Why it fails: "should pass" is not evidence. No command was run. The agent is guessing based on the code change, not verifying the outcome.
Agent: Running the full test suite:
$ pytest tests/
> 28 passed, 2 failed
28 of 30 tests pass. Two failures remain in test_auth_refresh.py:
- test_expired_token_refresh: expects 401 but gets 200
- test_concurrent_refresh: race condition in token store
The original bug (token validation) is fixed, but these two
related failures need investigation before this is complete.
Agent: The builder agent reported success. Phase complete!
Why it fails: Agent self-reports are not verification. Check the VCS diff and run the tests yourself.
From failure analysis: your human partner said "I don't believe you" -- trust broken. Undefined functions shipped. Missing requirements shipped. Time wasted on false completion then redirect then rework.
No shortcuts for verification. Run the command. Read the output. THEN claim the result. This is non-negotiable.