Help us improve
Share bugs, ideas, or general feedback.
From 97
Provides rigid checklists for correctness traps in error handling, floating-point math, concurrency, remote calls, singletons/globals, hot-path data structures, and high-volume logging. Invoke when writing or reviewing such code.
npx claudepluginhub oribarilan/97 --plugin 97How this skill is triggered — by the user, by Claude, or both
Slash command
/97:correctness-trapsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Common bugs grouped by domain: floats that won't compare equal, retries that hammer a downed service, singletons that wreck testability, and others. **When you write code in one of these domains, stop and run the matching checks before you commit.**
Audits defensive code for empty catch blocks, missing input validation, assertion side effects, and wrong exception abstraction. Includes crisis triage for production incidents.
Provides error handling patterns like exceptions, Result types, propagation, and graceful degradation across languages. For APIs, reliability, debugging, and fault-tolerant systems.
Strategies for handling errors: exceptions, error types, recovery strategies, and error propagation.
Share bugs, ideas, or general feedback.
Common bugs grouped by domain: floats that won't compare equal, retries that hammer a downed service, singletons that wreck testability, and others. When you write code in one of these domains, stop and run the matching checks before you commit.
This is a rigid skill. Jump to the sub-section that matches what you're writing and run that sub-section's checks.
These checks matter most when code will reach real users in production. In MVPs, prototypes, internal dev tools, and one-off scripts where the architecture is still in flux, prefer the simplest thing that works.
Invoke when you're about to:
If the change touches one of these domains even slightly, invoke anyway — the per-domain check is short and the bugs are not.
catch. try { ... } catch (...) {} silently swallows everything. Same for ignoring return codes (printf's return value, write()'s short-write count) and pretending errno doesn't exist. Example: a service-call wrapper swallows every exception and returns null, so every downstream caller has to invent their own theory of what null means. Expose erroneous conditions in your interfaces; if handling errors feels onerous, the interface is wrong. (Goodliffe, 97/26.)==. 0.1 + 0.2 != 0.3 in IEEE 754 — the canonical demonstration. Compare with a tolerance appropriate to the magnitude of the values involved (≈ ε|x|, where ε is machine epsilon — ~1e-7 for float, ~1e-16 for double).x² - 100000x + 1 = 0 directly via the quadratic formula gives a wildly wrong small root because -b + sqrt(b² - 4) cancels; compute one root and derive the other from r1 * r2 = c/a. Same shape of error appears in any series with alternating signs of similar magnitude.while (!call()) call(); against a downed service hammers it the moment it comes back. Exponential backoff, jitter, and a max-retries ceiling are the minimum; idempotency on the server side is what makes retry safe at all.for (i = 0; i < strlen(s); ++i) — strlen runs every iteration, scanning the whole string each time, turning O(n) work into O(n²). Hoist the length out. The same shape applies to repeated DB lookups, repeated config parses, and repeated regex compilations inside hot loops. (van Winkel, 97/89.)Logger.getInstance() called from every layer means tests can't intercept output, can't run in parallel, and inherit log state from previous tests.RI/*)When the call will run under load against a downstream that can fail, the per-call hardening is the first write. These checks matter most in production code.
None, "infinity", "many minutes"). Pick a per-call budget based on the downstream's realistic latency plus margin, and cap retries inside that budget. (RI/Timeout.)RI/CircuitBreaker.)RI/Bulkhead.)RI/Backpressure.)RI/FailFast.)These thoughts mean STOP — apply the domain check before committing:
| Thought | Reality |
|---|---|
| "I'll throw the same exception type for both — caller handles either way." | Technical and business exceptions are different contracts. Mixing them means callers can't tell what to guard against beforehand vs. handle after. (97/21) |
| "Empty catch is fine, the error can't happen here." | "Can't happen" is how silent corruption ships. Log, rethrow, or surface the error — never swallow. (97/26) |
| "Nobody on the team knows how this build step works, but it works." | Magic that no one owns is a fault waiting for the day the magic stops. Find the person who knows or document it now. (97/29) |
"if (a == b) for floats is fine, the values are computed the same way." | 0.1 + 0.2 != 0.3. Use a tolerance scaled to magnitude, or use a decimal type. (97/33) |
"I'll use double for the price column — it's faster than Decimal." | Floats accumulate roundoff; money does not forgive roundoff. Use fixed-point or decimal for currency. (97/33) |
| "I'll wrap a lock around the shared map — that fixes the race." | Locks around shared mutable state are where deadlocks and lost updates hide. Prefer message passing; lock only when measured and understood. (97/57) |
| "Failed call → just retry in a loop until it works." | A retry loop without backoff and a cap will hammer the service the moment it recovers. Backoff + jitter + ceiling, and require idempotency on the server. (97/41) |
| "I'll lazy-load each related row — it's cleaner." | One page = thousands of sequential round-trips = visibly broken latency. Count IPCs per stimulus; batch, parallelize, or cache. (97/41) |
"It's just for (i = 0; i < strlen(s); ++i) — looks normal." | strlen runs every iteration; an O(n) loop becomes O(n²). Hoist invariants out of hot loops. (97/89) |
| "Linked list is fine, n won't get that big." | "Won't get that big" is how production timeouts are born. Pick the structure by access pattern and confirm with measurement. (97/89, 97/46) |
| "Singleton — there'll only ever be one." | Single-instance is an assumption that ages badly, and the global access point destroys testability. Hide behind an interface, inject the dependency. (97/73) |
| "I'll let the HTTP client default the timeout — it's fine." | Defaults are None or hours. Held connections, threads, and queue slots add up under load. Set an explicit per-call timeout. (RI/Timeout) |
| "The downstream's flaky — I'll just retry." | Retry without a circuit breaker piles load on a service that's already failing. Wrap critical downstreams in a breaker; fail fast locally when open. (RI/CircuitBreaker) |
| "One pool for all downstreams keeps the code simpler." | One slow third party fills the pool and the whole service stops. Bulkhead per downstream; isolate failure domains. (RI/Bulkhead) |
| "I'll buffer events in an in-memory queue — it'll catch up." | Unbounded queues exhaust memory under sustained load. Pick a cap and a reject policy; let backpressure inform callers. (RI/Backpressure) |
| "I'll validate after the DB lookup — saves a branch." | Late failure holds DB connections, locks, and quota for a request that can't succeed. Validate at the entry; fail fast. (RI/FailFast) |
You are done when all of the following are true for every domain below your change touches:
== between floats; tolerances scaled to magnitude; money uses a decimal type; subtractions of near-equal magnitudes have been audited for cancellation.If any box that applies to your change is unchecked, you are not done. Either finish, or revert and re-plan.
| # | Principle | Author |
|---|---|---|
| 97/21 | Distinguish Business Exceptions from Technical | Dan Bergh Johnsson |
| 97/26 | Don't Ignore That Error! | Pete Goodliffe |
| 97/29 | Don't Rely on "Magic Happens Here" | Alan Griffiths |
| 97/33 | Floating-Point Numbers Aren't Real | Chuck Allison |
| 97/41 | Inter-Process Communication Affects Application Response Time | Randy Stafford |
| 97/46 | Know Your Limits | Greg Colvin |
| 97/57 | Message Passing Leads to Better Scalability in Parallel Systems | Russel Winder |
| 97/73 | Resist the Temptation of the Singleton | Sam Saariste |
| 97/89 | Use the Right Algorithm and Data Structure | Jan Christiaan "JC" van Winkel |
RI/Timeout | Always Set a Timeout | Michael Nygard |
RI/CircuitBreaker | Circuit Breaker | Michael Nygard |
RI/Bulkhead | Bulkhead | Michael Nygard |
RI/Backpressure | Backpressure / Bounded Queues | Michael Nygard |
RI/FailFast | Fail Fast | Michael Nygard |
See principles.md for the long-form distillations, citations, and source links.