From code-foundations
Audits defensive programming code for empty catch blocks, missing input validation, assertions with side effects, wrong exception levels, and robustness strategies. Includes crisis triage checklist.
npx claudepluginhub ryanthedev/code-foundationsThis skill uses the workspace's default tool permissions.
| Check | Why Critical |
Implements error handling patterns like retries, circuit breakers, and async error management for resilient apps, APIs, distributed systems, and debugging.
Apply the fail-early (fail-fast) pattern to detect and report errors at the earliest possible point. Covers input validation with guard clauses, meaningful error messages, assertion functions, and anti-patterns that silently swallow failures. Primary examples in R with general/polyglot guidance. Use when writing functions that accept external input, adding input validation before CRAN submission, refactoring code that silently produces wrong results, reviewing PRs for error-handling quality, or hardening internal APIs against invalid arguments.
Implements defense-in-depth validation across entry points, business logic, environment guards, and debug instrumentation to prevent invalid data failures deep in execution stacks.
Share bugs, ideas, or general feedback.
| Check | Why Critical |
|---|---|
| No executable code in assertions | Code disappears in production builds |
| No empty catch blocks | Silently swallows bugs that compound |
| External input validated | Security vulnerabilities, data corruption |
Production down? Use this prioritized subset:
EOFException from GetEmployee())Why triage works: These 5 items catch 80% of defensive programming bugs. Full checklist (21 items) is for non-emergency review.
Empty catch blocks cause compounding failures. A suppressed error in one layer cascades into harder-to-diagnose failures in production.
Any data not provably controlled by current code path:
"Internal team API" is still external. If it crosses a network boundary or process boundary, validate it.
Code used during development that allows a program to check itself as it runs. When true = operating as expected. When false = detected an unexpected error (bug). Use for conditions that should never occur.
A damage-containment strategy. Interfaces designated as boundaries to "safe" areas. Data crossing these boundaries is checked for validity.
Limitation: Barricades reduce redundant validation but do NOT replace defense-in-depth for security-critical operations. If barricade validation has a bug, what happens?
SECURITY EXCEPTION: Security-critical code (authentication, authorization, cryptographic operations, PII handling) is NEVER exempt from defensive programming regardless of other factors. When in doubt, validate.
These checks are almost always required. Exceptions need explicit justification and documentation:
| Check | Time | Why Critical |
|---|---|---|
| No executable code in assertions | 15 sec | Code disappears in production builds |
| No empty catch blocks | 15 sec | Silently swallows bugs that compound |
| External input validated | 30 sec | Security vulnerabilities, data corruption |
| Assertions for bugs only | 15 sec | Assertions disabled in production; anticipated errors need handling |
Why these four? Violations create silent failures that are nearly impossible to debug later. They don't crash loudly - they corrupt data and hide bugs.
Error handling deferred to a future session is rarely completed. Edge cases are forgotten once the code moves out of active context. If deferral is necessary, create a tracked ticket with specific scope.
BEFORE implementing any error handling, search the codebase:
| Search For | Why |
|---|---|
| Same error type elsewhere | How is it handled? Log? Throw? Return code? |
| Same module's error handling | What's the established pattern here? |
| Barricade/validation patterns | Where are the trust boundaries? |
| Exception hierarchy | What custom exceptions exist? |
Questions to answer:
If pattern found: Follow it. Consistency in error handling is critical for debugging.
If no pattern found: You're establishing one. Document your decision. Consider if this should become the pattern.
See: pattern-reuse-gate.md for full gate protocol.
Purpose: Execute checklists for defensive programming, assertions, exceptions, and error handling Triggers:
Purpose: Apply defensive programming techniques when implementing error handling and validation Triggers:
digraph assertion_vs_error {
rankdir=TB;
START [label="Handling a potentially\nbad condition" shape=doublecircle];
never [label="Should this NEVER happen?\n(programmer bug)" shape=diamond];
assertion [label="Use ASSERTION" shape=box style=filled fillcolor=lightblue];
anticipated [label="Is it anticipated\nbad input?" shape=diamond];
errorhandling [label="Use ERROR HANDLING" shape=box style=filled fillcolor=lightgreen];
robust [label="Building highly\nrobust system?" shape=diamond];
both [label="Use BOTH\n(assert + handle)" shape=box style=filled fillcolor=lightyellow];
reconsider [label="Clarify requirements:\nIs it a bug or expected?\nAsk domain expert." shape=box style=filled fillcolor=lightcoral];
START -> never;
never -> robust [label="yes"];
never -> anticipated [label="no"];
robust -> both [label="yes"];
robust -> assertion [label="no"];
anticipated -> errorhandling [label="yes"];
anticipated -> reconsider [label="no/unsure"];
}
digraph correctness_vs_robustness {
rankdir=TB;
START [label="Choose error\nhandling philosophy" shape=doublecircle];
safety [label="Safety-critical?\n(medical, aviation, nuclear)" shape=diamond];
correctness [label="Favor CORRECTNESS\nShut down > wrong result" shape=box style=filled fillcolor=lightcoral];
consumer [label="Consumer app?\n(games, word processors)" shape=diamond];
robustness [label="Favor ROBUSTNESS\nKeep running > perfect" shape=box style=filled fillcolor=lightgreen];
analyze [label="ANALYZE DOMAIN\n(see guidance below)" shape=box style=filled fillcolor=lightyellow];
START -> safety;
safety -> correctness [label="yes"];
safety -> consumer [label="no"];
consumer -> robustness [label="yes"];
consumer -> analyze [label="no"];
}
Domain Analysis Guidance (for "Analyze domain" path):
| Domain Type | Lean Toward | Key Question |
|---|---|---|
| Enterprise/B2B | Correctness | "Would wrong data cause business decisions based on false info?" |
| SaaS platforms | Balanced | "What's the blast radius of a wrong answer vs unavailability?" |
| Internal tools | Robustness | "Is user technical enough to recover from a crash?" |
| Data pipelines | Correctness | "Does downstream processing assume data integrity?" |
| Real-time systems | Context-dependent | "Is stale data better or worse than no data?" |
digraph debug_code_production {
rankdir=TB;
START [label="Should this debug\ncode stay in production?" shape=doublecircle];
important [label="Checks important errors?\n(calculations, data integrity)" shape=diamond];
keep1 [label="KEEP" shape=box style=filled fillcolor=lightgreen];
crash [label="Causes hard crash?\n(no save opportunity)" shape=diamond];
remove1 [label="REMOVE" shape=box style=filled fillcolor=lightcoral];
diagnose [label="Helps remote diagnosis?\n(logging, state dumps)" shape=diamond];
keep2 [label="KEEP as silent log" shape=box style=filled fillcolor=lightgreen];
remove2 [label="REMOVE or make\nunobtrusive" shape=box style=filled fillcolor=lightyellow];
START -> important;
important -> keep1 [label="yes"];
important -> crash [label="no"];
crash -> remove1 [label="yes"];
crash -> diagnose [label="no"];
diagnose -> keep2 [label="yes"];
diagnose -> remove2 [label="no"];
}
| Condition Type | Use Assertion | Use Error Handling | Guidance |
|---|---|---|---|
| Should never occur (bug) | Yes | No | Assert documents the impossibility |
| Can occur at runtime | No | Yes | Handle gracefully |
| External input | No | Yes | Always validate external data |
| Internal interface (same module) | Yes | No | Assert for contract violations |
| Internal interface (cross-module) | Yes | Yes, if crossing trust boundary | Validate at module boundaries |
| Precondition violation | Yes | Yes, if public API | Public APIs need graceful errors |
| Security-critical | Both | Both | Defense in depth |
| Highly robust systems | Both | Both | Belt and suspenders |
MUST be performed in order:
Class-level barricade: Public methods validate and sanitize; private methods within that class can assume data is safe.
Critical caveat: "Trust inside barricade" means reduced redundant validation, NOT zero validation. For security-critical paths (auth, crypto, PII), validate again even inside the barricade. Bugs in barricade validation happen.
Traditional exception propagation assumes synchronous call stacks. Modern patterns need different approaches:
// BAD: Unhandled rejection crashes Node.js
async function fetchUser(id) {
const response = await fetch(`/api/users/${id}`); // Can reject
return response.json();
}
// GOOD: Explicit error handling
async function fetchUser(id) {
try {
const response = await fetch(`/api/users/${id}`);
if (!response.ok) {
throw new UserNotFoundError(id); // Domain-level exception
}
return response.json();
} catch (e) {
if (e instanceof UserNotFoundError) throw e;
throw new UserServiceError('Failed to fetch user', { cause: e });
}
}
callback(error, result)// BAD: First rejection loses other results
const results = await Promise.all(promises);
// GOOD: Collect all results including failures
const results = await Promise.allSettled(promises);
const failures = results.filter(r => r.status === 'rejected');
if (failures.length > 0) {
logErrors(failures);
// Decide: fail entirely or continue with partial results?
}
Make errors painful during development so they're found and fixed:
| Technique | Purpose |
|---|---|
| Make asserts abort | Don't let programmers bypass known problems |
| Fill allocated memory | Detect memory allocation errors immediately |
| Fill files/streams completely | Flush out file-format errors early |
| Default/else clauses fail hard | Impossible to overlook unexpected cases |
| Fill objects with junk before deletion | Detect use-after-free immediately |
| Email error logs to yourself | Get notified of errors in the field |
Paradox: During development, make errors noticeable and obnoxious. During production, make errors unobtrusive with graceful recovery.
| Debug Code Type | Action | Rationale |
|---|---|---|
| Checks important errors (calculations, data) | KEEP | Tax calculation errors matter; messy screens don't |
| Checks trivial errors (screen updates) | REMOVE or log silently | Penalty is cosmetic only |
| Causes hard crashes | REMOVE | Users need chance to save work |
| Enables graceful crash with diagnostics | KEEP | Mars Pathfinder diagnosed issues remotely |
| Logging for tech support | KEEP | Convert assertions from halt to log |
| Exposes info to attackers | REMOVE | Error messages shouldn't help attackers |
| Claim | Source | Application |
|---|---|---|
| "Garbage in, garbage out" is obsolete | McConnell p.188 | Production software must validate or reject |
| Assertions especially useful in large/complex programs | McConnell p.189 | More code = more interface mismatches to catch |
| Error handling is architectural decision | McConnell p.197 | Decide at architecture level, enforce consistently |
| Trade speed for debugging aids | McConnell p.205 | Development builds can be slow if they catch bugs |
| Exceptions weaken encapsulation | McConnell p.198 | Callers must know what exceptions called code throws |
| Dead program does less damage than crippled one | Hunt & Thomas | Fail fast, fail loud during development |
| Mars Pathfinder used debug code in production | McConnell p.209 | JPL diagnosed and fixed remotely using left-in debug aids |
| Bugs cost 100x more to fix in production | IBM Systems Sciences Institute | Validates investment in early defensive programming |
| 15-50% of development time spent on debugging | McConnell, citing multiple studies | Defensive programming reduces this significantly |
| Mars Climate Orbiter lost due to unit mismatch | NASA 1999 | 9 months of success doesn't mean code is safe |
Strategy selection is an architectural decision - be consistent throughout.
| Application Type | Favor | Avoid |
|---|---|---|
| Safety-critical (medical, aviation) | Shut down | Return guessed value |
| Consumer apps (games, word processors) | Keep running | Crash without save |
| Financial/audit | Fail with clear error | Silent substitution |
| Data pipelines | Fail and retry OR quarantine | Silent data loss |
| Real-time systems | Degrade gracefully | Hard crash |
| After | Next |
|---|---|
| Validation complete | cc-control-flow-quality (CHECKER) |