From claudekit
Guides evidence-driven debugging: state hypothesis, add minimal instrumentation (logs, breakpoints, probes), record observations to confirm or refute theories in async, distributed, or production systems.
npx claudepluginhub duthaho/claudekit --plugin claudekitThis skill uses the workspace's default tool permissions.
The active-debugging companion to `investigate-root-cause`. Where investigate
Structured, hypothesis-driven debugging methodology with ranked hypotheses, git bisect strategy, instrumentation planning, and minimal reproduction design. Use for non-obvious bugs like intermittent failures, performance regressions, or issues without clear stacktraces.
Enforces four-phase root cause analysis for bugs, errors, test failures, unexpected behavior, and production incidents before proposing fixes.
Enforces systematic root cause analysis for bugs, test failures, unexpected behavior, and regressions via five-phase workflow: Understand, Reproduce, Isolate, Fix, Verify.
Share bugs, ideas, or general feedback.
The active-debugging companion to investigate-root-cause. Where investigate
produces a written hypothesis, evidence-driven-debugging is the workflow for
testing that hypothesis with real instrumentation: logs, breakpoints, prints,
debugger sessions, runtime probes. The skill exists because the most common
debugging-phase failure is the engineer who runs through three or four mental
hypotheses without writing anything down, ends up where they started, and can't
reconstruct what they tried. Evidence-driven debugging keeps a paper trail.
Used inside Phase 3 of investigate-root-cause, but invocable directly when an
existing hypothesis just needs runtime testing.
investigate-root-cause and need to test itinvestigate-root-cause Phase 2 firstGoal: Be explicit about what runtime evidence will confirm or refute.
Inputs: A hypothesis (from investigate-root-cause Phase 2 or your own
prior thinking).
Actions:
The bug occurs because [X] causes [Y] when [Z].Output: A test design: If I see <evidence>, hypothesis is confirmed; if I see <other>, hypothesis is refuted; ambiguous = collect more.
Goal: Add the minimum runtime probes to capture the evidence.
Inputs: The test design.
Actions:
[bug-1234]).debug: prefix so
it's easy to revert.Output: Instrumentation in code (or in a debugger config), tagged.
Goal: Run the bug with the instrumentation in place and capture output.
Inputs: Instrumented code + the reproducer from investigate-root-cause
Phase 1.
Actions:
Output: Captured probe output, saved.
Goal: Decide confirm/refute/ambiguous.
Inputs: Captured output + Step 1's test design.
Actions:
investigate-root-cause Phase 2 with the new evidence.Output: A one-line verdict: Hypothesis confirmed | Refuted (return to hypothesis) | Ambiguous (add probes at <location>).
Goal: Remove debug probes when the work is done.
Inputs: A confirmed hypothesis (or a refuted one that led you elsewhere).
Actions:
debug: commits, OR[bug-1234] tag).
These become permanent observability.print / console.log / dbg! lines remain in the
committed code.Output: Clean working tree. Either the debug commits are reverted or formalized.
| Excuse | Why it sounds reasonable | Why it's wrong | What to do instead |
|---|---|---|---|
| "I'll just add some console.logs and figure it out as I go." | Fast, low-overhead, the standard move. | The "figure it out as you go" version usually doesn't write down the hypothesis you're testing or what would refute it. You add prints, see some output, decide it "looks suspicious," add more prints, follow the suspicion, and lose track of which hypothesis you started with. By the time you find the bug (or get stuck), you can't reconstruct what you tried. | Spend 60 seconds on Step 1's test design before adding probes. Even one sentence is enough. The structure forces you to know what you're looking for, which is what makes the probes interpretable when they fire. |
| "The probes don't need tags — there aren't that many logs." | A handful of probes in a quiet system don't need filtering. | "Not that many logs" is true on your dev box. In a production-grade reproducer, the probe lines are intermixed with framework logs, request logs, third-party noise. Untagged probes are findable only by remembering which file you put them in, which is the same memory problem the skill exists to fix. | Tag every probe with the same identifier ([bug-1234], [debug-jane], whatever's unique). Filtering on the tag isolates your evidence in seconds. |
| "I don't need to save the output — I just looked at it." | Looked-at-and-understood is real. | Looked-at-and-understood doesn't survive a context switch. If you finish a debugging session at 6 PM and resume at 9 AM, the captured output is gone, you're reconstructing from the bug-fix patch you didn't write down, and you re-instrument in a slightly different way and lose comparability. | Save the output. Even if it's tail -f log.out > /tmp/bug-1234-run-1.log. The save is the deliverable for Step 3. |
| "Refuted hypothesis means I should fix it anyway — I have a workaround in mind." | Sometimes the workaround is faster than continuing to investigate. | The workaround that ships against a refuted hypothesis is the workaround that doesn't fix the actual bug. The bug recurs in a different shape because the fix addressed a hypothesis the evidence already disagreed with. The workaround is at best a shim and at worst a compounding error. | If refuted, return to investigate-root-cause Phase 2. The evidence you just gathered is input to the next hypothesis; don't waste it by patching against a hypothesis it disproved. |
| "I'll leave the debug logs in — observability is good." | Adding logs is one of the cheapest improvements to a service's observability. | Debug logs left in are not observability. They lack the structure (level, key-value, sampling) of proper logs; they pollute the log stream with bug-1234 tags forever; they confuse the next person who searches for "the right way" to log this thing and finds your debug prints. | If a probe is genuinely useful as long-term observability, convert it: re-write as a structured log with proper level, no tag, in the right place. Otherwise revert. The middle option (leave the debug logs in unchanged) is the worst of both. |
| Checkpoint | Required artifact | What "no evidence" looks like |
|---|---|---|
| End of Step 1 | A test design naming what confirms, refutes, and what's ambiguous | "I'll see what the logs say." |
| End of Step 2 | Instrumented code committed with debug: prefix and a shared tag | "I dropped some console.logs in." |
| End of Step 3 | Captured output saved to a file or PR comment | "I saw the output in my terminal." |
| End of Step 4 | A one-line verdict and the evidence supporting it | "It seems like the cache is the problem." |
| End of Step 5 | Reverted or formalized probes; clean working tree confirmed | "I'll clean up the debug logs later." |