From claude-code-debugger
Use when debugging requires deep investigation beyond memory lookup — when the debugging-memory skill returns LIKELY_MATCH, WEAK_SIGNAL, or NO_MATCH, when a fix didn't hold, when the user explicitly asks for root cause analysis, deep investigation, or iterative debugging, or when you suspect the initial diagnosis is superficial. Not for known fixes or trivial issues.
npx claudepluginhub tyroneross/claude-code-debugger --plugin claude-code-debuggerThis skill uses the workspace's default tool permissions.
A 7-phase debugging loop: investigate via causal tree analysis, hypothesize root cause, implement targeted fix, verify with evidence, score against criteria, pressure-test via critique agent, and report with transparency markers. Iterates up to 5x on failures.
Provides Ktor server patterns for routing DSL, plugins (auth, CORS, serialization), Koin DI, WebSockets, services, and testApplication testing.
Conducts multi-source web research with firecrawl and exa MCPs: searches, scrapes pages, synthesizes cited reports. For deep dives, competitive analysis, tech evaluations, or due diligence.
Provides demand forecasting, safety stock optimization, replenishment planning, and promotional lift estimation for multi-location retailers managing 300-800 SKUs.
A 7-phase debugging loop: investigate via causal tree analysis, hypothesize root cause, implement targeted fix, verify with evidence, score against criteria, pressure-test via critique agent, and report with transparency markers. Iterates up to 5x on failures.
Before entering the loop, assess whether it's warranted. The trigger is the verdict category, not a numeric score — research shows LLM-assigned confidence scores are poorly calibrated for open-ended tasks (Tian et al., EMNLP 2023; 49-84% calibration error on open-ended generation).
KNOWN_FIX — apply the fix directly and verifyLIKELY_MATCH, WEAK_SIGNAL, or NO_MATCH, the user asks for deep investigation, the initial diagnosis feels superficial, or a previous fix attempt didn't holdGoal: Understand what's actually failing and why, not just what it looks like.
search MCP tool with the symptom. Note any related incidentsOutput: Causal tree (with confirmed and pruned branches), reproduction steps, evidence gathered, research performed
Goal: Commit to a specific, testable hypothesis before writing any fix.
Output: Hypothesis statement, evidence level, prediction test, related symptom predictions
Goal: Make the minimal change that addresses the hypothesized root cause.
Output: List of changes with rationale
Goal: Collect concrete evidence that the fix works.
Every verification step must produce evidence: command output, test results, or observable behavior. "It should work" is not evidence.
Output: Evidence for each verification step
Goal: Objective pass/fail assessment with evidence.
Score against these criteria:
| # | Criterion | Method | Pass Condition | Evidence Required |
|---|---|---|---|---|
| 1 | Symptom resolved | Reproduction steps | Symptom no longer occurs | Command output or test result |
| 2 | Tests pass | Test suite | All relevant tests pass | Test runner output |
| 3 | No regressions | Broader test suite | No new failures introduced | Test runner output |
| 4 | Root cause addressed | Code review | Fix targets root cause, not symptom | Diff + reasoning |
| 5 | Hypothesis confirmed | Prediction test | Prediction test passes | Test output |
All criteria must have evidence. No criterion marked PASS without proof.
If any criterion fails → enter iteration (Phase 6 rules apply) If all criteria pass → proceed to critique (Phase 6)
Output: Scorecard with pass/fail per criterion and evidence
Goal: Challenge the fix before the user relies on it.
The critique agent checks 5 things:
Goal: Clear, honest summary. No overclaiming.
Every item in the report gets one marker:
store MCP tool for future retrievaloutcome MCP tool.claude-code-debugger/debug-loop/scorecard.mdWhen any criterion fails or the critique is CHALLENGED, iterate:
Load references/convergence-rules.md for detailed rules and escalation templates.
Summary:
Write iteration state to .claude-code-debugger/debug-loop/state.json:
{
"symptom": "original symptom",
"iteration": 1,
"phase": "VERIFY",
"hypotheses": [
{
"iteration": 1,
"hypothesis": "description",
"evidence_level": "strong | moderate | weak",
"result": "confirmed | disproved | partial",
"evidence": "what was found"
}
],
"scorecard": [
{
"criterion": "symptom_resolved",
"result": "PASS | FAIL",
"evidence": "summary"
}
],
"critique_verdict": "APPROVED | CHALLENGED | pending",
"changes_made": ["file:change summary"]
}
Create the directory with mkdir -p .claude-code-debugger/debug-loop/ before writing.
MEMORY SEARCH → INVESTIGATE → HYPOTHESIZE → FIX → VERIFY → SCORE
↓
All pass? ──yes──→ CRITIQUE ──approved──→ REPORT
↓ ↓
no challenged
↓ ↓
ITERATE ←──────────────┘
(up to 5x)
| Anti-Pattern | What to Do Instead |
|---|---|
| Accepting the first explanation | Branch first — identify 2+ plausible causes before pursuing any |
| Fixing the symptom | Trace to root cause, fix there |
| "This should fix it" | Run the tests, show the output |
| Retrying the same approach | If it failed once with the same evidence, it'll fail again. Change the hypothesis |
| Declaring victory without evidence | Every claim needs a ✅/⚠️/❓ marker |
| Skipping research when stuck | If you don't know why something behaves this way, search for it |
| Hiding uncertainty | ⚠️ and ❓ are not failures — they're honest. Hiding them is the failure |