From harness-claude
Classifies errors into 7 taxonomy categories (syntax/type, logic, design, performance, etc.) and routes to specific resolution strategies. Use for unclear root causes, failed bug fixes, or pattern mismatches.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeThis skill uses the workspace's default tool permissions.
> Cognitive mode: **diagnostic-investigator**. Classify errors into taxonomy categories and route to deterministic resolution strategies. Evidence first, classification second, action third.
Enforces systematic root cause analysis before fixes for bugs, test failures, unexpected behavior, performance issues, and build failures.
Guides systematic debugging of bugs, errors, test failures, production incidents, and flaky issues through root cause investigation, reproduction, logging, and git analysis before fixes.
Provides systematic debugging methodology using reproduce-investigate-hypothesize-fix-prevent workflow and root cause analysis for bugs, test failures, flaky behavior, production errors, performance degradation, integration failures.
Share bugs, ideas, or general feedback.
Cognitive mode: diagnostic-investigator. Classify errors into taxonomy categories and route to deterministic resolution strategies. Evidence first, classification second, action third.
on_bug_fix triggers fire and the error does not match a known patternEvery error falls into one of 7 categories. Each category has distinct signals and a distinct resolution strategy. Misclassification leads to wasted effort — a Logic error treated as Syntax will never be fixed by reading the compiler output.
Signals: Compilation failures, type errors, parse errors, import resolution failures, "cannot find name", "expected X but got Y" from the compiler or type checker.
Resolution strategy: Read the error message. It tells you exactly what is wrong and where. Fix mechanically — match the type, fix the import, correct the syntax. No investigation needed. Run the type checker to confirm.
Time budget: Under 5 minutes. If it takes longer, you have misclassified.
Signals: Tests fail with wrong output values, unexpected behavior at runtime, "expected X but received Y" from test assertions, correct types but incorrect results.
Resolution strategy: Write a failing test FIRST that isolates the incorrect behavior. Then trace the data flow from input to incorrect output. Find the exact line where the actual value diverges from the expected value. Fix that line. Run the test to confirm.
Time budget: 5-30 minutes. If it takes longer, consider reclassifying as Design.
Signals: Multiple related failures, fixes that break other things, circular dependencies, "everything is tangled", you cannot write a clean test because the code is not testable, the fix requires changing many files.
Resolution strategy: STOP. Do not attempt to fix locally. This is an architectural issue that requires human judgment. Document the symptoms, the attempted fixes, and the structural problem. Escalate to the human architect with a clear summary and 2-3 options.
Time budget: 15 minutes maximum for classification and documentation. Do not spend time attempting fixes.
Signals: Slow responses, timeouts, high memory usage, "heap out of memory", operations that used to be fast are now slow, N+1 query patterns.
Resolution strategy: Profile FIRST. Do not guess at the bottleneck. Use the appropriate profiling tool for the runtime (browser devtools, node --prof, py-spy, database EXPLAIN). Identify the actual hotspot. Optimize only that hotspot. Measure again to confirm improvement.
Time budget: 15-60 minutes. Profiling takes time but prevents optimizing the wrong thing.
Signals: Vulnerability scanner findings, injection possibilities, authentication/authorization failures, exposed secrets, CORS issues, unsafe deserialization.
Resolution strategy: Check OWASP Top 10 for the vulnerability class. Apply the minimal fix that closes the vulnerability without changing unrelated behavior. Do not refactor surrounding code during a security fix — minimize the blast radius. Verify with a test that the vulnerability is closed.
Time budget: Variable, but the fix itself should be minimal. If the fix requires large changes, reclassify as Design and escalate.
Signals: "Module not found" at runtime (not compile time), version mismatch errors, "connection refused", works on one machine but not another, Docker/CI failures that pass locally, missing environment variables.
Resolution strategy: Check versions first — runtime, dependencies, OS. Compare the failing environment to a working environment. Look at: Node/Python/Java version, dependency lock file freshness, environment variables, file permissions, network connectivity. Fix the environment, not the code.
Time budget: 5-30 minutes. Environment issues are usually fast once you compare configurations.
Signals: Test passes sometimes and fails sometimes, "works on retry", timing-dependent failures, failures that disappear when you add logging, race conditions, order-dependent test results.
Resolution strategy: Isolate the timing dependency. Run the failing test in isolation — does it still flake? Run it 20 times in a loop — what is the failure rate? Look for: shared mutable state between tests, missing await on async operations, time-dependent assertions (setTimeout, Date.now()), external service dependencies without mocking. Fix the non-determinism, do not add retries.
Time budget: 15-60 minutes. Flaky tests are deceptively hard. If you cannot isolate the timing dependency in 60 minutes, escalate.
This phase must complete before any fix is attempted.
Capture the current state before any changes:
# Run type checker (adapt to project language)
npx tsc --noEmit 2>&1 | tail -50
# Run test suite
npm test 2>&1 | tail -100
# Record results
echo "Baseline captured at $(date)" >> .harness/diagnostics/current.md
Record exact counts: how many type errors, how many test failures, which tests fail.
Read the ENTIRE error output. Not the first line. Not the summary. The complete message including:
Compare the error signals against the 7 categories above. Ask:
CLASSIFICATION: [Category Name]
CONFIDENCE: [High/Medium/Low]
SIGNALS: [List the specific signals that led to this classification]
ALTERNATIVE: [If confidence is Medium or Low, what other category could it be?]
Gate: Classification must be explicit and written down before proceeding. "I think it's a type error" is not sufficient. State the category, confidence, and signals.
Follow the resolution strategy for the classified category exactly. Do not mix strategies. Each category has a specific approach because each category has a specific failure mode.
If the category is Design, STOP HERE. Do not proceed to Phase 3. Document and escalate.
For all other categories, execute the resolution strategy as described in the taxonomy above.
Implement the fix according to the resolution strategy for the classified category.
Run the same checks from Phase 1 Step 1:
# Run type checker
npx tsc --noEmit 2>&1 | tail -50
# Run test suite
npm test 2>&1 | tail -100
Compare results to the baseline:
The error is resolved when:
This phase triggers only when the initial classification was wrong or the first fix attempt failed.
# Diagnostic Record: <brief description>
Date: <timestamp>
Initial Classification: <what you thought it was>
Actual Classification: <what it turned out to be>
Misclassification Reason: <why the signals were misleading>
Resolution: <what actually fixed it>
Anti-Pattern: <what to watch for next time>
# Append to diagnostics log
cat >> .harness/diagnostics/anti-patterns.md << 'ENTRY'
<the record from Step 1>
ENTRY
This log accumulates over time and helps improve future classifications.
.harness/anti-patterns.md per Group B3 convention| Rationalization | Why It Is Wrong |
|---|---|
| "I can see the error is a type issue -- let me just fix it without formal classification" | The gate says classification must be explicit and written down before any fix attempt. Implicit classification skips the evidence step. |
| "This looks like a Design issue, but I can probably fix it locally with a small change" | Design category MUST escalate. Local fixes for architectural problems create more architectural problems. |
| "I do not need to run tests before fixing -- I know what the baseline is" | Deterministic checks before AND after is a gate. Without a recorded baseline, you cannot prove the fix helped. |
| "My first fix did not work, but I will try a different approach within the same category" | Reclassify, do not force. If the resolution strategy is not working, the classification is probably wrong. |
Phase 1 — CLASSIFY:
Error: src/handlers/users.ts(42,15): error TS2345:
Argument of type 'string | undefined' is not assignable to parameter of type 'string'.
Baseline: 1 type error, 0 test failures.
CLASSIFICATION: Syntax/Type
CONFIDENCE: High
SIGNALS: Compile-time error, TypeScript error code TS2345, exact file and line provided
ALTERNATIVE: None — this is unambiguously a type error
Phase 2 — ROUTE: Following Syntax/Type strategy: read the error, fix mechanically.
The function getUserById(id: string) is called with req.params.id which is string | undefined. Add a guard or use non-null assertion after validation.
Phase 3 — RESOLVE:
// Fix: add guard before call
const id = req.params.id;
if (!id) {
return res.status(400).json({ error: 'id parameter is required' });
}
const user = await getUserById(id);
Verification: 0 type errors (was 1), 0 test failures (unchanged). Fix confirmed.
Phase 1 — CLASSIFY:
Error: test/integration/queue.test.ts
"should process message within timeout"
Passes 7 out of 10 runs. Fails with: "Timeout — message not processed within 2000ms"
Baseline: 0 type errors, 1 flaky test failure (intermittent).
CLASSIFICATION: Flaky
CONFIDENCE: High
SIGNALS: Intermittent failure, timing-dependent assertion (2000ms timeout),
passes on retry, failure rate ~30%
ALTERNATIVE: Could be Environment if CI has different resource constraints,
but it also flakes locally
Phase 2 — ROUTE: Following Flaky strategy: isolate the timing dependency.
Run in isolation: still flakes (not order-dependent). Run 20 times: fails 6/20 times (~30% failure rate, consistent). Examine the test: it publishes a message and asserts it is processed within 2000ms. Examine the handler: processing involves a database write that sometimes takes >2000ms under load. Root cause: the timeout is too tight for the actual processing time, and there is no mechanism to signal completion — the test polls on a timer.
Phase 3 — RESOLVE:
// Before: polling with fixed timeout
await new Promise((resolve) => setTimeout(resolve, 2000));
const result = await db.query('SELECT * FROM processed WHERE id = ?', [msgId]);
expect(result.rows).toHaveLength(1);
// After: event-driven wait with generous timeout
const result = await waitForCondition(
() => db.query('SELECT * FROM processed WHERE id = ?', [msgId]).then((r) => r.rows.length > 0),
{ timeout: 10000, interval: 100 }
);
expect(result).toBe(true);
Verification: 0 type errors (unchanged), 20/20 test runs pass. Fix confirmed.
Phase 4 — RECORD: Not needed — initial classification was correct and first fix attempt succeeded.