From harness-claude
Executes 4-phase systematic debugging with entropy analysis via 'harness cleanup' and persistent sessions. Enforce Phase 1 investigation before fixes for unclear test failures, context-specific bugs, or vague errors.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeThis skill uses the workspace's default tool permissions.
> 4-phase systematic debugging with entropy analysis and persistent sessions. Phase 1 before ANY fix. "It's probably X" is not a diagnosis.
Enforces root cause investigation for bugs, test failures, unexpected behavior, and performance issues through four phases before proposing fixes.
Performs root cause analysis for bugs by tracing errors through code, analyzing stack traces, forming and testing hypotheses, then hands off to fix.
Enforces systematic root cause analysis before fixes for bugs, test failures, unexpected behavior, performance issues, and build failures.
Share bugs, ideas, or general feedback.
4-phase systematic debugging with entropy analysis and persistent sessions. Phase 1 before ANY fix. "It's probably X" is not a diagnosis.
on_bug_fix triggers firePhase 1 INVESTIGATE before ANY fix. No exceptions.
If you find yourself writing fix code before completing investigation, STOP. Delete the fix. You are guessing, not debugging. A fix without investigation is a coin flip that creates the illusion of progress.
Before beginning, create a persistent debug session. This survives context resets and tracks state across multiple attempts.
.harness/debug/active/<session-id>.md
Session file format:
# Debug Session: <brief-description>
Status: gathering
Started: <timestamp>
Error: <the error message or symptom>
## Investigation Log
(append entries as you go)
## Hypotheses
(track what you have tried)
## Resolution
(filled in when resolved)
Status transitions: gathering -> investigating -> fixing -> verifying -> resolved
You must complete Phase 1 before writing ANY fix code. No exceptions.
Read-only constraint: Phase 1 is investigation only. You may read files, run commands, add log statements, and record observations. You may NOT write production code fixes, modify business logic, or commit changes during investigation. If you find yourself writing a fix, you have jumped to Phase 4.
harness cleanup
Review the output. Entropy analysis reveals:
Record relevant findings in the session log.
Read the COMPLETE error message. Not just the first line — the entire stack trace, every warning, every note. Errors often contain the answer.
Ask yourself:
Record the answers in the session log.
Run the failing scenario multiple times. Confirm it fails every time with the same error. If it is intermittent, record:
If you cannot reproduce the failure, you cannot debug it. Escalate.
git log --oneline -20
git diff HEAD~5
What changed recently? Many bugs are caused by the most recent change. Compare the failing state to the last known working state.
Use code_outline to get a structural overview of suspect modules (functions, classes, exports) without reading full source. Use code_search to locate symbol usages, error strings, or patterns across the codebase. Use code_unfold to expand a specific symbol to its full implementation with dependency context. These tools let you navigate efficiently without reading entire files.
Start at the error location and trace backward:
Read each function in the call chain completely. Do not skim.
Update the session status to investigating.
When you encounter an unknown during investigation or analysis, classify it immediately:
Do not bury unknowns. An unstated assumption in your investigation leads to fixes that address the wrong root cause.
Search the codebase for similar functionality that WORKS. There is almost always a working example of what you are trying to do.
Look for:
- Other calls to the same function/API that succeed
- Similar features that work correctly
- Test fixtures that exercise the same code path
- Documentation or comments that describe expected behavior
Run detect_anomalies to identify structural irregularities (orphaned files, missing tests, unusual coupling) that may relate to the bug. Anomalies near the failure site often point to the root cause.
When you find a working example, read it in its entirety. Do not cherry-pick lines. Understand:
Compare the working example to the failing code line by line. The bug is in the differences. Common categories:
Record all differences in the session log.
Based on your investigation and analysis, state a specific hypothesis:
"The failure occurs because [specific cause].
If this hypothesis is correct, then [observable prediction].
I can test this by [specific action]."
A good hypothesis is falsifiable — there is a concrete test that would disprove it. "Something is wrong with the configuration" is not a hypothesis. "The database connection string is missing the port number, causing connection timeout" is a hypothesis.
Change exactly ONE thing to test your hypothesis. If you change multiple things, you cannot determine which one had the effect.
Run the failing scenario. Did the behavior change?
If the bug is in a complex system, extract a minimal reproduction:
This serves two purposes: it confirms your understanding of the root cause, and it becomes the basis for a regression test.
Update the session status to fixing.
Before writing the fix, write a test that:
This follows harness-tdd discipline. The fix is driven by a failing test.
Write a SINGLE fix that addresses the ROOT CAUSE identified in Phase 3. Not a workaround. Not a symptom suppression. The root cause.
Characteristics of a good fix:
Characteristics of a bad fix (revert immediately):
if branch for the specific failing inputany or removes a type checkharness validate — must PASSharness check-deps — must PASSIf a knowledge graph exists at .harness/graph/, refresh it after code changes to keep graph queries accurate:
harness scan [path]
Skipping this step means subsequent graph queries (impact analysis, dependency health, test advisor) may return stale results.
Apply the regression test verification protocol:
If the test passes without the fix, the test does not catch the bug. Rewrite the test.
Update the debug session:
Status: resolved
Resolved: <timestamp>
## Resolution
Root cause: <what actually caused the bug>
Fix: <what was changed and why>
Regression test: <path to test file>
Learnings: <what to remember for next time>
Move the session file:
mv .harness/debug/active/<session-id>.md .harness/debug/resolved/
Append learnings to .harness/learnings.md if the bug revealed a pattern that should be remembered.
Update the session status to resolved.
harness cleanup — Run in Phase 1 INVESTIGATE for entropy analysis. Reveals dead code, pattern violations, and drift near the failure site.harness validate — Run in Phase 4 VERIFY after applying the fix. Confirms the fix does not break project-wide constraints.harness check-deps — Run in Phase 4 VERIFY. Confirms the fix does not introduce dependency violations.harness state learn — Run after resolution to capture learnings for future sessions.code_outline — Use in Phase 1 INVESTIGATE to get a structural overview of suspect modules without reading full source.code_search — Use in Phase 1 INVESTIGATE to locate symbol usages, error strings, or patterns across the codebase.code_unfold — Use in Phase 1 INVESTIGATE to expand a specific symbol to its full implementation with dependency context.detect_anomalies — Run in Phase 2 ANALYZE to identify structural irregularities (orphaned files, missing tests, unusual coupling) near the failure site..harness/debug/active/ (in progress) and .harness/debug/resolved/ (completed). These persist across context resets.| Flag | Corrective Action |
|---|---|
| "It's probably X, let me just fix that" | STOP. "Probably" is a guess, not a diagnosis. Complete Phase 1 INVESTIGATE before writing any fix code. |
| "I'll change a few things and see if the bug goes away" | STOP. One variable at a time. Multiple simultaneous changes mean you cannot determine which one had the effect — or whether you introduced a new bug. |
| "One more fix attempt before I escalate" after 2 failed attempts | STOP. Three failed attempts means your mental model is wrong. Step back, re-read the investigation log, and question your assumptions about how the system works. |
// temporary workaround or // TODO: real fix later replacing root-cause fix | STOP. Workarounds are symptom suppression. The root cause remains. Fix it properly or escalate — do not commit workarounds disguised as fixes. |
| Rationalization | Reality |
|---|---|
| "I have a strong hunch about what is wrong, so I will jump straight to fixing it" | Phase 1 INVESTIGATE must be completed before ANY fix code is written. You are guessing, not debugging. |
| "I changed two things and the bug is gone, so the fix must be correct" | One variable at a time is a gate. Changing multiple things simultaneously means you do not know which change fixed it. |
| "This is my third attempt but I feel close, so one more try before escalating" | After 3 failed fix attempts, the gate requires you to question the architecture. The problem is likely not where you think it is. |
| "A try-catch that swallows the error prevents the crash, so the bug is fixed" | Symptom suppression is explicitly listed as a bad fix. Wrapping the failure in a try-catch addresses what the bug did, not why it happened. |
| "The bug only happens in edge cases, so a partial fix is acceptable" | A partial fix means the bug still exists. Either fix the root cause completely or document the remaining scenarios as known issues with tracked tickets. |
| "I can skip the regression test since I understand the root cause well" | Understanding the root cause and proving the fix catches it are different things. The revert-and-fail test is mandatory — it is the only proof the test actually guards against the bug. |
Phase 1 — INVESTIGATE:
harness cleanup: No entropy issues near api/routes/users.ts
Error: "Cannot read properties of undefined (reading 'email')"
Stack trace points to: src/services/user-service.ts:34
Reproduces consistently with POST /users and empty body {}
Recent changes: Added input validation middleware (2 commits ago)
Data flow: request.body -> validate() -> createUser(body.email)
Phase 2 — ANALYZE:
Working example: POST /orders handles empty body correctly
Difference: /orders validates BEFORE destructuring; /users destructures BEFORE validating
The validation middleware runs but its result is not checked
Phase 3 — HYPOTHESIZE:
Hypothesis: The validation middleware sets req.validationErrors but the route
handler does not check it before accessing req.body.email.
Test: Add a log before line 34 to check req.validationErrors.
Result: Confirmed — validationErrors contains "email is required" but handler proceeds.
Phase 4 — FIX:
// Regression test
it('returns 400 when request body is empty', async () => {
const res = await request(app).post('/users').send({});
expect(res.status).toBe(400);
expect(res.body.errors).toContain('email is required');
});
// Fix: Check validation result before processing
if (req.validationErrors?.length) {
return res.status(400).json({ errors: req.validationErrors });
}
Revert test: Commenting out the validation check causes the test to fail with 500. Confirmed.