/dev:debug — Systematic Debugging

You are running the debug skill. Your job is to run a structured, hypothesis-driven debugging session that persists across conversations. You track every hypothesis, test, and finding in a debug file so work is never lost.

This skill spawns Task(subagent_type: "dev:towline-debugger") for investigation work.

Context Budget

Reference: skills/shared/context-budget.md for the universal orchestrator rules.

Additionally for this skill:

Never perform investigation work yourself — delegate ALL analysis to the towline-debugger subagent
Minimize reading debug file content — read only the latest hypothesis and result section
Delegate all code reading, hypothesis testing, and fix attempts to the debugger subagent

Core Principle

Debug systematically, not randomly. Every investigation step must have a hypothesis, a test, and a recorded result. No "let me just try this" — every action has a reason and is documented.

Flow

Step 1: Check for Active Debug Sessions

Load depth profile: Run node ${CLAUDE_PLUGIN_ROOT}/scripts/towline-tools.js config resolve-depth to get debug.max_hypothesis_rounds. If the command fails (no config.json or CLI error), default to 5 rounds. Initialize a round counter at 0. This counter increments each time a continuation debugger is spawned.

Scan .planning/debug/ for existing debug files:

.planning/debug/
  {NNN}-{slug}.md     # Each debug session is a file

Read each file's frontmatter to check status:

status: active — session is in progress
status: resolved — session is complete
status: stale — session was abandoned

If active sessions found:

Use the debug-session-select pattern from skills/shared/gate-prompts.md: question: "Found active debug sessions. Which would you like?"

Generate options dynamically from active sessions:

Each active session becomes an option: label "#{NNN}: {title}", description "Started {date}, last: {last hypothesis}"
Always include "New session" as the last option: description "Start a fresh debug investigation"
If more than 3 active sessions exist, show only the 3 most recent plus "New session" (max 4 options)

Handle responses:

If user selects an existing session: go to Resume Flow (Step 2b)
If user selects "New session": go to New Session Flow (Step 2a)
If user types a session number not in the list: look it up and resume it

If no active sessions found:

Go to New Session Flow

Step 2a: New Session Flow

Gather Symptoms

If $ARGUMENTS is provided and descriptive:

Use it as the initial issue description
Still ask targeted follow-up questions

If $ARGUMENTS is empty or minimal:

Ask the user for symptoms

Symptom gathering questions (ask as plain text — these are freeform, do NOT use AskUserQuestion):

Expected behavior: "What should happen?"
Actual behavior: "What actually happens? Include error messages if any."
Reproduction: "How do you trigger this? Steps to reproduce?"
Onset: "When did this start? Did anything change recently (new code, dependency update, config change)?"
Scope: "Does this affect everything or just specific cases? Any patterns?"

Optional follow-ups (ask if relevant):

"What have you already tried?"
"Does this happen in all environments (dev, prod, test)?"
"Any relevant log output?"

Generate Session ID

Scan .planning/debug/ for existing files
Extract NNN prefixes
Next number = highest + 1 (start at 001)
Generate slug from issue title (same rules as quick task slugs)

Create Debug File

Create .planning/debug/{NNN}-{slug}.md:

---
id: "{NNN}"
title: "{issue title}"
status: active
created: "{ISO date}"
updated: "{ISO date}"
severity: "{critical|high|medium|low}"
category: "{runtime|build|test|config|integration|unknown}"
---

# Debug Session: {title}

## Symptoms

**Expected:** {expected behavior}
**Actual:** {actual behavior}
**Reproduction:** {steps}
**Onset:** {when it started}
**Scope:** {affected areas}

## Environment

- OS: {detected or reported}
- Runtime: {node version, python version, etc.}
- Relevant config: {any config that matters}

## Investigation Log

### Round 1 (automated)

{This section is filled by towline-debugger}

## Hypotheses

| # | Hypothesis | Status | Evidence |
|---|-----------|--------|----------|
| 1 | {hypothesis} | {testing/confirmed/rejected} | {evidence} |

## Root Cause

{Filled when found}

## Fix Applied

{Filled when fixed}

## Timeline

- {ISO date}: Session created

Spawn Debugger

Display to the user: ◐ Spawning debugger...

Spawn Task(subagent_type: "dev:towline-debugger") with the prompt template.

Read skills/debug/templates/initial-investigation-prompt.md.tmpl for the spawn prompt. Fill in the {NNN}, {slug}, and symptom placeholders with values from the debug file created above.

Step 2b: Resume Flow

Read the debug file content
Parse the investigation log and hypotheses table
Display to user:

Resuming debug session #{NNN}: {title}

Last state:
- Hypotheses tested: {N}
- Confirmed: {list or "none yet"}
- Rejected: {list}
- Current lead: {most promising hypothesis}

Continuing investigation...

Display to the user: ◐ Spawning debugger (resuming session #{NNN})...

Spawn Task(subagent_type: "dev:towline-debugger") with the continuation prompt template.

Read skills/debug/templates/continuation-prompt.md.tmpl for the spawn prompt. Fill in the {NNN}, {slug}, and {paste investigation log...} placeholders with data from the debug file.

Step 3: Handle Debugger Results

The debugger returns one of four outcomes:

ROOT CAUSE FOUND + FIX

Root cause identified: {cause}
Fix applied: {description}
Commit: {hash}

Actions:

Update debug file:
- Set status: resolved
- Fill "Root Cause" section
- Fill "Fix Applied" section
- Add timeline entry
Update STATE.md if it has a Debug Sessions section
Report to user with branded output:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 TOWLINE ► BUG RESOLVED ✓
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

**Session #{NNN}:** {title}
**Root cause:** {cause}
**Fix:** {description}
**Commit:** {hash}

───────────────────────────────────────────────────────────────

## ▶ Next Up

**Continue your workflow** — the bug is fixed

`/dev:status`

<sub>`/clear` first → fresh context window</sub>

───────────────────────────────────────────────────────────────

**Also available:**
- `/dev:continue` — execute next logical step
- `/dev:review {N}` — verify the current phase

───────────────────────────────────────────────────────────────

ROOT CAUSE FOUND (no fix)

Used when the debugger was invoked with find_root_cause_only or when the fix is too complex for auto-application.

Root cause identified: {cause}
Suggested fix: {approach}

Actions:

Update debug file:
- Set status: resolved
- Fill "Root Cause" section
- Add suggested fix to notes
Suggest next steps to user:

───────────────────────────────────────────────────────────────

## ▶ Next Up

**Apply the fix** — root cause identified, fix needed

`/dev:quick {fix description}`

<sub>`/clear` first → fresh context window</sub>

───────────────────────────────────────────────────────────────

**Also available:**
- `/dev:plan` — for complex fixes that need planning
- `/dev:status` — see project status

───────────────────────────────────────────────────────────────

CHECKPOINT

The debugger found something but needs user input or more investigation.

Investigation progress:
- Tested: {hypotheses tested}
- Found: {key finding}
- Need: {what's needed to continue}

Actions:

Update debug file with findings so far
Present checkpoint to user
Use the debug-checkpoint pattern from skills/shared/gate-prompts.md: question: "Investigation has reached a checkpoint. How should we proceed?"

Handle responses:

"Continue": Display ◐ Spawning debugger (continuing investigation)... and spawn another Task(subagent_type: "dev:towline-debugger") with updated context from the debug file
"More info": Ask the user freeform what additional context they have, then update the debug file and spawn another debugger
"New approach": Ask the user freeform what alternative angle to try, then update hypotheses and spawn another debugger

INCONCLUSIVE

The debugger exhausted its hypotheses without finding the root cause.

Investigation exhausted:
- Tested: {all hypotheses}
- Rejected: {list}
- Remaining unknowns: {list}

Actions:

Update debug file with all findings
Report to user:
- What was tested and eliminated
- What remains unknown
- Suggested next investigation approaches:
  - Different reproduction steps
  - Log-level debugging
  - Environment comparison
  - Bisect (git bisect to find the breaking commit)
  - External help (stack overflow, docs)
Keep session active for future resumption

Debugger Investigation Protocol

The towline-debugger agent follows this protocol internally:

Hypothesis-Driven Investigation

1. OBSERVE: Read error messages, logs, code around the failure point
2. HYPOTHESIZE: "The most likely cause is X because Y"
3. PREDICT: "If X is the cause, then test Z should show W"
4. TEST: Execute test Z
5. EVALUATE:
   - Result matches prediction → hypothesis supported → investigate deeper
   - Result contradicts → hypothesis rejected → try next hypothesis
   - Result is unexpected → new information → form new hypothesis

Investigation Techniques

Technique	When to Use
Stack trace analysis	Error with stack trace available
Code path tracing	Logic error, wrong behavior
Log injection	Need to see runtime values
Binary search	Know it worked before, need to find when it broke
Isolation	Complex system, need to narrow scope
Comparison	Works in one case, fails in another
Dependency audit	Recent dependency changes
Config diff	Works in one environment, not another

Evidence Quality

Quality	Description	Action
Strong	Directly proves/disproves hypothesis	Record and move on
Moderate	Suggests but doesn't prove	Record, seek corroboration
Weak	Tangentially related	Note but don't base decisions on it
Misleading	Red herring	Record as eliminated, explain why

Hypothesis Round Limit

The maximum number of investigation rounds is controlled by the depth profile's debug.max_hypothesis_rounds setting:

quick: 3 rounds (fast, surface-level investigation)
standard: 5 rounds (default)
comprehensive: 10 rounds (deep investigation)

The orchestrator tracks the round count. Before spawning each continuation debugger (Step 3 "CHECKPOINT" -> "Continue"), increment the round counter. If the counter reaches the limit:

Do NOT spawn another debugger
Present to user: "Debug session has reached the hypothesis round limit ({N} rounds for {depth} mode). Options:"

Use AskUserQuestion: question: "Reached {N}-round hypothesis limit. How should we proceed?" header: "Debug Limit" options: - label: "Extend" description: "Allow {N} more rounds (doubles the limit)" - label: "Wrap up" description: "Record findings so far and close the session" - label: "Escalate" description: "Save context for manual debugging"

If "Extend": double the limit and continue
If "Wrap up": update debug file status to stale, record all findings, suggest next steps
If "Escalate": write a detailed handoff document to the debug file with all hypotheses, evidence, and suggested manual investigation steps

Debug File Management

Lifecycle

active → resolved     (root cause found and fixed)
active → stale        (abandoned — no updates for 7+ days)
active → active       (resumed after pause)

Staleness Detection

When scanning for active sessions, check the updated date. If more than 7 days old:

Mark as stale in status
Still offer to resume, but warn: "This session is {N} days old. Context may have changed."

Cleanup

Old resolved debug files can accumulate. They serve as a knowledge base for similar issues. Do NOT auto-delete them.

State Integration

Update STATE.md Debug Sessions section (create if needed):

### Debug Sessions

| # | Issue | Status | Root Cause |
|---|-------|--------|------------|
| 001 | Login timeout | resolved | DB connection pool exhausted |
| 002 | CSS not loading | active | investigating |

Git Integration

Reference: skills/shared/commit-planning-docs.md for the standard commit pattern.

If planning.commit_docs: true in config.json:

New session: docs(planning): open debug session {NNN} - {slug}
Resolution: docs(planning): resolve debug session {NNN} - {root cause summary}
Fix commits use standard format: fix({scope}): {description}

Edge Cases

User provides a stack trace or error in arguments

Parse it as the "Actual behavior" symptom
Extract key information: error type, file, line number
Use this to form initial hypotheses immediately

Debugger agent fails

If the towline-debugger Task() fails or returns an error, display:

╔══════════════════════════════════════════════════════════════╗
║  ERROR                                                       ║
╚══════════════════════════════════════════════════════════════╝

Debugger agent failed for session #{NNN}.

**To fix:**
- Check the debug file at `.planning/debug/{NNN}-{slug}.md` for partial findings
- Re-run with `/dev:debug` to resume the session
- If the issue persists, try a fresh session with different symptom details

Issue is in a dependency (not user code)

Document which dependency and version
Check if there's a known issue (search patterns in node_modules, site-packages, etc.)
Suggest: update dependency, pin version, or work around

Issue is intermittent

Note intermittency in symptoms
Suggest: run multiple times, add timing/logging, check for race conditions
Investigation must account for non-deterministic reproduction

Multiple issues interacting

If investigation reveals multiple separate issues, split into separate debug sessions
Create additional debug files
Track each independently

Fix would be a breaking change

Report the root cause but DO NOT auto-apply the fix
Present the trade-offs to the user
Let the user decide how to proceed

Anti-Patterns

DO NOT skip hypothesis formation — every test must have a reason
DO NOT make random changes hoping something works
DO NOT ignore failed hypotheses — record why they failed
DO NOT exceed the depth profile's debug.max_hypothesis_rounds limit without user confirmation (default: 5 for standard mode)
DO NOT fix the symptom instead of the root cause
DO NOT auto-apply fixes for breaking changes
DO NOT delete debug files — they're a knowledge base
DO NOT combine multiple bugs into one debug session
DO NOT skip updating the debug file after each investigation round
DO NOT start a new session when an active one covers the same issue

debug

/dev:debug — Systematic Debugging

Context Budget

Core Principle

Flow

Step 1: Check for Active Debug Sessions

Step 2a: New Session Flow

Gather Symptoms

Generate Session ID

Create Debug File

Spawn Debugger

Step 2b: Resume Flow

Step 3: Handle Debugger Results

ROOT CAUSE FOUND + FIX

ROOT CAUSE FOUND (no fix)

CHECKPOINT

INCONCLUSIVE

Debugger Investigation Protocol

Hypothesis-Driven Investigation

Investigation Techniques

Evidence Quality

Hypothesis Round Limit

Debug File Management

Lifecycle

Staleness Detection

Cleanup

State Integration

Git Integration

Edge Cases

User provides a stack trace or error in arguments

Debugger agent fails

Issue is in a dependency (not user code)

Issue is intermittent

Multiple issues interacting

Fix would be a breaking change

Anti-Patterns

Similar Skills