From rune
ACH-based parallel debugging. Spawns multiple hypothesis-investigator agents to investigate competing hypotheses simultaneously. Use when bugs are complex, when single-agent debugging hits 3+ failures, or when root cause is unclear. Also triggers when test-failure-analyst returns LOW confidence during arc Phase 7.7. Keywords: debug, investigate, hypothesis, root cause, parallel debugging, ACH, competing hypotheses, falsify, evidence, multi-agent debug. <example> user: "/rune:debug test suite fails intermittently on auth module" assistant: "The Tarnished initiates the ACH Protocol — triaging, generating hypotheses, and summoning investigators..." </example>
npx claudepluginhub vinhnxv/rune --plugin runeThis skill is limited to using the following tools:
**Runtime context** (preprocessor snapshot):
Debugs complex bugs using competing hypotheses across 6 failure categories, parallel evidence collection with strength ratings, and root cause arbitration.
Deploys parallel agent investigators to test multiple bug hypotheses simultaneously, gather confirming/disproving evidence, synthesize findings, rank causes, and apply minimal verified fixes.
Performs root cause analysis for bugs by tracing errors through code, analyzing stack traces, forming and testing hypotheses, then hands off to fix.
Share bugs, ideas, or general feedback.
Runtime context (preprocessor snapshot):
find tmp -maxdepth 1 -name '.rune-*-*.json' -exec grep -l '"running"' {} + 2>/dev/null | wc -l | tr -d ' 'git branch --show-current 2>/dev/null || echo "unknown"Implements Analysis of Competing Hypotheses (ACH) methodology for multi-agent debugging. Instead of sequential hypothesis testing (systematic-debugging), this spawns parallel investigators — each assigned ONE hypothesis to confirm or falsify with evidence.
Load skills: systematic-debugging, rune-orchestration, context-weaving, team-sdk, zsh-compat
| Scenario | Use /rune:debug | Use systematic-debugging |
|---|---|---|
| Complex bug, unclear root cause | Yes | No |
| Multiple possible causes | Yes | No |
| Simple deterministic bug | No | Yes |
| Single obvious hypothesis | No | Yes |
| 3+ failures in single-agent debug | Yes (escalation) | No |
| test-failure-analyst LOW confidence | Yes (arc trigger) | No |
debug:
max_investigators: 4 # 1-6, default 4
timeout_ms: 420_000 # 7 min per investigation round
model: sonnet # default investigators model; overridden by cost_tier via resolveModelForAgent()
re_triage_rounds: 1 # max re-triage rounds before escalating to user
echo_on_verdict: true # persist verdict to rune-echoes after resolution
Read config via readTalismanSection("misc") — see read-talisman.md.
Goal: Reproduce, classify, and generate competing hypotheses.
bugDescription = $ARGUMENTS
If $ARGUMENTS is empty, use AskUserQuestion to get:
Run the failing test or reproduce the error:
testOutput = Bash("{test_command}")
If reproduction fails:
AskUserQuestionRecord:
git log --oneline -10 -- {affected_files}Use trigger heuristics from ach-methodology.md:
Record primary category and note any secondary matches.
Using templates from hypothesis-templates.md:
{H-PREFIX}-{NNN} (e.g., H-REG-001)Graceful degradation: If <2 hypotheses can be generated (bug is too simple or too unclear),
fall back to single-agent systematic-debugging methodology. Do NOT spawn a team for 1 hypothesis.
// readTalismanSection: "misc"
const misc = readTalismanSection("misc")
maxInvestigators = misc?.debug?.max_investigators ?? 4
timeoutMs = misc?.debug?.timeout_ms ?? 420000
investigatorModel = misc?.debug?.model ?? "sonnet"
maxReTriageRounds = misc?.debug?.re_triage_rounds ?? 1
Cap hypotheses at maxInvestigators.
CWD="$(git rev-parse --show-toplevel 2>/dev/null || pwd)"
source "${CWD}/plugins/rune/scripts/lib/workflow-lock.sh"
conflicts=$(rune_check_conflicts "writer")
if echo "$conflicts" | grep -q "CONFLICT"; then
AskUserQuestion({ question: "Active workflow conflict:\n${conflicts}\nProceed anyway?" })
fi
rune_acquire_lock "debug" "writer"
teamName = "rune-debug-{timestamp}"
TeamCreate({ team_name: teamName })
Fallback: If TeamCreate fails, fall back to single-agent systematic-debugging.
For each hypothesis, create a task:
TaskCreate({
subject: "Investigate {hypothesis_id}: {one-line summary}",
description: `
## Assignment
Investigate this ONE hypothesis. Gather confirming AND falsifying evidence.
**Hypothesis ID**: {hypothesis_id}
**Hypothesis**: {full hypothesis statement}
**Category**: {category}
## Bug Context
**Description**: {bugDescription}
**Error output**: {testOutput}
**Affected files**: {file_list}
## Evidence Standards
Classify each evidence item by tier:
- DIRECT (1.0): Uniquely produces/eliminates failure
- CORRELATIONAL (0.6): Associated but alternate explanations exist
- TESTIMONIAL (0.3): Reasoning without direct observation
- ABSENCE (0.8/0.2): Expected evidence not found (0.8 exhaustive, 0.2 shallow)
## Output
Write evidence report to: tmp/debug/{teamName}/{hypothesis_id}.md
Use the format from your agent instructions.
`,
activeForm: "Investigating {hypothesis_id}"
})
Before spawning investigators, query agent-search MCP for investigation agents:
// MCP-first investigator discovery — enables user-defined investigators
// (e.g., "perf-profiler" for performance-specific debugging)
let investigatorType = "hypothesis-investigator" // default
try {
const candidates = agent_search({
query: "hypothesis investigation debugging root cause analysis evidence",
phase: "goldmask",
category: "investigation",
limit: 5
})
Bash("mkdir -p tmp/.rune-signals && touch tmp/.rune-signals/.agent-search-called")
if (candidates?.results?.length > 0) {
// User-defined investigator can supplement default for specific debug contexts
const userInvestigator = candidates.results.find(c =>
(c.source === "user" || c.source === "project") &&
c.name !== "hypothesis-investigator"
)
if (userInvestigator) investigatorType = userInvestigator.name
}
} catch (e) {
// MCP unavailable — use default hypothesis-investigator
}
For each hypothesis, spawn an investigator agent (all investigators use general-purpose subagent type — agent specialization comes from the prompt, not from the subagent type):
Agent({
subagent_type: "general-purpose",
team_name: teamName,
name: "investigator-{N}",
model: resolveModelForAgent("hypothesis-investigator", talisman), // Cost tier mapping (references/cost-tier-mapping.md)
prompt: "You are assigned hypothesis {hypothesis_id}. Claim your task from the task list, investigate, and report findings to tmp/debug/{teamName}/{hypothesis_id}.md"
})
Use polling loop (see polling-guard skill):
pollIntervalMs = 30000
maxIterations = ceil(timeoutMs / pollIntervalMs)
for iteration in 1..maxIterations:
TaskList() // MUST call TaskList every cycle (POLL-001)
count completed tasks
if all tasks completed: break
if stale (no progress for 3 cycles): warn and continue
Bash("sleep 30")
Read all evidence reports from tmp/debug/{teamName}/:
for each hypothesis_id:
report = Read("tmp/debug/{teamName}/{hypothesis_id}.md")
parse: verdict, confidence, evidence[], cross_signals
Handle missing reports (investigator timeout): proceed with available reports, note gap.
Execute the deterministic arbitration algorithm from ach-methodology.md.
For each hypothesis with a report:
WES(H) = sum(supporting × tier_weight) - sum(refuting × tier_weight)
penalty = 0.4 × count(DIRECT cross-refutations)
FCS(H) = clamp((WES - penalty) × confidence_raw, 0, 1)
Apply tiebreaker rules in order (see ach-methodology.md):
reTriageCount += 1
if reTriageCount > maxReTriageRounds:
AskUserQuestion("All hypotheses scored below threshold. Here are the findings: {summary}. Can you provide additional context?")
// Use user input to generate new hypotheses or exit
else:
// Extract cross-hypothesis signals from all investigators
// Generate new hypotheses
// Return to Phase 1 (new investigation round)
Write verdict to tmp/debug/{teamName}/verdict.md:
# Debug Verdict — {teamName}
## Primary Hypothesis
**ID**: {winner_id}
**Statement**: {hypothesis}
**Confidence**: {HIGH|MEDIUM|LOW} (FCS: {score})
**Category**: {category}
## Evidence Summary
### Supporting (top 3 by weight)
1. [{evidence_id}] {description} — `{file:line}` (DIRECT, 1.0)
2. ...
### Refuting
- {refuting evidence if any}
## Refuted Alternatives
| Hypothesis | FCS | Reason |
|-----------|-----|--------|
| {H-XXX-NNN} | {score} | {key refuting evidence} |
## Cross-Hypothesis Signals
{signals that emerged during investigation}
## Recommended Fix
{Specific fix approach based on the winning hypothesis}
## Defense-in-Depth Layers
Based on category {category}, apply:
{layers from ach-methodology.md Defense-in-Depth Mapping}
Based on the verdict, implement the fix:
Run the original failing test/reproduction:
Bash("{original_test_command}")
If fix fails:
Apply defensive layers per the failure category mapping in ach-methodology.md. Reference defense-in-depth.md for layer implementation details.
If talisman?.debug?.echo_on_verdict:
Standard 5-component team cleanup (QUAL-012): dynamic member discovery (fallback: investigator-{1..N}), shutdown broadcast, grace period, retry-with-backoff TeamDelete (4 attempts), process kill + filesystem fallback gated on !cleanupTeamDeleteSucceeded. Releases workflow lock. Then presents final summary (bug, hypothesis, fix, defense layers, rejected alternatives).
See phase-4-cleanup.md for the full cleanup pseudocode.
When triggered by arc (test-failure-analyst returned LOW confidence):
AskUserQuestion — use the test output and failure context from arctmp/debug/{teamName}/verdict.md for arc consumptiontmp/.rune-signals/{arc-team}/debug-complete.signal| Condition | Fallback |
|---|---|
| <2 hypotheses generated | Single-agent systematic-debugging |
| TeamCreate fails | Single-agent systematic-debugging |
| All investigators timeout | Partial arbitration with available reports |
| All hypotheses falsified (after max rounds) | Escalate to user with all evidence collected |
| Fix fails after dominant hypothesis confirmed | Check runner-up hypothesis, then escalate |