Help us improve
Share bugs, ideas, or general feedback.
From harness-engineering
Runs or continues differential debugging sessions between implementations, traces, captures, or outputs. Records artifact identities, commands, mismatches, findings, validation, and next probes in durable session logs.
npx claudepluginhub alchemiststudiosdotai/harness-engineeringHow this skill is triggered — by the user, by Claude, or both
Slash command
/harness-engineering:differential-session-runnerThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this skill when debugging requires a **durable evidence trail** rather than ad hoc notes.
Orchestrates persistent, hypothesis-driven debugging sessions across conversations, delegating analysis to subagents and tracking in debug files.
Executes 4-phase systematic debugging with entropy analysis via 'harness cleanup' and persistent sessions. Enforce Phase 1 investigation before fixes for unclear test failures, context-specific bugs, or vague errors.
Guides evidence-driven debugging: state hypothesis, add minimal instrumentation (logs, breakpoints, probes), record observations to confirm or refute theories in async, distributed, or production systems.
Share bugs, ideas, or general feedback.
Use this skill when debugging requires a durable evidence trail rather than ad hoc notes.
This skill is for workflows where you are comparing:
The goal is not only to investigate. The goal is to leave behind a session artifact another operator or agent can continue.
Use this skill when the user asks to:
Every differential investigation should produce a reusable evidence packet.
A good session artifact lets another operator answer:
If the repo already has a native evidence location, use it.
Examples:
docs/.../sessions/docs/chunks/analysis/.../reports/If the repo does not already have a native convention, write to:
memory-bank/evidence/YYYY-MM-DD_HH-MM-SS_<topic>-session.mdmemory-bank/evidence/index.mdBefore creating anything, search for:
Read the relevant guidance and continue the repo's existing pattern.
Capture the strongest available identity for the artifact:
If a content hash is possible, record it early and use it as the main session identity.
Search existing sessions for the artifact identity.
Capture the exact commands used for the first comparison step.
Examples:
python compare.py --baseline out/a.json --candidate out/b.json
pytest tests/test_replay.py -k case_17
mytool diff trace_a.cdt trace_b.cdt
Never summarize commands loosely. Record them exactly.
Capture the first relevant divergence and, if applicable, how it moved over time.
Examples:
If later fixes move the mismatch frontier, append the new progression rather than deleting the old one.
Write findings as evidence-backed observations, not guesses.
Good findings:
If code changes are made during the investigation, capture them in a separate section:
If no changes were made, state that explicitly.
List the validation commands and their results.
Examples:
Do not write "fixed" without a validation section.
Every session should end with one of:
---
title: "<topic> – Differential Session"
phase: Evidence
date: "YYYY-MM-DD HH:MM:SS"
owner: "<agent_or_user>"
tags: [evidence, differential, <topic>]
---
## Artifact
- Path: `<path>`
- Identity: `<sha256|commit|case-id>`
## Baseline Commands
- `<exact command 1>`
- `<exact command 2>`
## First Mismatch Progression
- baseline: `<first mismatch>`
- after fix 1: `<new frontier>`
- after fix 2: `<cleared|new mismatch>`
## Key Findings
- finding 1
- finding 2
## Landed Changes
- `path/to/file` → change summary
- `tests/...` → validation coverage added
## Validation
- `<command>` → `<result>`
- `<command>` → `<result>`
## Outcome / Next Probe
- `<cleared | next probe | blocked reason>`
A completed session artifact should make handoff possible with no hidden context.
It must include:
After updating the session artifact:
plan-phaseexecute-phase