Runs or continues differential debugging sessions between implementations, traces, captures, or outputs. Records artifact identities, commands, mismatches, findings, validation, and next probes in durable session logs.
From harness-engineeringnpx claudepluginhub alchemiststudiosdotai/harness-engineeringThis skill is limited to using the following tools:
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Use this skill when debugging requires a durable evidence trail rather than ad hoc notes.
This skill is for workflows where you are comparing:
The goal is not only to investigate. The goal is to leave behind a session artifact another operator or agent can continue.
Use this skill when the user asks to:
Every differential investigation should produce a reusable evidence packet.
A good session artifact lets another operator answer:
If the repo already has a native evidence location, use it.
Examples:
docs/.../sessions/docs/chunks/analysis/.../reports/If the repo does not already have a native convention, write to:
memory-bank/evidence/YYYY-MM-DD_HH-MM-SS_<topic>-session.mdmemory-bank/evidence/index.mdBefore creating anything, search for:
Read the relevant guidance and continue the repo's existing pattern.
Capture the strongest available identity for the artifact:
If a content hash is possible, record it early and use it as the main session identity.
Search existing sessions for the artifact identity.
Capture the exact commands used for the first comparison step.
Examples:
python compare.py --baseline out/a.json --candidate out/b.json
pytest tests/test_replay.py -k case_17
mytool diff trace_a.cdt trace_b.cdt
Never summarize commands loosely. Record them exactly.
Capture the first relevant divergence and, if applicable, how it moved over time.
Examples:
If later fixes move the mismatch frontier, append the new progression rather than deleting the old one.
Write findings as evidence-backed observations, not guesses.
Good findings:
If code changes are made during the investigation, capture them in a separate section:
If no changes were made, state that explicitly.
List the validation commands and their results.
Examples:
Do not write "fixed" without a validation section.
Every session should end with one of:
---
title: "<topic> – Differential Session"
phase: Evidence
date: "YYYY-MM-DD HH:MM:SS"
owner: "<agent_or_user>"
tags: [evidence, differential, <topic>]
---
## Artifact
- Path: `<path>`
- Identity: `<sha256|commit|case-id>`
## Baseline Commands
- `<exact command 1>`
- `<exact command 2>`
## First Mismatch Progression
- baseline: `<first mismatch>`
- after fix 1: `<new frontier>`
- after fix 2: `<cleared|new mismatch>`
## Key Findings
- finding 1
- finding 2
## Landed Changes
- `path/to/file` → change summary
- `tests/...` → validation coverage added
## Validation
- `<command>` → `<result>`
- `<command>` → `<result>`
## Outcome / Next Probe
- `<cleared | next probe | blocked reason>`
A completed session artifact should make handoff possible with no hidden context.
It must include:
After updating the session artifact:
plan-phaseexecute-phase