v0.8.1 — codifies the three-tier reporting vocabulary (VERIFIED / ASSUMED / UNTESTED) and explains how to respond when the honest-reporting Stop/PreToolUse hook fires. Activated automatically by brainstorming, writing-plans, subagent-driven-development, autopilot, test-driven-development, and systematic-debugging.
How this skill is triggered — by the user, by Claude, or both
Slash command
/codex-paired-superpowers:honest-reportingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
The orchestrator (Claude) has a tendency to confidently report success on things it hasn't actually checked — "tests pass" without running them, "install path verified" when only checked locally, "both parts of the release gate cleared" when only one was actually exercised. The v0.8.1 honest-reporting hook surfaces this mechanically: a deterministic scanner over the last assistant turn (or rece...
The orchestrator (Claude) has a tendency to confidently report success on things it hasn't actually checked — "tests pass" without running them, "install path verified" when only checked locally, "both parts of the release gate cleared" when only one was actually exercised. The v0.8.1 honest-reporting hook surfaces this mechanically: a deterministic scanner over the last assistant turn (or recent turns, on git tag / git push / gh release create / npm publish) checks for high-precision claim vocabulary and requires nearby evidence.
This skill defines the vocabulary you should reach for so the hook stays out of the way — and so reports remain trustworthy when the hook isn't active.
VERIFIED — I ran a tool in this turn (or the immediately prior turn) and its output established this. Cite the tool name and the relevant output. Example: "VERIFIED: npm test exited 0 with 88 tests passing (see test output above)."
ASSUMED — based on prior session state or unverified inference. Honest about the lack of evidence in this turn. Example: "ASSUMED stable from prior session: the v0.8.0 install was confirmed at session start; I have not re-checked this turn."
UNTESTED — deferred or known-not-yet-verified. Useful for surfacing coverage gaps. Example: "UNTESTED: cross-platform (Linux/Windows); only macOS was exercised."
The hook activates when <repo-root>/.codex-paired/honest-reporting-active.json is present and its expiresAt is in the future. It fires on:
git tag, git push, gh release create, or npm publish. Scans the last 1–2 assistant turns the same way.The exit-2 message names each unsourced claim and asks you to rewrite the turn citing the tool call OR reclassifying as ASSUMED / UNTESTED.
The scanner strips these before scanning:
``` ... ```)<<<VERDICT>>> ... <<<END>>>)## Machine Result sections> ...)the flagged word "shipped" is a meta-mention,
not a claim (v0.15.0; this is what previously caused re-block loops when a rewrite
discussed the word the hook flagged)So you can quote user text or include verdict blocks without false positives.
Anywhere in the same message as a claim match (v0.15.0 — previously the window was "same paragraph ±200 chars", which punished the honest tool-output-then-summary shape):
Bash, Edit, Read, Write, gh , git , node , npm , pnpm, pytest, jest, etc.dir/file.ext:42)ASSUMED, UNTESTED, not verified, haven't checked, based on prior session, recall from memoryran (e.g., "ran npm test")exit code| Bad (will fire) | Good (will pass) |
|---|---|
| "Tests pass and we're shipped." | "VERIFIED: npm test exited 0; tag pushed via git push --tags." |
| "v0.8.0 shipped cleanly." | "v0.8.0 shipped cleanly — ASSUMED stable from prior session (last release-gate run was 2026-05-10)." |
| "Install verified across platforms." | "VERIFIED on macOS: claude plugin install returned Version 0.8.0. UNTESTED on Linux/Windows." |
| "Everything is deployed." | "Deployed to staging (ASSUMED based on the run earlier; not re-checked this turn). UNTESTED on production." |
The marker file is <repo-root>/.codex-paired/honest-reporting-active.json:
{
"skillName": "autopilot",
"sessionStartedAt": "2026-05-11T14:30:00.000Z",
"expiresAt": "2026-05-11T22:30:00.000Z",
"specPath": "/abs/path/to/spec.md"
}
Default TTL: 8 hours. After expiry, the hook is inactive. Skills GC stale markers (>24h old) on their next entry.
The marker is written by the entry block of each codex-paired skill — they invoke:
node "$CLAUDE_PLUGIN_ROOT/lib/codex-bridge/cli.js" honest-reporting-mark-active --skill <name> [--spec <path>] [--ttl-hours N]
Clearing on completion (v0.15.0). The TTL is the backstop, not the lifecycle. When the
workflow finishes — the loop double-SHIPs, autopilot writes halt_reason: "completed" or halts,
or the skill's work is otherwise wrapped up — clear the marker so the hook stops policing
unrelated work in the same repo for the rest of the TTL window:
node "$CLAUDE_PLUGIN_ROOT/lib/codex-bridge/cli.js" honest-reporting-clear
This mirrors autopilot's anchor-clear. Sessions that die without cleanup are still covered by
TTL expiry. The marker also now resolves slice worktrees (.git-worktrees/slice-N) to the main
repo root, so implementer subagents are policed by the same marker.
One block per stop (v0.15.0). The Stop hook honors stop_hook_active: it blocks a turn at
most once. The rewrite goes through even if imperfect — fix the substance, don't fight the hook.
When rewriting, do NOT quote or discuss the flagged word; state the evidence or reclassify.
The hook respects normal-user freedom. Outside of codex-paired skill sessions, you can use any phrasing; the scanner doesn't run. The discipline is still good practice, but only the codex-paired workflows enforce it.
If a particular session needs to bypass the hook (e.g., demos, deliberate prose writing about past releases), delete or expire the marker:
rm <repo-root>/.codex-paired/honest-reporting-active.json
Re-running any codex-paired skill re-creates it.
npx claudepluginhub mkritter3/codex-paired-superpowers --plugin codex-paired-superpowersProvides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.