From arcforge
Audits arcforge SDD spec families (design.md, spec.xml, dag.yaml) for cross-artifact alignment, consistency, and state-transition integrity. Invoke via /arc-auditing-spec <spec-id>.
npx claudepluginhub gregoryho/arcforge --plugin arcforgeThis skill uses the workspace's default tool permissions.
**READ-ONLY ADVISORY. NEVER MUTATE.**
evals/README.mdevals/aa-cross-artifact-001-rename-drift.mdevals/aa-internal-001-ac-contradiction.mdevals/aa-state-transition-001-stale-worktree.mdevals/oi-001-emphasis-single-high.mdevals/oi-002-threshold-n-high-0.mdevals/oi-003-low-resolutions-skip.mdevals/oi-004-decisions-table-suppressed.mdevals/sc-001-invocation-contract.mdevals/sc-001-no-pipeline-invocation.shevals/sc-002-read-only-behavior.mdevals/threshold-change1-coverage.mdevals/threshold-change1-n-high-1-full.mdreferences/finding-schema.mdreferences/phase1-prompt.mdreferences/report-templates.mdreferences/save-flag.mdMandates invoking relevant skills via tools before any response in coding sessions. Covers access, priorities, and adaptations for Claude Code, Copilot CLI, Gemini CLI.
Share bugs, ideas, or general feedback.
READ-ONLY ADVISORY. NEVER MUTATE.
Skill body and all three sub-agents MUST NOT Edit, Write, rename, delete, or run any mutating git / filesystem operation. Phase 5 prints a Decisions table and exits. Main session (or a subsequent skill) owns any actual edits — not this skill, never.
No --apply flag, no "while I'm here let me fix this typo" shortcut, no Phase 6 that starts applying decisions. Diagnostic role only.
/arc-auditing-spec <spec-id> wanting cross-artifact alignment, internal consistency, and state-transition integrity checked across the spec familydesign.md exists (spec.xml and dag.yaml optional; graceful degradation per fr-aa-004)/review or pr-review-toolkitarc-refining or arc-planning — those skills do not and must not auto-invoke this one (fr-sc-001-ac3); invocation is always user-initiated/arc-auditing-spec <spec-id> [--save]
<spec-id> — directory name under specs/. Exact match only; no fuzzy resolution.--save — optional. When present, writes the full Phase 2 report + Phase 5 Decisions table to ~/.arcforge/reviews/<project-hash>/<spec-id>/<YYYY-MM-DD-HHMM>.md (hash from ${ARCFORGE_ROOT}/scripts/lib/worktree-paths.js). When absent, no file is written anywhere.Before Phase 1 fan-out, verify specs/<spec-id>/ exists as a directory.
If it exists: proceed to Phase 1. No file is written at this point.
If it does NOT exist: STOP. Print:
Error: specs/<spec-id>/ does not exist.
Available spec-ids:
- <id-1>
- <id-2>
...
Then exit non-zero. Write nothing. Spawn no sub-agent. This is the only valid response — see Red Flags for the specific failure modes this rule forbids.
| Phase | What | Contract |
|---|---|---|
| 0 | Precondition check above | fr-sc-001-ac1, fr-sc-001-ac2 |
| 1 | Parallel fan-out to three read-only sub-agents via Task tool | agents/arc-auditing-spec-*.md; fr-aa-001 |
| 2 | Print Summary + Findings Overview + per-finding Detail markdown | specs/arc-auditing-spec/details/output-and-interaction.xml fr-oi-001 |
| 3 | Triage UX — AskUserQuestion multi-select over HIGH findings | fr-oi-002 |
| 4 | Resolution UX — batched per-finding AskUserQuestion with diff previews | fr-oi-003 |
| 5 | Print Decisions markdown table; skill exits | fr-oi-004 |
You MUST dispatch all three audit agents in a SINGLE message using three parallel Task tool uses. Do NOT dispatch them one at a time. Sequential dispatch is the baseline failure mode this rule exists to prevent — a stock agent defaults to serial execution; this skill forbids it.
Dispatch these three agents concurrently, in a single message:
arc-auditing-spec-cross-artifact-alignmentarc-auditing-spec-internal-consistencyarc-auditing-spec-state-transition-integrityAssemble the sub-agent prompt per references/phase1-prompt.md — that file
is the authoritative layout. The prompt carries the spec-id plus resolved
absolute paths to design.md (newest docs/plans/<spec-id>/*/design.md),
spec.xml (specs/<spec-id>/spec.xml), details/*.xml
(specs/<spec-id>/details/), and dag.yaml (specs/<spec-id>/dag.yaml).
When an artifact is missing, substitute the literal absence marker
(absent — file does not exist) verbatim — do not omit the line, do not
invent placeholder paths. The axis agents depend on the marker for their
graceful-degradation branches (fr-aa-004).
REQUIRED BACKGROUND: skills/arc-auditing-spec/references/phase1-prompt.md
When an axis agent returns an error_flag in its findings, that axis has
failed mid-audit. The main session MUST:
error_flag in the Phase 2 Summary table (as a warning row
for that axis).One axis's error_flag does NOT stop the other two axes' findings from being
shown and triaged.
See skills/arc-auditing-spec/references/report-templates.md for full worked
examples with exact column headers. The summary below is the decision logic;
the reference file is the layout authority.
REQUIRED BACKGROUND: skills/arc-auditing-spec/references/report-templates.md
Print three sections in order. No omissions regardless of severity.
Section A — Summary table (axis | HIGH | MED | LOW | INFO | Total).
One row per axis plus a Totals row. If an axis returned error_flag, replace
its counts with ERR and note the error below the table.
Section B — Findings Overview table (ID | Sev | Axis | Title | Primary file).
Every finding from all three axes MUST appear — MED, LOW, and INFO findings
appear in this table exactly as HIGH findings do. No omissions.
When exactly one HIGH-severity finding exists across the full finding set
(N_HIGH == 1), the Title cell for that single-HIGH row MUST start with ⚠️
and MUST render the title text in markdown bold: ⚠️ **<title>**. This visual
emphasis ensures the lone HIGH is conspicuous even when Phase 3 triage does not
fire. When N_HIGH is 0 or >= 2, render all Overview rows without the ⚠️
prefix (baseline rendering). The ⚠️ prefix MUST NOT appear in the
per-finding Detail block header — the emphasis is Overview-row-only.
Section C — Per-finding Detail blocks. One block per finding, same order as the Overview. Each block contains:
location | evidence).Resolution | Description | Side-effect / Cost). When a resolution has a preview diff from the
agent, append the diff block under the table row.MED, LOW, and INFO findings MUST have full Detail blocks rendered here — even though they will not enter triage in Phase 3. All findings require visibility; the Phase 3 multi-select is not the only visibility mechanism.
Goal: let the reviewer select which HIGH findings to address in this session. MED, LOW, INFO MUST NOT appear as options — only reachable via the Other free-text channel (F-01, pinned). Phase 3 fires only when N_HIGH >= 2; below that threshold the skill takes a degraded path.
Step 0 — Threshold check (MANDATORY before any AskUserQuestion call). Count the HIGH-severity findings across all three axes (N_HIGH).
N_HIGH == 0: Phase 3 does NOT fire. Do NOT issue any AskUserQuestion
call. Do NOT enter Phase 4. Do NOT render a Phase 5 Decisions table.
Instead, print the concluding recommendation line (template:
references/report-templates.md §Concluding Recommendation Line) and
exit cleanly. The Phase 2 Detail blocks are the complete deliverable;
no alternative injection channel is provided.
N_HIGH == 1: Phase 3 multi-select does NOT fire (a single-option
multi-select would violate options.minItems: 2). Skip Phase 3 entirely.
The Phase 2 Findings Overview row for the lone HIGH already carries the
visual emphasis from fr-oi-001-ac5. Proceed directly into Phase 4 with
that single HIGH as the sole Stage-2 queue entry. Phase 4 then proceeds
per its existing rules (fr-oi-003). No Other injection channel exists on
this path; the Other pull-in channel only exists when Phase 3 actually
fires (N_HIGH >= 2).
N_HIGH >= 2: Phase 3 fires. Continue to the steps below.
Step 1 — Determine HIGH finding list. Collect all HIGH-severity findings across all three axes. Sort: A1 before A2 before A3; within each axis sort by NNN ascending.
Step 2 — Batched multi-select loop. Present up to 4 HIGH findings per AskUserQuestion call. When more than 4 HIGH findings exist, make sequential calls batching up to 4 each time until every HIGH finding has been presented exactly once.
Use AskUserQuestion with header: "Triage" and multiSelect: true. Template: see references/report-templates.md §Phase 3.
Step 3 — Parse Other free-text.
The AskUserQuestion tool appends an auto-generated Other field. After each
call, scan the Other string for the regex pattern A[1-3]-\d{3}. Add each
matched finding ID to the Stage 2 resolution queue alongside any HIGH IDs
the user checked. This is the ONLY channel through which MED, LOW, and INFO
findings enter the queue. This channel only exists when Phase 3 actually
fires (N_HIGH >= 2); it is not available on the N_HIGH == 0 or N_HIGH == 1
degraded paths.
Goal: for each finding in the resolution queue, ask the reviewer which
resolution to apply. One resolution per finding (multiSelect: false).
Only findings with at least 2 suggested resolutions receive an
AskUserQuestion question; findings with fewer than 2 suggested resolutions
are auto-skipped (see Per-Finding Skip Rule below).
Batched loop: iterate the Stage-2 queue. For each entry, check its resolution count before issuing a question. Batch qualifying findings (those with >= 2 resolutions) at most 4 per AskUserQuestion call. Sequential calls until all qualifying findings are asked exactly once.
Use AskUserQuestion with header = finding ID and multiSelect: false. Template: see references/report-templates.md §Phase 4.
Rules:
header = finding ID (format A<n>-<NNN> is 6 chars — no truncation
needed).label MUST start with
"(Recommended)".preview diff. Engine-fix-type options (e.g., "file a bug
against coordinator.js") MAY omit preview.Per-Finding Skip Rule (fr-oi-003-ac6): When Phase 4 iterates to a
finding whose suggested-resolutions count is less than 2 (i.e., 0 or 1
resolution), the skill MUST NOT issue any AskUserQuestion question for it.
AskUserQuestion's options.minItems: 2 constraint forbids a single-option
question, and asking among a single option is pure ceremony. Such findings
rely on their Phase 2 Detail block's Suggested Resolutions table as the
deliverable — the user reads it and decides what to do in the main session.
Note: skipped findings still appear in the Phase 2 Detail block with their
full Resolutions table. The skip only suppresses the interactive question,
not the data surface.
When the Decisions table is rendered (per fr-oi-004), a skipped finding's
row MUST have its Chosen Resolution column set to the sentinel string
(no ceremony — see Detail) and its User Note column left empty.
This skip is NOT an error. Do not treat it as a failure, do not log a warning, and do not halt the Phase 4 loop. Simply proceed to the next finding in the queue.
Conditional firing gate (fr-oi-004-ac1): Phase 5 renders the Decisions
table ONLY when Phase 3 or Phase 4 actually fired during this invocation.
Use the in-memory ceremony_fired flag that is set to true the moment
either phase issues its first AskUserQuestion call. This single rule
covers every path uniformly: the standard N_HIGH >= 2 path flips the flag
on Phase 3's first call; the N_HIGH == 1 + ≥2-resolutions direct-to-Phase-4
path (fr-oi-002-ac6) flips the flag on Phase 4's first question; the
N_HIGH == 1 + <2-resolutions path enters Phase 4 but issues no question,
so the flag correctly stays false and no Decisions table renders. Do
NOT re-derive this condition at Phase 5 entry — carry the flag forward
from wherever ceremony began.
N_HIGH == 0 path (fr-oi-004-ac4): When both Phase 3 and Phase 4 were
skipped per fr-oi-002-ac5 (the zero-HIGH exit), ceremony_fired remains
false. In that case Phase 5 MUST NOT print a Decisions table, MUST NOT
print a stub "No decisions" line, and MUST NOT produce any Phase 5 output.
The concluding recommendation line already printed at the end of Phase 3's
threshold check is the skill's terminal output on this path.
When ceremony_fired is true, print the Decisions table, then exit.
This is the final deliverable. Template: see references/report-templates.md §Phase 5.
Finding ID: the A<n>-<NNN> id.Chosen Resolution: label of the option the user selected, or
"(Other)" when answered via free-text. For findings auto-skipped
at Phase 4 (fewer than 2 resolutions), use the sentinel
(no ceremony — see Detail).User Note: when the user answered via Other, store that free-text
verbatim — no paraphrasing, no summarizing. Empty when no Other
text was provided. Empty for auto-skipped findings.The Decisions table MUST include ALL findings that went through Phase 3
triage or Phase 4 resolution, including any findings auto-skipped at Phase
4 whose rows carry the (no ceremony — see Detail) sentinel. The record
must be complete — no finding that entered the Stage-2 queue MUST be left out.
Phase 5 is TERMINAL. After printing the Decisions table, the skill exits.
The main session MUST NOT apply any resolution via Edit, Write, or any
mutating tool based on user decisions — that action is explicitly out of
scope for this skill. If you find yourself about to call Edit or Write, or
about to invoke /arc-refining to apply changes, STOP — see Red Flags.
When --save is present, the main session writes the full Phase 2 report
~/.arcforge/reviews/<project-hash>/<spec-id>/<YYYY-MM-DD-HHMM>.md
after Phase 5 prints (24-hour time). Without --save: zero files are
written anywhere. Derive <project-hash> via a node subprocess
calling hashRepoPath from ${ARCFORGE_ROOT}/scripts/lib/worktree-paths.js — never
reimplement the hash inline (drift risk). The reference file carries the
exact subprocess one-liner, concrete filename example, and mkdir -p
parent-directory command.REQUIRED BACKGROUND: skills/arc-auditing-spec/references/save-flag.md
tools: allowlist in each agent's frontmatter (agents/arc-auditing-spec-*.md) — not via prose in system prompts. See fr-sc-002-ac3.--save is the ONLY permitted write, and only to ~/.arcforge/reviews/ under the arcforge home directory — never into specs/, docs/, or any project-tracked path.If you find yourself doing any of these, STOP immediately:
| Rationalization | Why it's wrong | Do instead |
|---|---|---|
| "The user's spec-id doesn't exist, but this other one is clearly what they meant — I'll audit that and note the substitution at the top" | fr-sc-001-ac2 forbids substitution. This is a baseline failure mode observed in RED testing — the rationalization "don't ask clarifying questions" does NOT justify picking a different spec. | Print available ids, exit. Let the user re-invoke with the right id. |
"The user said <id> and specs/<id>/ is missing, but docs/plans/<id>/ has a design.md — I'll audit in pure-design mode since the design clearly exists" | Phase 0 requires specs/<id>/ specifically. docs/plans/<id>/ is not a fallback path — not for pure-design audits, not for partial, not for anything. A design doc without a spec directory is a pre-refining state; the audit skill does not operate on it. | Print available ids, exit. If the user wanted a design-only review, that's /arc-brainstorming or manual review territory, not this skill. |
| "I spotted an obvious typo while reading — fixing it saves a round trip" | Read-only is absolute. Even typos are reported as findings, never patched. | Add a finding (LOW severity) to the axis agent's output; let main session decide. |
| Any of: • "User picked resolution (a) for A1-003 — I should Edit the spec now" • "Phase 5 ended — I'll invoke /arc-refining to apply the decision" • "User selected (Recommended) — applying saves a round-trip" | Phase 5 is terminal (fr-oi-004-ac3). No mutation, no auto-chain, no Edit — even for an obvious Recommended choice. This skill's scope ends at the Decisions table. | Print Decisions table, exit. Main session owns all subsequent action. |
| "I'll have arc-refining/arc-planning call this skill at the end of their flow for free quality gating" | fr-sc-001-ac3 forbids pipeline auto-invocation. The skill must remain user-triggered. | Do not add any invocation from any pipeline SKILL.md body. |
"Let me add a --apply flag so this is one-step for users" | Makes the skill a mutator, defeating its diagnostic-only contract. | Don't. The contract is the contract. |
"The agent marked option A as (Recommended), so I can lean on it — emphasize it in the question wording, sort it visually first, fold the others under 'more options'" | (Recommended) is the axis agent's heuristic ranking, not a verdict. Eval evidence (resolution-ranking accuracy ~50% on real specs) shows the prefix is informational, not authoritative. Emphasizing it in presentation, collapsing alternatives, or paraphrasing non-recommended options pre-empts the user's decision and converts an advisory into a soft mutator. | Render every option with equal visual weight beyond the (Recommended) prefix itself. Do not paraphrase, sort, hide, or deemphasize non-recommended options. The user is the decider. |
| "This MED finding is clearly important so I'll add it to the triage options anyway" | F-01 is pinned: MED/LOW/INFO MUST NOT appear in Phase 3 triage options. The Overview table and Detail block already ensure visibility. | MED/LOW/INFO are only reachable via the Other free-text channel. Do not add them to options. |
| "The user might miss the MED finding if it's not in the triage options" | Phase 2 overview table and detail block ensure every finding is visible regardless of severity. The triage options array is for HIGH findings only — that is the structural boundary. | Trust the overview table. Other free-text is the channel for user-initiated MED/LOW/INFO selection. |
"The --save path is pinned to ~/.arcforge/reviews/ but a symlink into docs/reviews/ or specs/ would be more discoverable" | --save is the ONE carve-out, and only to ~/.arcforge/reviews/ specifically. Any other path violates the Iron Law. | Write only to ~/.arcforge/reviews/<project-hash>/<spec-id>/<YYYY-MM-DD-HHMM>.md. |
| "I'll reimplement the project hash inline — it's just sha256 of cwd" | fr-oi-005-ac3: the hash MUST come from ${ARCFORGE_ROOT}/scripts/lib/worktree-paths.js so a single project has one hash across worktree paths and review paths. Reimplementing risks drift. | Use the subprocess one-liner shown in the --save section. |
"Without --save, I'll save the report anyway — it's harmless and the user will appreciate it" | fr-oi-005-ac1: without --save, ZERO files are written. Default is read-only. | Do not write any file unless --save is explicitly present. |
Per fr-sc-003, this skill's phase content, the three sub-agent system prompts, and the eval scenarios validating audit correctness were produced through arc-writing-skills' TDD (RED → GREEN → REFACTOR) with recorded baseline-without-skill scenarios. Evals live under skills/arc-auditing-spec/evals/ and exercise each of the three audit axes (fr-sc-003-ac2); the suite MUST pass before shipping.
agents/arc-auditing-spec-cross-artifact-alignment.md, agents/arc-auditing-spec-internal-consistency.md, agents/arc-auditing-spec-state-transition-integrity.mdspecs/arc-auditing-spec/spec.xml + specs/arc-auditing-spec/details/{skill-contract,audit-agents,output-and-interaction}.xmldocs/plans/arc-auditing-spec/2026-04-22/design.md (v1, frozen) · docs/plans/arc-auditing-spec/2026-04-24-iterate2/design.md (v2 Change Intent — ceremony threshold rules)skills/arc-auditing-spec/evals/