Help us improve
Share bugs, ideas, or general feedback.
This skill should be invoked when the user asks to "run a consensus pipeline", "validate a phase with consensus", "run consensus gates", "check with three agents", "unanimous validation", "audit with Lead Alpha Bravo", or invokes /consensus-run or /consensus-validate. Orchestrates the three-agent unanimous-gate phase pipeline: for each configured phase, invoke Lead, Alpha, and Bravo in parallel, collect independent PASS or FAIL votes, enforce unanimity, trigger fix cycles on any FAIL with re-validation by all three agents, and persist evidence per phase and role. Works for CLI, web, API, and iOS targets — the triad validates the target system independent of its platform.
npx claudepluginhub krzemienski/multi-agent-consensus --plugin multi-agent-consensusHow this skill is triggered — by the user, by Claude, or both
Slash command
/multi-agent-consensus:consensus-pipelineThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Execute a phase pipeline where every phase transition requires unanimous PASS from
Provides a checklist for code reviews covering functionality, security, performance, maintainability, tests, and quality. Use for pull requests, audits, team standards, and developer training.
Share bugs, ideas, or general feedback.
Execute a phase pipeline where every phase transition requires unanimous PASS from
three independent agents — Lead, Alpha, and Bravo. Any single FAIL triggers a fix
cycle, and ALL three agents re-validate after the fix, not just the one that failed.
Evidence persists per phase and per role under .consensus/evidence/, and state
persists at .consensus/state.json for resume after interrupt.
This skill ports the semantics of src/consensus/orchestrator.py (the Python CLI's
PipelineOrchestrator class) to Claude Code native primitives. Every runtime decision
the Python CLI makes, this skill makes via Task-tool invocations and bash scripts
bundled under scripts/.
The skill is triggered by two slash commands:
/consensus-run [--target PATH] [--phases P1,P2,...] [--resume]/consensus-validate [--target PATH] [--phase PHASE] — single-phase spot checkRead configuration from .claude/consensus.local.md frontmatter. The canonical
JSON Schema for both the plugin frontmatter and the Python YAML loader lives at
schemas/consensus-config.schema.json (draft 2020-12, schema_version: 1
required). See references/evidence-schema.md for the manifest/vote/state
schemas that consume this config. When .claude/consensus.local.md is absent,
defaults match src/consensus/config.py:ConsensusConfig exactly — phases
["explore", "audit", "fix", "verify"], max_fix_cycles: 3,
parallel_agents: true, require_unanimous: true, agents lead:opus,
alpha:sonnet, bravo:sonnet.
For each phase in the configured phase order:
Announce the phase. Print the phase name, description, and max fix cycle budget. Increment the gate counter.
Materialize the evidence directory. Run
bash ${CLAUDE_PLUGIN_ROOT}/skills/consensus-pipeline/scripts/init-evidence-dir.sh $TARGET $PHASE
to create $TARGET/.consensus/evidence/$PHASE/{lead,alpha,bravo}/.
Spawn three agents in parallel. This is the single most important step.
Issue ONE assistant tool-use block containing THREE parallel Task invocations —
one per role. Do NOT issue them sequentially across multiple assistant turns.
See references/role-invocation.md for the independence rationale and the
sequential-dispatch failure mode.
Each invocation uses the phase-prompt template from references/phase-prompts.md
with $TARGET, $PHASE, and $ROLE interpolated. Pass the gate number via the
environment variable CONSENSUS_GATE_NUMBER. Instruct the agent to write its
vote JSON to $TARGET/.consensus/evidence/$PHASE/{role}/gate-${N}-vote.json.
Collect votes. Each subagent returns a final message. Each agent also writes
its vote JSON to the per-role path. The shape of the JSON must match
references/evidence-schema.md (ports src/consensus/models.py:Vote). Any vote
that fails to write is treated as FAIL for safety (see gate.py:136-142 parse
failure fallback).
Evaluate the gate. Run
bash ${CLAUDE_PLUGIN_ROOT}/skills/consensus-pipeline/scripts/gate-check.sh $TARGET $PHASE $N.
.consensus/state.json with the phase marked PASSED.Fix cycle. Up to max_fix_cycles iterations (default 3):
a. Collect all FAIL findings from the three vote JSONs.
b. Surface the findings to the user. The skill itself does NOT apply fixes. Fix
application is out of scope for the consensus triad by design — the voting
function and the editing function must remain separate. If fix_agent is
configured in .claude/consensus.local.md, invoke that subagent with the
findings; otherwise prompt the user to apply fixes and continue.
c. After the fix signal, re-invoke ALL THREE agents (step 3) with an incremented
fix_cycle_count. Loop to step 5.
d. If max_fix_cycles is exhausted without unanimous PASS, mark the phase FAILED
in the state file and halt the pipeline.
Save state after every phase transition. Write the full PipelineState shape
to $TARGET/.consensus/state.json so that --resume can restore.
On every phase transition and every fix cycle, write
$TARGET/.consensus/state.json with the full PipelineState shape (see
references/evidence-schema.md). On /consensus-run --resume, read the state file
and resume from the last incomplete phase. Phases whose status is passed are
skipped — ports orchestrator.py:208-210 semantics.
The state file includes a schema_version: 1 field for forward-compatibility. The
Python CLI ignores unknown fields (pydantic models), so adding fields here does not
break consensus run on the same repo.
The Claude Code Task tool returns the subagent's final message as a tool_result. If the three Task calls are issued across three separate assistant turns, the second assistant turn's context will contain the first subagent's tool_result. That leaks reasoning across siblings and silently breaks the independence guarantee the consensus pattern depends on.
The correct pattern is exactly one assistant turn emitting exactly one parallel
tool-use block with three tool_use entries. The Claude Code runtime dispatches
all three, no agent sees another's result because none are available until all three
return, and the runtime's output stream interleaves them back into the caller's
context in one atomic step. This is validated by the Phase 00 VG-1 probe; do not
regress it.
If the invoking context does not support parallel Task (for instance, a headless
claude -p run where --input-format=text lacks tool-use batching), fall back to
CLI mode. See references/role-invocation.md section "CLI fallback".
Every gate writes four files under $TARGET/.consensus/evidence/$PHASE/:
lead/gate-${N}-vote.jsonalpha/gate-${N}-vote.jsonbravo/gate-${N}-vote.jsongate-${N}-verdict.json — merged by scripts/merge-votes.shA run-level manifest is written at pipeline completion:
$TARGET/.consensus/evidence/manifest.json — lists every evidence artifact with
type, role, phase, and timestamp. See references/evidence-schema.md.
After pipeline completion, /consensus-report reads .consensus/state.json and
renders a phase table identical to orchestrator.py:230-277. /consensus-report --format json emits the machine-readable report matching orchestrator.py:279-313.
references/gate-evaluation.md — unanimity math, fix-cycle rules, edge cases
(timeout → FAIL, malformed JSON → FAIL, zero-findings FAIL rationale).references/evidence-schema.md — directory layout and JSON shapes for Vote,
GateResult, Phase, PipelineState, Evidence, and manifest.references/phase-prompts.md — the four default phase prompt templates from
config.py:58-79 (explore, audit, fix, verify) and the substitution spec.references/role-invocation.md — Task-primary mode vs CLI fallback mode, decision
rules, and example invocations for each.schemas/consensus-config.schema.json — canonical JSON Schema (draft 2020-12)
for the plugin frontmatter config AND the Python YAML loader. Single source
of truth; both loaders must agree. schema_version: 1 is required.examples/streaming-bug-dogfood.md — walkthrough of running the pipeline against
a known-broken ChatViewModel.swift sample and observing the audit gate catch the
+= bug, the fix cycle trigger, and the verify-phase unanimous PASS.scripts/init-evidence-dir.sh — mkdir -p for the per-phase per-role tree.scripts/merge-votes.sh — read three vote JSONs, emit gate-${N}-verdict.json
matching models.py:GateResult.scripts/gate-check.sh — dual-mode. Authoritative mode (invoked by skill step 5)
returns exit 0 on unanimous PASS, exit 1 otherwise. Advisory mode (invoked by the
SubagentStop hook) prints a systemMessage only if a gate is currently closed.scripts/run-agent-cli.sh — CLI fallback invocation for a single agent. Direct
port of gate.py:61-76 subprocess call.A concrete walk-through of a single 4-phase run against a hypothetical target, showing what each piece of the pipeline produces. Timestamps are illustrative.
t=0 User invokes: /consensus-run --target ./webapp --phases explore,audit,fix,verify
t=0 Skill reads .claude/consensus.local.md (absent — uses defaults)
t=0 Skill writes .consensus/state.json — all phases pending, current_phase_index=0
t=1 PHASE explore — gate counter=1
t=1 init-evidence-dir.sh ./webapp explore → creates .consensus/evidence/explore/{lead,alpha,bravo}/
t=1 Skill emits ONE assistant turn with three parallel Task calls
t=2 Lead, Alpha, Bravo dispatch in parallel
t=40 All three return; each has written gate-1-vote.json to its per-role dir
t=40 gate-check.sh ./webapp explore 1 → reads three votes, merges → gate-1-verdict.json (unanimous PASS), exit 0
t=40 state.json updated: explore.status=passed, current_phase_index=1
t=41 PHASE audit — gate counter=2
(same sequence — three parallel Tasks, vote JSONs, verdict, state update)
t=80 gate-2-verdict.json unanimous_pass=false — Alpha flagged a bug at app.py:88
t=80 Skill surfaces findings to user; enters fix cycle
t=120 User applies fix; signals "applied"
t=120 Skill re-invokes ALL THREE agents with fix_cycle_count=1, gate counter=3
t=160 gate-3-verdict.json unanimous_pass=true (all three confirm the fix)
t=160 state.json updated: audit.status=passed, current_phase_index=2
t=161 PHASE fix — gate counter=4
(fix phase verifies fixes don't regress anything — usually unanimous PASS)
t=200 gate-4-verdict.json unanimous_pass=true
t=201 PHASE verify — gate counter=5
t=240 gate-5-verdict.json unanimous_pass=true
t=240 state.json: all phases passed, completed_at set
t=240 evidence/manifest.json written — lists 12 vote JSONs + 5 verdict JSONs + any inline evidence
t=240 /consensus-report invoked for final summary (phase table + gate count + fix cycles)
Total gates in this example: 5 (one per phase + one fix-cycle retry on audit). Total agent invocations: 5 × 3 = 15 completions. Total fix cycles counted in PipelineState: 1.
When reading configuration, the skill applies this precedence (higher wins):
/consensus-run or /consensus-validate (e.g., --phases).claude/consensus.local.md frontmatter in the target repositorysrc/consensus/config.py:ConsensusConfig)The skill MUST NOT read a .claude/consensus.local.md from anywhere other than
the target repository's root. Pulling config from the user's home directory or
the plugin install location would silently cross-contaminate settings between
unrelated projects.
Use /consensus-run when:
Use /consensus-validate when:
Both invocations use the same three-parallel-Task pattern and produce the same
evidence shape. The only difference is phase count and state-file semantics —
/consensus-validate writes vote JSONs and a verdict but does NOT write
state.json (spot check, not pipeline run).
The Bravo agent has Bash in its tools: frontmatter. Running Bravo against an
untrusted target repository could execute arbitrary commands from that repo's
runbook. Do not run the plugin against code you have not reviewed. This is the
only role with runtime-execution scope; Lead and Alpha are read-only.