From developer-workflow
Performs multi-agent review of implementation plans using PoLL consensus protocol. Independent expert panels surface diverse issues and blind spots before coding.
npx claudepluginhub kirich1409/krozov-ai-tools --plugin developer-workflowThis skill uses the workspace's default tool permissions.
Multi-agent independent review of an implementation plan, followed by consensus synthesis.
Reviews implementation plans through quality lenses (architecture, security, test coverage, code quality, standards, usability) and collaboratively iterates before implementation.
Orchestrates parallel architecture and experience reviews of implementation plans, scores findings across dimensions like data flow and UX, consolidates ranked fixes for user approval and auto-application. Use after planning, before non-trivial coding.
Reviews and validates plans (implementation, research, design, migration, etc.) using parallel subagents for codebase alignment, best practices, standards, feasibility, and fresh perspectives.
Share bugs, ideas, or general feedback.
Multi-agent independent review of an implementation plan, followed by consensus synthesis.
A single reviewer has blind spots. Different experts catch different problems — an architect spots coupling issues, a security engineer finds auth gaps, a performance expert flags N+1 queries. Independent parallel review prevents groupthink: each agent forms their own opinion before seeing anyone else's, which surfaces more diverse issues than sequential discussion.
The core value is not that individual reviews are better — it's that multiple independent perspectives surface issues that any single reviewer would miss, and the structured synthesis makes disagreements and consensus explicit rather than hidden.
The review follows the Panel of LLM Evaluators protocol, backed by research showing that independent panels outperform iterative debates (debates cause conformity and suppress dissent):
┌─ Read plan (track source: Plan Mode / file / conversation)
│ ↓
│ Discover available agents (only real, existing agents)
│ ↓
│ Pre-select relevant agents → present multi-select to user
│ ↓
│ Spawn selected agents in parallel (independent review)
│ ↓
│ Collect all reviews
│ ↓
│ Synthesize verdict (PoLL aggregation, or single-agent verdict)
│ ↓
│ Present verdict
│ ↓
│ ┌─ PASS → Done (proceed to implementation)
│ ├─ CONDITIONAL → Fix plan at source → Re-review ─┐
│ └─ FAIL → Fix plan at source → Re-review ────────┘
│ │
└────────────────────────────────────────┘
(max 3 review cycles)
The review follows a strict state machine. Only these transitions are valid:
Read Plan → Discover Agents
Discover → Select Agents
Select → Parallel Review
Review → Synthesize
Synthesize → Verdict
Verdict:PASS → Done
Verdict:COND → Fix Plan
Verdict:FAIL → Fix Plan
Fix Plan → Re-review (back to Parallel Review with same agents)
Re-review → Synthesize → Verdict (same cycle)
Forbidden transitions:
Cycle limit: maximum 3 full review cycles (initial + 2 re-reviews). If the plan still has blockers after 3 cycles, stop and escalate to the user — the plan may need a fundamentally different approach rather than incremental fixes.
For long reviews (multiple agents, re-review cycles), save state to a file so work survives
context compaction. Use ./swarm-report/plan-review-state.md with this structure:
# Plan Review State
Source: {plan_mode | file:<path> | conversation}
Cycle: {1 | 2 | 3} of 3
Status: {discovering | reviewing | synthesizing | fixing | done}
## Plan Summary
{goal, technologies, scope — extracted in Step 1}
## Selected Agents
- {agent1} (recommended)
- {agent2} (recommended)
## Reviews Completed
- [x] {agent1} — {severity counts: N critical, M major, K minor}
- [ ] {agent2} — pending
## Verdict History
### Cycle 1: {PASS | CONDITIONAL | FAIL}
- Blockers: {list}
- Improvements: {list}
### Cycle 2: ...
Rules:
Locate the current plan. Check these sources in order:
plan.md, PLAN.md, or any markdown file), read itTrack the plan source — remember whether it came from Plan Mode, a file (save the path), or conversation context. Step 5 needs this to know how to apply fixes.
Extract from the plan:
Find all available agents by scanning for real agent definition files:
Glob("**/agents/*.md") across plugin paths in the projectgeneral-purpose, kotlin-engineer, compose-ui-architect, manual-tester, etc.)Critical rule: only include agents that actually exist. Read each agent file's frontmatter
(name, description) to confirm it's real. Never invent, imagine, or assume agents that aren't
physically present as files or listed in the system prompt. If you're not sure an agent exists —
check before listing it. A phantom agent in the selection list erodes trust.
Score each discovered agent's relevance to the plan based on:
Mark agents as recommended if their relevance score is high. Aim for at least 2 recommended
agents — multi-perspective review is the whole point. But if only 1 agent is genuinely relevant,
recommend just that one rather than padding with irrelevant agents. general-purpose is a good
fallback when no other agent covers a gap — it brings broad architectural perspective that
complements any domain specialist.
Use AskUserQuestion with multiSelect: true to present the agent list. Structure:
AskUserQuestion supports max 4 options. If more agents are available, show only the top 4
by relevance and mention the rest in the question text so the user can type them in "Other".Explicit agent specification: if the user named specific agents (e.g., "review with kotlin-engineer and security"), skip discovery entirely and use those agents directly. No multi-select needed — the user already chose.
Spawn each selected agent as a subagent via the Agent tool. All agents launch in a single
message to maximize parallelism.
Each agent receives this prompt (adapted to their expertise). The structured format is important because the synthesis step depends on parsing severity, confidence, and domain_relevance from each review. Without consistent structure, aggregation becomes guesswork.
You are reviewing an implementation plan as a {agent_role} expert.
## The Plan
{full_plan_text}
## Your Task
Review this plan from the perspective of your expertise. Be specific and actionable.
## Required Output Format
You MUST structure your response exactly as follows:
### Summary
2-3 sentence overall assessment from your perspective.
### Domain Relevance
State one of: high | medium | low — how much does this plan touch your area of expertise.
### Issues
For each issue, use this exact structure:
**Issue N: {short title}**
- **severity**: critical | major | minor
- **confidence**: high | medium | low
- **issue**: what the problem is (1-2 sentences)
- **suggestion**: what to do instead (1-2 sentences)
Severity guide:
- critical = blocks implementation or will cause serious failures
- major = significantly affects quality, performance, or maintainability
- minor = nice to have, low risk if skipped
Confidence guide:
- high = this is squarely in your domain and you're certain
- medium = relevant to your domain but you could be wrong
- low = outside your core expertise but worth flagging
Be honest about confidence — a low-confidence flag from outside your domain is still valuable,
but it should be weighted accordingly in synthesis.
Respond in the same language the plan is written in.
After all agents complete, the orchestrator (main session) reads all reviews and synthesizes. This is the step where multi-agent review delivers its core value — cross-referencing independent opinions to find signal that no single reviewer could produce alone.
Single-agent case: if only one agent reviewed the plan (either by user's choice or because only one was relevant), skip cross-referencing. Present that agent's issues directly using the same verdict format, but note that the review represents a single perspective. Convergence signals and uncertainties sections are not applicable — omit them.
| Signal | Action |
|---|---|
| Critical severity from any agent with high confidence | → Blocker. Must be addressed. |
| Same issue raised by 2+ agents independently | → Escalate to critical regardless of individual severity. Multiple experts seeing the same problem = real problem. |
| Major severity from agent with high domain_relevance | → Important improvement. Include in verdict. |
| Contradicting opinions between agents | → Surface as "Uncertainty — requires decision". Present both sides with context. Do NOT silently pick one. |
| Minor severity or low confidence from single agent | → Include as suggestion, not requirement. |
| Low domain_relevance agent flagging an issue | → Note it but weight lower. They may be right, but it's outside their core expertise. |
Pay special attention to convergence signals — when agents with different expertise independently flag the same concern, that's the strongest signal the review can produce. Call these out explicitly in the verdict.
Present the synthesized result:
## Plan Review Verdict: {PASS | CONDITIONAL | FAIL}
### Blockers (must fix before implementing)
- {issue} — raised by {agent(s)}, severity: critical
Suggestion: {what to do}
### Important Improvements (strongly recommended)
- {issue} — raised by {agent(s)}, confidence: {level}
Suggestion: {what to do}
### Suggestions (nice to have)
- {issue}
Suggestion: {what to do}
### Uncertainties (requires your decision)
- {topic} — {Agent A} says X, {Agent B} says Y
Context: {why they disagree}
### Consensus
{What all agents agreed on — the strengths of the plan}
Verdict criteria:
This step is not optional — always execute it based on the verdict.
The action depends on where the plan came from (tracked in Step 1):
| Source | How to fix |
|---|---|
| Plan Mode | Call EnterPlanMode with the list of issues to address |
File (e.g., plan.md) | Edit the file directly with the improvements |
| Conversation context | Present the issues and work with the user to revise the plan inline |
Confirm the plan is ready. Say so explicitly and proceed to implementation.
EnterPlanMode with the improvement listEnterPlanMode with the blockers list and suggestions.