From claude-code-config
Launches parallel agents for iterative review of plans or code across 4 escalating rounds: broad, multisample, focused, focused+multisample. Targets issues in plans >500 lines or with >3 components.
npx claudepluginhub anastasiyaw/claude-code-configThis skill is limited to using the following tools:
Iterative plan hardening through multisampling and focused decomposition.
Dispatches parallel specialist agents to review code diffs, PRDs, or plans before PRs or as quality gates. Invoked via /review, code-review agent, or team-dev in swarm tiers.
Review an implementation plan through multiple quality lenses and collaboratively iterate based on findings. Use when the user wants to evaluate a plan before implementation.
Challenges software plans with pre-implementation red-team analysis, identifying edge cases, security holes, scalability bottlenecks, error propagation risks, and integration conflicts.
Share bugs, ideas, or general feedback.
Iterative plan hardening through multisampling and focused decomposition.
Core insight: a single agent misses issues due to attention budget limits. Multiple independent agents reading the same document find different problems (stochastic diversity). Focused decomposition further improves depth per aspect. Iterative fix-then-re-review uncovers issues previously masked by other bugs.
Source: deksden (@deksden_notes) — "Plan Swarming" technique, April 2026. Related: Anthropic Harness Design (Generator-Evaluator), deep-review (parallel competency code review).
Research backing:
This skill works in two modes:
Plan mode (default): review design docs, specs, ADRs, RFCs before implementation. Code mode: review code files for bugs and vulnerabilities. Activated when user passes code files instead of a plan, or says "review code", "find vulnerabilities", "security audit". In code mode, aspects shift from plan-oriented (contracts, completeness) to code-oriented (injection, auth bypass, race conditions, memory).
Ask the user which document to review if not obvious from context.
Plan mode: PLAN.md, ADR, spec, design doc, RFC, or any structured document describing what will be built and how.
Code mode: source code files, a module, or a directory. Best for security audits, bug hunts, or pre-release quality checks.
Read the target document(s) fully. Note:
- Total size (lines, sections/files)
- Key components/modules described or implemented
- Interfaces between components
- Data flows and mutations
- External dependencies and trust boundaries
If the target is <100 lines with 1-2 simple components, suggest a single-pass review instead — swarming is overkill for small targets.
Purpose: catch obvious issues before spending tokens on multisampling.
Launch ONE Agent with this prompt:
You are a senior architect reviewing a plan document before implementation.
Your goal: find issues that would cause bugs, rework, or confusion during
implementation.
## Plan to review
{paste or reference the plan document path}
Read the entire plan. Then check for:
1. CONTRACTS — are interfaces between components fully specified?
Types, error codes, required vs optional fields, versioning.
2. DATA FLOW — is data transformation described end-to-end?
What happens at each boundary? Backward compatibility?
3. NEGATIVE SCENARIOS — what happens when things fail?
Timeouts, partial failures, invalid input, race conditions.
4. CONSISTENCY — do different sections contradict each other?
Same entity described differently in two places?
5. COMPLETENESS — are there gaps? Steps that say "TBD" or "later"?
Scenarios mentioned but not covered?
6. DEPENDENCIES — is implementation order clear?
Are blocking dependencies identified? Circular deps?
7. AMBIGUITY — could two engineers read a section and implement
it differently? Vague terms like "handle appropriately"?
## Output format
For EACH finding:
FINDING: {one-line description}
SECTION: {which section of the plan}
SEVERITY: HIGH | MEDIUM | LOW
EVIDENCE: {quote the problematic text, max 2 lines}
FIX: {concrete change to the plan text}
If the plan is clean — output: "NO_FINDINGS — plan review clean."
Do NOT pad with praise. Only problems.
Collect findings. If 0 findings → plan is clean, congratulate user, stop.
If findings exist:
Purpose: stochastic diversity catches what one pass missed.
IMPORTANT: do NOT use identical prompts for all agents. Research [2502.11027] shows identical prompts produce correlated errors — agents "cluster" on the same issues and miss the same blind spots. Instead, give each agent a DIFFERENT perspective while reviewing the same document.
Launch 3 agents in parallel (or 5 for critical plans), each with a different reviewer persona:
CRITICAL: launch all agents in a SINGLE message (parallel tool calls). Each agent has isolated context — no cross-contamination.
| Agent | Persona | Focus bias |
|---|---|---|
| 1 | Skeptical implementer | "I have to code this tomorrow — what's unclear, contradictory, or impossible?" |
| 2 | Security auditor | "Where are the trust boundaries? What happens with malicious input?" |
| 3 | QA engineer | "How do I test this? What edge cases aren't covered? What breaks at scale?" |
| 4 | New team member | "I just joined — what terms are undefined? What implicit knowledge is required?" |
| 5 | Ops/SRE | "What fails at 3am? What's the rollback plan? What's unmonitored?" |
| Agent | Persona | Focus bias |
|---|---|---|
| 1 | Attacker | "How do I exploit this? Injection, auth bypass, privilege escalation?" |
| 2 | Concurrency specialist | "What races, deadlocks, or ordering issues exist?" |
| 3 | Performance engineer | "What's O(n^2)? What allocates unbounded memory? What blocks the event loop?" |
| 4 | Error recovery auditor | "What happens when X fails? Is cleanup correct? Are resources leaked?" |
| 5 | Integration tester | "Do contracts match? Are types compatible? What breaks at boundaries?" |
You are a {PERSONA} reviewing {plan/code} before implementation/deployment.
Your perspective: {FOCUS_BIAS}
Review the ENTIRE document through your specific lens.
{same checklist and output format as Round 1}
Stop criteria: if Round 2 found 0 high + <=2 medium → STOP. Plan is solid.
Purpose: narrow scope = deeper analysis per aspect.
Based on the target content, select 3-7 aspects.
| Aspect | When to include |
|---|---|
| Contracts & Interfaces | Plan describes >2 interacting components |
| Data Flow & Migrations | Plan involves data transformation, DB changes, or state migration |
| Negative Scenarios | Plan describes user-facing features or distributed systems |
| Consistency | Plan is >300 lines or written by multiple authors |
| Completeness | Plan references external systems or has phased rollout |
| Security & Trust | Plan involves auth, user input, or external APIs |
| Dependencies & Order | Plan has >5 implementation steps or parallel workstreams |
Before launching agents: read references/vulnerability-kb.md for condensed detection
heuristics per CWE class. Feed the relevant CWE heuristics into each agent's prompt.
Full Vul-RAG entries with code examples: knowledge-vault/docs/security/cwe/.
Based on MultiVer [2602.17875] and VulAgent [2509.11523] patterns:
| Aspect | What to trace |
|---|---|
| Injection & Input Validation | SQL/NoSQL/command/LDAP injection, XSS, path traversal, template injection |
| Auth & Access Control | Auth bypass, privilege escalation, IDOR, missing authorization checks |
| Concurrency & State | Race conditions, TOCTOU, deadlocks, shared mutable state, atomicity violations |
| Memory & Resources | Buffer overflows, use-after-free, resource leaks, unbounded allocations |
| Error Handling & Recovery | Swallowed errors, info leakage in errors, incomplete cleanup, missing rollback |
| Cryptography & Secrets | Weak algorithms, hardcoded secrets, improper random, timing attacks |
| Business Logic | State machine violations, numeric overflow in prices, missing validation of business rules |
Present selected aspects to user: "I'll focus review on these {N} aspects: ..."
For each aspect, launch ONE Agent with a FOCUSED prompt:
You are reviewing a plan document with a SINGLE focus: {ASPECT_NAME}.
Ignore everything outside your focus area — other reviewers handle those.
## Your focus: {ASPECT_NAME}
{ASPECT_DESCRIPTION — 2-3 sentences explaining what to look for}
## Plan to review
{reference the plan document path — the latest version with all prior fixes}
Read the ENTIRE plan but analyze ONLY through the lens of {ASPECT_NAME}.
Go deep: trace every {aspect-relevant thing} end-to-end. Check that every
scenario is complete, every interface is specified, every edge case is handled.
## Output format
FINDING: {one-line description}
SECTION: {which section}
SEVERITY: HIGH | MEDIUM | LOW
ASPECT: {ASPECT_NAME}
EVIDENCE: {quote, max 2 lines}
FIX: {concrete change}
If clean — output: "NO_FINDINGS — {ASPECT_NAME} review clean."
Launch ALL aspect agents in a SINGLE message (parallel).
Same dedup + synthesis. Present focused report.
Stop criteria: 0 high + <=2 medium → STOP. Otherwise ask about Round 4.
Purpose: maximum depth. Only for critical plans where Round 3 still found high-severity issues.
Gate: ask user explicitly: "Round 3 still found {N} high-severity issues. Round 4 will launch {aspects x 2-3} agents (~{estimate} tokens). Continue?"
For each aspect from Round 3 that had findings, launch 2-3 agents with the same focused prompt. Same parallel launch pattern.
Final synthesis. At this depth, the plan should be clean. If still finding high-severity issues → the plan likely needs structural rework, not just polish. Tell the user.
====================================================
ROUND {N}: {BROAD | MULTISAMPLE | FOCUSED | FOCUSED+MULTISAMPLE}
Agents: {count} | New findings: {count} | Dupes removed: {count}
====================================================
-- HIGH ({count}) ------------------------------------
1. [{aspect}] {description}
Section: {section reference}
Evidence: "{quoted text}"
Fix: {concrete change}
Confidence: {HIGH if found by multiple agents, MEDIUM otherwise}
-- MEDIUM ({count}) ----------------------------------
2. [{aspect}] {description}
...
-- LOW ({count}) -------------------------------------
3. ...
====================================================
CUMULATIVE: {total_high} high / {total_medium} medium / {total_low} low
RECOMMENDATION: CONTINUE -> Round {N+1} | STOP - plan is clean
====================================================
After the last round (wherever the process stops):
====================================================
PLAN SWARM REVIEW COMPLETE
====================================================
Rounds executed: {N}
Total agents launched: {count}
Total findings: {count} ({fixed} fixed, {deferred} deferred)
By severity:
HIGH: {count found} -> {count fixed}
MEDIUM: {count found} -> {count fixed}
LOW: {count found} -> {count fixed}
Round breakdown:
R1 (broad): {findings_count} findings
R2 (multisample): {findings_count} findings
R3 (focused): {findings_count} findings
R4 (focus+multi): {findings_count} findings
VERDICT: {HARDENED | IMPROVED | NEEDS_REWORK}
====================================================
Verdicts:
| Scenario | Use |
|---|---|
| Quick architecture check | /plan-eng-review |
| CEO-level scope challenge | /plan-ceo-review |
| Design/UX review | /plan-design-review |
| Code diff review (pre-merge) | /review or /deep-review |
| Thorough plan hardening before implementation | /plan-swarm-review (plan mode) |
| Plan with many interacting components | /plan-swarm-review (plan mode) |
| High-stakes plan (infra, security, payments) | /plan-swarm-review (plan mode) |
| Security audit of a module/codebase | /plan-swarm-review (code mode) |
| Pre-release vulnerability hunt | /plan-swarm-review (code mode) |
| Bug hunt when "something is wrong but tests pass" | /plan-swarm-review (code mode) |