From parallel-adversarial-review
Runs parallel code reviews with multiple AI models (claude, codex, gemini), performs cross-critiques to detect hallucinations and severity issues, then synthesizes a deduplicated report for high-stakes reviews like security or pre-merge.
npx claudepluginhub prime-radiant-inc/parallel-adversarial-reviewThis skill uses the workspace's default tool permissions.
A three-stage review pipeline that uses multiple installed coding-agent CLIs as independent reviewers, then has them critique each other's findings, then synthesizes a final report. Catches model-specific blind spots and hallucinations that single-model PAR cannot.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Guides code writing, review, and refactoring with Karpathy-inspired rules to avoid overcomplication, ensure simplicity, surgical changes, and verifiable success criteria.
Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Share bugs, ideas, or general feedback.
A three-stage review pipeline that uses multiple installed coding-agent CLIs as independent reviewers, then has them critique each other's findings, then synthesizes a final report. Catches model-specific blind spots and hallucinations that single-model PAR cannot.
| Situation | Use |
|---|---|
| Routine review, normal stakes | parallel-adversarial-review (faster, cheaper) |
| Pre-merge review on hot path code | MMAR |
| Security review | MMAR |
| Production incident postmortem code change | MMAR |
| You suspect a model has a blind spot for this kind of code | MMAR |
| Compliance / audit artifact | MMAR |
MMAR costs more (N+1 model invocations + cross-critique). Don't reach for it on every commit.
Stage 1: Parallel Reviews
┌──────────────┬──────────────┬──────────────┐
diff ──────► │ claude │ codex │ gemini │ ───► findings_<model>.md
└──────┬───────┴──────┬───────┴──────┬───────┘
│ │ │
Stage 2: Cross-Critique (NxN-1 grid)
┌──────────────┬──────────────┬──────────────┐
│ codex critiq │ gemini critiq│ codex critiq │
│ of claude │ of claude │ of gemini │ ───► critique_<a>_of_<b>.md
│ ... │
└──────┬───────┴──────┬───────┴──────┬───────┘
│ │ │
Stage 3: Synthesis
┌──────────────────────────────────────────────┐
│ synthesizer (Claude as subagent or CLI) │
│ - dedupe across reviewers │
│ - drop hallucinations flagged by critics │
│ - apply severity-disagreement rule │
│ - produce final findings report │
└──────────────────────────────────────────────┘
You do not call CLIs by hand. Run the driver:
${CLAUDE_PLUGIN_ROOT}/scripts/mmar.py review <diff_path_or_- > [options]
Options:
--reviewers claude,codex,gemini — which CLIs to use (default: auto-detect)--workdir <dir> — repo to run reviewers in (default: cwd)--out <dir> — where to write per-stage artifacts (default: ./.mmar/<timestamp>/)--mock-dir <dir> — read pre-recorded responses from <dir>/<stage>/<reviewer>.txt instead of calling CLIs (for evals and CI)--skip-critique — stage 1 + stage 3 only, no cross-critique (degrades to multi-model PAR)--domain-prompt <file> — path to a file with the domain-specific reviewer instructions (e.g. "review for security bugs"); defaults to a generic code-quality promptThe driver writes per-stage artifacts and a final findings.md in the output directory. Read it. Pass it to the implementer or whoever owns the next stage.
CLI invocations are defined in scripts/adapters.toml. Each entry maps a CLI name to a command template. To add a new CLI or fix an invocation that broke (CLIs change their flags), edit that file. The driver reads it on every run.
Default-on reviewers (enabled if installed): claude, codex, gemini, pi, opencode.
Opt-in reviewers (enabled=false; flip in adapters.toml after configuring credentials): amp, droid.
If a CLI is not installed or not enabled, it is silently skipped. The driver requires at least 2 reviewers to proceed; with 1 it errors and tells you to install another or use plain PAR.
Stage 1 reviewers get the prompt in reviewer-wrapper.md. Stage 2 critics get the prompt in critic-wrapper.md. Stage 3 synthesis uses synthesizer-prompt.md. The driver assembles these from the diff and findings; you do not invoke them directly.
The synthesizer applies:
[high] if found by ≥2 reviewers and not flagged by any critic; [medium] if found by 1 reviewer and not flagged; [low] if found by 1 and partially flagged.These rules are enforced by synthesizer-prompt.md. Do not negotiate with them.
| Failure | Response |
|---|---|
| A reviewer CLI hangs | Driver has a per-CLI timeout (default 5 min). Kill that adapter; continue with the rest. |
| A reviewer returns empty/unparseable output | Treated as "no findings"; do not retry — that biases the result. |
| Cross-critique pairs balloon (6 reviewers → 30 critiques) | Driver caps critiques per reviewer at 3 random others. Configurable. |
| Cost concerns | Use --reviewers to pick fewer, or use plain PAR. |
| Network down | Run with --mock-dir over a recorded fixture, or use plain PAR with the local Claude session. |
The eval suite in evals/ measures recall and precision against fixtures with planted defects. Run:
${CLAUDE_PLUGIN_ROOT}/evals/runner.py --mode mock # cheap, deterministic, CI-safe
${CLAUDE_PLUGIN_ROOT}/evals/runner.py --mode live # actually invokes CLIs (costs $$)
If you change the wrappers or synthesizer, re-run evals before merging. A regression on recall or precision is a blocker.
reviewer-wrapper.md, critic-wrapper.md, synthesizer-prompt.md — the prompts the driver uses.parallel-adversarial-review skill — single-model PAR for routine reviews.scripts/adapters.toml — how to add or fix a CLI invocation.