From superpowers-plus
Dispatches 5 parallel specialist reviewers (Defect Finder, Design Critic, Guardian, Standards Enforcer, Performance Analyst) to analyze code changes and aggregate prioritized findings.
npx claudepluginhub bordenet/superpowers-plus --plugin superpowers-plusThis skill uses the workspace's default tool permissions.
> **Wrong skill?** File-protocol review handoff → `code-review`. PR inline → `providing-code-review`. Pre-commit gate → `progressive-code-review-gate`. **Slash commands:** `/sp-cr-battery [min-score]` (primary; optional 1.0–10.0 quality threshold, default 7.0), `/sp-deepreview` (legacy).
Mandates invoking relevant skills via tools before any response in coding sessions. Covers access, priorities, and adaptations for Claude Code, Copilot CLI, Gemini CLI.
Share bugs, ideas, or general feedback.
Wrong skill? File-protocol review handoff →
code-review. PR inline →providing-code-review. Pre-commit gate →progressive-code-review-gate. Slash commands:/sp-cr-battery [min-score](primary; optional 1.0–10.0 quality threshold, default 7.0),/sp-deepreview(legacy).
Dispatch 5 specialized reviewer agents in parallel, each focused on a distinct set of review dimensions. A triage coordinator selects which reviewers to activate based on the diff, then aggregates findings into a unified report.
Why this exists: A single reviewer tries to evaluate everything simultaneously, leading to shallow coverage, inconsistent focus, and ~40% false positive rates. Specialized reviewers with focused prompts produce deeper analysis with near-zero false positives.
requesting-code-review or progressive-code-review-gate triggers a reviewverification-before-completion detects the implementation→presentation transition and no valid sentinel exists for HEADGate chain position 3 of 4:
| Gate (order) | Self-fires when | Short-circuit if |
|---|---|---|
1. debate | About to commit to a design before coding | Already ran this session |
2. progressive-harsh-review | About to present a non-code deliverable | Already ran on this artifact |
3. code-review-battery | About to present/commit/push code | Valid sentinel for HEAD exists |
4. verification-before-completion | About to write any results-presenting response | Sentinel SHA == HEAD → skip re-dispatch |
One-per-unit rule: Battery fires at most once per coherent unit of work. If a valid .code-review-cleared sentinel exists for HEAD, the gate is already satisfied — do not re-dispatch.
This is the canonical skip gate for the one-per-unit rule. Callers (requesting-code-review, finishing-a-development-branch, progressive-code-review-gate) should run this before dispatching. If a caller does not implement Phase 0 explicitly, the agent should apply this decision manually before invoking battery.
SENTINEL="$(git rev-parse --show-toplevel 2>/dev/null || echo .)/.code-review-cleared"
cat "$SENTINEL" 2>/dev/null || echo "NO CLEARANCE"
echo "HEAD: $(git rev-parse HEAD 2>/dev/null)"
git diff --quiet && git diff --cached --quiet && echo "WORKTREE_CLEAN" || echo "WORKTREE_DIRTY"
| Sentinel state | Decision |
|---|---|
NO CLEARANCE | Run battery (proceed to Phase 1). |
| Sentinel SHA ≠ HEAD SHA | Run battery (battery is stale). |
Sentinel valid for HEAD but WORKTREE_DIRTY | Run battery (staged/unstaged changes exist that were not reviewed). |
Valid sentinel for HEAD AND WORKTREE_CLEAN | Skip. Battery already ran on the current code. Note the clearance and skip to Phase 6. |
| Malformed | Delete .code-review-cleared, run battery. |
Analyze the diff and select reviewers:
| Reviewer | Focus | Activate When |
|---|---|---|
| Defect Finder | Correctness, edge cases, concurrency | Any code change |
| Design Critic | Factoring, complexity, API design | Adds/modifies classes, functions, public APIs |
| Guardian | Security, blast radius, backwards compat | Any code change |
| Standards Enforcer | Docs, test quality, observability | Always |
| Performance Analyst | Performance, logging | DB, loops, caching, network I/O, or >500 LOC |
| Monolith (on-demand) | All dimensions | --all flag or manual request |
Decision rules: Docs-only → Standards Enforcer only. Config-only → Guardian only. Any code → Defect Finder + Guardian + Standards Enforcer + conditionally Design Critic and Performance Analyst.
Mandatory activation (not subject to triage exclusion):
Overrides: --all (force all), --only=<name>, --skip=<name>, --round1-only (skip escalation).
State your triage decision before dispatching:
**Triage**: Activated: [list] | Skipped: [list] | Reason: [1-2 sentences]
Sub-agents have NO conversation context. Pass diff + source context inline.
1. Capture diff: git diff --cached, git diff HEAD~1, or git diff main..HEAD
2. Source context for ripple analysis (#1 missed-finding cause = reviewing diff in isolation):
3. Inbound reference scan (mandatory when diff renames, moves, or deletes files):
git diff --diff-filter=RD --name-only main..HEAD # old paths
grep -rn "old-filename" . --include="*.md" --include="*.ts" --include="*.sh" # scan ENTIRE repo
MUST scan outside the changed directory. The #1 failure mode: scoping grep to the refactored directory, missing sibling modules that reference old paths. Hits outside the diff are mandatory CRITICAL findings — broken consumers the author didn't update. Include grep results in every reviewer's context.
4. Dispatch ALL activated reviewers simultaneously via sub-agent-code-reviewer (Augment) or Task() (Claude). Each gets: reviewer prompt + full diff + source context + inbound reference scan results.
After all reviewers return:
[Reviewer Name][Reclassified: Critical → Minor — missing tests are a standards gap, not a production defect]| Finding | CX Impact | Complexity | Testability | Action |
|---|---|---|---|---|
| #1 ... | Fixes dead-air | +3 lines | Clearer tests | Implement |
| #3 ... | None | Adds abstraction | Marginal | Defer |
Tightening: If total findings >10, suppress Minor findings from the report body. Still count them in the summary line. Never suppress Critical or Important. State "Tightening applied: [N] Minor findings suppressed" in the report.
Report format: Header (activated/skipped reviewers) → Critical → Important → Minor (full, or "[N] Minor findings suppressed") → Clean Dimensions → Action Classification table → Durable Checks summary → Live Metrics → Summary (Findings: [N] Critical, [N] Important, [N] Minor ([N] suppressed) | Metrics: durable=[N]% or N/A, convergent-count=[N], unresolved-critical=[N]).
Metrics: Durable check rate (≥50%), convergent finding count, unresolved Critical count (target: 0). Offline: precision ≥75%, high-sev precision ≥80%, Round 2 yield ≤20%.
Score (after all fix rounds): 10.0 − (Critical×2.5) − (Important×1.5) − (Minor×0.25) − (durable<50% ? 0.5 : 0), floor 0.0. Calibration: 0 findings → 10.0; 1 Important → 8.5; 1 Critical → 7.5; 2 Importants + low durable → 6.5. Extract threshold from invocation (e.g. /sp-cr-battery 8.5 → 8.5; default 7.0). Score < threshold → abort Phase 6.
If ANY trigger fires after Round 1, re-dispatch a focused reviewer:
| Trigger | Re-run | Why |
|---|---|---|
| >2 state/flag findings | Defect Finder (interaction-path focus) | Systemic timing/ordering |
| >3 test quality issues | Standards Enforcer (mock-focused) | Shared mock infrastructure |
| >50 lines removed or functions deleted | Guardian (deletion focus) | Callers may depend on removed behavior |
| Files renamed/moved/deleted without inbound scan | Guardian (inbound-reference focus) | Broken consumers outside the changed directory |
| "Pre-existing" issues flagged | Defect Finder (lifecycle focus) | Deeper structural gaps |
Re-dispatch with focused instruction (diff slice + refreshed context + trigger signal). Append under ### Round 2 Findings. Skip if --round1-only, all clean, or diff <20 lines.
STOP when: unresolved Critical = 0, last 2 passes <20% new high-sev, durable ≥50%. CONTINUE if escalation trigger fires or Critical remains. ESCALATE TO HUMAN after 3 passes.
After synthesis, scan all reviewer outputs for shared blind spots:
⚠️ CORRELATED EVIDENCE — reviewers may share a blind spot outside the cited region. Expand the review scope to adjacent modules.⚠️ ECHO REASONING — findings may reflect shared analytical bias, not independent analysis. Require at least one reviewer to re-examine from a different entry point.⚠️ UNANIMOUS CLEAN — verify reviewers examined different evidence slices. Check that each reviewer's output references different source files or code paths.Correlated-failure flags do NOT change verdicts directly — they trigger expanded scope or re-examination. The goal is to surface shared blind spots, not to manufacture findings.
Prerequisite: Correlated-Failure Detection has completed and no re-examination was triggered.
If final verdict is PASS or PASS_WITH_NITS (all nits resolved):
# tools/run-battery.sh is the ONLY permitted way to write .code-review-cleared.
# Pass --min-score with the threshold from the invocation (default 7.0).
tools/run-battery.sh --verdict PASS # min-score 7.0 (default)
tools/run-battery.sh --verdict PASS --min-score 9.1 # explicit threshold
tools/run-battery.sh --verdict PASS_WITH_NITS --min-score 8.0 # nits verdict + threshold
❌ Never write
.code-review-cleareddirectly withecho. Usetools/run-battery.shso that automated checks run before the sentinel is written.
Timing: Battery may run before or after git commit. The sentinel records the SHA at time of writing (HEAD). If battery ran pre-commit, the sentinel will be stale after commit — run battery again before pushing. The pre-push hook is the authoritative validator: it checks that sentinel SHA matches the ref being pushed. Do not skip this step — without a valid sentinel, the push will be blocked.
If verdict is REJECT or PASS_WITH_FIXES: do NOT write the sentinel. Fix all Critical/Important findings, re-dispatch, then write sentinel when the re-run passes.
Monolith found something no specialist found → candidate pattern → candidates/. Reviewer fails → note, don't retry. Diff >3000 lines → warn, suggest chunks. Empty diff → skip.
| Anti-Pattern | Detection | Correction |
|---|---|---|
| All reviewers agree | No disagreements found | Force second-order critique: each reviewer must name ≥1 plausible failure mode or state why none exists — but the explanation must cite a specific property of the change (e.g., "pure rename, no callers affected"), not a generic "it's straightforward." Artificial dissent without reasoning is noise; generic dismissal is rubber-stamping |
| Duplicate findings | Same issue from 3 reviewers | Deduplicate in synthesis, attribute first finder |
| Reviewer fatigue | Later reviewers less thorough | Randomize dispatch order |
| Missing source context | Review diff without callers | Include grep results for all touched functions |
| Over-scoping | Reviewing unchanged code | Focus on diff + directly impacted callers only |
| Failure | Fix |
|---|---|
| Sub-agent returns no findings on complex diff | Verify diff + source context was passed inline — sub-agents have no conversation context |
| False positives from isolated diff review | Include source context (callers, field readers) per Phase 2 — isolation is the #1 cause |
| Convergence never reached | Escalate to human after 3 passes |
| Monolith finds issues specialists missed | Log as gap-analysis candidate for specialist prompt improvement |