From code-review
This skill should be used when the user asks to "review my code", "do a code review", "review this project", "find bugs in my code", "security review", "review my changes", "code audit", "check my code for issues", "analyze code quality", mentions multi-model review, consensus review, or wants a thorough code review using multiple AI models. Performs a three-phase review: light models for reconnaissance, powerful models for deep analysis with consensus scoring, and verification agents that test each finding with code snippets to eliminate false positives.
npx claudepluginhub pixelsquared/claude-skills --plugin code-reviewThis skill uses the workspace's default tool permissions.
Perform a three-phase multi-model code review that produces verified findings scored by cross-model consensus. In Phase 1, three light Claude sub-agents (Haiku) run in parallel to map the codebase -- identifying file structure, data flows, dependency graphs, critical areas, and complexity hotspots. Their outputs merge into a single codebase map, and each agent writes its own report to a timesta...
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Generates original PNG/PDF visual art via design philosophy manifestos for posters, graphics, and static designs on user request.
Perform a three-phase multi-model code review that produces verified findings scored by cross-model consensus. In Phase 1, three light Claude sub-agents (Haiku) run in parallel to map the codebase -- identifying file structure, data flows, dependency graphs, critical areas, and complexity hotspots. Their outputs merge into a single codebase map, and each agent writes its own report to a timestamped review directory. In Phase 2, three powerful Claude sub-agents (Sonnet and Opus) receive this map as context and perform deep analysis across six review dimensions (bugs, security, performance, architecture, code quality, error handling), skipping exploration entirely because the map provides full orientation. Each agent writes its findings report to the review directory. After Phase 2 completes, collect all findings, match them across agents by file proximity and category, and assign consensus confidence tiers: Confirmed (3/3 agents agree), Likely (2/3), or Possible (1/3). In Phase 3, up to six Sonnet verification agents (one per review dimension) independently verify each finding by reading the cited source code, researching whether the issue is real, writing a test script, and executing it. Findings are classified as Verified, False Positive, or Inconclusive. The final report presents findings organized by confidence tier and severity, with verification status on each finding and false positives moved to a separate Dismissed section.
Three tiers of Claude models, dispatched as native sub-agents via the Task tool.
| Agent | Model | Perspective | Focus |
|---|---|---|---|
| Structure Scout | model="haiku" | Architect | Module boundaries, dependency graph, layer separation, public API surface |
| Data Tracer | model="haiku" | Data Engineer | Data flow end-to-end, trust boundaries, input/output paths, transformation chains |
| Risk Spotter | model="haiku" | Security/Reliability Engineer | Critical areas, complexity hotspots, error-prone patterns, attack surface |
| Agent | Model | Perspective | Focus |
|---|---|---|---|
| Security Auditor | model="opus" | Adversarial Security Engineer | Think like an attacker -- injection, auth bypass, data exposure, trust boundary violations, supply chain risks |
| Reliability Engineer | model="sonnet" | Performance & Reliability SRE | Failure modes, resource exhaustion, scalability bottlenecks, error recovery, observability gaps |
| Craft Reviewer | model="sonnet" | Senior Software Architect | Architecture health, abstraction quality, maintainability, naming clarity, unnecessary complexity |
| Agent | Model | Scope | Focus |
|---|---|---|---|
| Bugs Verifier | model="sonnet" | Bugs & Correctness findings | Reproduce logic errors, test edge cases, verify code paths |
| Security Verifier | model="sonnet" | Security findings | Test injection vectors, verify auth flows, probe attack surfaces |
| Performance Verifier | model="sonnet" | Performance findings | Profile hotspots, test N+1 patterns, measure resource usage |
| Architecture Verifier | model="sonnet" | Architecture findings | Analyze coupling, check dependency cycles, verify layer violations |
| Quality Verifier | model="sonnet" | Code Quality findings | Check for dead code, verify duplication claims, test naming issues |
| Error Handling Verifier | model="sonnet" | Error Handling findings | Test failure paths, verify exception propagation, trigger error conditions |
Only dimensions with findings get a verification agent. Dimensions with zero findings are skipped.
Each agent receives a {PERSPECTIVE} block in its prompt that establishes its reviewer identity and what it pays closest attention to. Critically, every agent still reviews all sections/dimensions -- perspectives control depth of insight, not scope of coverage. This means:
This produces genuinely diverse findings even though all agents are Claude models, because each approaches the same code with different priorities and mental models.
Override models or perspectives at runtime by telling Claude which to use.
Determine what code to review before entering Phase 1. Three scope modes exist:
.git/, node_modules/, __pycache__/, dist/, build/, .next/, vendor/, binary files, images, lock files, and other build artifacts.git diff for unstaged changes or git diff --staged for staged changes. If the user mentions a branch, run git diff main...HEAD (or the appropriate base branch).For all modes, concatenate files with clear delimiters:
=== FILE: path/to/file.ext ===
[full file content]
Include the complete file path and full content for each file. For large codebases exceeding 50 files or 100KB of total content, split the code into batches. Run multiple agent rounds per phase, giving each round a subset of the files. Merge partial recon reports or partial review findings before proceeding.
Before launching any agents, create a timestamped directory for all review artifacts:
docs/reviews/YYYY-MM-DD-HH-MM/ using the current date and time.mkdir -p docs/reviews/YYYY-MM-DD-HH-MM/.{REPORT_PATH}.All agents write their reports to this directory. The coordinator writes merged/consolidated files. The full directory structure when complete:
docs/reviews/YYYY-MM-DD-HH-MM/
phase1-structure-scout.md
phase1-data-tracer.md
phase1-risk-spotter.md
phase1-codebase-map.md
phase2-security-auditor.md
phase2-reliability-engineer.md
phase2-craft-reviewer.md
phase3-verification-bugs.md
phase3-verification-security.md
phase3-verification-performance.md
phase3-verification-architecture.md
phase3-verification-quality.md
phase3-verification-error-handling.md
final-report.md
Execute these steps in order:
Gather code content. Collect all source files for the resolved scope. Concatenate them with === FILE: path === delimiters into a single code block string. Record the total file count and scope description.
Read the recon prompt template. Load references/recon-prompt.md from this skill's directory. This template contains placeholders: {SCOPE_DESCRIPTION}, {CODE_CONTENT}, and {PERSPECTIVE}.
Prepare three perspective-specific recon prompts. Create three versions of the resolved template, each with a different {PERSPECTIVE} block:
Structure Scout: "You are an Architect. Focus on module boundaries, dependency relationships, layer separation, and the public API surface. In the File Inventory, be especially thorough about exports and inter-module contracts. In the Dependency Graph, trace coupling patterns and flag architectural violations. Map how the system is organized."
Data Tracer: "You are a Data Engineer. Focus on how data moves through the system end-to-end. In Data Flow, be especially thorough -- trace every entry point through every transformation to every output. Map trust boundaries where data crosses from untrusted to trusted contexts. Flag where input validation happens (or doesn't) along each path."
Risk Spotter: "You are a Security and Reliability Engineer. Focus on what can go wrong. In Critical Areas, be especially thorough -- flag every piece of code touching auth, crypto, databases, external APIs, user input, or the filesystem. In Complexity Hotspots, identify code most likely to harbor hidden bugs. Think about failure modes and attack surface."
Additionally, resolve {REPORT_PATH} in each agent's prompt to the corresponding output file:
docs/reviews/YYYY-MM-DD-HH-MM/phase1-structure-scout.mddocs/reviews/YYYY-MM-DD-HH-MM/phase1-data-tracer.mddocs/reviews/YYYY-MM-DD-HH-MM/phase1-risk-spotter.mdLaunch all three recon agents in parallel. Use the Task tool three times in the same message, each with subagent_type="general-purpose" and model="haiku". Place each perspective-specific resolved prompt in the corresponding task's prompt field. Instruct each sub-agent to return the full recon report as its response. Run all three in the background for parallelism.
Wait for all three agents to complete. Monitor for failures. If an agent fails or times out, log the failure and proceed with the remaining reports. Two reports are sufficient for a useful merged map.
Collect all recon reports. Read the response from each completed sub-agent.
Merge into a unified codebase map. Create a single consolidated reconnaissance report. Apply these merging rules:
After merging, write the unified codebase map to docs/reviews/YYYY-MM-DD-HH-MM/phase1-codebase-map.md using the Write tool.
Execute these steps in order:
Read the deep review prompt template. Load references/review-prompt.md from this skill's directory. This template contains placeholders: {SCOPE_DESCRIPTION}, {CODEBASE_MAP}, {CODE_CONTENT}, and {PERSPECTIVE}.
Prepare three perspective-specific review prompts. Replace {SCOPE_DESCRIPTION}, {CODEBASE_MAP}, and {CODE_CONTENT} in all three. Then set a different {PERSPECTIVE} for each:
Security Auditor: "You are an Adversarial Security Engineer. Think like an attacker. Prioritize: injection vectors (SQL, XSS, command), authentication/authorization bypass, secrets exposure, path traversal, SSRF, unsafe deserialization, and trust boundary violations. Trace every path from user input to dangerous sinks. Question every assumption about data safety. For non-security dimensions, still review them but with a security-aware lens -- e.g., a performance issue that enables DoS, an architecture issue that makes security boundaries unclear."
Reliability Engineer: "You are a Performance and Reliability SRE. Think about what happens at scale and what happens when things fail. Prioritize: N+1 queries, blocking I/O in async paths, unbounded growth, memory leaks, missing timeouts, missing retries on transient failures, silent error swallowing, resource cleanup in error paths, and cascading failure risk. For non-performance dimensions, still review them but with a reliability lens -- e.g., a correctness bug that only manifests under load, an architecture issue that prevents graceful degradation."
Craft Reviewer: "You are a Senior Software Architect focused on long-term code health. Think about the developer who maintains this code next year. Prioritize: tight coupling, god classes, circular dependencies, violated abstractions, DRY violations, unclear naming, overly complex conditionals, missing interfaces at boundaries, and inconsistent patterns. For non-architecture dimensions, still review them but with a maintainability lens -- e.g., a bug hidden by unclear naming, an error handling gap caused by poor abstraction."
Additionally, resolve {REPORT_PATH} in each agent's prompt to the corresponding output file:
docs/reviews/YYYY-MM-DD-HH-MM/phase2-security-auditor.mddocs/reviews/YYYY-MM-DD-HH-MM/phase2-reliability-engineer.mddocs/reviews/YYYY-MM-DD-HH-MM/phase2-craft-reviewer.mdLaunch all three deep review agents in parallel. Use the Task tool three times in the same message:
subagent_type="general-purpose", model="opus" — with the security perspective promptsubagent_type="general-purpose", model="sonnet" — with the reliability perspective promptsubagent_type="general-purpose", model="sonnet" — with the craft perspective promptInstruct each sub-agent to return the full findings report as its response. Run all three in the background for parallelism.
Wait for all three agents to complete. Handle failures as in Phase 1 -- log errors and continue with available results.
Collect all deep review outputs. Read the findings from each completed sub-agent. Each agent's output follows the findings format defined in references/review-prompt.md.
After collecting all Phase 2 findings, produce a single consolidated set of findings with consensus scores. Follow the detailed merging instructions in references/report-template.md. The procedure in brief:
Collect all findings. Normalize every finding from every agent into a standard record: file path, line number, category, severity, description, suggestion, and source agent name (e.g., "Opus", "Sonnet-1", "Sonnet-2").
Match findings across agents. Two findings from different agents are the same finding when both conditions hold: they reference the same file with line numbers within 5 lines of each other, AND they target the same review category. Compare all pairs.
Assign confidence tiers. Count how many distinct agents flagged each grouped finding. Three agents = Confirmed. Two agents = Likely. One agent = Possible.
Resolve severity conflicts. When matched agents assign different severity levels to the same finding, use the highest severity. A finding flagged as critical by one agent and medium by another becomes critical.
Select best description and combine suggestions. Pick the most specific, actionable description from among the agreeing agents. Merge non-redundant suggestions into a single recommendation. If agents propose different fixes, list them as alternatives.
Sort the final list. Order by confidence tier (Confirmed first, then Likely, then Possible), then by severity (critical, medium, low), then by file path and line number.
Execute these steps after consensus scoring is complete:
Group findings by dimension. Take all consolidated findings from consensus scoring and group them into six buckets: bugs, security, performance, architecture, quality, error-handling. Record the finding IDs (F-1, F-2, etc.) for each group. Skip dimensions that have zero findings.
Read the verification prompt template. Load references/verification-prompt.md from this skill's directory. This template contains placeholders: {DIMENSION}, {FINDINGS}, {CODE_CONTENT}, {CODEBASE_MAP}, {REPORT_PATH}, {DATE}, {COUNT}, {VERIFIED_COUNT}, {FP_COUNT}, {INCONCLUSIVE_COUNT}.
Prepare dimension-specific verification prompts. For each dimension with findings, resolve the template:
{DIMENSION} — the dimension name (e.g., "Security", "Bugs & Correctness"){FINDINGS} — the structured list of findings for this dimension, including finding ID, title, severity, consensus tier, file:line, description, code snippet, and suggestion{CODE_CONTENT} — the full source code (same content passed to Phase 1 and 2){CODEBASE_MAP} — the merged Phase 1 codebase map{REPORT_PATH} — the output path for this dimension's verification report (e.g., docs/reviews/YYYY-MM-DD-HH-MM/phase3-verification-security.md)The remaining placeholders ({DATE}, {COUNT}, {VERIFIED_COUNT}, {FP_COUNT}, {INCONCLUSIVE_COUNT}) are filled in by the agent in its output, not by the coordinator.
Launch verification agents in parallel. Use the Task tool for each dimension with findings. All use subagent_type="general-purpose", model="sonnet". Run in the background for parallelism.
The verification agents have access to: Read (to examine source files), Bash (to execute test scripts), Write (to save reports), WebSearch (to research language/framework behavior), Grep and Glob (to search the codebase).
Wait for all verification agents to complete. If an agent fails or times out, mark all its findings as Inconclusive and note the failure in the final report.
Collect all verification reports. Read each agent's output. Parse the verdicts for each finding.
Update the final report. For each finding in the consolidated report:
Verification field with the verdict from Phase 3Evidence field with the 1-2 sentence summaryWrite the final report. Save the complete report (with verification status on all findings) to docs/reviews/YYYY-MM-DD-HH-MM/final-report.md using the Write tool. Also present it to the user in the conversation.
Produce the final markdown report following the template structure defined in references/report-template.md. Fill in all placeholders: date, scope description, file count, model lists, severity counts, consensus counts, and top concerns. Number all findings sequentially (F-1, F-2, F-3, ...) across the entire report.
Include the Codebase Map Summary section -- condense the Phase 1 map to 10-20 lines highlighting architecture points, critical areas, and data flows relevant to the findings. Include the Per-Model Raw Notes appendix capturing each agent's unique observations that did not become formal findings.
Present the complete report directly to the user in the conversation. For large reports (more than 30 findings or more than 200 lines), additionally offer to save the report to a file in the project directory (e.g., code-review-report-YYYY-MM-DD.md).
This skill relies on four supporting reference files in the references/ subdirectory:
references/recon-prompt.md -- Phase 1 prompt template for recon agents. Contains the full prompt structure for codebase reconnaissance including file inventory, data flow, dependency graph, critical areas, and complexity hotspots. Uses placeholders {SCOPE_DESCRIPTION}, {CODE_CONTENT}, {PERSPECTIVE}.references/review-prompt.md -- Phase 2 prompt template for review agents. Contains the deep analysis prompt covering all six review dimensions with severity definitions and output format. Uses placeholders {SCOPE_DESCRIPTION}, {CODEBASE_MAP}, {CODE_CONTENT}, {PERSPECTIVE}.references/review-dimensions.md -- Detailed criteria for all six review dimensions: bugs and correctness, security, performance, architecture, code quality, and error handling. Each dimension includes a what-to-look-for checklist and severity classification guidance.references/report-template.md -- Final output format specification including the consensus scoring system, confidence tier definitions, finding matching rules, full markdown report template with placeholders, field definitions, section rules, and the complete seven-step merging procedure.references/verification-prompt.md -- Phase 3 prompt template for verification agents. Contains the verification procedure (locate code, research issue, write test script, execute, assign verdict) and output format. Uses placeholders {DIMENSION}, {FINDINGS}, {CODE_CONTENT}, {CODEBASE_MAP}, {REPORT_PATH}.