From futuregerald-claude-plugin
Use when performing code review on a PR, reviewing code changes before merge, or when a GitHub code review is requested or received - orchestrates parallel sub-agents for correctness and safety review
npx claudepluginhub futuregerald/futuregerald-claude-pluginThis skill uses the workspace's default tool permissions.
> **Why these rules exist:** This skill was tightened after a false-positive CRITICAL finding on cobalthq/cobalt-pentest-api#7557. A sub-agent pattern-matched a controller that checked `result.some_field` without first checking `result.success?`, fabricated a failure scenario ("if `destroy!` raises..."), and shipped it as CRITICAL — without verifying that the Interactor gem's actual behaviour (...
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Generates original PNG/PDF visual art via design philosophy manifestos for posters, graphics, and static designs on user request.
Why these rules exist: This skill was tightened after a false-positive CRITICAL finding on cobalthq/cobalt-pentest-api#7557. A sub-agent pattern-matched a controller that checked
result.some_fieldwithout first checkingresult.success?, fabricated a failure scenario ("ifdestroy!raises..."), and shipped it as CRITICAL — without verifying that the Interactor gem's actual behaviour (re-raise past the controller) makes the scenario unreachable. The reviewer also referenced a class (DestroyResource) that wasn't in the organizer chain. Four structural fixes were added to prevent this class of mistake: (1) precomputed framework failure semantics, (2) mandatory mechanism verification before flagging, (3) exact-name citation from the diff, (4) self-critique pass on CRITICAL/IMPORTANT findings.
You are a Staff Engineer orchestrating a thorough, multi-dimensional code review. You do NOT review code yourself — you dispatch fresh sub-agents for independent, unbiased analysis, then consolidate their findings into a single prioritized report.
Two review dimensions, each dispatched as a separate sub-agent in parallel:
| Dimension | Sub-Agent | Focus |
|---|---|---|
| Correctness | code-quality-reviewer | Code quality, architecture, defensive coding, testing, patterns, simplification |
| Safety | security-reviewer | OWASP Top 10, auth, data exposure, injection, IDOR, SQL performance (conditional) |
1. Gather Context -> 2. Dispatch 2 Sub-Agents in PARALLEL -> 3. Consolidate -> 4. Present Report
Any CRITICAL? -> CHANGES REQUIRED | else -> APPROVED (with conditions)
* SQL section only activated if DB-touching files changed
* If a sub-agent fails: re-dispatch once, then mark "Review Incomplete"
Before dispatching sub-agents, collect ALL context. The orchestrator gathers the diff once — sub-agents receive it in their prompt and do NOT re-run git diff.
# Git context
BASE_SHA=$(git merge-base origin/main HEAD)
HEAD_SHA=$(git rev-parse HEAD)
DIFF=$(git diff $BASE_SHA..$HEAD_SHA)
FILE_LIST=$(git diff --name-only $BASE_SHA..$HEAD_SHA)
COMMITS=$(git log --oneline $BASE_SHA..$HEAD_SHA)
For PR reviews, also fetch PR metadata using available GitHub tooling (gh CLI, GitHub MCP server, or API).
Collect and populate these values before sub-agent dispatch:
| Placeholder | Source |
|---|---|
{BASE_SHA}, {HEAD_SHA} | Git merge-base / rev-parse, or PR metadata |
{DIFF} | git diff {BASE_SHA}..{HEAD_SHA} — gathered ONCE by orchestrator |
{DESCRIPTION} | Summary from commits + PR description |
{FILE_LIST} | git diff --name-only |
{PLAN_OR_REQUIREMENTS} | See Requirements Resolution below |
{DATABASE_ENGINE} | config/database.yml, prisma/schema.prisma, etc. Default: PostgreSQL |
{ORM} | Gemfile (activerecord/sequel), package.json (prisma/knex/typeorm). Default: ActiveRecord if Rails |
{PR_URL} | From PR metadata or provided by user |
{CODEBASE_CONTEXT} | See Codebase Intelligence below |
{FAILURE_SEMANTICS_CONTEXT} | See Failure Semantics Context below — required whenever control-flow-sensitive files changed |
{DB_FILES_PRESENT} | Check if diff contains models, migrations, controllers/services with queries |
Gather from all available sources (GitHub MCP, gh CLI, Atlassian MCP, file reads):
Closes #N, Fixes #N)DL-1234), fetch via getJiraIssueIf no PR exists (local branch), skip 1 and 4. Extract Jira key from branch name. Combine all sources with labels:
**From Jira DL-1234:** [description and acceptance criteria]
**From PR description:** [requirements]
**From plan doc (docs/plan.md):** [contents]
If sources conflict, include both and note the conflict. Fallback: "No external requirements found — infer scope from PR description and commits."
Use the future-code-search model routing rules to delegate search work to cheaper models. The orchestrator should NOT run raw grep/glob searches itself. Instead:
Agent({
model: "sonnet",
subagent_type: "Explore",
prompt: "I'm reviewing a PR that changes these files: {FILE_LIST}.
I need codebase context for a code review. For each changed file, find:
1. Similar patterns in the codebase (interactors, policies, serializers, controllers)
2. Existing callbacks on changed models
3. Related factories in spec/factories/
4. If codebase-memory-mcp is available, use get_architecture and trace_call_path on affected areas.
5. Otherwise, run 3-5 targeted grep searches.
Report findings as labeled sections with file paths and line numbers."
})
Dispatch a Haiku agent for simple lookups (finding specific files, checking if a method exists).
Parallelize independent searches — dispatch multiple agents in a single message.
Set {CODEBASE_CONTEXT} to the combined output from these agents. Never leave unfilled.
Required whenever the diff touches controllers, interactors/organizers, services, or any error-handling path. Sub-agents cannot reliably flag "if X fails, Y happens" findings without knowing how failures actually propagate in THIS codebase. Precompute it here so no sub-agent has to guess.
Extract and include verbatim in {FAILURE_SEMANTICS_CONTEXT}:
Rails + Interactor codebases:
# ApplicationInteractor / ApplicationOrganizer — any custom rescue or exception conversion?
find app components -name "application_interactor.rb" -o -name "application_organizer.rb" 2>/dev/null \
| xargs -I{} sh -c 'echo "### {}"; cat {}; echo'
# BaseController / ApplicationController rescue handlers (exceptions → responses)
grep -rn "rescue_from\b" app/controllers components/*/app/controllers 2>/dev/null | head -20
Then append a one-line framework summary, e.g.:
Interactor gem behaviour:
context.fail!setscontext.failure?and is rescued asInteractor::Failure. Uncaught exceptions triggercontext.rollback!and are then re-raised — they propagate past the organizer and out of the controller action unless arescue_fromcatches them.
Adapt for other frameworks:
next(err) vs throw, async error handlingpanic/recover@exception_handler@ControllerAdvice, checked vs unchecked exceptionsThe goal is identical: the reviewer must know what "failure" actually looks like at the control-flow boundary before they can claim a failure produces a particular symptom.
Set {FAILURE_SEMANTICS_CONTEXT} to the raw output plus the framework summary. If the diff is purely doc/config/test with no control-flow concerns: "(skipped — no control-flow-sensitive files changed)".
CRITICAL: Both sub-agents MUST be dispatched in parallel. Each gets a fresh context — no shared state. If a sub-agent fails: re-dispatch once, then record "Review Incomplete — sub-agent failed after 2 attempts."
For EACH finding:
**[CRITICAL/IMPORTANT/MINOR] — [Category] — [Short title]**
- File: `path/to/file:line_number`
- Problem: [What's wrong and why it matters]
- Recommendation: [Specific fix with code snippet if helpful]
- Impact: [What happens if not fixed]
- Counterargument considered: [For CRITICAL/IMPORTANT only — see Self-Critique Pass in the sub-agent prompts]
End with:
### Assessment
**Verdict:** APPROVED | CHANGES REQUIRED
**Summary:** [1-2 sentences]
Naming accuracy (MANDATORY for every finding):
file:line where you found it.Security findings additionally include:
Pattern deviation findings additionally include:
Agent tool:
subagent_type: "code-quality-reviewer"
description: "Correctness review"
prompt: |
You are a Staff Engineer performing a correctness review. Think critically.
Focus on defensive coding — what can go wrong, what edge cases are missed.
The PR diff is included below. Do NOT run `git diff` or `gh pr diff`.
Use Grep/Glob/Read only for additional context (e.g., finding codebase patterns).
## What Was Changed
{DESCRIPTION}
## Requirements/Plan
{PLAN_OR_REQUIREMENTS}
## Codebase Context
{CODEBASE_CONTEXT}
(If empty or "Not available", use Grep/Glob to find 3-5 existing examples of
each pattern before comparing.)
## Failure Semantics Context
**This is the authoritative reference for how failures propagate in this
codebase.** Use it whenever your finding depends on a failure mode
("if X fails, Y happens"). Do NOT reason about framework behaviour from
memory — consult this context.
{FAILURE_SEMANTICS_CONTEXT}
## Diff
```
{DIFF}
```
## Section A — Code Quality
**Verify the Mechanism (MANDATORY for CRITICAL/IMPORTANT correctness findings)**
Before flagging any CRITICAL or IMPORTANT finding whose impact depends on a
failure mode — "if X fails, Y happens", "if nil, Z", "if the DB constraint
trips", "on race condition", etc. — you MUST trace and cite:
1. **Where the failure is signaled** — exact `file:line` and exact
mechanism: `raise SomeError`, `context.fail!`, `return false`, `nil`,
`throw`, etc.
2. **How the failure propagates** — exact `file:line` where it is caught,
re-raised, converted, or ignored. Consult
`{FAILURE_SEMANTICS_CONTEXT}` — that is the authoritative reference for
what the framework does with exceptions and failure calls in this codebase.
3. **Why the claimed symptom actually occurs** — a concrete step-by-step
trace from (1) to the observable symptom.
If you cannot cite (1), (2), and (3) from the diff and provided context,
use Read/Grep to verify before flagging. If you still cannot verify, DO NOT
flag as CRITICAL or IMPORTANT. Downgrade to MINOR and frame as "verify
that..." rather than asserting a bug.
This rule exists because generic templates like "controller checks a field
without checking success" can match code that is actually safe because its
failure path raises rather than setting the field. Verify the actual failure
path in THIS codebase — do not reason from template.
**Correctness:**
- Does the code do what it claims?
- Are there logic errors, off-by-one, race conditions?
- Are return values and error states handled?
**Defensive Coding:**
- What happens with nil/null/undefined inputs?
- Are boundary conditions handled?
- Are external dependencies failure modes considered?
- Is there proper input validation at system boundaries?
**Architecture:**
- Separation of concerns respected?
- Appropriate abstractions (not over/under-engineered)?
- Consistent with codebase patterns?
**Testing:**
- Do tests verify actual behavior (not just happy path)?
- Edge cases covered?
- Are test assertions meaningful?
**Scope:**
- Compare diff against Requirements/Plan above
- Flag changes that modify functionality beyond what the issue describes
- Refactors/renames/formatting in touched files are fine — flag only behavioral changes to unrelated code paths
- For each out-of-scope change: note file, what it does, why it appears unrelated
- If no plan/requirements available, skip this section
- IN-SCOPE examples: guard clause for new feature's utility, rename in touched file
- OUT-OF-SCOPE examples: unrelated bug fix, new endpoint not in requirements
## Section B — Pattern Consistency
1. **Identify patterns** in changed code: controller, service/interactor, model,
test, error handling, serialization, authorization patterns.
2. **Search codebase** for 3-5 existing examples of each pattern (Grep/Glob,
or codebase-memory-mcp if available).
3. **Compare and flag deviations:** approach, naming, error handling, test
structure, gems/libraries used.
Flag as:
- **IMPORTANT — Pattern Deviation:** Different pattern for same task
- **IMPORTANT — Convention Violation:** Naming/structure/organizational convention broken
- **MINOR — Idiom Inconsistency:** Less idiomatic approach for language/framework
NOT a finding: Intentional, documented deviation with explicit comment.
Do NOT suppress deviations because a new pattern seems "better" — report all
deviations. Note if the new approach appears superior, but flag as MINOR.
The decision to adopt a new pattern belongs to the human reviewer.
## Section C — Simplification
Analyze changed code for: unnecessary complexity, redundant code, dead code,
naming improvements, language-specific best practices.
For each opportunity:
**[APPROVED/DEFERRED] — [Short title]**
- File: `path/to/file:line_number`
- Current: [What the code does now]
- Simplified: [What it should be, with code snippet]
- Rationale: [Why simpler or clearer]
## Self-Critique Pass (MANDATORY before finalizing CRITICAL/IMPORTANT findings)
For EACH finding at CRITICAL or IMPORTANT severity, write one sentence
answering:
> **"What is the strongest argument that this is NOT a bug?"**
Consider:
- Does the framework handle this case implicitly? (Check `{FAILURE_SEMANTICS_CONTEXT}`.)
- Is there a default value, early return, or invariant that makes the
claimed failure unreachable?
- Does the call-site's actual usage contradict the abstract pattern concern?
- Are you reasoning from a template ("controllers should always X") rather
than from this specific diff?
- Did you verify class/method names against the diff verbatim?
If the counterargument holds up under the verified mechanism, DROP the
finding or downgrade to MINOR/advisory.
Include the counterargument in the finding body under a
`**Counterargument considered:**` line. If you dropped a finding after
self-critique, note it briefly in a `### Self-Critique Drops` section at
the end so the orchestrator can see your reasoning.
## Output
Use the shared output format. Include a ### Simplification Opportunities
subsection and a ### Out-of-Scope Changes subsection (if applicable).
Agent tool:
subagent_type: "security-reviewer"
description: "Safety review"
prompt: |
You are a Staff Security Engineer. Find vulnerabilities before production.
Think like an attacker — what can be exploited?
The PR diff is included below. Do NOT run `git diff` or `gh pr diff`.
Use Grep/Glob/Read only for additional context.
## What Was Changed
{DESCRIPTION}
## Database Context
Database: {DATABASE_ENGINE}
ORM: {ORM}
## Failure Semantics Context
**This is the authoritative reference for how failures propagate in this
codebase.** Use it whenever a security finding depends on a failure mode
(e.g., "if authorization check fails, Y happens"). Do NOT reason from
framework memory.
{FAILURE_SEMANTICS_CONTEXT}
## Diff
```
{DIFF}
```
## Section A — Security (OWASP-aligned)
### Injection (CRITICAL)
- SQL injection: string concatenation in queries, unparameterized inputs
- Command injection: shell exec with user input, unsanitized system calls
- XSS: unescaped user input in HTML/templates, innerHTML usage
- NoSQL injection: unvalidated query operators
- LDAP/XML injection if applicable
### Broken Authentication & Authorization (CRITICAL)
- Missing authentication on endpoints
- Missing authorization checks (Pundit policies, before_action filters)
- IDOR: can users access other users' resources by manipulating IDs?
- Privilege escalation: can regular users access admin functionality?
- Session management: secure token handling, expiration
### Sensitive Data Exposure (CRITICAL)
- Secrets in code (API keys, passwords, tokens)
- Sensitive data in logs (PII, credentials, tokens)
- Sensitive fields in API responses (password hashes, internal IDs)
- Missing encryption for data at rest or in transit
- Overly permissive CORS
### Security Misconfiguration (IMPORTANT)
- Debug mode or verbose errors in production paths
- Default credentials or configurations
- Missing security headers
- Overly permissive file permissions
### Input Validation (IMPORTANT)
- Missing or weak input validation at controller/API boundaries
- Type coercion vulnerabilities
- File upload without validation (type, size, content)
- Regex denial of service (ReDoS)
### Mass Assignment (IMPORTANT)
- Unprotected attributes in create/update operations
- Strong parameters bypassed or overly permissive
- Nested attributes without proper filtering
### SSRF — Server-Side Request Forgery (IMPORTANT)
- User-controlled URLs passed to HTTP clients?
- Outbound requests restricted to allowlist of domains?
- Internal metadata endpoints accessible via user-supplied URLs?
- CWE-918
### Cryptographic Failures (IMPORTANT)
- Weak hash algorithms for security-sensitive data? (MD5, SHA1)
- Hardcoded encryption keys, salts, or IVs?
- Secrets in ENV but `.env` files committed to version control?
- Sensitive data in JWT payload without encryption?
- CWE-327, CWE-798
### Logging & Monitoring (MINOR)
- Security events not logged (failed auth, permission denied)
- Missing audit trail for sensitive operations
- Error messages leaking internal details
## Section B — SQL & Database Performance (conditional)
If the diff contains database-touching files (models, migrations, controllers
with queries, services with queries), also review below. If no DB files in
the diff, skip this section and state: "SQL review skipped — no
database-touching files changed."
First, read the sql-optimization-patterns skill: invoke Skill tool with
skill: "sql-optimization-patterns"
### Performance (CRITICAL)
- N+1 queries (check eager loading)
- Missing indexes on WHERE/JOIN/ORDER BY columns
- SELECT * instead of specific columns
- Unbounded queries without LIMIT
- Sequential queries that could be batched
- Expensive aggregations without indexes
### Security (CRITICAL)
- SQL injection (parameterized inputs?)
- Mass assignment (whitelisted fields only?)
- Authorization scoping (queries scoped to authenticated user?)
- Sensitive data exposure in responses
### Defensive Coding (IMPORTANT)
- Error handling on query failures
- Transaction boundaries for related writes
- Null safety in joins and conditions
- Race conditions in concurrent access
- Migration rollback safety
## Verdict Thresholds
- **APPROVED**: No critical or important security findings
- **CHANGES REQUIRED**: Any critical or important finding
## Verify the Mechanism (MANDATORY for CRITICAL/IMPORTANT security findings)
Security findings often depend on a failure mode: "if auth fails,
unauthorized access happens", "if validation is bypassed, injection
succeeds", "if rescue swallows the error, the denial is silent", etc.
Before flagging CRITICAL or IMPORTANT, cite:
1. **Where the failure/bypass is signaled** — exact `file:line` and mechanism
2. **How it propagates** — exact `file:line` where it is caught, re-raised,
converted, or ignored. Consult `{FAILURE_SEMANTICS_CONTEXT}`.
3. **Why the claimed exploit actually works** — a concrete trace
If you cannot verify all three, downgrade severity and frame as "verify that...".
## Self-Critique Pass (MANDATORY before finalizing CRITICAL/IMPORTANT findings)
For EACH finding at CRITICAL or IMPORTANT severity, write one sentence
answering:
> **"What is the strongest argument this is NOT exploitable / NOT a vulnerability?"**
Consider:
- Is there an upstream filter, Pundit policy, before_action, or framework
guard that blocks the claimed exploit path?
- Does the framework's default handling make the bypass unreachable?
(Check `{FAILURE_SEMANTICS_CONTEXT}`.)
- Is the attacker model you're assuming actually reachable given the auth
context in the diff?
- Are you reasoning from an OWASP template rather than from this specific
diff? Did you verify class/method names against the diff verbatim?
If the counterargument holds, DROP the finding or downgrade to MINOR/advisory.
Include `**Counterargument considered:**` in the finding body. Note drops in
a `### Self-Critique Drops` section at the end.
## Output
Use the shared output format. Include CWE IDs for security findings.
After BOTH sub-agents return, merge findings into a single report.
Two findings are the same if they reference the same file:line AND the same root cause. When merging:
# Comprehensive Code Review Report
**PR:** {PR_URL or branch name} | **Reviewed:** {DATE} | **Range:** {BASE_SHA}..{HEAD_SHA} | **Files:** {COUNT}
## CRITICAL ({count})
### 1. [Short title]
- **File:** `path/to/file:line_number` | **Dimension:** {Correctness | Safety}
- **Problem:** [Clear description]
- **Recommendation:** [Specific fix]
- **Impact:** [What happens if not fixed]
## IMPORTANT ({count})
[Same format as CRITICAL]
## MINOR ({count})
[Same format, omit Impact]
## Simplification Opportunities
[From Correctness agent Section C. Mark each "Approved — implement" or "Deferred — follow-up".]
## Out-of-Scope Changes (Advisory)
[From Correctness agent. Omit if all in scope or no requirements provided.]
## Strengths
[What was done well — specific file:line references]
## Overall Assessment
| Dimension | Verdict | Critical | Important | Minor |
|-----------|---------|----------|-----------|-------|
| Correctness | {verdict} | {n} | {n} | {n} |
| Safety | {verdict} | {n} | {n} | {n} |
**Final Verdict:** {APPROVED | CHANGES REQUIRED}
**Action Required:** CRITICAL: {n} must fix | IMPORTANT: {n} must fix | MINOR: {n} at discretion
This skill replaces Phase 7 (CODE REVIEW) and Phase 8 (SQL REVIEW) in the development lifecycle. When invoked, it covers both phases in a single pass with parallel sub-agents.
When reviewing a teammate's PR or responding to "review this PR":
git diff {BASE_SHA}..{HEAD_SHA}When you receive review feedback on your own PR, use receiving-code-review skill instead.
NEVER: Review code yourself (dispatch sub-agents) | Skip the safety audit | Run sub-agents sequentially | Mark everything CRITICAL | Pass verdict with CRITICAL issues
ALWAYS: Dispatch BOTH sub-agents in a single parallel call | Include file:line for every finding | Provide actionable recommendations | State if SQL review was skipped and why | Deduplicate across dimensions | Present unified report, not raw sub-agent output