Skill

reviewing-code

Reviews code changes for correctness, readability, architecture, security, and performance using project standards. Checks lint compliance, type safety, test coverage, security issues for PRs and audits.

code-quality

security

npx claudepluginhub williamzujkowski/nexus-agents

Tool Access

This skill is limited to using the following tools:

ReadGrepGlobBashLSP

Preview

<!--

SKILL.md

Similar Skills

code-review-and-quality

27.2k

Conducts multi-axis code reviews evaluating correctness, readability, architecture, security, and performance before merging changes.

agent-skills

code-reviewer

Reviews pull requests and code changes for correctness, security, performance, code quality, testing, documentation, and maintainability using git diffs.

5 files6 tools

charon-fan-agent-playbook

code-review

10.9k

Reviews code changes, PRs, and diffs for security vulnerabilities, performance issues, correctness bugs, and maintainability problems with tables for issues and suggestions.

engineering

Stats

Stars10

Forks1

Last CommitMay 5, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Code Review Skill

Pre-emit vs post-hoc

This skill is the post-hoc review gate — evaluating someone else's diff after it lands as a PR. The complementary pre-emit gate is self-critique, which the author runs on their own work before submitting:

Skill	Reviews	When
`self-critique`	Your own output	Before you emit anything
`reviewing-code` (this)	Others' code (PRs, commits)	After someone else writes code

Both use the same five-axis framework below. Both can apply to the same artifact at different lifecycle points — author self-critiques before PR; reviewer applies this skill on the resulting diff.

Five-axis review framework

Evaluate every change across these dimensions. The order matters — correctness gates merge; the rest are quality.

Correctness — does the code do what the spec says? Are edge cases (null, empty, boundary, error paths) handled? Are the tests actually testing the behavior, not just the implementation?
Readability — would another engineer understand this without explanation? Names follow project conventions? Control flow straightforward?
Architecture — does it follow existing canonical patterns (CLAUDE.md "Canonical Paths") or introduce a new one? Module boundaries respected? No circular deps? Abstraction level appropriate?
Security — input validated at boundaries (per .rules/untrusted-input.md)? Secrets out of code/logs? Auth/authz where needed? Parameterized queries? See security-checklist.
Performance — N+1 patterns? Unbounded data fetching? Sync I/O in hot paths? Re-render storms? Missing pagination? Don't optimize without evidence — see performance-optimization skill.

Full documentation:

Review Checklist

Structural (Section 3)

Criterion	Limit
File	≤ 400 lines
Function	≤ 50 lines
Nesting	≤ 4 levels

Type Safety (Section 4)

No any types
Result<T, E> for fallible ops
Zod at boundaries

Security (Section 7)

No secrets in code/logs
Path traversal prevention
No user-provided RegExp

See SECURITY.md for full threat model.

Review Process

# 1. Run quality gates
pnpm lint && pnpm typecheck && pnpm test

# 2. Check coverage
pnpm test:coverage

Verification Gate — MANDATORY for every finding

A 2026-04-25 audit (#2225) found a 100% false-positive rate in second-pass code-review findings. Each false positive cost ~5min of triage. The fix is a stricter pre-file gate.

Before flagging any finding, run this checklist:

Read the cited line + 5 lines before + 5 lines after. Most false positives die here — "missing bounds check" turns out to be at the next line; "O(n²) loop" has a .slice(0, 20) cap on the line above.
Trace the call path. Is the flagged code reachable in practice? Or does upstream validation (e.g. isValidCommand, Zod schema) already filter the input?
Name the observable failure. What test would assert the bug? "Wrong return value", "leaked resource", "raised exception" — be concrete. If you can't, the finding is not load-bearing.
Rule out JS non-issues:
- No "race condition" without await between read and write — JS is single-threaded; sync code is atomic at microtask level
- Maps support set/delete during iteration per ECMA-262
- NaN comparisons fail closed silently (no observable bad state)
- as Record<string, unknown> is safe IF every access has typeof guards — read the whole function before flagging the cast

If any check raises "wait, actually..." → drop the finding. Don't file, don't include in the report. False positives compound: pollute the backlog, train future agents on noise, erode trust in tooling.

Anti-rationalization — Code review

Excuse	Counter
"It's a small change, no need for thorough review"	Small changes hide subtle bugs: off-by-one, missed null check, type narrowing slipped. Apply the 4-point gate to every finding regardless of diff size.
"Tests pass, so it's correct"	Tests verify what was tested. Edge cases the tests don't cover are still bugs. Use the five-axis framework — correctness ≠ "tests green."
"I trust the author"	Trust is for the human relationship; the review is for the code. Apply the same gate to senior contributors as to first-timers.
"The CI gates would catch any real issue"	CI catches a known set of failure modes. Reviews catch the ones CI doesn't model — architecture drift, unclear naming, subtle security gaps, unintended public-API changes.
"I'd refactor this differently, but it works"	If the existing pattern is canonical (per CLAUDE.md), don't fight it. If your way is genuinely better, file a follow-up — don't gate the merge.
"I'll flag it; the author can decide"	Every flagged finding costs review-cycle time. Apply the 4-point gate first; only flag what passed all four.

Output categorization

Every finding gets one of three tags. The bar matters — over-flagging dilutes the signal.

Critical — must fix before merge. Security vulnerability, data loss risk, broken functionality, public-API regression.
Important — should fix before merge. Missing test, wrong abstraction, poor error handling, type-safety violation.
Suggestion — consider for improvement. Naming, code style, optional optimization, alternative approach. Author may accept or defer.

If you tag everything Critical, nothing is Critical. Three or more Critical findings on a non-emergency PR usually means the PR is too big — split it.

Review Output

## Code Review: [File/PR]

### Critical (Must Fix)

- [ ] Issue at `file:line`

### Major (Should Fix)

- [ ] Issue

### Recommendation

[ ] APPROVE / [ ] REQUEST_CHANGES

See CODING_STANDARDS.md for common issues and patterns.

Red flags (in addition to the verification gate)

Review approving without running the local quality gates
Critical findings tagged but not blocking the merge
Reviewer cited a finding outside the diff scope (drive-by review)
Same review pattern repeated across PRs — file an issue for a lint rule or pattern guide instead of flagging each instance
Review output without specific file:line citations