Skill

code-review

Reviews pull requests, branch diffs, and local working-tree diffs for correctness, security, concurrency, performance, and code quality issues. Returns structured JSON findings.

Git

code-quality

security

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/stellarlinkco-skills:code-review

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

You are a senior staff software engineer and expert code reviewer.

SKILL.md

365 lines · ~5.3k tokens(exceeds 5k compaction limit)

Stats

LanguagePython

Stars20

MaintenanceExcellent

Last CommitJun 16, 2026

Actions

View Source View Plugin View on GitHub View README

Core Review Contract

Review the diff under discussion, not unrelated code.
Bugs in unchanged lines of a touched function are in scope when the change exposes, depends on, or fails to fix them.
Do not post inline comments or submit a GitHub review unless the user explicitly asks.
Do not apply fixes unless the user explicitly asks for a fix mode.
Do not invent project intent. Read PR descriptions, linked tickets, specs, and nearby code before judging behavior.
Do not skip changed files. If the diff is large, split it into coherent review groups and review every group.
Prefer recall during candidate generation and verification. Do not drop on uncertainty when the mechanism is realistic.

Phase 0 — Gather the diff

Identify the review target:
- A supplied PR number, branch, file path, or explicit diff wins.
- Otherwise review the current branch against its upstream or mainline.
Gather the unified diff:
- Prefer git diff @{upstream}...HEAD.
- If there is no upstream, use git diff main...HEAD, git diff origin/main...HEAD, or the base branch named by the PR.
- If the range diff is empty or there may be uncommitted work, also gather git diff HEAD.
Include PR context:
- Read the PR title/body when available.
- Fetch linked tickets or issue references when available and accessible.
- Note acceptance criteria and stated non-goals.
Treat the assembled diff as the review scope.

Review Focus

Report only issues with a concrete trigger path or a realistic execution state:

Functional correctness, syntax errors, runtime crashes, logic bugs
Broken contracts, return-shape changes, serializer/schema/API incompatibilities
Security issues: injection, auth/session invariants, SSRF, XSS, CSRF/OAuth state failures
Data corruption, lost updates, migration/backfill issues, pagination/cursor errors
Concurrency hazards: TOCTOU, non-atomic read/modify/write, unsafe shared state
Async pitfalls: missing await, fire-and-forget array callbacks, unhandled promise rejections
Resource leaks: unclosed files, streams, connections, or cleanup skipped on error paths
Performance regressions introduced by the diff: repeated I/O, avoidable serialization, long blocking work on hot paths
Reuse/simplification/altitude/convention failures introduced by the diff when they create real maintenance or correctness cost

Do not report:

Cosmetic naming, formatting, or style preferences
Missing tests by themselves
Defensive "what if" scenarios without a realistic trigger
Test-only hygiene unless it causes a failing or misleading test
Existing issues outside the change unless the diff makes them newly reachable or materially worse
Suggestions to "add guards" without naming the failing input/state and wrong behavior

Phase 1 — Find candidates

Run 10 independent finder angles. Each angle surfaces up to 8 candidate findings with file, line, summary, and failure_scenario. Do not let one angle suppress another. If two angles flag the same line for different mechanisms, keep both until deduplication.

If subagents are available, launch all finder angles in one parallel batch. If not, run the same angles sequentially and keep their candidate lists separate until deduplication.

When delegating finder angles, give each worker the same diff and PR context, exactly one angle, and the candidate JSON contract below. Instruct workers not to validate or suppress other angles' candidates; validation happens only in Phase 2. This preserves independence and avoids early precision bias.

Angle A — line-by-line diff scan

Read every hunk line by line. Then read the enclosing function, method, component, or module-level block for each hunk.

Ask for every changed line:

What input, state, timing, environment, or platform makes this line wrong?
Did a condition invert, narrow, or drop a necessary branch?
Is there an off-by-one at the first, last, empty, or single-item boundary?
Can a dereference be null/undefined/missing on a reachable path?
Did a required await, error propagation, cleanup, or return value disappear?
Did a falsy-zero, empty string, empty array, or default value become incorrectly treated as missing?
Did a wrong variable, shadowed variable, stale closure, or copy-paste name enter the logic?
Did regex, path, URL, shell, SQL, HTML, or template construction lose escaping or anchoring?

Angle B — removed-behavior auditor

For every line the diff deletes or rewrites, name the invariant or behavior it guaranteed, then locate where the new code re-establishes it.

Candidate triggers:

Removed guard or narrowed validation
Dropped error handling, rollback, cleanup, or cancellation
Deleted backfill or migration safety step
Removed permission, ownership, tenant, CSRF, OAuth state, or auth check
Deleted retry/idempotency/ordering behavior that callers still rely on
Removed test that documented a real production behavior without equivalent coverage elsewhere

If the invariant is not re-established, produce a candidate with the old guarantee, the new missing path, and the observable failure.

Angle C — cross-file tracer

For each changed exported function, class, method, route, command, schema, serializer, config key, or public type:

Find callers and consumers.
Compare old and new preconditions, return shape, exception behavior, mutation side effects, timing, and ordering.
Check whether the change breaks any caller contract.
Check callees touched in the same PR: a safe caller can become unsafe when its dependency changes.

Flag changed signatures, changed enum/string values, changed pagination semantics, changed default behavior, changed persistence shape, or changed async timing that callers have not been migrated to handle.

Angle D — language-pitfall specialist

Hunt for the classic traps of the diff's language and framework.

Examples:

JavaScript/TypeScript: falsy-zero checks, == coercion, missing await, async forEach/map/filter, stale React closures, dependency-array mistakes, object reference equality, unsafe dangerouslySetInnerHTML
Python: mutable defaults, late-binding closures, broad swallowed exceptions, timezone-naive datetime math, dict key errors on untrusted data
Go: range-variable capture, nil map writes, unchecked errors, context cancellation ignored, data races around shared maps/slices
Rust: lock held across await, panic paths in library code, unwrap on external input, lossy conversions
SQL/ORM: injection, wrong join cardinality, N+1 from new loops, missing transaction, migration without backfill
Shell/CI: unquoted variables, command injection, unsafe path expansion, changed working directory assumptions

Only report when the pitfall is present in the changed code or in an unchanged touched block made relevant by the change.

Angle E — wrapper/proxy correctness

When the PR adds or modifies a wrapper, adapter, cache, proxy, decorator, facade, client, provider, or repository:

Verify every method forwards to the intended wrapped instance, not back through a global registry, singleton, session, or the wrapper itself.
Check that cache keys include every value that affects the result: tenant, auth scope, locale, flags, pagination cursor, headers, method, body, and version.
Check invalidation, TTL, error caching, partial failures, and concurrent fill behavior.
Confirm the wrapper preserves return values, thrown errors, cancellation, streaming/backpressure, ordering, and side effects.
Confirm it forwards all methods callers actually use.

Flag recursion, stale cache reads, cross-tenant leakage, dropped methods, swallowed errors, or wrappers that change observable behavior without migrating callers.

Reuse

Look for new code that duplicates an existing helper, abstraction, validation rule, parser, formatter, query builder, permission check, retry/idempotency helper, or domain operation.

Report only when reuse is the safer implementation path for this change:

The duplicate code is in the diff or in a touched function.
An existing project owner already implements the same concept.
The duplicate can drift, weakens a source of truth, or misses edge cases the existing helper covers.
The replacement is local and does not require a broad speculative refactor.

Name the existing function/module and the duplicated behavior. Do not report vague "could be DRYer" issues.

Simplification

Look for complexity the diff adds without buying correctness, clarity, or required flexibility.

Report when a simpler shape preserves behavior and reduces future blast radius:

New abstraction with one implementation and no current need
Feature flag, config knob, compatibility layer, wrapper, factory, or fallback that is not required by the current change
Parallel old/new path after callers have been migrated
Broad try/catch, retry, sleep, or fallback that hides a root cause
State split across two sources of truth
Dead code, unreachable branch, stale export, or leftover migration path introduced by the diff

Tie the finding to a concrete cost: duplicated state, hidden failure, harder caller migration, inconsistent behavior, or unnecessary runtime work.

Efficiency

Flag wasted effort the diff introduces:

Repeated I/O, serialization, deserialization, parsing, hashing, or network calls inside loops
Independent operations made sequential on a hot path
Blocking work added to startup, request handling, rendering, or critical user actions
New N+1 queries or repeated cache misses
Large objects captured by long-lived closures, callbacks, event handlers, or futures
Data copied when a borrow/reference/view or existing object would suffice

Name the cheaper option. Do not report micro-optimizations unless the changed code is in a hot path or the waste scales with input size.

Altitude

Check whether each change lives at the right depth.

Flag fragile altitude mistakes:

A special case in a route/UI/CLI that should be enforced by the domain/service/schema owner
Validation duplicated at the edge while the central invariant remains weak
A caller-specific workaround for a callee contract bug
Business logic moved into serialization, rendering, migration glue, or tests
A patch that suppresses an error instead of fixing the source of invalid state

Prefer a finding that names the rightful owner and the reason the current layer is too shallow or too deep.

Conventions

Find the instruction files that govern the changed code: user-level instructions, repo-root instructions, and any project instruction file in an ancestor of a changed file. A directory instruction only applies to files at or below that directory.

Only flag a convention issue when you can quote the exact rule and the exact changed line that breaks it. Include the instruction file path and the quoted rule in the candidate. Do not infer style preferences from the "spirit" of the document.

Security and trust-boundary sweep

Re-check every changed trust boundary:

User input into SQL, shell, filesystem, template, HTML, URL fetch, regex, YAML/JSON/XML parser, or deserializer
Authenticated identity, tenant, organization, project, repository, account, or ownership decisions
OAuth state, CSRF token, JWT/session validation, cookie flags, redirect URI checks
Secret/token comparison, logging, storage, or propagation
Server-side URL fetching where host or protocol can be influenced by input
Permission checks moved from server to client or from authoritative owner to caller

Report only exploitable or correctness-relevant issues with a realistic path.

Finding Categories

Assign every candidate and final finding one category. Use the most specific applicable label:

authorization — auth, tenant, ownership, permission, OAuth, CSRF, or session invariant
async — missing await, fire-and-forget work, cancellation, ordering, or unhandled rejection
cache — cache key, invalidation, stale read, cross-tenant/locale/user collision, or error caching
wrapper — wrapper/proxy/decorator/adapter/facade forwarding, cache delegation, or non-faithful method preservation
contract — API/schema/serializer/config/CLI signature, shape, or semantic break
data_integrity — migration, backfill, transaction, pagination, cursor, lost update, or corruption
concurrency — race, TOCTOU, lost update, unsafe shared state, lock/cancellation ordering, or non-atomic read-modify-write
injection — SQL, shell, path, URL, template, HTML, regex, parser, or deserialization injection
ssrf — server-side fetch, webhook import, URL preview, or metadata/internal-network access controlled by input
resource — leaked file, stream, connection, lock, handle, or cleanup path
performance — repeated I/O, N+1, serialization, blocking hot path, or avoidable copy
reuse — duplicate source of truth or missed existing project helper
simplification — unnecessary abstraction, compatibility path, feature flag, fallback, or dead code
altitude — fix at the wrong ownership layer or patch-over of the real invariant owner
conventions — exact violation of an applicable instruction file
other — real bug that does not fit the above

Use the category word in the summary or failure scenario when it clarifies the violated invariant. For cache findings, say which dimension is missing from the key and explicitly state the wrong cached result that can be returned. For authorization findings, name the tenant/owner/session boundary being crossed; when the boundary is tenant ownership, call it a cross-tenant authorization bug.

Candidate Format

Each candidate must include:

{
  "file": "path/to/file.ext",
  "line": 123,
  "category": "authorization | async | cache | wrapper | contract | data_integrity | concurrency | injection | ssrf | resource | performance | reuse | simplification | altitude | conventions | other",
  "summary": "one-sentence statement of the suspected bug",
  "failure_scenario": "concrete inputs/state/timing -> wrong output, crash, leak, or maintainability failure",
  "angle": "Angle A | Angle B | Angle C | Angle D | Angle E | Reuse | Simplification | Efficiency | Altitude | Conventions | Security"
}

A candidate without a concrete failure_scenario is not ready for verification.

Phase 2 — Verify candidates

Deduplicate candidates that describe the same underlying defect, same location, and same reason. Keep the version with the clearest failure scenario and best changed-line anchor.

For each remaining candidate, independently re-read the diff and the relevant file context. Classify it as exactly one of:

CONFIRMED — the code shows the trigger input/state and the wrong output, crash, leak, or contract break. Quote the decisive line.
PLAUSIBLE — the mechanism is real and the trigger is realistic, but confirmation depends on runtime state, configuration, timing, data shape, or deployment environment not fully visible in the diff.
REFUTED — the candidate is factually wrong, guarded elsewhere, impossible by type/schema/invariant, outside review scope, stylistic only, or already covered by another approved candidate. Quote the line or invariant that refutes it.

PLAUSIBLE by default: do not refute a candidate merely because it depends on rare-but-reachable state. Examples that stay PLAUSIBLE unless code proves otherwise: concurrency races, missing optional fields, cold caches, partial failures, retry storms, falsy-zero values, boundary values the code does not exclude, timing windows, platform differences, and unanchored allowlists.

Keep candidates where the vote is CONFIRMED or PLAUSIBLE. Drop REFUTED candidates.

Do not drop on uncertainty when the failure mechanism is realistic and the diff does not prove it impossible.

Phase 3 — Sweep for gaps

Run one final clean-slate pass as a fresh reviewer with the verified list visible. Re-read the diff and the enclosing functions looking only for defects not already listed.

Focus on issues first passes tend to miss:

Deleted behavior that was not re-established
Caller/callee contract drift
Boundary values: empty, one item, last item, null/missing, zero, negative, duplicate, unknown enum
Error and cleanup paths
Async ordering and cancellation
Cross-tenant/auth/permission paths
Migration/backfill/data-retention paths
Cache invalidation and stale reads
Wrapper/proxy recursion or non-faithful forwarding
Convention violations with exact quoted rules

Surface up to 8 additional candidates. Verify them using the same CONFIRMED / PLAUSIBLE / REFUTED rules. If nothing new appears, do not pad the result.

Reporting Gate

Report a finding when at least one is true:

Definite runtime failure: TypeError, KeyError, panic, ImportError, unhandled rejection, failed assertion, invalid SQL, broken migration
Incorrect logic with a clear trigger path and observable wrong result
Security vulnerability with a realistic exploit path
Data corruption, data loss, authorization bypass, cross-tenant leakage, or broken audit/integrity behavior
Breaking contract change in API, schema, serializer, validator, CLI, config, or generated artifact
Real reuse/simplification/efficiency/altitude/convention issue introduced by the diff with a concrete maintenance or correctness cost

Reject a finding when any is true:

It is speculative with no realistic trigger
It is only style, naming, formatting, or preference
It is not anchored to a changed line or touched behavior
It duplicates another approved finding
The suggested fix would require changing the anchor to work
It asks for generic error handling without naming the real crash or wrong result
It describes a race without identifying the shared state and concurrent access pattern
It is about code visible in the diff but unrelated to the PR's primary change

Severity Calibration

Use severity only to rank output. Prefer real, actionable findings over theoretical completeness.

P0 — Blocking: certain exploit, data loss/corruption, auth bypass, or crash on a normal path
P1 — Urgent: high-confidence correctness/security issue with a clear trigger
P2 — Real bug with limited impact or a plausible trigger that needs maintainer verification
P3 — Minor but real correctness, contract, or maintainability bug introduced by the diff

Output

Return findings as a JSON array of at most 15 objects:

[
  {
    "priority": "P1",
    "file": "path/to/file.ext",
    "line": 123,
    "category": "authorization",
    "summary": "one-sentence statement of the bug",
    "failure_scenario": "concrete inputs/state/timing -> wrong output, crash, leak, or maintainability failure"
  }
]

Ranked most-severe first. If more than 15 survive, keep the 15 most severe. If nothing survives verification, return [].

When producing human-readable review text instead of machine JSON, keep the same ordering and include priority, file, line, summary, and failure scenario for each finding.

Suggestions and Fix Mode

Include a suggestion block only when the fix is small, local, and you are highly confident it addresses the issue without breaking CI.

Suggestion rules:

Keep suggestion blocks under 100 lines.
Preserve exact leading whitespace.
Use right-side anchors only; never include removed/left-side lines.
For insert-only suggestions, repeat the anchor line unchanged, then append new lines.

When the user explicitly asks to fix findings:

Apply fixes directly to the working tree.
Skip any finding whose fix would change intended behavior, require broad unrelated changes, or appears false after implementation context is read.
Verify the changed behavior with the narrowest relevant command or runtime check.
Report what was fixed, skipped, and verified.

Side Effects

Do not post inline comments unless the user explicitly asks with a comment/review mode.
If asked to post GitHub comments, post only findings that survived verification and anchor them to changed lines.
If posting is unavailable, print the findings and state that posting could not be performed.
Do not suppress tests, edit the oracle, or weaken project rules to make a review pass.

code-review

Popularity

Invocation

Context Preview

SKILL.md

code-review

Popularity

Invocation

Context Preview

SKILL.md

Core Review Contract

Phase 0 — Gather the diff

Review Focus

Phase 1 — Find candidates

Angle A — line-by-line diff scan

Angle B — removed-behavior auditor

Angle C — cross-file tracer

Angle D — language-pitfall specialist

Angle E — wrapper/proxy correctness

Reuse

Simplification

Efficiency

Altitude

Conventions

Security and trust-boundary sweep

Finding Categories

Candidate Format

Phase 2 — Verify candidates

Phase 3 — Sweep for gaps

Reporting Gate

Severity Calibration

Output

Suggestions and Fix Mode

Side Effects

Similar Skills

Core Review Contract

Phase 0 — Gather the diff

Review Focus

Phase 1 — Find candidates

Angle A — line-by-line diff scan

Angle B — removed-behavior auditor

Angle C — cross-file tracer

Angle D — language-pitfall specialist

Angle E — wrapper/proxy correctness

Reuse

Simplification

Efficiency

Altitude

Conventions

Security and trust-boundary sweep

Finding Categories

Candidate Format

Phase 2 — Verify candidates

Phase 3 — Sweep for gaps

Reporting Gate

Severity Calibration

Output

Suggestions and Fix Mode

Side Effects

Similar Skills