Help us improve
Share bugs, ideas, or general feedback.
From mtk
Reviews code for correctness, security, architecture, and test quality against project standards. Use after implementation and before merge, or when reviewing a PR.
npx claudepluginhub moberghr/mtk-agent-toolkit --plugin mtkHow this skill is triggered — by the user, by Claude, or both
Slash command
/mtk:code-review-and-qualityThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
```!
Mandates invoking relevant skills via tools before any response in coding sessions. Covers access, priorities, and adaptations for Claude Code, Copilot CLI, Gemini CLI.
Share bugs, ideas, or general feedback.
echo "--- Branch ---"
git branch --show-current 2>/dev/null || echo "(detached)"
echo "--- Tech Stack ---"
cat .claude/tech-stack 2>/dev/null || echo "(not set)"
echo "--- Diff stat ---"
git diff --stat HEAD 2>/dev/null || git diff --stat --cached 2>/dev/null || echo "(no diff)"
Review changed code as an adversary, not a collaborator. The review must prioritize real risks over style and decide whether the change improves overall code health.
CLAUDE.md.claude/tech-stack to identify the active stack, then .claude/skills/tech-stack-{stack}/SKILL.md for stack-specific reference paths## Reference Files section.claude/references/security-checklist.md.claude/references/testing-patterns.md.claude/references/performance-checklist.md.claude/references/domain-finance.md), load it for domain-specific rationalizationsIf reviewing a PR or branch with CI runs, check CI status:
Run bash hooks/ci-status.sh to get check run results
If CI failed, note which checks failed — the review should focus on those areas
If CI passed, note any warnings from the build output (.mtk/analyzer-output.json)
If hooks/ci-status.sh is not available or gh is not installed, proceed without CI context
Review across these axes:
.claude/references/stub-detection.md (empty bodies, NotImplementedException, suspect return null/[]/{}, mock data in production paths, unwired handlers)Route specialized review when needed:
compliance-reviewer for security/compliance-sensitive worktest-reviewer for coverage and verification qualityarchitecture-reviewer for boundary and slice integrity concernssilent-failure-hunter when the diff touches error handling — dispatch
when git diff matches any of \b(catch|except|finally)\b,
\.catch\(, \?\?, \|\|, or adds // eslint-disable, # noqa,
@ts-ignore, @ts-expect-error, Skip =, it\.skip, xit\(.
Run in parallel with compliance-reviewer; merge findings, dedupe by
(file, line, rule). The hunter emits category: "error-handling"
so dedupe is straightforward.Categorize findings per the schema in .claude/references/review-finding-schema.md:
critical, warning, suggestion severitiesconfidence score 0–100 per the rubricinternet_facing
for boundary exposure and needs_human_review for axes the AI cannot
honestly clear from the diff aloneScore the five dimensions (1–10). Assign one score per dimension and cite at least one file:line evidence quote per score (high or low):
correctness — does the code do what the spec said? Edge cases? Invariants?security — auth, secrets, input validation, audit, supply chaintest_coverage — public behaviors tested? error paths exercised? assertions meaningful?architecture_fit — slices, boundaries, patterns honored?simplicity — fewer files / abstractions / moving parts feasible?Score rubric: 9–10 exemplary · 7–8 acceptable · 4–6 blocks merge · 1–3 severe.
Auto-fail rules:
NEEDS_CHANGES regardless of finding count.file:line evidence quote → treated as 0 (auto-fail).Iteration cap. If a dimension has scored < 7 in two prior iterations and a third iteration would also score it < 7, stop and escalate to a human. Automated remediation has stopped converging. Report the dimension, iteration count, and remaining findings.
Emit output in the canonical format:
.claude/review-config.json, default 80)If findings[] has fewer than 2 entries, populate below_threshold_rationale explicitly stating what axes were checked and why the code is genuinely clean. Silent empty reviews are invalid.
If a workflow artifact is active (MTK_WF_UUID set), record scores:
scripts/workflow-artifact.sh set "$MTK_WF_UUID" results.review_scores.<dimension>=<n> for each of the five dimensions, and results.review_iteration=<n> for the current cycle.
See .claude/skills/context-engineering/SKILL.md for the shared MTK rationalization table. Review-specific traps: authors are blind to their own assumptions (self-review isn't review), "mostly style" is a dodge (real review starts with correctness and risk), and soft-pedaling a real production risk to avoid blocking progress is a review failure.
.claude/references/review-finding-schema.md (markdown table + fenced JSON)findings[])below_threshold_rationale is populated when fewer than 2 findings surfaceNEEDS_CHANGES when any dimension < 7