Help us improve
Share bugs, ideas, or general feedback.
From exarchos
Stage 2 code quality review assessing SOLID, DRY, security, and test quality. Required after spec review passes. Uses git diff and static analysis scripts.
npx claudepluginhub lvlup-sw/exarchosHow this skill is triggered — by the user, by Claude, or both
Slash command
/exarchos:skills-codex-quality-reviewThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Stage 2 of two-stage review: Assess code quality, maintainability, and engineering best practices.
Stage 2 code quality review assessing SOLID, DRY, security, and test quality. Required after spec review passes. Uses git diff and static analysis scripts.
Dispatches 5 specialized agents for multi-perspective code review on correctness, architecture, security, production readiness, and test quality. Merges findings, auto-fixes Critical/Important issues up to 3 rounds.
Orchestrates two-stage code review: spec compliance against story acceptance criteria first, then code quality. For quantum-loop pipeline after implementation or standalone via git diffs.
Share bugs, ideas, or general feedback.
Stage 2 of two-stage review: Assess code quality, maintainability, and engineering best practices.
Prerequisite: Spec review must PASS before quality review.
MANDATORY: Before accepting any rationalization for rubber-stamping code quality, consult
references/rationalization-refutation.md. Every common excuse is catalogued with a counter-argument and the correct action.
Activate this skill when:
review command (after spec review)This skill runs in a SUBAGENT spawned by the orchestrator, not inline.
The orchestrator provides the state file path, diff output from exarchos_orchestrate({ action: "review_diff" }), task ID, and spec review results (must be PASS).
The subagent reads the state file for artifact paths, uses the diff output instead of full files, runs static analysis, performs a code walkthrough, generates a report, and returns the verdict.
The orchestrator is responsible for generating the diff before dispatching the quality-review subagent. The subagent does NOT generate its own diff.
Orchestrator responsibilities:
git diff main...HEAD or git diff main...integration-branchSubagent responsibilities:
Instead of reading full files, receive the integrated diff:
# Generate integrated diff for review (branch stack vs main)
git diff main...HEAD > /tmp/stack-diff.patch
# Alternative: git diff for integration branch
git diff main...integration-branch > /tmp/integration-diff.patch
# Alternative: use gh CLI to get PR diff
# gh pr diff <number>
# Or use GitHub MCP pull_request_read with method "get_diff" if available
This reduces context consumption by 80-90% while providing the complete picture.
Before evaluating, query the review strategy runbook to determine the appropriate evaluation approach:
exarchos_orchestrate({ action: "runbook", id: "review-strategy" }) to determine single-pass vs two-pass evaluation strategy based on diff size and task count.After delegation completes, quality review examines:
This enables catching:
Quality Review focuses on:
Does NOT re-check:
Before reviewing, query quality signals for the skill(s) under review:
mcp__exarchos__exarchos_view({ action: "code_quality", workflowId: "<featureId>" })
regressions is non-empty, report active quality regressions to the user before proceedingconfidenceLevel: 'actionable', present the suggestedAction to the usergatePassRate < 0.80 for the target skill, warn about degrading qualityBefore proceeding, confirm spec review passed for all tasks:
action: "get", featureId: "<id>", query: "reviews"
If ANY task has specReview.status !== "pass", STOP and return:
{ "verdict": "blocked", "summary": "Spec review not passed — run spec-review first" }
If this review follows a delegation phase, verify triage routing:
exarchos_orchestrate({
action: "verify_review_triage",
featureId: "<feature-id>"
})
featureId resolves PRs (event-store projection) and review.routed events directly from the store — no .state.json/.events.jsonl needed (INV-1). A legacy file-based workflow may instead pass stateFile + eventStream.
passed: true: triage routing correct — continue to Step 1. passed: false: triage issues found — investigate and resolve before proceeding.
Runbook: Run quality evaluation gates via runbook:
exarchos_orchestrate({ action: "runbook", id: "quality-evaluation" })If runbook unavailable, usedescribeto retrieve gate schemas:exarchos_orchestrate({ action: "describe", actions: ["check_static_analysis", "check_security_scan", "check_convergence", "check_review_verdict"] })
Run automated gates via orchestrate actions. See references/gate-execution.md for orchestrate action signatures and response handling.
check_static_analysis — lint, typecheck, quality-check (D2). Must pass before continuing.check_security_scan — security pattern detection (D1). Include findings in report.check_context_economy, check_operational_resilience, check_workflow_determinism — advisory, feed convergence view.Evaluate agent-generated tests against Kent Beck's Test Desiderata. Four properties are critical for agentic code:
| Property | What to check | Flag when |
|---|---|---|
| Behavioral | Tests assert on observable behavior, not implementation details | Mock call count assertions, internal state inspection, testing private methods |
| Structure-insensitive | Tests survive refactoring without behavioral change | Tests coupled to internal helper method signatures, tests that break when internals are renamed |
| Deterministic | Tests produce the same result every run | Uncontrolled Date.now(), Math.random(), setTimeout race conditions, network-dependent tests |
| Specific | Test failures pinpoint the cause | toBeTruthy() / toBeDefined() without additional specific assertions, catch-all tests with vague descriptions |
Test layer mismatch detection: Flag unit tests with >3 mocked dependencies as potential layer mismatches — unit tests with many mocks often indicate the test is asserting integration concerns rather than unit logic. Advisory finding: suggest re-classifying as integration test with real collaborators.
Include Test Desiderata findings in the quality review report under a "Test Quality" section. Output format: Report Test Desiderata violations as entries in the issues array with category: "test-quality".
Use the template from references/review-report-template.md to structure the review output.
| Priority | Action | Examples |
|---|---|---|
| HIGH | Must fix before merge | Security issues, data loss risks |
| MEDIUM | Should fix, may defer | SOLID violations, complexity |
| LOW | Nice to have | Style preferences, minor refactors |
If classification is ambiguous, default to MEDIUM and flag for human decision.
If HIGH-priority issues found:
Fix loop iteration limit: max 3. If HIGH-priority issues persist after 3 fix-review cycles, pause and escalate to the user with a summary of unresolved issues. The user can override: review --max-fix-iterations 5
After the quality-review fix loop completes and quality passes, re-verify that the quality fixes did not break spec compliance. Run inline (not a full dispatch):
npm run test:run
npm run typecheck
exarchos_orchestrate({
action: "check_tdd_compliance",
featureId: "<featureId>",
taskId: "<taskId>",
branch: "<branch>"
})
The subagent MUST return results as structured JSON. The orchestrator parses this JSON to populate state. Any other format is an error.
{
"verdict": "pass | fail | blocked",
"summary": "1-2 sentence summary",
"issues": [
{
"severity": "HIGH | MEDIUM | LOW",
"category": "security | solid | dry | perf | naming | test-quality | other",
"file": "path/to/file",
"line": 123,
"description": "Issue description",
"required_fix": "What must change"
}
],
"test_results": {
"passed": 0,
"failed": 0,
"coverage_percent": 0
}
}
| Don't | Do Instead |
|---|---|
| Block on LOW priority | Accept and track for later |
| Review before spec passes | Complete spec review first |
| Be overly pedantic | Focus on impactful issues |
| Skip security checks | Always verify basics |
| Accept poor test quality | Tests are code too |
| Apply generic standards to language issues | Reference language-specific rules |
If an issue spans multiple tasks:
Key format: The review key MUST be kebab-case
reviews["quality-review"], not camelCasereviews.qualityReview. The guard matches on the exact key string.
On review complete:
action: "update", featureId: "<id>", updates: {
"reviews": { "quality-review": { "status": "pass", "summary": "...", "issues": [...] } }
}
On all reviews pass — advance to synthesis:
action: "transition", featureId: "<id>", target: "synthesize"
For the full transition table, consult @skills/workflow-state/references/phase-transitions.md.
Quick reference:
review → synthesize requires guard all-reviews-passed — all reviews.{name}.status must be passingreview → delegate requires guard any-review-failed — triggers fix cycle when any review failsUse exarchos_workflow({ action: "describe", actions: ["update", "init"] }) for
parameter schemas and exarchos_workflow({ action: "describe", playbook: "feature" })
for phase transitions, guards, and playbook guidance. Use
exarchos_orchestrate({ action: "describe", actions: ["check_static_analysis", "check_security_scan", "check_review_verdict"] })
for orchestrate action schemas.
For review verdict routing, query the decision runbook:
exarchos_orchestrate({ action: "runbook", id: "review-escalation" })
This runbook provides structured criteria for routing between APPROVED and NEEDS_FIXES verdicts based on finding severity and fix cycle count. APPROVED transitions to synthesize; NEEDS_FIXES transitions back to delegate for a fix cycle. (BLOCKED routing is only relevant in plan-review, not here.)
After Tier 1 MCP gates complete, execute the quality check catalog. This provides deterministic quality checks (grep patterns, structural analysis) that run on any MCP platform — no companion plugins required.
exarchos_orchestrate({ action: "prepare_review", featureId: "<id>" })
The response contains:
catalog — structured check dimensions with grep patterns, structural thresholds, and heuristic instructionsfindingFormat — the TypeScript interface for submitting findingspluginStatus — which companion plugins are configured in .exarchos.ymlExecute each check in the catalog against the codebase:
pattern against files matching fileGlobthreshold (e.g., nesting depth, function length)descriptionCollect all matches as findings in the format specified by findingFormat, then pass them as pluginFindings to check_review_verdict.
On platforms with skill support (Claude Code, Cursor), the orchestrator may additionally invoke impeccable:critique for deeper qualitative analysis. These findings are also passed as pluginFindings.
Query convergence status and compute verdict via orchestrate. See references/convergence-and-verdict.md for full orchestrate calls, response fields, and verdict routing logic.
Summary: check_convergence returns per-dimension D1-D5 status. check_review_verdict takes finding counts and optional pluginFindings array (from catalog execution and companion plugins), emits gate events, and returns APPROVED or NEEDS_FIXES.
All transitions are automatic — no user confirmation. See references/auto-transition.md for per-verdict transition details, Skill invocations, and integration notes.
Before transitioning, record the review verdict. The reviews value MUST be an object with a status field, not a flat string:
APPROVED:
exarchos_workflow({ action: "update", featureId: "<id>", updates: {
reviews: { "quality-review": { status: "pass", summary: "...", issues: [] } }
}})
exarchos_workflow({ action: "transition", featureId: "<id>", target: "synthesize" })
Then invoke synthesize.
NEEDS_FIXES:
exarchos_workflow({ action: "update", featureId: "<id>", updates: {
reviews: { "quality-review": { status: "fail", summary: "...", issues: [{ severity: "HIGH", file: "...", description: "..." }] } }
}})
Then invoke delegate --fixes.
Gate events: Do NOT manually emit
gate.executedevents viaexarchos_event. Gate events are automatically emitted by thecheck_review_verdictorchestrate handler. Manual emission causes duplicates.
Guard shape: The
all-reviews-passedguard requiresreviews.{name}.statusto be a passing value (pass,passed,approved,fixes-applied). Flat strings likereviews: { "quality-review": "pass" }are silently ignored and will block thereview → synthesizetransition.