From kiro
Substantive critical review of kiro spec artifacts (requirements, design, tasks) against the actual codebase and quality standards. Produces GO/NO-GO verdicts with specific, actionable issues.
npx claudepluginhub softwarecatscom/scc-marketplace --plugin kiroThis skill uses the workspace's default tool permissions.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Perform substantive critical review of kiro spec artifacts by cross-referencing against the actual codebase and project quality standards. Produces a GO/NO-GO decision with specific, actionable issues.
/kiro:reviewer {feature} requirements # Review requirements.md
/kiro:reviewer {feature} design # Review design.md
/kiro:reviewer {feature} tasks # Review tasks.md
This reviewer does what a senior engineer would do: read the spec, read the code, compare them, and identify gaps. Mechanical checks (EARS format, checkbox syntax) are necessary but insufficient. The real value is substantive review -- finding hollow sections, orphaned references, missing integrations, and misalignment between spec and reality.
1. Load Context
|-- Read the target artifact
|-- Read spec.json for feature metadata
|-- Read pipeline-log.md for prior decisions (if exists)
+-- Check for design mockup references in the spec description
2. Explore the Codebase
|-- Identify target source files from the artifact
|-- Read current implementations of referenced components
|-- Inventory existing hooks, utilities, and data patterns
+-- Map what EXISTS vs what the artifact ASSUMES exists
3. Apply Review Checklist (artifact-specific, see below)
4. Produce Verdict
|-- GO: No high-severity issues, artifact is ready for next phase
|-- NO-GO: High-severity issues that would cause problems downstream
+-- Specific issues with severity, evidence, and fix guidance
| # | Check | Severity | What to Look For |
|---|---|---|---|
| R1 | Content depth | HIGH | Are all requirements substantive? A section with 3 or fewer ACs that duplicate existing functionality is hollow. Compare each requirement's AC count and richness against the actual components available. |
| R2 | Codebase alignment | HIGH | Do requirements reference components that exist? Do they MISS components that exist? Read the target files and their imports. If the codebase has existing components or hooks that the requirement ignores, that's a gap. |
| R3 | Orphaned references | MEDIUM | Do test requirements (unit/E2E) reference behaviors not specified in functional requirements? If a test requirement says "test authorization logic" but no functional requirement defines authorization, the test AC is orphaned. |
| R4 | Implementation leaking | LOW | Are requirements prescribing HOW (specific component names, file paths, architecture choices) instead of WHAT (observable behavior)? Implementation details belong in design.md. |
| R5 | Phase completeness | MEDIUM | If work is phased, is every phase transition state defined? Example: 5 sections exist in Phase 1 but section 4 content is Phase 2 -- what does section 4 show in Phase 1? |
| R6 | EARS compliance | LOW | Do ACs use "shall" language? Are they testable and verifiable? Do they describe single behaviors? |
| R7 | Test coverage requirements | HIGH | Is there a unit test requirement with coverage threshold? Is there an E2E test requirement? Are test scenarios concrete enough to implement? |
| # | Check | Severity | What to Look For |
|---|---|---|---|
| D1 | Architecture alignment | HIGH | Does the design integrate with existing patterns? Read current source files -- does the design account for the project's actual data fetching, state management, and error handling patterns? |
| D2 | Component interface consistency | HIGH | Do proposed interfaces match existing patterns in the codebase? Check prop patterns, hook signatures, error handling conventions. |
| D3 | Missing integration points | MEDIUM | Does the design cover how new components wire into existing ones? Layout imports, route registration, navigation updates. |
| D4 | Data flow completeness | MEDIUM | Is every data fetch described? Are loading, error, and empty states covered for each async operation? |
| D5 | Testability section | MEDIUM | Does the design include a testability section? Mock boundaries, test data strategy, component isolation approach. |
| D6 | Design review rules compliance | LOW | Cross-check against .kiro/settings/rules/design-review.md criteria if the file exists. |
| # | Check | Severity | What to Look For |
|---|---|---|---|
| T1 | Requirements traceability | HIGH | Every requirement must map to at least one task. Every task must reference requirement IDs. Cross-check both directions. |
| T2 | Task actionability | MEDIUM | Are tasks concrete enough to implement without guessing? Vague tasks like "set up section" without specifying what data to show are not actionable. |
| T3 | Testing task completeness | HIGH | Dedicated unit test task group exists. Dedicated E2E test task group exists. Coverage targets specified. Testing tasks are NOT marked optional. |
| T4 | Dependency ordering | MEDIUM | Do tasks reference outputs of earlier tasks? Is the execution order logical? Can a developer start task 1 without reading task 5 first? |
| T5 | Parallel analysis correctness | LOW | If (P) markers exist, verify the marked tasks truly have no data/file dependencies. If tasks share target files, they cannot be parallel. |
| T6 | Task generation rules compliance | LOW | Cross-check against .kiro/settings/rules/tasks-generation.md if the file exists. Natural language descriptions, 2-level hierarchy, sequential numbering. |
+-- Kiro Review: {artifact-type} -- {feature-name} ----------+
| |
| Codebase explored: {N files read, key components found} |
| |
| Issues Found: |
| |
| [HIGH] Issue title |
| Evidence: {what you found in code/spec} |
| Impact: {what goes wrong if not fixed} |
| Fix: {specific revision guidance} |
| |
| [MED] Issue title |
| Evidence: ... |
| Impact: ... |
| Fix: ... |
| |
| [LOW] Issue title |
| Evidence: ... |
| Fix: ... |
| |
| Strengths: |
| - {1-2 things done well} |
| |
| -----------------------------------------------------------+
| Verdict: GO / NO-GO |
| Reason: {1-2 sentences} |
| Action: {what to do next} |
+-------------------------------------------------------------+
The reviewer operates autonomously by default. It uses a ternary decision model for every issue found:
| Action | Criteria | Example |
|---|---|---|
| Fix | Gap in the current artifact that has one clear correct resolution. The fix does not change product scope or user-facing behavior -- it fills in what was obviously missing. | Missing E2E scenario for a section that exists in requirements. Orphaned test AC referencing undefined behavior. |
| Defer | Concern that belongs to a later phase. The current artifact cannot resolve it because the answer depends on design, implementation, or data that doesn't exist yet. | Requirements reference a data source that doesn't exist -- design will define the data model. |
| Escalate | Two or more valid approaches exist that represent different product decisions, and no evidence in the spec, codebase, or prior decisions resolves the ambiguity. The reviewer cannot pick one without making a product judgment call. | A section could show summary metrics OR a detail table -- both are valid UX choices. A requirement could scope to Phase 1 or Phase 2 -- both are valid timeline choices. |
Escalate ONLY when ALL of these are true:
If any condition is false, do NOT escalate -- fix or defer instead.
When uncertain whether to fix, defer, or escalate:
The reviewer MUST read actual source files, not just the spec. Strategy by artifact type:
For requirements review:
src/ (or equivalent) for related existing code -- components, hooks, utilitiesFor design review:
For tasks review:
The reviewer is invoked by /kiro:pipeline at these points:
| Pipeline Phase | Reviewer Invocation | Purpose |
|---|---|---|
| Phase 3: After gap analysis | /kiro:reviewer {feature} requirements | Substantive review before proceeding to design |
| Phase 5: After tasks generation | /kiro:reviewer {feature} tasks | Verify traceability and actionability before implementation |
The reviewer runs BEFORE presenting the artifact to the user for approval. Its verdict and issues are included in the presentation so the user can make an informed approval decision.
| Do NOT | Do Instead |
|---|---|
| Review only the spec artifact in isolation | Read the codebase -- that's the whole point |
| Accept thin/hollow sections as "intentionally minimal" | Flag them as HIGH issues -- thin sections create poor outcomes |
| Assume referenced components exist | Verify by reading source files |
| Produce vague issues ("could be better") | Every issue needs Evidence, Impact, and Fix |
| Review implementation details in requirements | Flag them as LOW issues -- they belong in design |
| Escalate fixable gaps | Fix them. Escalation is a last resort. |