ALWAYS load before starting any task. Provides the 10-dimension weighted scoring rubric (0-10 scale) used to evaluate all implementations in WRFC loops. Dimensions include correctness, completeness, error handling, type safety, security, performance, maintainability, testing, accessibility, and documentation. Includes deterministic validation scripts and score thresholds for pass/revise/reject decisions.
Applies a weighted 10-dimension scoring rubric to evaluate code implementations and determine pass/fail verdicts.
/plugin marketplace add mgd34msu/goodvibes-plugin/plugin install goodvibes@goodvibes-marketThis skill inherits all available tools. When active, it can use any tool Claude has access to.
references/scoring-examples.mdscripts/validate-fix.shscripts/validate-review.shscripts/
validate-review.sh
validate-fix.sh
references/
scoring-examples.md
This skill defines the precise scoring rubric and review format used in Work-Review-Fix-Check (WRFC) loops. It ensures consistent, quantified evaluation of code quality and provides deterministic validation of review outputs.
Every review evaluates code across 10 dimensions. Each dimension receives a score from 1 to 10, where:
Does the code work as intended?
Scoring criteria:
Common issues:
Is everything implemented fully?
Scoring criteria:
Common issues:
Are vulnerabilities prevented?
Scoring criteria:
Common issues:
* in productionIs it efficient?
Scoring criteria:
Common issues:
Does it follow project patterns?
Scoring criteria:
Common issues:
Are tests present and meaningful?
Scoring criteria:
Common issues:
Is it easy to understand?
Scoring criteria:
Common issues:
Are errors handled gracefully?
Scoring criteria:
Common issues:
Are types correct and comprehensive?
Scoring criteria:
any, generics used appropriately, strict mode enabledany types with TODO comments to fixany types, type assertions without validation, loose typingany everywhere, type assertions hiding errorsCommon issues:
anyas) without runtime validationDoes it work with existing code?
Scoring criteria:
Common issues:
The overall score is a weighted average of the 10 dimensions:
Overall = (Correctness x 0.20) +
(Completeness x 0.15) +
(Security x 0.15) +
(Performance x 0.10) +
(Conventions x 0.10) +
(Testability x 0.10) +
(Readability x 0.05) +
(Error Handling x 0.05) +
(Type Safety x 0.05) +
(Integration x 0.05)
Example:
Correctness: 9
Completeness: 8
Security: 10
Performance: 7
Conventions: 9
Testability: 6
Readability: 8
Error Handling: 7
Type Safety: 9
Integration: 9
Overall = (9x0.20) + (8x0.15) + (10x0.15) + (7x0.10) + (9x0.10) +
(6x0.10) + (8x0.05) + (7x0.05) + (9x0.05) + (9x0.05)
= 1.8 + 1.2 + 1.5 + 0.7 + 0.9 + 0.6 + 0.4 + 0.35 + 0.45 + 0.45
= 8.35/10
The overall score determines the verdict:
| Score Range | Verdict | Action Required |
|---|---|---|
| >= 9.5 | PASS | Ship it -- production ready |
| 8.0-9.49 | CONDITIONAL PASS | Minor issues -- fix and re-check (8.0 is inclusive, no full re-review) |
| 6.0-7.9 | FAIL | Significant issues -- fix and full re-review required |
| Below 6.0 | FAIL | Major rework needed -- fix and full re-review required |
Critical dimension rule: If any dimension scores below 4, the overall verdict is automatically FAIL regardless of the calculated score.
Every review MUST produce this exact structure. Validation scripts check for these sections.
## Review Summary
- **Overall Score**: X.X/10
- **Verdict**: PASS | CONDITIONAL PASS | FAIL
- **Files Reviewed**: [list of files]
## Dimension Scores
| Dimension | Score | Notes |
|-----------|-------|-------|
| Correctness | X/10 | [specific findings] |
| Completeness | X/10 | [specific findings] |
| Security | X/10 | [specific findings] |
| Performance | X/10 | [specific findings] |
| Conventions | X/10 | [specific findings] |
| Testability | X/10 | [specific findings] |
| Readability | X/10 | [specific findings] |
| Error Handling | X/10 | [specific findings] |
| Type Safety | X/10 | [specific findings] |
| Integration | X/10 | [specific findings] |
## Issues Found
### Critical (must fix)
- [FILE:LINE] Description of issue. Fix: [specific fix]
- [FILE:LINE] Description of issue. Fix: [specific fix]
### Major (should fix)
- [FILE:LINE] Description of issue. Fix: [specific fix]
- [FILE:LINE] Description of issue. Fix: [specific fix]
### Minor (nice to fix)
- [FILE:LINE] Description of issue. Fix: [specific fix]
- [FILE:LINE] Description of issue. Fix: [specific fix]
## What Was Done Well
- [specific positive observation with file reference]
- [specific positive observation with file reference]
Critical: Must be fixed before shipping. Examples:
Major: Should be fixed before shipping. Examples:
any)Minor: Nice to fix but not blockers. Examples:
Note: Severity can depend on system risk context (e.g., performance issue may be Critical in high-scale systems).
When a fix agent receives a review with issues:
After applying fixes, the fix agent must produce:
## Fixes Applied
### Critical Issues Addressed
- [FILE:LINE] [Original issue] -> Fixed by: [what was changed]
- [FILE:LINE] [Original issue] -> Fixed by: [what was changed]
### Major Issues Addressed
- [FILE:LINE] [Original issue] -> Fixed by: [what was changed]
### Minor Issues Addressed
- [FILE:LINE] [Original issue] -> Fixed by: [what was changed]
### Issues Not Fixed
- [FILE:LINE] [Original issue] -> Reason: [why it wasn't fixed]
**Example**:
- [src/api/legacy.ts:45] Complex refactoring of legacy code -> Reason: Out of scope for this PR, tracked in ticket #1234
After fixes are applied, a re-reviewer must:
Check each previously flagged issue
Re-score all dimensions
Identify new issues
## Re-Review Summary
- **Overall Score**: X.X/10 (was Y.Y/10)
- **Verdict**: PASS | CONDITIONAL PASS | FAIL
- **Previous Issues**: X critical, Y major, Z minor
- **Issues Resolved**: X critical, Y major, Z minor
- **New Issues Found**: X critical, Y major, Z minor
## Dimension Score Changes
| Dimension | Previous | Current | Change |
|-----------|----------|---------|--------|
| Correctness | X/10 | X/10 | +/- X |
| [etc] | ... | ... | ... |
## Previous Issues - Resolution Status
### Critical Issues
- [RESOLVED] [FILE:LINE] [Original issue] -> RESOLVED
- [NOT FIXED] [FILE:LINE] [Original issue] -> NOT FIXED: [reason]
### Major Issues
- [RESOLVED] [FILE:LINE] [Original issue] -> RESOLVED
### Minor Issues
- [RESOLVED] [FILE:LINE] [Original issue] -> RESOLVED
## New Issues Found
[Use standard Critical/Major/Minor format]
This skill is a critical component of the Work-Review-Fix-Check loop:
The orchestrator uses the numeric score and verdict to make decisions:
Wrong: Giving 8-9 scores when significant issues exist Right: Use the rubric literally -- 6-7 means "acceptable but with notable issues"
Wrong: Marking a security vulnerability as "Major" instead of "Critical" Right: Use severity guidelines -- auth bypass is ALWAYS Critical
Wrong: "The error handling is poor" Right: "src/api/users.ts:42 - Empty catch block silently swallows errors"
Wrong: "Fix the type safety issues"
Right: "Replace any with User type and add runtime validation with zod schema"
Wrong: Scoring 9.5 instead of 10 because "the domain is complex" or "staying below perfection" Right: If no issues are identified, the score is 10. Scores must be objective and based solely on identified, actionable issues. A 10/10 is always achievable and must be awarded when no deficiencies are found. Never withhold points for subjective reasons like domain complexity, code novelty, or philosophical caution.
Wrong: Only listing problems Right: Also document what was done well (encourages good patterns)
Wrong: Overall score 7.2/10 with verdict "PASS" Right: Score 7.2 -> Verdict FAIL (threshold is 8.0 for conditional, 9.5 for pass)
This skill includes deterministic validation scripts:
See scripts/ directory for implementation details.
Correctness: 20% (most important)
Completeness: 15%
Security: 15%
Performance: 10%
Conventions: 10%
Testability: 10%
Readability: 5%
Error Handling: 5%
Type Safety: 5%
Integration: 5%
9.5+ -> PASS
8.0-9.49 -> CONDITIONAL PASS
6.0-7.9 -> FAIL
<6.0 -> FAIL (major rework)
Critical: Security, crashes, data corruption -> MUST FIX
Major: Performance, missing features, poor practices -> SHOULD FIX
Minor: Style, optimization opportunities -> NICE TO FIX
Activates when the user asks about AI prompts, needs prompt templates, wants to search for prompts, or mentions prompts.chat. Use for discovering, retrieving, and improving prompts.
Search, retrieve, and install Agent Skills from the prompts.chat registry using MCP tools. Use when the user asks to find skills, browse skill catalogs, install a skill for Claude, or extend Claude's capabilities with reusable AI agent components.
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.