From project-toolkit
Assesses code maintainability through 5 qualities (cohesion, coupling, encapsulation, testability, non-redundancy) with scoring rubrics. Generates markdown reports with remediation for methods/classes/modules across languages.
npx claudepluginhub rjmurillo/ai-agents --plugin project-toolkitThis skill uses the workspace's default tool permissions.
Evaluate code maintainability using 5 timeless design qualities with quantifiable scoring rubrics.
Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
Evaluate code maintainability using 5 timeless design qualities with quantifiable scoring rubrics.
assess code qualityevaluate maintainabilitycheck code qualitiestestability reviewrun quality assessment# Assess a single file
python3 scripts/assess.py --target src/services/auth.py
# Assess changed files only (CI mode)
python3 scripts/assess.py --target . --changed-only --format json
# Full module assessment with HTML report
python3 scripts/assess.py --target src/services/ --format html --output quality-report.html
| Quality | Question | Score 10 | Score 1-3 |
|---|---|---|---|
| Cohesion | How related are responsibilities? | Single, well-defined responsibility | Unrelated responsibilities jammed together |
| Coupling | How dependent on other code? | Minimal deps, depends on abstractions | Tightly coupled, hard-coded dependencies |
| Encapsulation | How well are internals hidden? | All internals private, minimal API | Everything public, no information hiding |
| Testability | How easily verified in isolation? | Pure functions, injected dependencies | Hard to test, requires full integration |
| Non-Redundancy | How unique is each piece of knowledge? | Zero duplication, appropriate abstractions | Pervasive copy-paste |
Use this skill when:
Use analyze instead when:
The skill runs automated assessment via scripts/assess.py:
Symbol Extraction
Quality Scoring
Comparison (if historical data)
Report Generation
Gate Enforcement (CI mode)
python3 scripts/assess.py --target <path> [options]
| Parameter | Required | Default | Description |
|---|---|---|---|
--target | Yes | - | File, directory, or glob pattern |
--context | No | production | production, test, or generated |
--changed-only | No | false | Only assess changed files (git diff) |
--format | No | markdown | markdown, json, or html |
--config | No | .qualityrc.json | Path to config file |
--output | No | stdout | Output file path |
--use-serena | No | auto | auto, yes, or no (Serena integration) |
| Code | Meaning |
|---|---|
| 0 | Assessment complete, all thresholds met |
| 10 | Quality degraded vs previous run |
| 11 | Quality below configured thresholds |
| 1 | Script error (invalid args, file not found) |
Create .qualityrc.json to customize thresholds:
{
"thresholds": {
"cohesion": { "min": 7, "warn": 5 },
"coupling": { "max": 3, "warn": 5 },
"encapsulation": { "min": 7, "warn": 5 },
"testability": { "min": 6, "warn": 4 },
"nonRedundancy": { "min": 8, "warn": 6 }
},
"context": {
"test": {
"testability": { "min": 3 }
}
},
"ignore": [
"**/generated/**",
"**/*.pb.py",
"**/migrations/**"
]
}
| Avoid | Why | Instead |
|---|---|---|
| Running on entire codebase every commit | Slow, noisy | Use --changed-only in CI |
| Using scores for performance reviews | Gaming the system | Focus on trend improvement |
| Blocking merges on absolute scores | Discourages refactoring old code | Block on regression only |
| Ignoring context (test vs production) | False positives | Use --context flag |
| Not configuring thresholds | One-size-fits-all does not fit | Customize .qualityrc.json |
After running assessment:
How strongly related are responsibilities within a boundary?
High cohesion = focused, understandable code. Low cohesion = "god objects" doing too much.
| Score | Description |
|---|---|
| 10 | Single, well-defined responsibility |
| 7-9 | Primary responsibility clear, minor supporting concerns |
| 4-6 | Multiple loosely related responsibilities |
| 1-3 | Unrelated responsibilities jammed together |
How dependent is this code on other code?
Loose coupling = independent evolution, easy testing. Tight coupling = fragile, hard to test.
| Score | Description |
|---|---|
| 10 | Minimal dependencies, depends on abstractions |
| 7-9 | Few dependencies, all explicit |
| 4-6 | Moderate dependencies, some global state |
| 1-3 | Tightly coupled, hard-coded dependencies |
How well are implementation details hidden?
Good encapsulation = freedom to change internals. Poor encapsulation = brittle API.
| Score | Description |
|---|---|
| 10 | All internals private, minimal public API |
| 7-9 | Mostly private, well-defined API |
| 4-6 | Some internals exposed |
| 1-3 | Everything public, no information hiding |
How easily can behavior be verified in isolation?
Testable code = fast feedback, confidence to refactor. Untestable code = fear of change.
| Score | Description |
|---|---|
| 10 | Pure functions, injected dependencies |
| 7-9 | Mostly testable, straightforward to mock |
| 4-6 | Moderately testable, requires setup |
| 1-3 | Hard to test, requires full integration |
How unique is each piece of knowledge?
DRY code = fix once, single source of truth. Duplication = fix N times, maintenance burden.
| Score | Description |
|---|---|
| 10 | Zero duplication, appropriate abstractions |
| 7-9 | Minimal duplication (intentional) |
| 4-6 | Moderate duplication, missed abstractions |
| 1-3 | Pervasive copy-paste |
python3 scripts/assess.py --target src/models/user.py
Output:
# Code Quality Assessment: src/models/user.py
## Summary
- **Cohesion**: 8/10
- **Coupling**: 4/10 (warning)
- **Encapsulation**: 9/10
- **Testability**: 7/10
- **Non-Redundancy**: 9/10
## Issues Found
### Coupling: 4/10 (Warning)
**Problem**: Direct instantiation of DatabaseConnection in constructor
**Impact**: Hard to test, tightly coupled to database layer
**Remediation**: Use dependency injection
- See: [Dependency Injection](references/patterns/dependency-injection.md)
- Related ADR: ADR-023 (Dependency Management)
Example Fix:
# Before
class User:
def __init__(self):
self.db = DatabaseConnection() # Hard-coded dependency
# After
class User:
def __init__(self, db: DatabaseInterface):
self.db = db # Injected dependency
# In CI pipeline
python3 scripts/assess.py --target . --changed-only --format json --output quality.json
# Exit code 10 = quality degraded, fail PR
# Exit code 0 = quality maintained, pass
python3 scripts/assess.py --target src/ --format html --output reports/quality.html
Opens dashboard showing:
# Identify refactoring targets
python3 scripts/assess.py --target src/ --format json | \
jq '.files | sort_by(.overall) | .[0:5]' > low-quality-files.json
# Feed to planner
planner --input low-quality-files.json --goal "Refactor lowest quality files"
When reviewing ADRs, include quality impact:
# Before implementing ADR
python3 scripts/assess.py --target affected-files.txt > baseline.md
# After implementing ADR
python3 scripts/assess.py --target affected-files.txt > post-implementation.md
# Compare
diff baseline.md post-implementation.md
Combine broad analysis with focused quality metrics:
# First: broad exploration
analyze --target src/
# Then: quality deep dive on problem areas
python3 scripts/assess.py --target src/services/auth.py
For detailed scoring methodology and examples:
| Support Level | Languages |
|---|---|
| Full | Python (.py), TypeScript/JavaScript (.ts, .js, .tsx, .jsx), C# (.cs), Java (.java), Go (.go) |
| Partial (heuristic) | Ruby (.rb), Rust (.rs), PHP (.php), Kotlin (.kt) |
Serena integration improves accuracy when available.
This skill embodies "sergeant methods directing privates":
Each quality scorer is cohesive (single responsibility), loosely coupled (independent), and testable (pure calculation).
These 5 qualities are computer science fundamentals:
Language-agnostic design ensures longevity across technology shifts.