From universe
Reviews test files for bug-catching quality, grading on six dimensions like assertion depth, input coverage, mock health with actionable scorecard.
npx claudepluginhub mbwsims/claude-universe --plugin universeThis skill is limited to using the following tools:
Evaluate existing tests for quality issues that let bugs through. Grades test files on six
Reviews test suites for coverage completeness, quality, and best practices. Checks happy/sad paths, edge cases, assertions, isolation, AAA patterns, and compliance with RSpec, Minitest, Jest, Playwright.
Reviews test suites for coverage, isolation, mock usage, naming conventions, and completeness using checklist for 80%+ coverage, AAA pattern, mock correctness, type safety, and best practices.
Prevents silent decimal mismatch bugs in EVM ERC-20 tokens via runtime decimals lookup, chain-aware caching, bridged-token handling, and normalization. For DeFi bots, dashboards using Python/Web3, TypeScript/ethers, Solidity.
Share bugs, ideas, or general feedback.
Evaluate existing tests for quality issues that let bugs through. Grades test files on six dimensions and produces a scorecard with specific, actionable findings — not generic advice.
This is not about code style or formatting. It is about whether the tests actually catch bugs.
Call testkit_analyze with the target test file. If available, use the structured metrics
(shallow assertion count, error coverage ratio, mock health, name quality) as the foundation
for dimension scoring in step 3. Supplement with semantic analysis for dimensions the tool
cannot measure (input coverage, independence).
If testkit_analyze is unavailable, perform full manual analysis as described below.
Note to the user: "Running without testkit-mcp — analysis is based on code reading.
Install testkit-mcp (npm install -g testkit-mcp) for precise metrics."
If a test file was specified, read it. Otherwise, discover test files using Glob:
**/*.test.*, **/*.spec.*, **/__tests__/**.
If multiple test files are found and none was specified, review the most recently modified
test file. Use ls -t or equivalent to determine modification order. Do not ask the user
to choose — pick the most recent one and note it: "Reviewing {filename} (most recently
modified test file). Specify a file to review a different one."
Read both:
Understanding the source code's contract is essential for evaluating whether the tests cover the right things.
Score each dimension using letter grades (A through F, with + and - for fine-grained
distinctions). Consult references/smell-catalog.md for detection patterns and
references/scoring-rubric.md for calibration.
Dimension 1: Assertion Depth
toBeDefined, toBeTruthy, toHaveBeenCalled without argumentsDimension 2: Input Coverage
Dimension 3: Error Testing
.toThrow())?Dimension 4: Mock Health
Dimension 5: Specification Clarity
Dimension 6: Independence
beforeEach properly reset state, or does it accumulate?If a /test-plan was produced for this code earlier in the conversation:
This catches the gap between "what we planned to test" and "what we actually tested."
Report format:
## Test Review — {test-file-name}
**Grade: {letter}** — {one-line summary}
| Dimension | Score | Finding |
|-----------|-------|---------|
| Assertion depth | {A-F} | {specific finding with count} |
| Input coverage | {A-F} | {what's missing} |
| Error testing | {A-F} | {count of gaps} |
| Mock health | {A-F} | {assessment} |
| Specification clarity | {A-F} | {assessment} |
| Independence | {A-F} | {assessment} |
### Priority Fixes
1. **{file}:{line}** — {specific issue}
{What to change and why — what bugs this would catch}
2. **Missing: {category}** — {what tests don't exist}
{Specific test cases to add}
3. ...
After the scorecard, list the most important tests that don't exist yet. These should be concrete — specific input values, specific expected outputs, specific error types. Not "add more edge case tests" but "add a test for empty email that expects ValidationError."
/test — Use to fix issues found in the review/test-plan — Use to identify missing coverage categories before adding testsreferences/smell-catalog.md — Test smells organized by dimension with detection
patterns, fixes, and before/after examplesreferences/scoring-rubric.md — Grading methodology with calibrated examples at
each grade level