From armory
Performs diff-based PR reviews across code quality, test coverage, silent failures, type design, and comment quality with severity-ranked findings on git diffs or specified PRs.
npx claudepluginhub mathews-tom/armory --plugin armoryThis skill uses the workspace's default tool permissions.
Diff-based code review across five dimensions. Reads the changed files, selects
Implements Playwright E2E testing patterns: Page Object Model, test organization, configuration, reporters, artifacts, and CI/CD integration for stable suites.
Guides Next.js 16+ Turbopack for faster dev via incremental bundling, FS caching, and HMR; covers webpack comparison, bundle analysis, and production builds.
Discovers and evaluates Laravel packages via LaraPlugins.io MCP. Searches by keyword/feature, filters by health score, Laravel/PHP compatibility; fetches details, metrics, and version history.
Diff-based code review across five dimensions. Reads the changed files, selects applicable review methodologies, and produces an aggregated report with severity-ranked findings.
Native alternative: Claude Code's
/ultrareviewruns a lightweight native bug-focused review (three free per month on Pro/Max plans at Opus 4.7's launch). Use this skill for five-dimension severity-ranked analysis (code quality + tests + error handling + types + comments) with file:line references; use/ultrareviewfor a quick bug-hunting pass on a diff.
| File | Contents | Load When |
|---|---|---|
references/code-review.md | Guideline compliance, bug detection, confidence scoring | Always |
references/test-analysis.md | Behavioral test coverage, criticality rating | Test files changed |
references/error-handling.md | Silent failure patterns, catch block analysis | Error handling changed |
references/type-design.md | Invariant analysis, 4-dimension rating rubric | Type definitions added/modified |
references/comment-quality.md | Comment accuracy, long-term value, rot detection | Comments/docstrings added |
git diff (unstaged changes)git diff main...HEAD or gh pr diff <number>git diff --name-onlyClassify changed files and select applicable dimensions:
| Condition | Dimension | Reference to Load |
|---|---|---|
| Always | Code review | references/code-review.md |
Files matching *test*, *spec*, *_test.*, test_* | Test analysis | references/test-analysis.md |
| Files containing try/catch, except, .catch, Result, error callbacks | Error handling | references/error-handling.md |
| Files containing class, interface, type, struct, enum, dataclass definitions | Type design | references/type-design.md |
| Files with new/modified docstrings, JSDoc, or block comments | Comment quality | references/comment-quality.md |
Load only the reference files that apply. Skip dimensions with no matching files.
For each applicable dimension, analyze the diff using the loaded methodology:
Merge all findings into a single report, deduplicated and severity-ranked.
Deduplication rules:
Severity mapping across dimensions:
| Dimension | Maps to Critical | Maps to Important | Maps to Suggestion |
|---|---|---|---|
| Code review | Confidence 90-100 | Confidence 80-89 | — |
| Test analysis | Rating 9-10 | Rating 7-8 | Rating 5-6 |
| Error handling | CRITICAL | HIGH | MEDIUM |
| Type design | Any rating <= 3/10 | Any rating 4-6/10 | Rating 7-8/10 |
| Comment quality | Factually incorrect | Misleading or incomplete | Restates obvious code |
# PR Review Summary
**Scope:** [X files changed, Y dimensions applied]
**Dimensions:** [list of active dimensions]
## Critical Issues (must fix before merge)
- **[dimension]** `file:line` — Description. Fix suggestion.
## Important Issues (should fix)
- **[dimension]** `file:line` — Description. Fix suggestion.
## Suggestions (consider)
- **[dimension]** `file:line` — Description.
## Strengths
- What's well-done in this changeset.
## Recommended Action
1. Fix critical issues
2. Address important issues
3. Consider suggestions
4. Re-run review after fixes
If no issues are found at any severity level, confirm the code meets standards with a brief summary of what was reviewed and which dimensions were applied.
Users can request specific dimensions instead of running all:
| User Says | Dimensions Applied |
|---|---|
| "review my PR" / "check my changes" | All applicable (default) |
| "review the code" / "check code quality" | Code review only |
| "check the tests" / "is test coverage good" | Test analysis only |
| "check error handling" / "find silent failures" | Error handling only |
| "review the types" / "check type design" | Type design only |
| "check the comments" / "review documentation" | Comment quality only |
When a specific aspect is requested, load only that reference file and skip routing.
| Problem | Resolution |
|---|---|
| No git diff available | Ask user to specify files or scope |
| CLAUDE.md not found | Review against general best practices; note the absence |
| No test files in diff | Skip test analysis dimension; note in output |
| Diff is empty | Report "no changes to review" and stop |
| Diff exceeds context limits | Focus on files the user is most likely to care about; summarize skipped files |
function declarations.code-refiner skill's job. Keep the roles separate.| Rationalization | Reality |
|---|---|
| "Tests pass, so the code is fine" | Tests are necessary but insufficient — they miss architecture, security, readability, and maintainability concerns |
| "It's a small diff, no real review needed" | Small changes cause most production incidents; a 3-line auth bypass is worse than a 300-line refactor |
| "We'll clean it up later" | Later never comes — the review IS the quality gate before code becomes legacy |
| "The author is senior, I trust them" | Seniority doesn't prevent mistakes; fresh eyes catch what familiarity blinds |
| "I already reviewed similar code recently" | Each diff has unique context — assumptions from past reviews cause missed issues |
| "This is just a refactor, nothing can break" | Refactors change behavior in subtle ways — verify with tests and trace call sites |