npx claudepluginhub mathews-tom/armory --plugin armoryThis skill uses the workspace's default tool permissions.
Transform skills authored for high-capability models (Opus) into deterministic workflows
Implements Playwright E2E testing patterns: Page Object Model, test organization, configuration, reporters, artifacts, and CI/CD integration for stable suites.
Guides Next.js 16+ Turbopack for faster dev via incremental bundling, FS caching, and HMR; covers webpack comparison, bundle analysis, and production builds.
Discovers and evaluates Laravel packages via LaraPlugins.io MCP. Searches by keyword/feature, filters by health score, Laravel/PHP compatibility; fetches details, metrics, and version history.
Transform skills authored for high-capability models (Opus) into deterministic workflows that execute reliably on lower-cost models (Sonnet, Haiku). The core insight from EvoSkills: skills encode reusable task structure, not model-specific artifacts. A skill evolved on Opus transfers with +35-45pp gains to other models — but only when the instructions are sufficiently deterministic that lower-capability models can follow them without improvising.
| File | Contents | Load When |
|---|---|---|
references/distillation-patterns.md | Pattern catalog for converting reasoning to rules | Always |
package-evaluator at >= 70%surrogate-verifier skill for cross-model assertion checkingScore each section of the source SKILL.md for reasoning difficulty:
| Complexity Signal | Score | Distillation Action |
|---|---|---|
| Decision tree with 3+ branches | HIGH | Convert to explicit if/then lookup table |
| "Use judgment" or "consider context" | HIGH | Replace with concrete heuristic rules |
| Multi-step inference chain | HIGH | Break into numbered atomic steps |
| Reference to domain expertise | MED | Add explicit reference file with knowledge |
| Clear enumerated steps | LOW | Keep as-is |
| Concrete examples with expected output | LOW | Keep as-is |
Produce a complexity map: section name -> complexity score -> planned action.
Execute the source skill with Opus on 5 representative tasks:
evals/cases.yaml (positive cases) or generate new onesFrom the collected traces, extract deterministic patterns:
Rewrite the SKILL.md applying all distillation actions from Phase 1:
| Source Pattern | Distilled Replacement |
|---|---|
| "Analyze the code and determine..." | "Check for these 5 specific patterns: [list]" |
| "Use appropriate formatting" | "Output as a markdown table with columns: [A, B, C]" |
| "Consider the context to decide..." | "If [condition A]: do X. If [condition B]: do Y. Default: Z" |
| "Apply best practices for..." | Reference file with explicit best practices enumerated |
| Multi-paragraph reasoning instruction | Numbered step list with single-sentence steps |
Rules for the rewrite:
Run the distilled skill on the target model (Haiku or Sonnet):
surrogate-verifier to generate assertions for each task output| Metric | Source (Opus + original) | Target (Haiku + distilled) | Delta |
|---|---|---|---|
| Assertions passed | N/M | N/M | ± |
| Weighted score | X.XX | X.XX | ± |
| Output completeness | % | % | ± |
| Format compliance | % | % | ± |
Produce the final comparison:
# Skill Distillation Report: <skill-name>
## Complexity Reduction
- Sections distilled: N/M (HIGH → LOW)
- Instruction word count: original X → distilled Y (Z% reduction)
- Decision points replaced with lookup tables: N
## Cross-Model Performance
| Model | Assertions Passed | Weighted Score | Format Compliance |
|---------|-------------------|----------------|-------------------|
| Opus | 7/7 | 1.00 | 100% |
| Sonnet | 6/7 | 0.92 | 100% |
| Haiku | 5/7 | 0.85 | 85% |
## Changes Made
1. [Section] "Analyze complexity" → explicit 5-item checklist
2. [Section] "Apply formatting" → fixed markdown table template
...
## Recommendation
[SHIP | ITERATE | MANUAL_REVIEW_NEEDED]
| Error | Resolution |
|---|---|
| Source skill scores below 70% | Refuse distillation; recommend evolution via test-engineer |
| No execution traces available | Generate synthetic tasks and collect traces before proceeding |
| Target model fails all assertions | Skill may be too complex for target model; report with detail |
| Distilled skill longer than source | Review distillation; patterns may need consolidation |