From evaluate-plugin
Batch evaluates all skills in a plugin by running /evaluate:skill on those with eval cases, then aggregates a plugin-level quality report. Use for auditing plugin quality or pre-release checks.
npx claudepluginhub laurigates/claude-plugins --plugin evaluate-pluginThis skill is limited to using the following tools:
Batch evaluate all skills in a plugin. Runs `/evaluate:skill` for each skill, then produces a plugin-level quality report.
Applies Acme Corporation brand guidelines including colors, fonts, layouts, and messaging to generated PowerPoint, Excel, and PDF documents.
Builds DCF models with sensitivity analysis, Monte Carlo simulations, and scenario planning for investment valuation and risk assessment.
Calculates profitability (ROE, margins), liquidity (current ratio), leverage, efficiency, and valuation (P/E, EV/EBITDA) ratios from financial statements in CSV, JSON, text, or Excel for investment analysis.
Batch evaluate all skills in a plugin. Runs /evaluate:skill for each skill, then produces a plugin-level quality report.
| Use this skill when... | Use alternative when... |
|---|---|
| Auditing all skills in a plugin before release | Evaluating a single skill -> /evaluate:skill |
| Establishing quality baselines across a plugin | Viewing past results -> /evaluate:report |
| Checking overall plugin quality after refactoring | Need structural compliance -> plugin-compliance-check.sh |
find $1/skills -name "SKILL.md" -maxdepth 3find $1/skills -name "evals.json" -maxdepth 3Parse these from $ARGUMENTS:
| Parameter | Default | Description |
|---|---|---|
<plugin-name> | required | Name of the plugin to evaluate |
--create-missing-evals | false | Generate evals for skills that lack them |
--parallel N | 1 | Max concurrent skill evaluations |
Find all skills in the plugin:
<plugin-name>/skills/*/SKILL.md
List them and count the total.
For each skill, check if evals.json exists:
--create-missing-evals: include, will create evals during evaluationReport the breakdown:
Found N skills in <plugin-name>:
- M with eval cases
- K without eval cases (skipped | will create)
For each included skill, invoke /evaluate:skill via the SlashCommand tool:
SlashCommand: /evaluate:skill <plugin-name>/<skill-name> [--create-evals]
If --parallel N is set and N > 1, batch evaluations into groups of N. Otherwise, run sequentially.
Track progress with TodoWrite — mark each skill as it completes.
After all skill evaluations complete, read each skill's benchmark.json and aggregate:
bash evaluate-plugin/scripts/aggregate_benchmark.sh <plugin-name>
Write aggregated results to <plugin-name>/eval-results/plugin-benchmark.json.
Print a plugin-level summary table:
## Plugin Evaluation: <plugin-name>
| Skill | Evals | Pass Rate | Status |
|-------|-------|-----------|--------|
| skill-a | 4 | 100% | PASS |
| skill-b | 3 | 67% | PARTIAL |
| skill-c | 5 | 80% | PASS |
**Overall**: 82% pass rate across N eval cases
Rank skills by pass rate. Flag any below 50% as needing attention.
| Context | Command |
|---|---|
| List plugin skills | ls -d <plugin>/skills/*/SKILL.md |
| Check for evals | find <plugin>/skills -name evals.json |
| Count skills | ls -d <plugin>/skills/*/SKILL.md | wc -l |
| Aggregate results | bash evaluate-plugin/scripts/aggregate_benchmark.sh <plugin> |
| Flag | Description |
|---|---|
--create-missing-evals | Generate eval cases for skills without them |
--parallel N | Max concurrent evaluations (default: 1) |