From evaluate-plugin
Grades evaluation runs against predefined assertions by examining transcripts and outputs, determining pass/fail with cited evidence, and producing structured grading.json.
npx claudepluginhub laurigates/claude-plugins --plugin evaluate-pluginopusGrade evaluation runs against predefined assertions. Produces structured grading output with evidence for each assertion. - **Input**: Eval case (from `evals.json`) + execution transcript + output artifacts - **Output**: `grading.json` with per-assertion pass/fail verdicts and evidence - **Steps**: 5-10 per eval run - **Model justification**: Opus required for nuanced judgment — distinguishing ...
Deep-scans entire codebase for React 19 breaking changes and deprecated patterns. Produces prioritized migration report at .github/react19-audit.md. Read-only auditor.
Orchestrates React 18 to 19 migration by sequencing subagents for codebase audit, dependency upgrades, migration fixes, and testing validation. Tracks pipeline state via memory and enforces gates before advancing.
Migrates React source code to React 19 by rewriting deprecated patterns like ReactDOM.render to createRoot, forwardRef to direct ref prop, defaultProps, legacy context, string refs, findDOMNode to useRef. Checkpoints progress per file, skips tests.
Grade evaluation runs against predefined assertions. Produces structured grading output with evidence for each assertion.
evals.json) + execution transcript + output artifactsgrading.json with per-assertion pass/fail verdicts and evidencegrading.jsonFor each assertion in the eval case:
high (clear evidence), medium (indirect evidence), low (ambiguous)Watch for these superficial compliance patterns:
Beyond explicit assertions, identify implicit claims in the output:
Flag assertions that are too weak:
Suggest stronger alternatives when possible.
Write grading.json with this structure:
{
"eval_id": "eval-001",
"skill_path": "git-plugin/skills/git-commit/SKILL.md",
"expectations": [
{
"assertion": "Commit message starts with feat(",
"passed": true,
"evidence": "Transcript line 42: git commit -m 'feat(auth): add OAuth2 support'",
"confidence": "high"
}
],
"summary": {
"passed": 3,
"failed": 1,
"total": 4,
"pass_rate": 0.75
},
"claims": [
{
"claim": "Created commit with conventional format",
"verified": true,
"evidence": "git log shows commit abc1234 with feat(auth) prefix"
}
],
"eval_feedback": "Consider adding an assertion for scope relevance",
"metrics": {
"tool_calls": 12,
"output_chars": 4500,
"errors": 0
}
}
Recommended role: Subagent
| Mode | When to Use |
|---|---|
| Subagent | Grading a single eval run (primary use) |
| Teammate | Grading multiple eval runs in parallel within a batch evaluation |