npx claudepluginhub jnuyens/gsd-plugin --plugin gsdThis skill is limited to using the following tools:
<objective>
Audits LLM eval pipelines for issues like missing error analysis, unvalidated judges, and vanity metrics. Produces prioritized findings with fixes when inheriting systems or verifying trustworthiness.
Writes, edits, reviews, and validates AgentV EVAL.yaml files for agent skill evaluations. Adds test cases, configures graders, converts from evals.json or chat transcripts.
Generates 20 test cases (15 happy path + 5 edge) for AI features in spreadsheet format using PM-Friendly Evals. Launches simple eval workflow with optional Linear project.
Share bugs, ideas, or general feedback.
<execution_context> @${CLAUDE_PLUGIN_ROOT}/workflows/eval-review.md @${CLAUDE_PLUGIN_ROOT}/references/ai-evals.md </execution_context>
Phase: $ARGUMENTS — optional, defaults to last completed phase. Execute @${CLAUDE_PLUGIN_ROOT}/workflows/eval-review.md end-to-end. Preserve all workflow gates.