Stats
Actions
Tags
Help us improve
Share bugs, ideas, or general feedback.
From agent-evaluation-lab
Use when designing, running, debugging, or hardening deterministic eval suites for agent skills, prompts, tool workflows, or MCP-backed cases.
How this skill is triggered — by the user, by Claude, or both
Slash command
/agent-evaluation-lab:skill-evaluation-workbenchThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
- A skill or prompt needs repeatable quality checks across models.
Share bugs, ideas, or general feedback.
references/ area and case fixtures into scoped support dirs.result, summary, trace, and workspace evidence.references/workbench-suite-model.mdnpx claudepluginhub yeaight7/agent-powerups --plugin agent-evaluation-labGuides technical evaluation of code review feedback: read fully, restate for understanding, verify against codebase, respond with reasoning or pushback before implementing.