Help us improve
Share bugs, ideas, or general feedback.
From agent-evaluation-lab
Use when evaluating prompts, LLM outputs, red-team suites, or model behavior with local eval configs and safe provider/cost controls.
How this skill is triggered — by the user, by Claude, or both
Slash command
/agent-evaluation-lab:prompt-evaluation-runnerThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this skill when you need to evaluate an LLM app, test a prompt, or run red-teaming/vulnerability scans against a target model or application.
Share bugs, ideas, or general feedback.
Use this skill when you need to evaluate an LLM app, test a prompt, or run red-teaming/vulnerability scans against a target model or application.
npx Prompt evaluation@latest directly.{{env.NAME}}; never hardcode keys.references/eval-config-patterns.mdnpx claudepluginhub yeaight7/agent-powerups --plugin agent-evaluation-labGuides technical evaluation of code review feedback: read fully, restate for understanding, verify against codebase, respond with reasoning or pushback before implementing.