Skill

skill-evaluation-workbench

Use when designing, running, debugging, or hardening deterministic eval suites for agent skills, prompts, tool workflows, or MCP-backed cases.

Popularity

Parent stars

Parent forks

Shared by

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/agent-evaluation-lab:skill-evaluation-workbench

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

- A skill or prompt needs repeatable quality checks across models.

Supporting Files

references/workbench-suite-model.md

SKILL.md

44 lines · ~618 tokens

Stats

LanguageTypeScript

Parent stars5

Parent forks1

MaintenanceExcellent

Last CommitMay 6, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.