Skill

generate-tests

Generates EvalView test cases from SKILL.md files using LLM, captures real agent interactions as tests, or creates individual test YAMLs manually.

testing

developer-tools

Popularity

Stars

114

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/evalview:generate-tests

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Use this skill when the user wants to create test cases for their AI agent or skill without writing YAML by hand.

SKILL.md

66 lines · ~658 tokens

Stats

LanguagePython

Stars114

Forks20

MaintenanceExcellent

Last CommitJun 15, 2026

Actions

View Source View Plugin View on GitHub View README

Generate Tests

Use this skill when the user wants to create test cases for their AI agent or skill without writing YAML by hand.

Four approaches

1. Generate tests from a SKILL.md file

Use the generate_skill_tests MCP tool to auto-generate a test suite from a skill definition. This reads the SKILL.md and produces YAML test cases covering explicit triggers, implicit triggers, contextual triggers, and negative cases.

Steps:

Ask the user which SKILL.md to generate tests for (or detect it from context).
Call generate_skill_tests with:
- skill_path: path to the SKILL.md file
- output_path (optional): where to save the generated YAML
- count (optional): number of test cases (default: 10)
After generation, offer to run the tests with run_skill_test.

CLI equivalent:

evalview skill generate-tests .claude/skills/my-skill/SKILL.md --auto
evalview skill generate-tests .claude/skills/my-skill/SKILL.md -c 20 -o tests/my-skill-tests.yaml

2. Create individual test cases manually

Use the create_test MCP tool to create a single test YAML file from a description.

Steps:

Gather from the user: test name, query, expected tools, forbidden tools, expected output keywords, and minimum score.
Call create_test with the parameters.
After creating the test, call run_snapshot to establish the golden baseline.

3. Capture real interactions

Use the CLI evalview capture command to proxy real agent traffic and save interactions as test YAMLs automatically. This records the query, output, and tool calls from live usage.

CLI equivalent:

evalview capture --agent http://localhost:8080/execute --output-dir tests/test-cases
evalview capture --multi-turn  # saves all turns as one multi-turn conversation test

4. Validate a skill before testing

Use validate_skill to check a SKILL.md for correct structure and completeness before generating tests from it.

Running generated tests

After generating tests, execute them with run_skill_test:

test_file: path to the generated YAML
no_rubric: true for fast deterministic-only checks (no LLM cost)
verbose: true for detailed output on all tests

CLI equivalent:

evalview skill test tests/my-skill-tests.yaml
evalview skill test tests/my-skill-tests.yaml --no-rubric  # fast, $0
evalview skill test tests/my-skill-tests.yaml --verbose --model claude-sonnet-4-20250514

generate-tests

Popularity

Invocation

Context Preview

SKILL.md

generate-tests

Popularity

Invocation

Context Preview

SKILL.md

Generate Tests

Four approaches

1. Generate tests from a SKILL.md file

2. Create individual test cases manually

3. Capture real interactions

4. Validate a skill before testing

Running generated tests

Similar Skills

Generate Tests

Four approaches

1. Generate tests from a SKILL.md file

2. Create individual test cases manually

3. Capture real interactions

4. Validate a skill before testing

Running generated tests

Similar Skills