From skill-unit
ALWAYS use this skill when the user mentions writing, designing, creating, or adding test cases for any skill, even if they also describe specific behavior to test. Triggers on "write a test case", "write me a test case", "write test cases", "design tests", "create a spec file", "help me write tests", "add tests", "no tests yet", "/test-design", or any request that involves creating test cases, spec files, or test coverage for a skill. If the user says "write a test case for X that covers Y", this skill handles it, not the skill being tested.
npx claudepluginhub dflor003/skill-unit --plugin skill-unitThis skill uses the workspace's default tool permissions.
Design, write, and refine `*.spec.md` test files for AI agent skills. This skill reads a target skill's SKILL.md, asks targeted questions about gaps it cannot infer, and incrementally generates test cases by category with refinement loops after each.
Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
Applies Acme Corporation brand guidelines including colors, fonts, layouts, and messaging to generated PowerPoint, Excel, and PDF documents.
Share bugs, ideas, or general feedback.
Design, write, and refine *.spec.md test files for AI agent skills. This skill reads a target skill's SKILL.md, asks targeted questions about gaps it cannot infer, and incrementally generates test cases by category with refinement loops after each.
/test-design, /test-design <skill-name>Search for a matching SKILL.md in both locations:
skills/*/SKILL.md — plugin-level skills in the current repo.claude/skills/*/SKILL.md — repo-level skillsUse the Glob tool with patterns skills/*/SKILL.md and .claude/skills/*/SKILL.md. Match the directory name against the provided skill name (case-insensitive).
Scan both locations using the Glob tool. Collect all found skills and present a numbered list:
I found these skills:
1. skill-unit (plugin: skills/skill-unit/SKILL.md)
2. test-design (plugin: skills/test-design/SKILL.md)
3. report-card (repo: .claude/skills/report-card/SKILL.md)
Which skill would you like to design tests for?
Wait for the user to pick one before proceeding.
This skill operates on a single target skill per invocation. If the user wants to design tests for multiple skills, they invoke /test-design separately for each.
This flow activates in two scenarios:
In either case, the agent MUST:
Explicitly name what is happening. Tell the user clearly:
"The skill
{name}doesn't exist yet (or:{name}doesn't currently have {described capability})." "I can help you use prompt-driven development — we'll define what the skill should do by writing test cases first, then you can build (or extend) the skill to pass them."
Ask discovery questions to understand the intended behavior. These replace the targeted questions in Step 4. Ask one at a time:
Stop asking when you have enough context to write meaningful test cases. Do not ask questions whose answers would be obvious from what the user already said.
Proceed to Step 5 (ID Prefix) and continue through the normal generation flow. The test cases now define the intended behavior of a skill that doesn't exist yet (or intended new behavior for an existing skill).
After writing the spec file, remind the user of the next step:
"These test cases define the behavior for
{name}. You can now build (or update) the skill to pass them, then run/skill-unit {name}to verify."
This flow is mandatory, not optional. When either trigger condition is met, the agent must enter this flow rather than simply reporting "skill not found" or silently proceeding as if the feature exists. The goal is to make test-first development the natural path for new skills and new capabilities.
Read .skill-unit.yml from the repo root (if it exists) to determine the test directory. Default to skill-tests/ if not configured.
Use the Glob tool to search recursively for **/*.spec.md under the test directory. For each found spec file, read its YAML frontmatter and check if the skill field matches the selected skill name. Skip files with missing or malformed frontmatter, or where the skill field is absent.
If one spec file is found:
"I found an existing spec at
{path}. Want me to review it for gaps, or are you looking to add something specific?"
If multiple spec files are found:
"I found {N} spec files for {skill-name}:
{path-1}— {name from frontmatter}{path-2}— {name from frontmatter}Which one would you like to work with, or should I review all of them for gaps?"
If no spec files are found:
Proceed to New Spec Creation.
Read the target skill's SKILL.md using the Read tool.
Validate the frontmatter first. Before extracting content, check that the YAML frontmatter between the --- delimiters is valid:
name: my-skill, not name my-skill)tags: [a, b], not tags: [a, b)If the frontmatter is malformed, stop and inform the user. Tell them what is wrong with the frontmatter and suggest how to fix it. Do not proceed with test case generation for a skill with invalid frontmatter.
If the frontmatter is valid, extract and summarize:
Check for prompt-driven development trigger: Compare the user's original request against the skill's actual capabilities. If the user described functionality, behaviors, or features that are not present in the SKILL.md, enter the Prompt-Driven Development Flow (from Step 1). Do not silently proceed as if the capability exists — the user needs to know they are defining new behavior.
If the skill uses Read, Write, Edit, Glob, or Grep on project files, or references specific file types or directory structures, or depends on git state — note that fixtures will likely be needed. Use the Read tool to load this skill's references/fixture-design.md (path: skills/test-design/references/fixture-design.md from repo root) for guidance on fixture design and incorporate fixture questions into the targeted questions below.
Ask focused questions about things that cannot be inferred from the SKILL.md. These are gap-fillers, not an exhaustive interview. If the SKILL.md is thorough, this step may produce zero questions.
Possible questions (ask only what is relevant):
If fixtures are needed (determined in Step 3):
Ask questions one at a time. Wait for each answer before asking the next. Stop when you have enough context to generate good test cases.
Auto-generate a 2-4 letter prefix from the skill name by taking uppercase initials or a short abbreviation:
| Skill Name | Prefix |
|---|---|
| commit | COM |
| report-card | RC |
| brainstorming | BRN |
| skill-unit | SU |
| test-design | TD |
Check for collisions: use the Glob tool to find all *.spec.md files in the test directory and read their ### headings to collect existing prefixes. If the auto-generated prefix matches an existing one from a different skill, prompt the user:
"The prefix
{PREFIX}is already used by{other-skill}. What prefix would you like to use instead?"
Present the chosen prefix to the user for confirmation before proceeding.
Generate the YAML frontmatter for the new spec file and present it to the user for approval:
---
name: {skill-name}-tests
skill: {skill-name}
tags: [{inferred-tags}]
# Include these only if applicable:
# global-fixtures: ./fixtures/{fixture-folder-name}
# setup: setup.sh
# teardown: teardown.sh
---
Infer tags from the skill's characteristics:
slash-command if the skill has a slash commandactivation if the skill has auto-activation triggersfixtures if fixture folders are neededPresent the frontmatter and ask: "Does this look right, or would you like to change anything?"
Wait for approval before proceeding to test case generation.
Generate test cases one category at a time, in this order:
| Order | Category | Purpose | When to Include |
|---|---|---|---|
| 1 | Activation tests | Verify the skill triggers (and doesn't trigger) on expected prompts | Always for auto-activating skills; slash-command-only skills test the command |
| 2 | Happy path tests | Core functionality with realistic, well-formed inputs | Always |
| 3 | Failure mode tests | Missing files, bad input, conflicting state, empty data | Always |
| 4 | Boundary tests | Edge cases at the limits of the skill's scope | When the skill has identifiable scope boundaries |
| 5 | Graceful decline tests | Requests adjacent to but outside the skill's purpose | Always |
| 6 | Interaction style tests | Tone, format, clarifying questions | When the skill has specific interaction expectations |
For each category:
COM-1, COM-2 in activation, COM-3, COM-4 in happy path).### {PREFIX}-{N}: {Human-Readable Descriptive Title}
{Plain-text purpose statement explaining why this test exists and what risk it guards against.}
**Fixtures:**
- {./path/to/fixture — only if this test needs fixtures beyond global-fixtures}
**Prompt:**
> {natural, human-sounding prompt}
**Expectations:**
- {observable outcome}
- {observable outcome}
**Negative Expectations:**
- {specific prohibited behavior}
The **Fixtures:** section is optional. Include it only when a test case needs additional fixture state beyond what global-fixtures provides. Per-test fixtures are layered on top of global fixtures.
"Want to refine any of these, add more, or move on to {next-category}?"
The user can:
Repeat the refinement loop until the user approves the category.
Prompt quality rules (apply to every generated prompt):
Good: "commit my changes", "how are the students doing?", "this test keeps failing, can you help?" Bad: "use the commit skill", "generate a report card using the report-card skill", "run git commit on staged files"
Expectation quality rules (apply to every generated expectation):
Good: "Commit message references the nature of the changes"
Bad: "Ran git commit -m 'fix: auth bug'"
Prompt variation — across test cases in the same spec, vary:
Once all categories have been approved:
--- horizontal rules.{test-dir}/{skill-name}/{skill-name}.spec.md
{test-dir} comes from .skill-unit.yml or defaults to skill-tests/.{skill-name} is the directory name of the target skill.results/ subfolder inside it."Spec file written to
{path}. Created {N} test cases across {M} categories." "You can run these tests with/skill-unit {skill-name}."
If fixture folders were created, also note:
"Fixture folder created at
{fixture-path}. Review and adjust the fixture files as needed — they contain the minimal structure we discussed."
When an existing spec file is found in Step 2, the skill enters one of two edit modes based on the user's response.
Used when the user asks for a review without specific instructions.
lower-kebab-case instead of human-readable Title Case?"Here's what I found in
{spec-file}:Missing coverage:
- No failure mode tests (minimum: 1)
- No graceful decline tests (minimum: 1)
Prompt quality issues:
- {PREFIX}-{N}: Prompt mentions the skill name — rewrite to be more natural
Expectation quality issues:
- {PREFIX}-{N}: Expectation 'Ran
git commit -m ...' tests implementation detail — rewrite as behavioral assertion- {PREFIX}-{N}: Expectation combines two checks — split into separate bullets
Want me to work through these one at a time?"
Used when the user has specific changes in mind.
### headings, find the highest number, and use the next sequential number.For new test cases, follow the same prompt and expectation quality rules as new spec creation.
This guide is used during both new spec creation and gap analysis. It defines the quality standards for generated test cases and the coverage checklist for evaluating existing specs.
Minimum coverage requirements by category. Use this during gap analysis to identify missing or thin categories.
| Category | Minimum | Applies When |
|---|---|---|
| Activation (positive) | 1 | Skill has auto-activation or slash command |
| Activation (negative) | 1 | Skill has auto-activation |
| Happy path | 1 | Always |
| Failure mode | 1 | Always |
| Boundary | 0 | Skill has identifiable scope boundaries |
| Graceful decline | 1 | Always |
| Interaction style | 0 | Skill has specific tone/format expectations |
Good prompts — natural, vague, human-sounding:
Bad prompts — leak implementation details or lead the agent:
Single-turn prompts — when tests run as a single prompt/response, front-load project context so the agent skips discovery:
Prompt variation — across test cases in the same spec, vary:
Good expectations — behavioral, observable, independently verifiable:
Bad expectations — implementation-specific, compound, or vague:
git commit -m 'fix: auth bug'" (too specific to implementation)Scope precision — when an expectation applies to a specific part of the output, say so:
Prompts vs. expectations boundary — skill internals (specific values, fallback behaviors, config names) are allowed in expectations because that is how you verify correct behavior. They must NOT appear in prompts because that leaks the answer to the agent under test.
Negative expectations — specific prohibited behaviors:
git push"All generated spec files must follow this exact structure.
Frontmatter: YAML block delimited by --- at the top of the file.
| Field | Required | Type | Description |
|---|---|---|---|
name | Yes | string | Human-readable name for the test suite |
skill | No | string | Skill being tested (always emitted during generation; used for spec detection) |
tags | No | list | Tags for filtering test runs |
timeout | No | duration | Per-test timeout (e.g., 90s) |
global-fixtures | No | path | Path to fixture folder copied for every test case, relative to spec file directory |
setup | No | filename | Script to run before tests |
teardown | No | filename | Script to run after tests |
Test case structure:
### {ID}: {Human-Readable Descriptive Title}
{Purpose statement — a plain-text sentence or two explaining why this test exists
and what risk or gap it guards against. This goes before the Prompt so the reader
understands the test's intent before seeing the mechanics.}
**Fixtures:**
- {./path/to/fixture — optional, only when this test needs additional fixtures}
**Prompt:**
> {prompt text — multi-line prompts use continued blockquote lines}
**Expectations:**
- {observable outcome — one per bullet}
**Negative Expectations:**
- {specific prohibited behavior — one per bullet}
Rules:
### headings.lower-kebab-case. The title should convey the test's purpose at a glance (e.g., "Asks the User Before Overwriting an Existing Spec" not asks-before-overwriting).**Prompt:** label. It explains why the test exists — what behavior it validates, what failure it prevents, or what design intent it captures.global-fixtures from frontmatter. Paths are relative to the spec file's directory.**Prompt:**. Leading > markers are stripped.**bold labels**.--- horizontal rules between test cases are optional (cosmetic).*.spec.md.When a user's test cases are failing and they need help diagnosing the problem, load this skill's references/troubleshooting.md (path: skills/test-design/references/troubleshooting.md from repo root) for common failure modes and fixes.