From spec-forge
Generates structured test case sets with multi-dimensional coverage from project code analysis or specification documents. This skill activates when the user asks to write test cases, generate tests, supplement tests, create test coverage, or improve test completeness. It auto-scans the project to extract testable units (APIs, functions, components, CLI commands, tool definitions), identifies coverage gaps, designs test cases across multiple dimensions (coverage depth, input types, interaction patterns), and produces a structured test case document with coverage matrix. Includes test strategy and methodology by default. Use --formal flag to add management sections (environment, roles, schedule, defect management).
npx claudepluginhub tercel/tercel-claude-plugins --plugin spec-forgeThis skill uses the workspace's default tool permissions.
Test cases are structured specifications that define what to test, how to test it, and what the expected outcome should be. Unlike a test plan (which focuses on strategy, schedule, roles, and process), test cases are the actionable core — each one is detailed enough for an engineer to translate directly into test code.
Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
Test cases are structured specifications that define what to test, how to test it, and what the expected outcome should be. Unlike a test plan (which focuses on strategy, schedule, roles, and process), test cases are the actionable core — each one is detailed enough for an engineer to translate directly into test code.
This skill treats test case design as a distinct discipline from test planning and test implementation:
--formal flagcode-forge:tddThe skill can automatically scan a project to extract testable units without requiring the user to provide a specification document. It identifies:
Every test case set is organized across dimensions. The skill provides built-in dimensions and can auto-detect project-specific dimensions.
Coverage Depth (always applied):
| Level | Name | Description | Minimum per testable unit |
|---|---|---|---|
| L1 | Happy Path | Basic correct behavior with valid inputs | 1 case |
| L2 | Boundary & Error | Edge cases, invalid inputs, error handling | 2 cases |
| L3 | Negative | Scenarios that should NOT trigger behavior | 1 case |
Test Category (apply categories relevant to the project):
| Category | When to Include | Description |
|---|---|---|
| Functional | Always | Correct behavior verification |
| Data Integrity | Project has database / persistent store | Constraints, transactions, cascades |
| Security | Project handles auth, user input, or sensitive data | Auth, injection, access control |
| Performance | Project has latency/throughput requirements | Response time, throughput, resource usage |
The skill analyzes the project to discover additional dimensions. Examples:
| Project Type | Auto-Detected Dimension | Values |
|---|---|---|
| AI Tool Calling | Trigger Mode | Single tool / Combo tools |
| AI Tool Calling | Conversation Turns | Single turn / Multi-turn |
| REST API | Auth Context | Unauthenticated / User / Admin |
| Frontend Component | Device Context | Desktop / Mobile / Tablet |
| CLI Tool | Input Source | Args / Stdin / Config file |
| CLI Tool | Output Format | JSON / Table / Plain text |
| Event System | Delivery Mode | Sync / Async / Batch |
| Function Library | Input Type | Primitive / Object / Array / Null / Undefined |
| Data Pipeline | Data Volume | Empty / Small / Large / Malformed |
| SDK / Client | Connection State | Connected / Disconnected / Reconnecting |
Auto-detected dimensions are presented to the user for confirmation before generating cases.
When testable units interact with each other, the skill generates combination test cases:
This step answers two questions: how to find testable units (input mode) and what kind of project this is (project profile).
| User Provides | Mode | Behavior |
|---|---|---|
Specification document (@docs/features/auth.md) | Spec Mode | Parse spec, extract testable requirements |
Code path (@src/services/payment.ts) | Code Mode | Analyze code, extract testable units |
| Nothing or just a name | Scan Mode | Full project scan, discover everything |
Scan the project to determine its type. Check for framework signatures and unit type distribution:
| Signal | Project Profile |
|---|---|
| HTTP framework detected (Express, FastAPI, Spring Boot, Gin, NestJS, Koa, Hono, etc.) AND route/endpoint units present | Web API |
| CLI framework detected (Click, Cobra, Commander, clap, argparse with subcommands, etc.) OR command/subcommand units present | CLI Tool |
| Frontend framework detected (React, Vue, Svelte, Angular, etc.) AND component units present | Frontend App |
| Tool/plugin definitions with trigger conditions, LLM framework detected (LangChain, Vercel AI SDK, etc.) | AI Agent |
| Pipeline/transform/ETL patterns, data processing frameworks (Airflow, Prefect, dbt, etc.) | Data Pipeline |
| Exported functions/classes as primary interface, no routes/commands/components, published as package | Function Library |
| Client/SDK methods with connection management, API wrapper patterns | SDK / Client Library |
Detection method:
package.json dependencies, pyproject.toml, Cargo.toml, go.mod, build.gradle for framework signaturesOutput: Explicit project profile label with rationale, e.g.: "Project Profile: Web API (Express detected in package.json, 12 route handlers found, PostgreSQL via Prisma)"
Before scanning code or extracting testable units, the doc-first discipline applies. Read the existing test documentation, identify what already exists, and decide for each test case set whether to REUSE (reference existing), EXTEND (edit existing in place), or NEW (genuinely missing). The default for any module that already has a test-cases.md file is to extend in place — never to create parallel test-cases-v2.md files, never to append ## Update blocks, never to leave deprecated cases strikethrough'd.
The full discipline, the four rules, the pre-generation checklist, and the anti-patterns to avoid:
@../shared/doc-first.md
You MUST run the pre-generation checklist (the five questions) from doc-first.md before proceeding to Step 2. If an existing test-cases.md (or equivalent) already covers any of the units this generation will discuss, you must:
file:line references)grep for the ID literal in test files) — those are load-bearing and must not be deleted without coordinating the test code updateIf the project shows signs of doc drift (multiple test-cases-*.md files, TC ID collisions across files, obvious duplication of cases), warn the user and recommend /spec-forge:analyze or /spec-forge:propagate first.
If a usable existing test-cases doc already covers most of what the user is asking for, the right action is edit it in place, not generate a new file.
Extraction must go beyond listing function signatures. For each testable unit, capture four layers:
| Layer | What to Extract | Why It Matters |
|---|---|---|
| Interface | Public API surface — function signatures, type contracts, trait/interface boundaries, input/output types | Defines what CAN be tested from outside |
| Logic | Branch paths, error handling chains, state transitions, validation rules, business logic | Defines what SHOULD be tested (complexity = risk) |
| Architecture | Module structure, layer boundaries, dependency direction, separation of concerns | Defines test STRATEGY (unit vs integration vs E2E) |
| Relationships | Call graphs, data flow between units, event propagation, shared state, trait implementations | Defines COMBINATION tests (which units interact) |
package.json, pyproject.toml, Cargo.toml, go.mod, etc.Use the language-specific strategy matching the project's tech stack. Each strategy defines how to find ALL testable units across the four layers.
| Layer | How to Extract |
|---|---|
| Interface | Read __init__.py for __all__ or public imports. Scan all .py files for def/class without leading _. Record decorators (@app.route, @click.command, @property). Extract type hints from signatures. |
| Logic | Count if/elif/else, try/except, match/case branches per function. Identify raise statements (error paths). Find assert statements (invariants). Identify state machines (class with state-changing methods). |
| Architecture | Map import statements to build module dependency graph. Identify layers (routes → services → repositories). Detect circular imports. Identify Abstract Base Classes (ABC) that define contracts. |
| Relationships | Trace function calls across modules (A calls B). Identify shared state (module-level variables, singletons). Map signal/event dispatchers to handlers. Identify dependency injection patterns. |
| Layer | How to Extract |
|---|---|
| Interface | Read index.ts for export statements. Follow export * from './module' re-exports. Scan for export function, export class, export interface, export type. Extract parameter types and return types. Record @decorator() patterns. |
| Logic | Count if/else, switch/case, ternary operators per function. Identify throw statements and try/catch chains. Find Promise rejection paths. Identify state management patterns (reducers, stores). Detect async/await error propagation. |
| Architecture | Map import statements to build dependency graph. Identify barrel exports (index.ts re-export chains). Detect layer patterns (controllers → services → repositories). Identify React component tree (parent → child props). |
| Relationships | Trace function/method calls across modules. Map event emitters to listeners (on/emit, addEventListener). Map React context providers to consumers. Identify callback chains and promise pipelines. Map API client calls to server endpoints. |
| Layer | How to Extract |
|---|---|
| Interface | Scan all .go files for capitalized identifiers (exported). Extract function signatures, method receivers, interface definitions. Record struct field visibility (capitalized = public). Identify interface satisfaction (implicit implementation). |
| Logic | Count if/else, switch/case branches. Trace error return values through call chains (Go's error propagation pattern). Identify defer/panic/recover paths. Find goroutine spawning (go func()) and channel operations. |
| Architecture | Map package imports to build dependency graph. Identify internal packages (internal/). Detect interface-based abstraction layers. Identify cmd/ entry points vs. library packages. |
| Relationships | Trace function calls across packages. Map interface implementors (which structs satisfy which interfaces). Identify channel communication patterns between goroutines. Map middleware chains. |
Rust requires the deepest extraction due to its unique type system. Surface-level scanning of lib.rs is NOT sufficient.
| Layer | How to Extract |
|---|---|
| Interface | 1. Follow the mod tree: Read src/lib.rs → find all mod and pub mod declarations → recursively read each module file (src/{mod_name}.rs or src/{mod_name}/mod.rs). 2. Track re-exports: Follow pub use chains to determine the final public API. pub use submod::MyStruct in lib.rs makes MyStruct part of the crate's public API even if defined deep in a submodule. 3. Extract pub items: pub fn, pub struct, pub enum, pub trait, pub type, pub const, pub static. 4. Parse visibility modifiers: Distinguish pub (fully public), pub(crate) (crate-internal), pub(super) (parent module only). Only pub items are part of the external API; pub(crate) items are testable but internal. 5. Extract generic constraints: Record where clauses and trait bounds (e.g., fn process<T: Handler + Send>(item: T) → testable with any T implementing Handler + Send). |
| Logic | 1. Pattern matching exhaustiveness: Identify match arms — Rust enforces exhaustive matching, each arm is a branch to test. 2. Result/Option chains: Trace ? operator propagation, unwrap() calls (panic risk), map/and_then/unwrap_or chains. 3. Error types: Find enum error types and their variants — each variant is a testable error path. 4. Unsafe blocks: Each unsafe block is a high-risk area needing focused testing. 5. Lifetime constraints: Functions with complex lifetime bounds may have subtle edge cases around borrow validity. 6. Derive macros: #[derive(Serialize, Deserialize)] generates testable serialization behavior. #[derive(Clone, PartialEq)] generates comparison behavior. |
| Architecture | 1. Crate structure: Map src/lib.rs → modules → submodules to understand the module tree. 2. Dependency graph: Read Cargo.toml for crate dependencies; read use statements for internal module dependencies. 3. Trait abstraction layers: Identify trait definitions that serve as interfaces between layers (e.g., trait Repository defining the data access contract). 4. Feature flags: Read [features] in Cargo.toml — conditional compilation means different code paths need different tests. 5. Workspace structure: If Cargo.toml has [workspace], scan member crates for cross-crate dependencies. |
| Relationships | 1. Trait implementations: Find all impl Trait for Struct blocks — these define behavioral contracts that must be tested. 2. Method implementations: Find all impl Struct blocks to discover methods. Methods may be spread across multiple files. 3. Generic type consumers: Trace where generic functions are called with concrete types — each concrete instantiation may have different behavior. 4. Async task spawning: Identify tokio::spawn, async_std::task::spawn — each spawned task is an interaction point. 5. Channel communication: mpsc::channel, oneshot::channel, broadcast — data flow between components. 6. Trait object dispatch: Box<dyn Trait>, &dyn Trait — dynamic dispatch points where different implementations may be used. |
For languages not listed above, apply the same four-layer extraction principle:
**/*.test.*, **/*.spec.*, **/__tests__/**, **/test_*, **/tests/**, tests/**, *_test.go, **/*_test.rsAfter extraction, verify completeness:
mod declaration in lib.rs has a corresponding scanned module file. Missing modules = missed API surface.pub use / export * from chains were followed to their source.If verification reveals gaps, re-scan the missed areas before proceeding.
The inventory must include all four layers per unit:
Unit: createUser
Type: route (POST /api/users)
File: src/routes/users.ts:42
Interface: (name: string, email: string) → User | ValidationError
Logic: 3 branches (valid → create, duplicate → conflict, invalid → 400)
Architecture: Route layer → calls UserService.create → calls UserRepository.save
Relationships: depends on UserService, UserRepository; triggers UserCreatedEvent
Has Tests: Partial (happy path only)
Coverage Status: L1 covered, L2/L3 missing
Scan Mode: Perform steps 2a-2e for the full project.
Code Mode: Perform steps 2a-2e scoped to specified files + their dependencies. Project profile from full project context.
Spec Mode: Read spec document → extract requirements → map to testable behaviors → infer layers from spec descriptions.
After extracting testable units, analyze the project to identify applicable dimensions:
Present the analysis results and ask the user to confirm:
Wait for user responses before proceeding.
Using the confirmed scope, dimensions, and user context, generate the test case set by filling in the template at references/template.md.
Per testable unit, generate at minimum:
For each auto-detected dimension, cross with coverage depth:
Combination test cases:
I/O and external dependency test cases:
Priority assignment:
After generating all test cases, construct the coverage matrix:
If upstream SRS/Tech Design documents exist, also build a Requirements Traceability Matrix mapping requirement IDs to test case IDs.
references/checklist.mddocs/<feature-name>/test-cases.mdTC-<MODULE>-<NNN>
Each test case must be implementation-ready. Fields:
[action] [condition] [expected outcome]L2, Auth:admin)name: "John Doe", email: "test@example.com". Never use placeholders like [valid name].These shortcuts are strictly prohibited:
[valid email]The default output includes a concise test strategy covering:
This is NOT a full test plan — it's the methodology context needed to understand the test cases. Sections are adapted to the project type (e.g., Data Integrity is omitted for projects without a database).
--formal)When --formal flag is provided, the output additionally includes:
These sections follow IEEE 829 structure for teams that need formal QA documentation.
references/template.md — output template with default and formal sectionsreferences/checklist.md — quality validation checklistAlways read both files before generating test cases.
The finished test cases document is written to:
docs/<feature-name>/test-cases.md
where <feature-name> is a lowercase, hyphen-separated slug. If the directory does not exist, create it. If a file already exists, confirm before overwriting.
The output is designed to be consumed by code-forge:tdd in driven mode:
/code-forge:tdd @docs/<feature-name>/test-cases.md
This will iterate through the test cases and implement each one following the Red-Green-Refactor cycle.