From code-forge
Use when implementing any feature or fix outside code-forge workflow — enforces Red-Green-Refactor cycle with mandatory test-first discipline. Supports three modes: (1) Standalone — ad-hoc TDD for quick changes, (2) Auto-Analysis — runs the full spec-forge:test-cases analysis pipeline (project profile, four-layer deep scan, multi-dimensional coverage) then implements all cases via TDD, (3) Driven — reads a test-cases.md document and implements each case via TDD.
npx claudepluginhub tercel/tercel-claude-plugins --plugin code-forgeThis skill uses the workspace's default tool permissions.
@../shared/execution-entrypoint.md
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
@../shared/execution-entrypoint.md
For this skill: start at Step 0 (Determine Mode). If you catch yourself about to say "falling back to manual TDD", STOP and go to the indicated step.
Test-Driven Development enforcement for any code change, with built-in code analysis.
Note: code-forge:impl already enforces TDD internally. This skill is for work outside that workflow.
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST.
No exceptions. Not for "simple" changes. Not for "obvious" fixes. Not when under time pressure.
Before any RED step in any mode (Driven, Auto-Analysis, Standalone), you MUST run the design-first pre-code checklist: read the relevant subsystem, consider the optimal interface-stable design, and decide whether to refactor existing code or add new code. The TDD cycle's REFACTOR step is the second enforcement point — every GREEN must be followed by a real consideration of whether the new code is the cleanest shape, not the most expedient one.
This discipline is the upstream defense against patch-first development. Read it once at the start of every session and again whenever you are tempted to add a new branch / wrapper / parallel module instead of refactoring:
@../shared/design-first.md
Examine the arguments to determine the operating mode:
| Argument | Mode | Behavior |
|---|---|---|
@docs/.../test-cases.md | Driven Mode | Read test cases document, implement each case via TDD |
@src/services/payment.ts or specific code path | Auto-Analysis Mode | Analyze specified code, design cases, implement via TDD |
| Feature name or description (e.g., "add validation to user signup") | Standalone Mode | Classic TDD — write tests for the described change |
| Empty (no arguments) | Auto-Analysis Mode | Scan project for coverage gaps, design cases, implement |
When a test-cases.md file is provided (generated by spec-forge:test-cases):
Present to user:
For each test case in scope:
test("TC-AUTH-001: create user with valid email returns 201", ...)After each case, display progress:
TDD Progress: {completed}/{total} ({percentage}%)
[x] TC-AUTH-001: Create user with valid email (P0) — DONE
[x] TC-AUTH-010: Create user with duplicate email rejected (P0) — DONE
[ ] TC-AUTH-011: Create user with invalid email format (P1) — next
[ ] TC-AUTH-030: Create user should NOT bypass email validation (P1)
Ask: "Continue with next case, skip, or pause?"
After all cases are implemented:
/code-forge:verify to confirm completion"When the user points to code or says "help me write tests" without a test-cases document.
Iron Rule: Auto-Analysis uses the SAME full analysis as spec-forge:test-cases. The only difference is the output — auto-analysis produces code directly instead of a document. The analysis quality must be identical.
Execute the complete spec-forge:test-cases analysis pipeline. The full workflow is defined in the spec-forge test-cases-generation skill (spec-forge/skills/test-cases-generation/SKILL.md). The essential steps are inlined below — follow them exactly:
Step 1 — Determine Input Mode and Project Profile
Step 2 — Deep Scan and Extract (Four Layers)
Step 3 — Detect Dimensions
Step 4 — Confirm Scope with User
Step 5 — Design Test Cases
Result: A complete set of structured test cases in memory — identical quality to what spec-forge:test-cases would produce as a document.
Ask the user: "Save the test cases as docs/{feature}/test-cases.md for future reference? (Y/n)"
For each test case (sorted by priority: P0 → P1 → P2), follow the same TDD cycle as Driven Mode:
test("TC-AUTH-001: create user with valid email returns 201", ...)After each case, display progress (same format as Driven Mode D.4):
TDD Progress: {completed}/{total} ({percentage}%)
[x] TC-AUTH-001: Create user with valid email (P0) — DONE
[x] TC-AUTH-010: Duplicate email rejected (P0) — DONE
[ ] TC-AUTH-011: Invalid email format (P1) — next
Ask: "Continue with next case, skip, or pause?"
After all cases are implemented:
/code-forge:verify to confirm completion"For ad-hoc changes where the user describes what to build or fix:
RED (write failing test) → VERIFY RED → GREEN (minimal code) → VERIFY GREEN → REFACTOR → REPEAT
Complete each phase fully before moving to the next.
Run the test. Confirm:
If the test passes: you're testing existing behavior. Rewrite the test. If the test errors: fix the error, re-run until it fails correctly.
Run the test. Confirm:
If the new test fails: fix the code, not the test. If other tests fail: fix them now, before proceeding.
@../shared/design-first.md for the discipline.Go back to Step 1 for the next behavior.
| If you're about to... | Instead... | Why |
|---|---|---|
| Write production code without a test | STOP — write the failing test first | Tests written after implementation pass immediately and prove nothing |
| Skip testing because the change is "simple" | Write the test — it will be quick if it's truly simple | Simple code has the sneakiest bugs (off-by-one, null edge cases) |
| Apply a quick fix without a regression test | Write the test, then fix | Untested fixes become permanent regressions |
| Continue with code that wasn't test-driven | Consider rewriting test-first | Sunk cost — untested code is a liability regardless of time spent |
Principle: test your own dependencies for real; only mock what you don't control.
| Your Dependency | Approach |
|---|---|
| Own database | Real DB (TestContainers, test instance, SQLite in-memory) |
| Own file system | Real temp directory |
| Own cache / message queue | Real (TestContainers, embedded) |
| External third-party API | Mock / stub acceptable |
| Non-deterministic input (time, random) | Inject controlled values |
Task: Add isPalindrome(str) function
1. RED — Write test:
test("isPalindrome returns true for 'racecar'", () => {
expect(isPalindrome("racecar")).toBe(true);
});
2. VERIFY RED — Run: npm test
✗ ReferenceError: isPalindrome is not defined ← fails correctly
3. GREEN — Minimal code:
function isPalindrome(str) {
return str === str.split("").reverse().join("");
}
4. VERIFY GREEN — Run: npm test
✓ isPalindrome returns true for 'racecar' ← passes
42 passed, 0 failed
5. REFACTOR — (no changes needed)
6. REPEAT — next test: edge case with empty string
Test runner detection: Check package.json scripts, pytest.ini, Cargo.toml, go.mod, or Makefile for the project's test command before starting the cycle. Use the same runner consistently.
Before claiming work is complete:
/spec-forge:test-cases first to generate a structured case set