Guide test-driven development through the Red-Green-Refactor cycle, test quality standards, and test runner discipline. Enforce test-first when tddMode is 'enforce'. Use when implementing features or fixing bugs.
From flownpx claudepluginhub synaptiai/synapti-marketplace --plugin flowThis skill is limited to using the following tools:
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Domain skill for test-driven development: Red-Green-Refactor cycle, test quality, and runner discipline.
IF YOU DIDN'T WATCH THE TEST FAIL, YOU DON'T KNOW IF IT TESTS THE RIGHT THING.
A test that has never failed might pass for the wrong reason. The RED phase exists to prove the test is valid.
For each feature or fix, track the TDD cycle with tasks:
TaskCreate("RED: Write failing test for {behavior}", "Minimal test describing desired behavior. MUST fail on first run.")
TaskCreate("GREEN: Implement {behavior}", "Simplest code to make the test pass. No cleverness.")
TaskCreate("REFACTOR: Clean up {behavior}", "Improve quality. All tests must still pass after each step.")
TaskUpdate(RED task, status: "in_progress")
TaskUpdate(RED task, status: "completed")
TaskUpdate(GREEN task, status: "in_progress")
TaskUpdate(GREEN task, status: "completed")
TaskUpdate(REFACTOR task, status: "in_progress")
TaskUpdate(REFACTOR task, status: "completed") Use TaskList to confirm the full cycle completed before moving to the next behavior.
The cycle is non-negotiable. RED → GREEN → REFACTOR. Always in this order.
Always use run mode, never watch mode:
| Framework | Correct | Wrong |
|---|---|---|
| Vitest | vitest run | vitest (watch) |
| Jest | CI=true jest or jest --watchAll=false | jest (watch) |
| Pytest | pytest | — |
| RSpec | rspec | guard (watch) |
Watch mode leaves orphan processes. Verify cleanup:
pgrep -f "vitest|jest|pytest" && echo "ORPHAN PROCESS — kill it"
# GOOD: Each test verifies one thing
test "returns empty array when no items match"
test "returns matching items sorted by relevance"
# BAD: Multiple behaviors in one test
test "search works correctly" # What does "correctly" mean?
Test names should read as specifications:
| Scenario | TDD? | Notes |
|---|---|---|
| New feature | Always | Define behavior before implementing |
| Bug fix | Always | Write reproducing test first |
| Refactor | Existing tests | Ensure tests pass before AND after |
| Prototype/spike | Optional | But write tests before merging |
| Config changes | No | Unless behavior changes |
Check settings.json → testing.tddMode:
| Mode | Behavior |
|---|---|
enforce | Block implementation without a failing test. No exceptions. |
suggest (default) | Recommend TDD. Allow override with explicit user decision. |
off | No TDD guidance. Tests still run in verification. |
Aim for meaningful coverage, not vanity metrics:
| Excuse | Response |
|---|---|
| "Skip TDD just this once" | Delete the code. Start over. The cycle is the discipline. |
| "This is too simple to test" | Simple code, simple test. Write it in 30 seconds. |
| "I'll write tests after" | You won't. And if you do, they'll test what you wrote, not what you should have written. |
| "Tests slow me down" | Tests slow you down NOW. Bugs slow you down FOREVER. |
| "The test passes on first try" | That's a red flag. It should have failed first. Check: is it testing the right thing? |
| Trigger | Action |
|---|---|
| Test passes immediately on first write | Test is suspect. Verify it fails when you break the code intentionally. |
| >3 tests needed for one function | Function may be doing too much. Consider splitting. |
| Mocking >2 dependencies | Design smell. Refactor to reduce coupling. |
| Test is harder to write than the code | Step back. Either the interface is wrong or the test is testing implementation. |