Enforce TDD red-green-refactor discipline during CCC Stage 5-6 implementation. Derives test cases from spec acceptance criteria and PR/FAQ documents rather than generic test suggestions. Blocks implementation code before a failing test exists. Tracks cycle state across the session and integrates with .ccc-state.json task context. Use when implementing features with testable acceptance criteria in the CCC workflow, when the execution mode is exec:tdd, or when you need to enforce test-first discipline. Trigger with phrases like "enforce TDD", "red green refactor", "test first", "write a failing test", "no implementation yet", "TDD cycle", "exec:tdd mode", "derive tests from spec", "acceptance criteria to tests".
npx claudepluginhub cianos95-dev/claude-command-centre --plugin claude-command-centreThis skill uses the workspace's default tool permissions.
Strict red-green-refactor discipline for CCC Stage 5-6 implementation. This skill enforces _how_ to do TDD within the CCC workflow. For _when_ to use TDD (mode selection), see the `execution-modes` skill.
Runs full RED-GREEN-REFRACTOR TDD workflow for features from descriptions, task IDs, or specs. Confirms plan then automates failing tests (RED), minimal implementation (GREEN), and refactoring.
Executes strict TDD for tasks: RED (happy/failure failing tests), GREEN (minimal impl), REFACTOR (tidy code). Enforces no prod code without tests first.
Enforces TDD red-green-refactor cycle: test spec/design, failing unit tests, failure verification, 80% line coverage, refactoring on complexity/length triggers.
Share bugs, ideas, or general feedback.
Strict red-green-refactor discipline for CCC Stage 5-6 implementation. This skill enforces how to do TDD within the CCC workflow. For when to use TDD (mode selection), see the execution-modes skill.
Claude's natural tendency is to write implementation first, then tests. This inversion produces:
TDD enforcement reverses this: the spec drives the tests, and the tests drive the implementation.
The execution-modes skill defines exec:tdd as a mode and provides a decision heuristic for when to use it:
When: Well-defined acceptance criteria that can be expressed as automated tests. The requirements are clear enough to write a failing test before writing implementation code.
This skill takes over once exec:tdd is selected. It enforces the discipline of the red-green-refactor loop, derives test cases from spec acceptance criteria, and prevents the common shortcuts that undermine TDD.
If you find yourself unable to write a failing test, the scope is not well-defined enough for TDD. Downgrade to exec:pair per the execution-modes heuristic.
Goal: Express a single acceptance criterion as an automated test that fails.
Process:
Read the spec. Open the PR/FAQ or issue description linked in .ccc-state.json. Identify the acceptance criteria checklist.
Select the next uncovered criterion. Work through criteria in dependency order — test foundational behavior before derived behavior.
Translate the criterion to a test assertion. The test name should read like the acceptance criterion:
AC: "API returns 404 when resource does not exist"
Test: test_returns_404_when_resource_not_found()
Write ONLY the test. No implementation code. No helper functions for the implementation. No "let me just stub out the interface first." The test file is the only file you touch in the RED phase.
Run the test. Confirm it fails. If the test passes without implementation, either:
Verify the failure message is meaningful. A good RED failure says what is missing. A bad RED failure says "undefined is not a function." If the failure message is unhelpful, improve the test before moving to GREEN.
Anti-rationalization blocks:
| Thought | Response |
|---|---|
| "Let me just write a quick implementation first" | NO. Write the test first. This is the entire point of TDD. |
| "I need the interface to know what to test" | Write the test as if the interface already exists. The test defines the interface. |
| "This criterion isn't really testable" | If it's in the acceptance criteria, it must be verifiable. Find a way to test it, or flag it as needing spec clarification. |
| "Let me write all the tests at once" | NO. One criterion, one test, one RED-GREEN-REFACTOR cycle. Batching tests defeats the feedback loop. |
| "The test framework needs setup first" | Test infrastructure setup is pre-TDD work. Do it before entering the RED phase. Don't conflate setup with the first test. |
Goal: Write the minimum implementation code to make the failing test pass.
Process:
Focus on the single failing test. Do not think about the next criterion. Do not think about "good design." Make this one test pass.
Write the minimum code. If a hardcoded return value makes the test pass, that is a valid GREEN step. The refactor phase handles design quality.
Run the test. Confirm it passes. All previously passing tests must still pass. If a new test breaks an old test, you have a design problem — do not fix it by weakening the old test.
Do not add functionality beyond the test. No "while I'm here" additions. No "this will obviously be needed later." If it's not required by a failing test, it doesn't exist yet.
Minimum code principle:
The GREEN phase deliberately produces "ugly" code. This is correct. Premature design in GREEN leads to:
Trust the process: RED defines what's needed, GREEN makes it work, REFACTOR makes it clean.
Anti-rationalization blocks:
| Thought | Response |
|---|---|
| "This code is ugly, let me clean it up" | That's the REFACTOR phase. Make the test pass first. |
| "I should add error handling for edge cases" | Is there a test for those edge cases? No? Then don't add the handling. Write a test first. |
| "Let me also implement the next feature while I'm in this file" | NO. One RED-GREEN-REFACTOR cycle at a time. |
| "This obviously needs a proper abstraction" | Maybe. But that's REFACTOR, not GREEN. Make it work first. |
Goal: Improve code quality without changing behavior. All tests must remain green throughout.
Process:
Run all tests. Confirm green. This is your baseline. Every refactoring step must maintain this state.
Look for code smells introduced in GREEN:
Refactor one thing at a time. After each change, run tests. If any test fails, the refactoring introduced a behavior change — revert and try again.
Refactor tests too. Test code is production code. Apply the same quality standards: remove duplication, improve naming, extract helpers for repeated setup.
Stop when the code is clean enough. "Clean enough" means: another developer could read it and understand the intent without comments. Do not gold-plate.
Anti-rationalization blocks:
| Thought | Response |
|---|---|
| "This refactoring needs a new test" | Then it's not a refactoring — it's a new feature. Go back to RED. |
| "I'll skip refactoring, the code is fine" | You are the worst judge of your own code quality. Spend at least 2 minutes looking for improvements. |
| "Let me refactor the entire module while I'm here" | Refactor only what you touched. Scope creep in refactoring is still scope creep. |
The unique value of TDD within CCC (vs. generic TDD) is that test cases are derived from the spec, not invented ad hoc.
Each acceptance criterion in the PR/FAQ maps to one or more test cases:
Spec AC: "Users can search by keyword with results ranked by relevance"
Test cases derived:
1. test_search_returns_results_matching_keyword()
2. test_search_results_ordered_by_relevance_score()
3. test_search_with_no_matches_returns_empty()
4. test_search_with_special_characters_handled()
Derivation rules:
The PR/FAQ structure maps to testing layers:
| PR/FAQ Section | Test Layer | What to Test |
|---|---|---|
| Press Release (user outcome) | Integration/E2E | Full user workflow produces stated outcome |
| Customer Problem | Edge cases | Scenarios that recreate the original problem |
| Solution | Unit tests | Individual components implement the solution |
| FAQ - Technical | Unit + Integration | Technical constraints are enforced |
| FAQ - Business | Acceptance | Business rules produce correct results |
| Pre-Mortem | Failure tests | Each failure mode is detected/handled |
Maintain a mapping between acceptance criteria and test status:
## Test Coverage — [Issue ID]
| # | Acceptance Criterion | Test(s) | Status |
|---|---------------------|---------|--------|
| 1 | Users can search by keyword | `test_search_*` (4 tests) | GREEN |
| 2 | Results paginated at 20/page | `test_pagination_*` (3 tests) | RED (current) |
| 3 | Admin can override rankings | — | NOT STARTED |
Update this table after each RED-GREEN-REFACTOR cycle. When all criteria reach GREEN, the implementation phase is complete.
TDD state persists across the session. If context compaction occurs or you lose track, reconstruct state from:
At the end of a session mid-TDD:
Commit the current state with a message indicating TDD phase:
wip(tdd): RED — failing test for [criterion description]
wip(tdd): GREEN — [criterion] passing, pre-refactor
wip(tdd): REFACTOR — [criterion] complete
Update the issue comment with the coverage table so the next session can resume.
Do not leave in GREEN without REFACTOR. If time is limited, either complete the refactor or revert to the last clean RED state.
When .ccc-state.json tracks the current task, the TDD cycle adds context:
{
"current_task": "CIA-123",
"execution_mode": "tdd",
"tdd_state": {
"phase": "RED",
"current_criterion": "Results paginated at 20/page",
"criteria_completed": 1,
"criteria_total": 5,
"failing_test": "test_pagination_returns_20_items"
}
}
This state enables drift detection: if implementation changes appear without a corresponding RED test, the drift-prevention skill can flag the violation.
Symptom: Implementation and test written in the same commit. The test passes on first run.
Fix: If a test passes without ever failing, it was not written first. Delete the test, delete the implementation, start RED properly.
Symptom: A single test covers multiple acceptance criteria. Failure doesn't indicate which criterion is broken.
Fix: One test per criterion (at minimum). A test that checks 5 things is 5 tests wearing a trench coat.
Symptom: Tests assert internal state (private methods, data structures) rather than observable behavior.
Fix: Test the what, not the how. "It returns the correct result" not "it calls the internal helper method." Implementation details change in REFACTOR — behavior doesn't.
Symptom: Spending 20+ minutes on a single test without writing it. Usually caused by unclear acceptance criteria.
Fix: The problem is upstream. The acceptance criterion is ambiguous. Flag it in the issue, request clarification, and move to the next testable criterion.
Symptom: Rapid RED-GREEN-RED-GREEN cycles without cleanup. Code quality degrades with each cycle.
Fix: REFACTOR is mandatory, not optional. Even if "the code is fine," take 2 minutes to look. The habit matters more than any single cleanup.
Symptom: Refactoring session grows to 30+ minutes. Extracting abstractions for patterns that appear twice. Adding generalization "for the future."
Fix: Refactoring scope = files touched in GREEN. Time budget = roughly equal to GREEN time. If refactoring takes longer than implementation, you're over-engineering.
No implementation without RED. If you catch yourself writing implementation code without a failing test, stop. Write the test first. This is non-negotiable.
One cycle at a time. Complete RED-GREEN-REFACTOR for one criterion before starting the next. No batching.
Tests run after every phase change. After writing the test (RED): run. After implementation (GREEN): run. After refactoring: run. No exceptions.
Failing test count = 1. During GREEN, exactly one test should be failing (the current one). If multiple tests fail, you've broken something — fix it before continuing.
30-minute escape hatch. If a single RED-GREEN-REFACTOR cycle takes more than 30 minutes, the criterion is too large. Split it into smaller criteria and test each independently.
exec:tdd mode and the decision heuristic for when TDD is appropriate. This skill assumes exec:tdd has already been selected.