From toby-essentials
Orchestrates agentic Red-Green-Refactor TDD cycles to build features incrementally: decompose into small tasks, write failing tests, make pass, refactor. Detects project lang and test setup.
npx claudepluginhub tobyilee/toby-plugins --plugin toby-essentialsThis skill uses the workspace's default tool permissions.
Orchestrate a 3-phase Red-Green-Refactor TDD cycle using sequential Agent calls. Each cycle implements one small behavior increment: write a failing test, make it pass with minimal code, then clean up.
Enforces strict TDD workflow for feature implementation: write one failing test, minimal code to pass, refactor, repeat. Prevents writing full test suites upfront.
Runs full RED-GREEN-REFRACTOR TDD workflow for features from descriptions, task IDs, or specs. Confirms plan then automates failing tests (RED), minimal implementation (GREEN), and refactoring.
Guides TDD workflow with red-green-refactor cycle: plan interfaces, tracer bullet tests, minimal implementation to green, refactor under tests. For explicit TDD requests only.
Share bugs, ideas, or general feedback.
Orchestrate a 3-phase Red-Green-Refactor TDD cycle using sequential Agent calls. Each cycle implements one small behavior increment: write a failing test, make it pass with minimal code, then clean up.
| Agent | Phase | Responsibility |
|---|---|---|
| red | RED — Write failing test | Create a test that compiles but fails, then verify the failure |
| green | GREEN — Make it pass | Implement the simplest code to make the test pass |
| refactor | REFACTOR — Clean up | Improve code quality while keeping all tests passing |
Before starting, detect the project's language and build tool:
build.gradle.kts, pom.xml, package.json, Cargo.toml, go.mod, pyproject.toml, etc../gradlew test, npm test, pytest)Capture as environment context:
PROJECT_ROOT: /path/to/project
SOURCE_DIR: src/main/java (or equivalent)
TEST_DIR: src/test/java (or equivalent)
TEST_CMD: ./gradlew test
TEST_FRAMEWORK: JUnit 5 / Jest / pytest / etc.
This is the most important planning step. Break the user's feature request into a sequence of small, incremental behaviors — each one becomes a TDD cycle.
How to decompose well:
Example: User says "Calculator 클래스 만들어줘"
TDD Tasks:
1. add(1, 2) returns 3 (basic addition)
2. add(0, 0) returns 0 (zero case)
3. subtract(5, 3) returns 2 (basic subtraction)
4. multiply(3, 4) returns 12 (basic multiplication)
5. divide(10, 2) returns 5 (basic division)
6. divide(10, 0) throws ArithmeticException (division by zero)
Present the task list to the user for confirmation before starting. The user can reorder, add, remove, or modify tasks.
For each task, run three sequential Agent calls. This is simpler and more reliable than using Team/Task APIs because RED→GREEN→REFACTOR is inherently sequential.
Spawn an Agent with the Red agent prompt from references/agent-prompts.md, appending:
Execute TDD RED phase:
- Task: "{task description}"
- Environment: {environment context}
- Write a failing test, verify it fails
- Report: test file path, test method name, failure message
- Save any created/modified files
Wait for completion. Check the result:
Spawn an Agent with the Green agent prompt, including:
Execute TDD GREEN phase:
- Failing test: {test file path and method}
- Failure message: {from RED phase}
- Environment: {environment context}
- Write the MINIMUM code to make the test pass
- Run ALL tests, confirm everything passes
- Report: files modified, all test results
Wait for completion. If tests still fail, send the failure back and ask to retry.
Spawn an Agent with the Refactor agent prompt, including:
Execute TDD REFACTOR phase:
- Just implemented: {summary of RED+GREEN}
- Environment: {environment context}
- Look for: duplication, naming, complexity, test readability
- Run ALL tests after each change
- If code is clean enough, report "no refactoring needed"
- Report: what changed and why, final test results
After each cycle completes, present a summary to the user:
── TDD Cycle {N} Complete ──
Task: {task description}
RED: ✅ Test written: UserServiceTest.shouldReturnUserById()
GREEN: ✅ Implementation: UserService.findById() — hardcoded return
REFACTOR: ✅ Extracted UserRepository interface
Tests: 5 passed, 0 failed
Files changed: UserService.java, UserServiceTest.java, UserRepository.java
── Progress ──
[x] 1. findById returns user when exists
[x] 2. findById throws when not found
[ ] 3. createUser saves and returns user
[ ] 4. deleteUser removes user
[ ] 5. listUsers returns all users
Continue with task 3? (or modify the remaining tasks)
This checkpoint lets the user:
Maintain a running task list throughout the session. After each cycle, update the status:
[x] Completed tasks (with cycle number)
[>] Current task
[ ] Remaining tasks
Show this progress list at every user checkpoint. This gives visibility into the session's arc and helps the user decide what to focus on next.
The reason for strict separation is that it prevents the common anti-pattern of writing tests and implementation simultaneously, which defeats the purpose of TDD — you lose the confidence that the test actually tests what you think it tests.
This feels counterintuitive but is fundamental to TDD. The simplest implementation reveals whether the test is specific enough. If hardcoding "passes" a test that should require real logic, the test needs improvement.
| Situation | Action |
|---|---|
| Build fails in RED | Ask agent to fix stubs, re-verify failure |
| GREEN can't pass test | Send failure output, ask to retry with different approach |
| REFACTOR breaks tests | Ask agent to revert and retry with smaller changes |
| Agent produces incorrect output | Re-read the source files, correct, and re-run tests |
When the user finishes the TDD session, provide a final summary:
For detailed agent system prompts:
references/agent-prompts.md — Complete prompts for Red, Green, and Refactor agents including rules, workflow, and verification steps