From sdd-pipeline
Enforces TDD cycle: one failing test, minimal passing code, repeat. For features, bugs, refactors in codebases with tests. Runs tests via hook at turn end.
npx claudepluginhub eduwxyz/my-awesome-skills --plugin sdd-pipelineThis skill uses the workspace's default tool permissions.
One small failing test, just enough code to pass, then a look. Repeat.
Guides Test-Driven Development (TDD) for features, bug fixes, refactors: write failing test first, verify failure, add minimal passing code, refactor.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Guides code writing, review, and refactoring with Karpathy-inspired rules to avoid overcomplication, ensure simplicity, surgical changes, and verifiable success criteria.
Share bugs, ideas, or general feedback.
One small failing test, just enough code to pass, then a look. Repeat.
Drop in ~/.claude/skills/tdd/ (per-user) or <repo>/.claude/skills/tdd/ (per-project).
Spikes, copy/style/visual edits, one-off scripts, generated files. If the codebase has no tests at all, see untested-code.md first.
If either slips, back up.
The name describes a capability, not a method. Read it out loud — it should sound like something a user or caller would care about.
✅ logged_out_user_cannot_publish_a_post
✅ schedule_overlap_returns_409
❌ post_service_calls_repo_save
❌ schedule_returns_object_with_status
The body has three sections in order — set up, do the thing, check what's observed:
test "expired tokens are rejected":
token = issue_token(ttl_seconds: 60)
advance_clock(seconds: 120)
result = verify(token)
assert result.ok == false
assert result.reason == "expired"
If renaming a private function tomorrow would break the test even though no behavior changed, the test was tied to internals. Rewrite or delete it. See test-anatomy.md.
Writing five tests up front and then five implementations produces tests that describe an imagined system. They lock you into the wrong shape and stop pulling their weight once any pair shares a code path.
Each test must exist because of something you learned writing the previous one.
Wrong: RED t1 t2 t3 t4 t5 → GREEN c1 c2 c3 c4 c5
Right: t1→c1, t2→c2, t3→c3, ...
Same mistake in miniature: writing one test and reaching inside to "also handle" something it doesn't cover. Don't.
The Stop hook runs your tests at the end of every turn (green allows the turn to end, red blocks it). It reads the command from .claude/tdd/test-command.txt at the project root.
If that file is missing, set it up before starting the cycle:
CLAUDE.md first — projects often document the canonical test command there.package.json (scripts.test), pyproject.toml / pytest.ini, Makefile (test: target), Cargo.toml, or analogous config files for the project's stack.test:unit vs test:e2e), ask the user which one to use for TDD.Once decided, create .claude/tdd/ if missing and write the command on a single line to .claude/tdd/test-command.txt. Confirm with the user the first time.
The user can edit the file at any time to change the command (e.g. switching from npm test to npm run test:unit).
1. Decide. If a spec/<slug>.md exists for this task, read it — Behaviors + Acceptance criteria are your queue. Otherwise list the behaviors to verify in rough order with whoever cares.
2. Smoke run. Pick the first behavior. Test → fails → minimum code → passes. If this first cycle takes more than 15 minutes, the slice is too big.
3. Each next behavior. Same shape. One test at a time. No speculation. No "while I'm here." If a rule slips, see smells.md.
4. Cleanup. Tests green? Invoke the simplify skill on the recent changes — it reviews code for reuse, quality, and efficiency and applies any improvements found. After simplify, see cleanup.md for additional cleanup. Cleanup only on green; never add behavior during cleanup. (The Stop hook will also remind you to invoke simplify once per session if you forget.)
Stop adding tests when all are true:
100% coverage with bad tests is worse than 70% with good ones.