From develop
TDD-first feature development — crystallise API as a demo test, drive implementation to pass it, run quality stack and progressive review loop.
npx claudepluginhub borda/ai-rig --plugin developThis skill is limited to using the following tools:
<objective>
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
TDD-first feature development. Crystallise the API as a demo use-case test, drive implementation to pass it, then close quality gaps with review, documentation, and the quality stack.
NOT for: bug fixes (use /develop:fix); .claude/ config changes (use /manage).
Foundry plugin check: run
ls ~/.claude/plugins/cache/ 2>/dev/null | grep -q foundry(exit 0 = installed). If the check fails or you are uncertain, proceed as if foundry is available — it is the common case; only fall back if an agent dispatch explicitly fails.
When foundry is not installed, substitute foundry:X references with general-purpose and prepend the role description plus model: <model> to the spawn call:
| foundry agent | Fallback | Model | Role description prefix |
|---|---|---|---|
foundry:sw-engineer | general-purpose | opus | You are a senior Python software engineer. Write production-quality, type-safe code following SOLID principles. |
foundry:qa-specialist | general-purpose | opus | You are a QA specialist. Write deterministic, parametrized pytest tests covering edge cases and regressions. |
foundry:doc-scribe | general-purpose | sonnet | You are a documentation specialist. Write Google-style docstrings and keep README content accurate and concise. |
Skills with --team mode: team spawning with fallback agents still works but produces lower-quality output.
| Temptation | Reality |
|---|---|
| "The feature is clear — I can skip the demo and go straight to code" | Without a crystallized API contract, implementation drifts. The demo is the spec. |
| "I know this library — no need to check docs" | Training data contains deprecated patterns. One fetch prevents hours of rework. |
| "I'll write tests after the implementation is stable" | Tests drive design. Writing them first reveals API problems before they are baked in. |
| "The existing suite still passes — the feature is good" | The existing suite doesn't cover the new feature. The demo and edge-case tests do. |
| "Step 1 analysis is unnecessary for a small addition" | Scope analysis reveals reuse opportunities and blast radius. Small additions regularly grow. |
Task hygiene: Before creating tasks, call TaskList. For each found task:
completed if the work is clearly donedeleted if orphaned / no longer relevantin_progress only if genuinely continuingTask tracking: immediately after Step 1 (scope is known), create TaskCreate entries for all steps of this workflow before doing any other work. Mark each step in_progress when starting it, completed when done.
Gather full context before writing any code:
# If issue number: fetch the full issue with comments
gh issue view <number >--comments
If a free-text description was provided: use the Grep tool (pattern <keyword>, glob **/*.py, path src/) to search for related code before spawning the analysis agent.
Structural context (codemap, if installed) — soft PATH check, silently skip if scan-query not found:
PROJ=$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null) || PROJ=$(basename "$PWD")
if command -v scan-query >/dev/null 2>&1 && [ -f ".cache/scan/${PROJ}.json" ]; then
scan-query central --top 5
fi
If results are returned: prepend a ## Structural Context (codemap) block to the foundry:sw-engineer spawn prompt with the hotspot JSON. This gives the analysis agent immediate complexity awareness without cold Glob/Grep exploration. If scan-query is not found or index is missing: proceed silently — do not mention codemap to the user.
Spawn a foundry:sw-engineer agent to analyse the codebase and produce:
Gate: If complexity smell was flagged, present the scope concern to the user before proceeding to Step 2.
Present the analysis summary before proceeding.
Skip if the feature calls no external library APIs — no new framework features, no third-party SDK methods, and no stdlib functions that changed in a recent Python version.
Trigger: the feature calls an external library API — a new framework feature, a third-party SDK method, or a stdlib function that has changed in a recent Python version.
DETECT → FETCH → CITE pipeline:
DETECT — read pyproject.toml or requirements*.txt for the exact version and output:
STACK DETECTED:
- <library> <exact-version> (from pyproject.toml)
→ Fetching official docs for the relevant API.
FETCH — use WebFetch to retrieve the specific relevant docs page (not the homepage). Source priority: official docs > official changelog/migration guide > web standards (MDN). Never cite Stack Overflow, blog posts, or AI training data.
CITE — when implementing, embed a comment with the source URL and the key quoted passage:
# Docs: https://docs.example.com/v2/api/method
# "The recommended pattern for X is Y" (v2.1 docs)
Conflict — if docs describe a pattern that conflicts with how the codebase currently uses the library:
CONFLICT DETECTED:
Existing code uses <old pattern>.
<library> <version> docs recommend <new pattern> for this use case.
Options:
A) Use the documented pattern (may require updating existing call sites)
B) Match existing code (works but not idiomatic for this version)
→ Which approach?
Before crystallising the API, surface any non-obvious design decisions:
ASSUMPTIONS I'M MAKING:
- [assumption about API shape, e.g. "returning a list not a generator"]
- [assumption about caller context, e.g. "called once per batch, not per item"] → Correct me now or I'll proceed with these.
Do not proceed to the demo if any assumption would materially change the API shape.
Crystallise the intended API contract before any implementation exists. Choose the form based on scope:
Unit function / simple API -> inline doctest:
def predict(self, x: Tensor) -> Tensor:
"""
>>> model = Classifier()
>>> model.predict(torch.zeros(1, 3))
tensor([0])
"""
Complex feature (setup required, side effects, multi-step flow) -> minimal example script:
# examples/demo_<feature>.py — throwaway script, run manually
from mypackage import Classifier
model = Classifier.from_pretrained("tiny")
result = model.predict_batch(["hello", "world"])
print(result) # expected: [label, label]
The example script captures what the feature should feel like to use. It becomes a formal pytest test once the implementation is complete and the API is stable (end of Step 3).
Both forms must:
# Doctest form: confirm it fails
python -m pytest --doctest-modules src/ -v <module >.py 2>&1 | tail -10
# Script form: confirm it errors (ImportError, AttributeError, NotImplementedError)
python examples/demo_ <feature >.py 2>&1 | tail -5
Gate: demo must fail or error. If it passes, the feature may already exist — revisit Step 1.
Before proceeding to implementation, critically evaluate the demo itself:
print-and-inspect?If any issue is found: revise the demo and re-run the gate. Do not proceed to Step 3 with a flawed API contract — the entire TDD loop is anchored to this.
Drive the implementation by making tests pass, one cycle at a time:
# Baseline: confirm existing suite is green before adding any new code
python -m pytest --tb=short -q <target_test_dir >-v 2>&1 | tail -20
Gate: all existing tests must pass before proceeding. If any fail, stop — do not add new code on a broken baseline. Use /develop:fix to address pre-existing failures first, then return here.
Start from the Step 2 demo — it is already failing and becomes the first target. For each piece of functionality:
python -m pytest --tb=short -q <target_test_dir >-v 2>&1 | tail -20
# doctest form
python -m pytest --doctest-modules src/ -v --tb=short <module >.py 2>&1 | tail -10
# pytest form
python -m pytest --tb=short <test_file >:: <test_name >-v
# script form
python examples/demo_ <feature >.py 2>&1 | tail -5
python -m pytest --tb=short -q <target_test_dir >-v
Repeat until all feature tests pass and the Step 2 demo passes.
If Step 2 produced an example script: promote it into a formal pytest test now that the API is stable. Delete the script once the test is in place.
Full review of the implementation. This is a loop — review -> fix -> re-review until only nits remain. Maximum 3 cycles.
Each cycle:
5-axis quality scan — before the full criteria evaluation, assess the implementation on each axis:
Use this scan to prioritize which criteria below get deepest scrutiny.
Evaluate against all criteria:
For every gap found: implement the fix immediately — add missing tests, remove dead code, revert out-of-scope edits. Return to Step 3 for any substantive implementation gap that needs a new TDD cycle.
Re-run the full suite to confirm nothing regressed:
python -m pytest --tb=short -q <target_test_dir >-v 2>&1 | tail -20
If only nits remain (style, cosmetic naming, minor formatting): document in Follow-up and exit the loop.
If substantive gaps remain: start the next cycle (max 3 total).
After 3 cycles: if substantive issues remain, stop — surface them to the user before proceeding to Step 5.
Spawn a foundry:doc-scribe agent to update all affected documentation:
CHANGELOG.md with a one-line entry under UnreleasedREADME.md usage examples# Verify doctests pass after doc updates
python -m pytest --doctest-modules <target_module >-v 2>&1 | tail -20
Read .claude/skills/_shared/quality-stack.md and execute the Branch Safety Guard, Quality Stack, Codex Pre-pass, Progressive Review Loop, and Codex Mechanical Delegation steps.
## Feature Report: <feature name>
### Purpose
[1-2 sentence description of what was built and why]
### Codebase Analysis
- Reused: [list of existing utilities/patterns leveraged]
- Modified: [files changed and why]
- New files: [list]
### Demo Use-Case
- Location: <file>::<test or doctest>
- API: [the function/class signature exposed]
### TDD Cycle
- Tests written: N
- Tests passing: N/N
- Regressions introduced: 0
### Quality
- Lint: clean / N issues fixed
- Types: clean / N issues fixed
- Doctests: passing
- Review: pass / N issues fixed (N cycles)
### Follow-up
- [any deferred items, known limitations, or suggested next steps]
## Confidence
**Score**: 0.N — [high ≥0.9 | moderate 0.8–0.9 | low <0.8 ⚠]
**Gaps**: [e.g., review cycle incomplete, edge cases not fully explored]
**Refinements**: N passes.
When to use team mode: feature spans 3+ modules, OR changes a public API, OR involves auth/payment/data scope.
Coordination:
{feature: <desc>, scope: <modules>, API: <proposed signature>}Spawn prompt template:
You are a [role] teammate implementing: [feature].
Read ~/.claude/TEAM_PROTOCOL.md — use AgentSpeak v2 for inter-agent messages.
Your task: [specific responsibility].
[If QA]: include security checks for any auth/payment/data-handling code.
Compact Instructions: preserve file paths, test results, API signatures. Discard verbose tool output.
Task tracking: do NOT call TaskCreate or TaskUpdate — the lead owns all task state. Signal your completion in your final delta message: "Status: complete | blocked — <reason>".