This skill should be used when autonomously iterating on test failures until all tests pass. It runs the test suite, diagnoses failures, applies minimal fixes, and re-runs in a loop with checkpoint commit isolation.
From soleurnpx claudepluginhub jikig-ai/soleur --plugin soleurThis skill uses the workspace's default tool permissions.
Autonomous test-fix iteration loop. Run the test suite, diagnose failures, apply fixes to implementation code, and re-run until all tests pass or a termination condition is met. This is a recovery mechanism for unexpected failures -- not a replacement for RED/GREEN/REFACTOR (use atdd-developer for TDD discipline).
soleur:work GREEN phase fails and manual diagnosis is tediousatdd-developer)Auto-detect the test command from project files in priority order:
CLAUDE.md -- explicit test command (highest priority)package.json -- scripts.test fieldCargo.toml -- cargo testMakefile / Justfile -- test targetGemfile / Rakefile -- bundle exec rake test or bin/rails testpyproject.toml -- pytestgo.mod -- go test ./...If $ARGUMENTS contains a custom test command, use it instead of auto-detection.
If $ARGUMENTS contains a number, use it as max iterations (default: 5).
If no runner is detected, ask the user for the test command.
Run git status --porcelain. If output is non-empty, STOP and tell the user to commit their changes first.
<decision_gate> Show the user: detected test command, max iterations, current branch. Get one confirmation before starting the loop. This is the only approval gate -- no per-iteration approval. </decision_gate>
Record the current commit SHA as <initial-sha> before entering the loop. This is the rollback target if the loop terminates on failure after multiple iterations.
Run the initial test suite. If all tests pass, exit with "All tests already pass. Nothing to fix."
For each iteration (up to max iterations):
Extract failure summaries from test output: test name and error message only (one line each). Discard full stack traces and passing test output to minimize context consumption.
Distinguish build/compilation errors from test failures. If the suite fails to compile, treat the entire build error as a single cluster and fix the compilation issue first.
Before attempting fixes, check whether to stop:
| Condition | Detection | Action |
|---|---|---|
| All tests pass | Zero failures | Stage fixes with git add -A, report success |
| Max iterations | iteration == limit | git reset --hard <initial-sha> (revert all iterations), report |
| Regression | Failure count increased vs previous iteration | git reset --hard HEAD (discard uncommitted fixes), report |
| Circular fix | Failure name set matches any prior iteration | git reset --hard <initial-sha> (revert all iterations), report |
| Non-convergence | Failure count unchanged for 2 consecutive iterations | git reset --hard <initial-sha> (revert all iterations), report |
| Build error persists | Same compilation error after fix attempt | git reset --hard <initial-sha> (revert all iterations), report |
If a termination condition triggers, skip to the Diagnostic Report.
Cluster failures by file or module (max 5 groups, sorted by failure count descending). If more than 5 modules fail, take the top 5 and note the skipped modules.
For each cluster, apply the diagnostic-first rule:
<critical_sequence>
Commit the current working tree as a rollback checkpoint before applying fixes. Skip on iteration 1 if the tree is clean -- <initial-sha> already serves as the rollback point.
git add -A && git commit -m "test-fix-loop: checkpoint iteration N"
Apply fixes to implementation code only. NEVER modify test files, add skip annotations, delete tests, or weaken assertions.
Re-run the full test suite after applying fixes.
Evaluate the result:
git add -A, report success, STOPgit reset --hard HEAD (discard uncommitted fixes, return to checkpoint), STOPgit reset --hard <initial-sha> (revert ALL iterations), STOPgit reset --hard <initial-sha> (revert ALL iterations), STOP
</critical_sequence>On termination (success or failure), write a report to stdout:
On success, fixes are staged but NOT committed. The user reviews and commits via /ship or manually.