Help us improve
Share bugs, ideas, or general feedback.
From claude-swe-workflows
Autonomous comprehensive refactoring through a three-phase pipeline: tactical cleanup, architectural restructuring, then final tactical pass.
npx claudepluginhub chrisallenlane/claude-swe-workflows --plugin claude-swe-workflowsHow this skill is triggered — by the user, by Claude, or both
Slash command
/claude-swe-workflows:lead-refactoropusThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Drives a codebase through tactical cleanup, architectural restructuring, and a final tactical cleanup pass — all without operator involvement between startup and termination. The operator states scope, severity floor, constraints, and refactor aggression at startup; the skill then runs a three-phase pipeline that picks up where each previous phase converges.
Orchestrates a multi-step refactoring pipeline: analyzes code quality, plans improvements, optionally writes tests, implements changes, and reviews results. Routes simple refactors to a lighter workflow.
Iteratively scans code for DRY, dead code, naming, and complexity issues, implements fixes, verifies with QA, and commits atomically until no improvements remain.
Refactors existing code for guideline compliance and testability using 4 parallel analysis agents. Use after /optimus:init to align code or before /optimus:unit-test to remove testability barriers.
Share bugs, ideas, or general feedback.
Drives a codebase through tactical cleanup, architectural restructuring, and a final tactical cleanup pass — all without operator involvement between startup and termination. The operator states scope, severity floor, constraints, and refactor aggression at startup; the skill then runs a three-phase pipeline that picks up where each previous phase converges.
This skill is the orchestrator-family member that pairs tactical refactoring with architectural review. It is a narrower sibling of /lead-project and a peer of /lead-bug-hunt. Unlike /lead-bug-hunt's convergence loop, /lead-refactor has a fixed three-phase shape: each phase converges internally (via the sub-skills' own loops) rather than via a global loop over the macro phases.
This skill implements the autonomy discipline documented in references/autonomy.md. The shared discipline governs the five levers (altitude rule, pre-loaded options, pre-rebutted recommendation, commander's intent, risk budgets), the cascade rule, the no-unilateral-breaking-changes guardrail, and the shared handoff template.
Phase 1 invokes /refactor, which loops internally until no more tactical improvements remain. Phase 2 runs /review-arch and acts on findings at or above the severity floor via /implement-batch, re-running /review-arch until it produces no findings above the floor (or the architectural-iteration hard cap of 5 is hit). Phase 3 invokes /refactor again to catch any tactical issues introduced by Phase 2's structural changes.
There is no global loop over the macro phases. /refactor converging once and then /review-arch converging is sufficient — /refactor's tactical scope (DRY, dead code, naming, complexity) does not generate new architectural opportunities, and /review-arch's noun analysis is stable across re-runs. Two /refactor invocations and one Phase-2 inner loop is the right shape.
/review-arch is advisory; its ticket proposals are auto-approved under /lead-refactor per the orchestrator-family contract documented in references/autonomy.md § "Auto-approval of sub-skill ticket proposals". The commander's-intent severity floor (field 2) is applied at the Phase-2 triage step (2b), not at the approval moment. The completion report lists every ticket created.
/review-arch produces findings backed by noun-analysis evidence. This skill trusts findings at or above the severity floor as actionable — they are real architectural opportunities by /review-arch's contract. The skill does not silently dismiss findings. If the skill genuinely believes a finding is wrong (the proposed restructuring would break a constraint, the finding misreads the domain model), that is an andon trigger ("contested finding"), not a unilateral disregard. No escape hatches.
The skill may: invoke /refactor, /review-arch, /implement-batch, /implement; create tickets via auto-approved /review-arch proposals; commit refactoring work via sub-skills; create and modify the working branch.
The skill may NOT without explicit authorization: push or merge to main/master, force-push, propose breaking changes (see references/autonomy.md § "No unilateral breaking changes"), invoke other skills outside the bounded repertoire, install dependencies, run irreversible destructive operations.
┌──────────────────────────────────────────────────────────────────┐
│ LEAD-REFACTOR WORKFLOW │
├──────────────────────────────────────────────────────────────────┤
│ 0. Startup │
│ ├─ 0a. Branch and working-tree check │
│ ├─ 0b. Resume existing run or start fresh │
│ ├─ 0c. Elicit commander's intent (4 fields) │
│ └─ 0d. Seed LEAD_REFACTOR_STATE.md │
│ │
│ 1. Phase 1: Tactical refactor │
│ └─ Invoke /refactor (loops internally to convergence) │
│ │
│ 2. Phase 2: Architectural review loop (max 5 iterations) │
│ ├─ 2a. Invoke /review-arch (auto-approves ticket proposals) │
│ ├─ 2b. Triage findings against severity floor │
│ ├─ 2c. Form batch, invoke /implement-batch │
│ ├─ 2d. Verify tests pass │
│ └─ 2e. Convergence check (re-run /review-arch) │
│ │
│ 3. Phase 3: Final tactical refactor │
│ └─ Invoke /refactor (catches tactical issues from Phase 2) │
│ │
│ 4. Termination │
│ ├─ 4a. Final verification pass │
│ └─ 4b. Completion report │
└──────────────────────────────────────────────────────────────────┘
Follow the shared startup protocol in references/lead-startup.md. Skill-specific values:
lead-refactor/<date> (e.g., lead-refactor/2026-05-12).LEAD_REFACTOR_STATE.md. "Resume as-is" semantic: re-verify the current phase's state, then continue.references/autonomy.md § "Commander's-intent schemas per skill / /lead-refactor". Push-back examples specific to this skill: "Clean it all up" is not a scope — ask which modules; "Whatever severity" is not a floor — push for HIGH+ as the productive default.LEAD_REFACTOR_STATE.md — include the four pinned intent fields, Current phase: phase-1, an empty cycle log, and an empty findings ledger. Gitignore the state doc per the protocol.Invoke /refactor with:
/refactor loops internally until no more tactical improvements remain. Suppress /refactor's built-in /tidy-docs pass (step 7 in /refactor) — /lead-refactor does not include a documentation finisher; the operator can run /tidy-docs separately if needed.
After /refactor concludes:
phase-2.If tests fail after Phase 1 — andon cord (regression introduced by tactical refactor).
Bounded loop with max 5 architectural iterations. Each iteration has five sub-phases.
/review-archRun /review-arch with:
When /review-arch reaches its ticket-proposal step, auto-approve. Record in the cycle log: ticket IDs proposed, scope of the proposal, and the fact that auto-approval was applied per commander's intent.
For each finding produced by /review-arch:
Classify against severity floor:
## Deferred findings with severity and a one-line summary.Screen for contested findings. If the skill believes a finding is wrong on substance (the proposed restructuring contradicts a constraint in commander's intent, the finding misreads the domain model), do not silently dismiss it. Record in ## Contested findings and pull the andon cord.
Screen for breaking-change implications. If a finding's proposed restructuring would require a breaking change (per references/autonomy.md § "No unilateral breaking changes"), record in ## Breaking-change findings and pull the andon cord. Do not auto-fix.
Three possible outcomes:
Batches are per-iteration. Don't co-mingle with prior iterations' tickets — verification stays tight.
Record the chosen outcome and rationale in the cycle log.
Execute the batch via /implement-batch (or /implement for single-ticket batches when /implement-batch would be ceremony).
After the batch completes:
If tests fail — andon cord (regression introduced by architectural change).
Sub-skill escalations cascade up per references/autonomy.md § "Cascade rule."
Phase 2 ends when 2c determines the fix-list is empty (no findings above floor on the most recent /review-arch run). Update state doc: current phase becomes phase-3.
Invoke /refactor again with the same scope and aggression as Phase 1. This catches tactical issues introduced by Phase 2's restructuring (renamed-but-not-cleaned modules, redundant code paths after consolidation, dead imports left from moves).
/refactor loops internally to convergence. Suppress its built-in /tidy-docs pass.
After /refactor concludes:
termination.If tests fail — andon cord.
Before declaring done:
If any check fails, treat as a blocker — return to Phase 3 to address.
## Lead-Refactor Complete
### Commander's intent
[All four fields, verbatim]
### Outcome
[One paragraph: did the pipeline complete cleanly? Did Phase 2 converge
within the 5-iteration cap? Were there contested or breaking-change
findings that surfaced? Note any caveats.]
### Phase summary
- Phase 1 (tactical): N commits, net -X/+Y lines
- Phase 2 (architectural): N iterations, K tickets fixed, M tickets deferred
- Phase 3 (tactical cleanup): N commits, net -X/+Y lines
### Top things to scrutinize
[Three to five items where the skill's judgment is most likely to need
review. Each item: one sentence + artifact (SHA, ticket ID, file:line).
Examples: aggressive triage calls in Phase 2, batches with multiple
attempts, areas where /review-arch findings were close-to-floor.]
### Tickets created and fixed (Phase 2)
- [#N] <title> — fix SHA <short> — /review-arch finding type
- [#N+1] <title> — fix SHA <short>
- ...
### Deferred findings (below severity floor)
[/review-arch findings not gated by the floor. Operator may run another
iteration with a lower floor to address them.]
- [Medium | iteration 1] <description>
- [Low | iteration 2] <description>
- ...
### Contested or breaking-change findings (if any escalated)
- [contested | iteration 3] <description> — andon-cord SHA <short> in state doc
- ...
### Constraint adherence
[Confirm no violations. If any close calls, name them with commit SHAs.]
### Changes summary
- Branch: <branch name> (SHA <short>)
- Base: <base branch> (SHA <short>)
- Commits on branch: N (Phase 1: P1 + Phase 2 fixes: P2 + Phase 3: P3)
- Net lines: +X/-Y
### Run metadata
- Phase 2 iterations: N of 5
- Andon-cord pulls during run: N (each with handoff in state doc)
- Duration (wall-clock, approximate)
The operator decides whether to merge, run another iteration (e.g., with a lower severity floor), or pause.
Same shape as /refactor's and /review-arch's scope questions. Examples:
vendor/ and gen/."pkg/auth and pkg/session — recent rewrite, want comprehensive cleanup."pkg/legacy (slated for restructuring)."The lowest /review-arch severity that gates Phase 2 convergence.
| Floor | Effect |
|---|---|
| CRITICAL only | Loops only on CRITICAL findings. Ships with HIGH+MEDIUM+LOW deferred. Use for "fix only the worst" sweeps. |
| HIGH+ | Default. Acts on HIGH and CRITICAL architectural findings; defers MEDIUM and LOW. |
| MEDIUM+ | Acts on MEDIUM and above. Phase 2 may not converge in 5 iterations on large codebases. |
| All severities | Not recommended. /review-arch tends to find LOW-severity opportunities indefinitely; unlikely to converge. |
Hard limits beyond the always-on guardrails (no breaking changes, no main/master writes).
Examples:
auth."pkg/legacy (slated for removal next quarter)."Aggression ceiling for both /refactor passes (Phase 1 and Phase 3). Mirrors /refactor's vocabulary.
| Setting | Effect |
|---|---|
| conservative | Only SAFEST and SAFE categories — dead code, formatters, simple DRY, single-use indirection. |
| moderate | Default. Adds cross-module DRY, splitting files, removing abstraction layers. |
| aggressive | Adds removal of legacy code with unclear purpose, consolidating similar-but-not-identical behavior. |
Triage is mechanical, not judgment-based. The severity floor is the only knob.
/implement-batch.The skill does NOT:
/review-arch's severity classifications. The reviewer's judgment stands.Follow the shared handoff template and per-skill extension protocol in references/autonomy.md § "Shared handoff template" and § "Per-skill handoff extensions". Skill-specific values:
## Andon Cord — /lead-refactor — Phase N (the phase is load-bearing).Current phase: <phase-1 | phase-2 | phase-3 | termination>Phase 2 iteration counter: <N> (if currently in Phase 2)Findings ledger: <K fixed, M deferred, L contested> (if currently in Phase 2)State doc pointer: see LEAD_REFACTOR_STATE.mdPull the cord when:
/review-arch finding is wrong on substance — the proposed restructuring contradicts a constraint in commander's intent, or the finding misreads the domain model./review-arch recommendation requires a breaking change (per references/autonomy.md § "No unilateral breaking changes")./refactor or /implement-batch invocation made the test suite fail in ways unrelated to the targeted change./refactor, /review-arch, /implement-batch, or /implement pulled its own cord for a reason this skill cannot resolve./implement-batch fails on the same finding 3 times across different approaches.LEAD_REFACTOR_STATE.mdMaintained at the repo root. Gitignored. Survives across invocations.
Structure:
# Lead-Refactor State
Started: <timestamp>
Branch: <branch-name>
Branch SHA at startup: <short SHA>
Base branch: <main-branch>
Base SHA at startup: <short SHA>
Last cycle HEAD: <short SHA>
Current phase: <phase-1 | phase-2 | phase-3 | termination>
Phase 2 iteration: N (if applicable)
Status: <active | paused-on-andon | complete>
## Commander's Intent
### Scope
<verbatim>
### Severity floor
<CRITICAL only | HIGH+ | MEDIUM+ | All>
### Constraints
- <constraint 1>
### Refactor aggression
<conservative | moderate | aggressive>
## Cycle log
### Phase 1 — <timestamp> — HEAD <short SHA>
- /refactor invoked with aggression <X>, scope <Y>
- Outcome: N commits, net -X/+Y lines, tests pass
### Phase 2 — Iteration 1 — <timestamp> — HEAD <short SHA>
- /review-arch: <findings count by severity>, tickets proposed <#N..#M>
- Triage: <count above floor | below floor | contested | breaking>
- Decide: <form batch | escalate | converged>
- Act: /implement-batch outcome, fix SHAs
- Verify: tests pass / fail
### Phase 2 — Iteration 2 ...
### Phase 3 — <timestamp> — HEAD <short SHA>
- /refactor invoked, outcome: ...
## Findings ledger
### Fixed (Phase 2)
- [iteration 1 | ticket #14] <description> — fix SHA <short>
- ...
### Deferred (below floor)
- [Medium | iteration 1] <description>
- ...
### Contested
- [iteration 3] <description> — andon cord pulled, see § Andon Cord history
### Breaking-change-required
- [iteration 2] <description> — andon cord pulled, see § Andon Cord history
## Andon cord history
### Phase 2 iteration 3 pull
<full handoff text, pasted>
## Open questions
- <question>
Update at every phase transition and every Phase 2 iteration. The state doc is the durable orientation — losing it means losing the agent's memory.
.gitignoreEnsure LEAD_REFACTOR_STATE.md is ignored. Commit the .gitignore change on the working branch at startup if needed.
/review-arch + one batch-implementation. If the loop hits 5 without convergence, the architectural shape is not stable under the current severity floor or scope; pull the andon cord./implement-batch fails on the same finding 3 times across different approaches, pull the andon cord.references/autonomy.md § "Risk budgets."Relationship to /refactor:
/lead-refactor invokes /refactor twice — once in Phase 1 (clear tactical noise before architectural analysis) and once in Phase 3 (catch tactical issues introduced by Phase 2's restructuring). Run /refactor directly when you want only tactical cleanup; run /lead-refactor when you want the full tactical + architectural + tactical pipeline.
Relationship to /review-arch:
/lead-refactor's Phase 2 invokes /review-arch repeatedly until findings converge below the severity floor. /review-arch's ticket proposals are auto-approved per the orchestrator-family contract. Run /review-arch directly when you want an advisory architectural read-out without implementation.
Relationship to /lead-project:
/lead-project is the open-ended orchestrator — it takes broad commander's intent and decides which skills to invoke. /lead-refactor is a fixed-shape three-phase pipeline for one specific outcome. Use /lead-project when refactoring is one of several concerns; use /lead-refactor when comprehensive refactoring is the sole objective. /lead-project may invoke /lead-refactor as a sub-skill when its Decide phase identifies a refactor sweep as the next move.
Relationship to /lead-bug-hunt:
Sibling orchestrator-family skill with a different outcome contract. /lead-bug-hunt eliminates bugs above a severity floor; /lead-refactor cleans up code structure above a severity floor. Both use the same auto-approval contract for sub-skill ticket proposals and the same 4-field commander's intent shape (scope, severity floor, constraints, aggression-or-finisher).
Out-of-axis skills:
/review-* skills other than /review-arch, /scope-project, /test-mutation, /tidy-docs, /tidy-git, /bug-* are not in this skill's repertoire. If your goal mixes refactoring with bug elimination, security review, doc tidying, or test surveying, use /lead-project instead. /lead-refactor is deliberately narrower.
If you want doc cleanup after refactoring, run /tidy-docs separately after /lead-refactor completes — structural changes in Phase 2 commonly invalidate inline docs and code examples.