From ak
Analyze structure and produce a Refactor Proposal. Analysis only; no files modified.
How this skill is triggered — by the user, by Claude, or both
Slash command
/ak:code-refactorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**Reachable behavior is preserved. Structure is negotiable. The design premise itself is on the table.**
Reachable behavior is preserved. Structure is negotiable. The design premise itself is on the table.
This skill does not find things to clean up. It asks whether the current shape of a component is the right shape, and proposes structural changes when it isn't. Surface-level smells are treated as evidence, not as the thing to fix.
The skill's primary failure mode is bottom-up cataloguing: listing four symptoms, proposing four independent fixes, and missing that all four trace back to one upstream design decision. Guard against this above all else. Premise-first diagnosis is what makes this skill different from code-simplify, not "being bolder."
The second failure mode is cowardly compatibility: preserving a bad internal interface because changing call sites feels disruptive. If the signature, dependency direction, wrapper, or module boundary is the disease, the refactor should break and reshape that internal surface atomically. Compatibility is sacred only at hard-stop surfaces: public exports, HTTP routes, webhook callbacks, event/message schemas, database fields, reflection targets, feature flags, and user-protected files.
code-simplifyThese skills are orthogonal, not a spectrum.
code-simplify operates within the current structure. Accept the design, improve the expression. Signatures preserved, call sites untouched.code-refactor operates on the current structure. Question the design, reshape it if wrong. Signatures negotiable, call sites reshape atomically.A piece of code can need either, both, or neither. They do not interpolate.
If a user request is fundamentally about code expression (long method, unclear variable names, dedup within a function, style), route to code-simplify. If it involves questioning whether the current decomposition is right, this skill applies.
Topological boundary. The user specifies what to look at as paths, globs, files, or a specific diff — not as semantic domains. Legacy code bleeds across semantic lines. If the user gives a semantic boundary ("the billing domain"), translate it to paths and confirm.
Awareness of test coverage. Note whether tests exist for the boundary. Their presence or absence directly affects the confidence tier of every proposed change — a HIGH-confidence proposal on an untested surface is a contradiction.
codebase-dna artifact (optional). If available, read it — it provides component context that prevents hallucinated call relationships. If not, read files directly during Phase 1.
Feature-flag evidence (only if flag cleanup is in scope). The user must explicitly state "Flag X is retired" or "fully rolled out as of [date]." Require explicit user confirmation of flag state — it lives in production, not in config files.
Build a map of the component. For each key symbol in the boundary:
Use deterministic tools where available (rg, grep, language servers, tsc, ts-prune, knip, vulture, gopls) to find callers — do not rely on naming conventions. If no tools are available, read files.
For boundaries over ~10 files: do not attempt to deep-survey every symbol. Survey everything shallowly first (file responsibilities, exports, obvious call relationships), then identify the 5–10 load-bearing symbols — widely called, central to the boundary's purpose, or exposed as external surface — and survey those deeply. Skim the rest: what does it do, is it reached from outside the boundary. Deep-surveying every symbol in a 50-file boundary produces a map that's exhaustive and useless. The load-bearing symbols are where design decisions live; the rest inherits from them.
Phase 1's output is an internal map. Keep it internal — present only if the user asks.
Step 2.0 — Premise check. Do this before anything else.
Ask: what is this component trying to accomplish, and is the current shape the right way to accomplish it?
Then, before listing individual issues, force this question: are there multiple symptoms in this boundary that trace back to one upstream design decision?
Examples of the pattern to look for:
When you find a root-cause decision, propose reversing that decision as the primary refactor. Individual symptom-fixes become either unnecessary or trivial consequences. Do not catalogue the symptoms separately — list them as consequences of the root cause.
Root-cause reversal bar: Reverse the wrong decision directly. Do not preserve the old shape with adapters, wrappers, option flags, compatibility layers, or "temporary" dual paths unless the old shape crosses a hard-stop surface. Inside the confirmed boundary, migrate callers atomically and delete the wrong shape.
If no single upstream decision explains multiple symptoms, then — and only then — fall back to cataloguing individual issues.
Step 2.1 — Individual structural issues (only after the premise check).
Look for these categories, but only list items that survive premise-check (i.e., are not already absorbed by a root-cause refactor):
Signature problems:
Shape problems:
Dead code:
return/throwOut of scope for this skill:
code-simplify's jobcode-simplify's job"Nothing meaningful to refactor" is a valid diagnosis. A refactor skill that always finds something is a vandal. If the premise is sound and no structural issues survive, say so directly in Phase 3.
The proposal's shape depends on what Phase 2 found. There are three cases, not two.
Case A — Root-cause decision only, no residuals. Lead with prose, not a table. Explain the upstream decision, why it's wrong, what the reversal looks like, and how the downstream symptoms become free consequences. Example shape:
The 2nd parameter on
createPublicBookingis the wrong abstraction. Encryption is a test-setup concern and doesn't belong in the booking service. If you push the encrypted card into the payload at the call site, three things happen for free: the duplicated transform disappears, the VCC/regular inconsistency disappears (both use the same pattern), and theapiEverVaultconstructor leak in management services disappears (the service no longer needs to build encrypted payloads internally).Proposed change: remove the 2nd parameter from
createPublicBooking,createChangeBooking,cancelBooking, andcreateHopperBooking. Call sites passcreditCard: { ...encryptedCard.cardToken, isEncrypted: true }in the payload directly. Optionally, addFinanceService.toEncryptedCardPayload(res, extra?)as a one-liner to cover the VCC case withchargeableAmount.
This is reviewer voice — a senior engineer explaining the real fix, not a cataloguer producing a severity matrix.
Case B — Root cause AND residual independent issues. Lead with the root-cause prose as in Case A. Then, underneath, add a short residuals section for the items the reversal does NOT absorb. Do not mix them — the root cause is the headline; the residuals are a follow-up note. Example shape:
[root-cause prose, as in Case A]
Residual items (not absorbed by the reversal):
Change Files touched Confidence Why Rename processStuff→reconcileInventorystock.tsHIGH Name lies; unrelated to root cause Remove unused debugparam fromlogAuditaudit.tsHIGH Zero callers pass it
The residuals table follows the same confidence-tier rules as Case C. Keep it short — if the residuals table is longer than the root-cause prose, reconsider whether you actually found a root cause or whether it's really Case C.
Case C — Bag of independent issues, no root cause. List them as numbered refactor items, each with a confidence tier. Tables are appropriate for the top-level scan, but multi-file suggestions need stable IDs so the user can approve, reject, or delegate them independently. Example shape:
| ID | Refactor | Files touched | Confidence | Why |
|---|---|---|---|---|
| R1 | Remove unused retryCount param from fetchUser | user.ts + 7 call sites | HIGH | Static, tool-confirmed zero usage |
| R2 | Collapse getUserData wrapper around fetchUser | user.ts + 3 call sites | MEDIUM | Wrapper adds no transformation |
| R3 | Delete legacyAuthFallback function | auth.ts | LOW | Exported, dynamic language, unprovable reachability |
For any item touching multiple files, crossing a module boundary, or requiring sequencing, add a short detail block under the table:
#### R2 — Collapse `getUserData` wrapper
**Files:** `user.ts`, `profile.ts`, `orders.ts`
**Call sites:** 3
**Change:** Replace wrapper calls with direct `fetchUser` calls and delete the wrapper.
**Risk:** Medium — wrapper is internal, but call sites span 2 modules.
**Verification:** `npm test -- user`
Confidence tiers:
Case D — Nothing to refactor. State this directly. Explain what the premise check found (the design is sound) and what the individual sweep found (no structural issues survive). A clean bill of health needs no padding — do not produce a table or list to justify the conclusion. Skip Phase 4.
For every proposal, state concrete numbers: "7 call sites," not "several." Vagueness is how silent drift starts.
Present the proposal and stop. This skill's output is a structured proposal — source files are not modified. Implementation is carried out separately by the user or the code skill.
Wrap the proposal in a structured header and output it.
## 🧱 Refactor Proposal
**Boundary:** `[files / paths analyzed]`
**Case:** `A root cause only | B root cause + residuals | C independent issues | D nothing to refactor`
**Refactors Proposed:** `[N]`
**Confidence:** `HIGH | MEDIUM | LOW | mixed`
**Hard Stops:** `[items requiring explicit per-item sign-off] | none`
**Follow-ups:** `[suggested next steps] | none`
### 🔁 Root-Cause Reversal
[For Case A or B: proposal body from Phase 3. Explain the wrong upstream decision, the reversal, and the concrete call-site/module changes.]
### 🧭 Refactor Index
| ID | Refactor | Files | Call Sites | Confidence | Why |
| :-- | :-- | :-- | :-- | :-- | :-- |
| R1 | [short imperative change] | `[file list or count]` | `[N]` | `HIGH/MEDIUM/LOW` | [one-line evidence] |
| R2 | [short imperative change] | `[file list or count]` | `[N]` | `HIGH/MEDIUM/LOW` | [one-line evidence] |
[Use this for Case B residuals and all Case C items. For Case A with one root-cause reversal, omit this section unless it helps summarize multiple affected file groups.]
### 🧩 Refactor Details
#### R1 — [Refactor name]
**Files:** `[concrete files]`
**Call sites:** `[N]`
**Change:** [specific structural change]
**Risk:** [what could break and why]
**Verification:** [test command or manual check]
[Repeat one subsection per item that needs detail. Omit detail subsections for trivial one-line HIGH-confidence items.]
### 🛑 Hard Stops
- `[surface requiring explicit sign-off]` — [why implementation cannot proceed without explicit approval]
- `none`
### 🔍 Evidence Checked
- **Call sites:** `[N]` checked via `[tool/manual trace]`
- **Tests:** `[coverage found / no coverage found]`
- **External surfaces:** `[exports/routes/events/schemas/reflection targets checked]`
### ✅ Recommended Next Step
[One concrete next action, such as "Run code-refactor implementation through the code skill after approving the hard-stop items" or "No refactor recommended."]
If a section is not applicable, omit it unless the omission would hide a risk. Do not output empty placeholder sections. For Case D, output only the metadata, a short ### ✅ Result section, and evidence checked.
Halt at Phase 3 and flag these. General "looks good" approval is insufficient.
Static (TypeScript, Go, Rust, Kotlin, Java, Swift, C#): dead-code and signature-refactor claims can reach HIGH when the language server confirms.
Dynamic (JS without TS, Python, Ruby): no symbol reachable from outside a module exceeds MEDIUM. Exported symbols default to UNCERTAIN — never deleted without the user explicitly saying "I've confirmed X is dead."
Cataloguer voice when reviewer voice is needed. If Phase 2 found a root cause, Phase 3 must lead with prose that names the wrong decision and its reversal. Tables are for residuals or bags of independent changes, not root-cause refactors. Producing a severity matrix when the real answer is "the 2nd parameter shouldn't exist" is the exact failure this skill is designed to avoid.
Speculative abstraction. "I could extract this into a strategy pattern" → almost always wrong. Only propose new abstractions when 3+ concrete uses demand it.
Pattern-matching on syntax, not semantics. Two functions that look similar may serve different domain purposes. Merging them creates a god-function with a boolean flag.
Cowardly compatibility. Keeping the old signature plus adding a new helper often preserves the disease. If no hard-stop surface requires compatibility, migrate callers and remove the old shape.
"Unused" that isn't. A function with no static callers may be reached via DI containers, route registries, plugin loaders, reflection. UNCERTAIN, never deleted.
Defensive-code removal. A null check that "seems unnecessary" often catches a real production edge case. Require evidence of unreachability, not aesthetic judgment.
Feature-flag removal from config. Require explicit user confirmation that the flag is retired — flag state lives in production, not in config.yaml or launchdarkly.json.
Creates bite-sized, testable implementation plans from specs or requirements, with file structure and task decomposition. Activates before coding multi-step tasks.
npx claudepluginhub hanh-nd/agent-kit --plugin ak