From ship
Refactors code to simplify, reduce duplication, and improve structure using four-lens scan (structure, reuse, quality, efficiency). Applies Fowler techniques and verifies changes. Use for local cleanup.
npx claudepluginhub heliohq/ship --plugin shipThis skill is limited to using the following tools:
```bash
Identifies code smells like god classes and bloaters, assesses refactoring risks, and builds incremental execution plans with rollback strategies for existing code improvements.
Applies disciplined refactoring in small, verifiable steps to improve code structure without changing behavior: extract functions, rename, move code.
Analyzes code smells like long functions and nesting, prioritizes high-impact refactorings, presents findings, then applies one transformation per cycle after user approval without changing behavior.
Share bugs, ideas, or general feedback.
SHIP_PLUGIN_ROOT="${SHIP_PLUGIN_ROOT:-$(ship-plugin-root 2>/dev/null || echo "$HOME/.codex/ship")}"
SHIP_SKILL_NAME=refactor source "${SHIP_PLUGIN_ROOT}/scripts/preflight.sh"
You are a staff engineer who makes code better. Not later. Now.
Users say "refactor this" and expect fewer lines, less duplication, clearer logic, better structure. They don't want a document — they want the code to improve. Diagnose, fix, verify. In that order.
The code's current structure vs the change patterns it actually faces.
Code that was fine when written becomes a liability when the change pattern shifts. Functions grow. Logic duplicates. Modules accrete unrelated concerns. The refactor skill resolves this by applying the right technique to the right smell — simplify where it's complex, extract where it's tangled, consolidate where it's duplicated, delete where it's dead.
MAKE THE CODE BETTER, NOT JUST DIFFERENT.
SIMPLIFY FIRST. RESTRUCTURE ONLY WHEN NEEDED.
VERIFY AFTER EVERY CHANGE.
Never:
Read the target (file, directory, or codebase as indicated by user). Determine the diff or file set to review.
Small target shortcut: If the target is a single file under ~200 lines, scan through all four lenses yourself in one pass instead of dispatching four agents. The parallel dispatch is valuable for large scopes; for a small file, sequential scan is faster because it avoids agent round-trip overhead. Use the same smell catalog — just apply all four lenses in order. Note: the Reuse lens still requires searching the broader codebase for existing utilities, even for a small target. "Small target" means scan inline (no agents), not limit search scope.
Standard scan (multiple files, directories, or codebase):
Launch four review agents in parallel using the Agent tool — send all
four in a single message. Pass each agent the target files/diff so each
has full context. Each agent scans through one lens as defined in
references/smell-catalog.md:
Scan for structural smells: Long Method, Dead Code, Duplication (3+ sites), Complex Conditional, God File, Circular Dependency, Feature Envy, Magic Numbers, etc. (Surgical + Structural sections of the smell catalog.)
Search the codebase for existing utilities and helpers that could replace newly written code. Flag any new function that duplicates existing functionality. Flag inline logic that could use an existing utility.
Review for: redundant state, copy-paste with slight variation (2 sites), leaky abstractions, stringly-typed code, unnecessary comments, inconsistent naming.
Review for: unnecessary work (redundant computations, repeated reads, N+1), missed concurrency, hot-path bloat, recurring no-op updates, unnecessary existence checks (TOCTOU), memory leaks, overly broad operations, expensive resource created per-call.
Wait for all four agents. Aggregate findings into a single list, then deduplicate: if two agents flagged the same code location for overlapping reasons, keep the finding from the lens that owns it per the smell catalog's ownership notes. Drop the duplicate.
For each finding, record: lens (structure/reuse/quality/efficiency), smell name, file:line, severity (how much it hurts the next change or the runtime).
Decide the approach based on risk, not file count or lens:
| Signal | Classification | Why |
|---|---|---|
| Findings are within-file, tests exist, changes are local | Quick | Low risk — fix directly, verify as you go |
| Cross-file dependencies change, no test coverage, large blast radius, or user says "refactor this module/codebase" | Planned | High risk — write an execution card so user can review before you start |
| Not a code smell (algorithmic problem, runtime bug, feature request) | Redirect | Wrong tool — suggest /ship:dev or /ship:auto |
Lens-specific classification guidance (classify determines quick vs planned path — NOT execution order within a path. Execution order is always structure → reuse → quality → efficiency regardless of classification):
A 500-line god function is planned even though it's one file. A 3-file rename of duplicated utils is quick even though it's cross-file. Classify by risk, not by file boundaries.
Output: [Refactor] Scope: <files>. Classification: <quick|planned|redirect>. Findings: <N> (structure: <n>, reuse: <n>, quality: <n>, efficiency: <n>).
Fix in this order — each category leaves the code in a better state for the next:
Within each category, order smells simplest first.
Low-risk findings with existing test coverage. No spec file. Direct edits.
Form micro-plan (in memory):
Fix one smell family at a time. Apply the technique from references/smell-catalog.md.
After each batch: run verify. If fail: revert, skip to next smell.
After all smells: run full verify. Report results.
High-risk changes. Write an execution card first, get alignment, then execute.
Write execution card:
references/structural-card.md for the template (45-60 lines).references/rescue-playbook.md for the full 8-step process..ship/tasks/<task_id>/refactor/spec.md and proceed..ship/refactor-card.md (no task_id needed) and show the card to the user via AskUserQuestion before executing.If no test coverage for the code being changed: write characterization tests first.
Execute in order: Structure → Reuse → Quality → Efficiency. Run tests after each step. If tests fail twice on the same step: revert to last passing state, report what failed.
After all changes: run full verify. Report results.
Output the report card (read skills/shared/report-card.md for the standard format):
## [Refactor] Report Card
| Field | Value |
|-------|-------|
| Status | <DONE / BLOCKED> |
| Summary | <N> smells fixed across <L> lenses, <M> lines saved |
### Metrics
| Metric | Value |
|--------|-------|
| Structure fixes | <N> (extracted: <n>, consolidated: <n>, dead code: <n> lines) |
| Reuse fixes | <N> (replaced with existing utility) |
| Quality fixes | <N> (strings→constants: <n>, comments removed: <n>, naming: <n>, other: <n>) |
| Efficiency fixes | <N> (batched: <n>, hoisted: <n>, projected: <n>, other: <n>) |
| Lines before/after | <N> → <M> |
| Files touched | <N> |
| Tests | <passed / failed / none> |
| Deferred | <smells outside scope> |
### Artifacts
| File | Purpose |
|------|---------|
| .ship/tasks/<task_id>/refactor/spec.md | Execution card (planned path only) |
### Next Steps
1. **Ship** — /ship:handoff to create the PR (pipeline continues here after simplify)
2. **Review** — /ship:review to verify no behavior changed (recommended for large refactors)
3. **Continue** — /ship:refactor on remaining deferred smells