Create implementation plans from spec via iterative codebase research and strategic questions. Produces mini-PR plans optimized for iterative development.
Generates implementation plans through iterative codebase research and strategic technical questions.
/plugin marketplace add doodledood/claude-code-plugins/plugin install vibe-workflow@claude-code-plugins-marketplaceOptional - path to spec fileUser request: $ARGUMENTS
Build implementation plan through structured discovery. Takes spec (from /spec or inline), iteratively researches codebase + asks high-priority technical questions that shape implementation direction → detailed plan.
Focus: HOW not WHAT. Spec=what; plan=architecture, files, functions, chunks, dependencies, tests.
Loop: Research → Expand todos → Ask questions → Write findings → Repeat until complete
Output files:
/tmp/plan-{YYYYMMDD-HHMMSS}-{name-kebab-case}.md/tmp/plan-research-{YYYYMMDD-HHMMSS}-{name-kebab-case}.md (external memory)Todos = areas to research/decide, not steps. Expand when research reveals: (a) files/modules to modify beyond those already in todos, (b) 2+ valid implementation patterns with different trade-offs, (c) dependencies on code/systems not yet analyzed, or (d) questions that must be answered before completing an existing todo.
Starter seeds:
- [ ] Read/infer spec requirements
- [ ] Codebase research (patterns, files to modify)
- [ ] Architecture decisions
- [ ] (expand as research reveals new areas)
- [ ] Finalize chunks
Evolution example - "Add real-time notifications":
Initial → After codebase research (found WebSocket) → After "needs offline too":
- [x] Read spec → 3 types, mobile+web
- [x] Codebase research → ws.ts, notification-service.ts
- [x] WebSocket approach → extend existing
- [ ] Architecture decisions
- [ ] Offline storage (IndexedDB vs localStorage)
- [ ] Sync conflict resolution
- [ ] Service worker integration
- [ ] Finalize chunks
Key: Never prune todos prematurely.
Path: /tmp/plan-research-{YYYYMMDD-HHMMSS}-{name-kebab-case}.md
# Research Log: {feature}
Started: {timestamp} | Spec: {path or "inline"}
## Codebase Research
## Architecture Decisions
## Questions & Answers
## Unresolved Items
Prerequisites: Requires vibe-workflow:codebase-explorer agent. If Task tool fails for any reason (agent not found, timeout after 120 seconds, permission error, incomplete results) OR returns fewer than 3 relevant files when exploring an area expected to touch multiple modules (cross-cutting concerns, features spanning >2 directories), perform supplementary codebase research manually using Read, Glob, and Grep tools and note [SUPPLEMENTED RESEARCH: codebase-explorer insufficient - {reason}] in research log. Do not retry on timeout—proceed directly to supplementary research.
Extract: requirements, user stories, acceptance criteria, constraints, out-of-scope.
No formal spec? Infer from conversation, tool outputs, user request. If spec and conversation together provide fewer than 2 concrete requirements, ask user via AskUserQuestion: "I need at least 2 concrete requirements to plan. Please provide: [list what's missing]" before proceeding.
Task tool with subagent_type: "vibe-workflow:codebase-explorer". Launch multiple in parallel for cross-cutting work.
Explore: existing implementations, files to modify, patterns, integration points, test patterns.
No skipping. Gives firsthand knowledge of patterns, architecture, integration, tests.
After EACH step:
### {timestamp} - {what researched}
- Explored: {areas}
- Key findings: {files, patterns, integration points}
- New areas: {list}
- Architectural questions: {list}
First draft with [TBD] markers. Same file path for all updates.
CRITICAL: Use AskUserQuestion tool for ALL questions—never plain text. If AskUserQuestion is unavailable, present questions in structured markdown with numbered options and wait for user response.
Example (the questions array supports 1-4 questions per call—that's batching):
questions: [
{
question: "Should we build the full implementation or a minimal stub?",
header: "Phasing",
options: [
{ label: "Full implementation (Recommended)", description: "Complete feature per spec, production-ready" },
{ label: "Minimal stub", description: "Interface only, implementation deferred" },
{ label: "Incremental", description: "Core first, enhance in follow-up PRs" }
],
multiSelect: false
},
{
question: "Which state management approach?",
header: "State",
options: [
{ label: "Extend existing store (Recommended)", description: "Matches codebase pattern in src/store/" },
{ label: "Local component state", description: "Simpler but less shareable" },
{ label: "New dedicated store", description: "Isolated but adds complexity" }
],
multiSelect: false
}
]
in_progress (via TodoWrite with status "in_progress")[TBD])completed (via TodoWrite with status "completed")NEVER proceed without writing findings — research log = external memory.
If user answer contradicts prior decisions: (1) Inform user: "This contradicts earlier decision X. Proceeding with new answer." (2) Log in research log under ## Conflicts with both decisions. (3) Re-evaluate affected todos. (4) Update plan accordingly. If contradiction cannot be resolved, ask user to clarify priority.
### {timestamp} - {what}
**Todo**: {which}
**Finding/Answer**: {result}
**Impact**: {what revealed/decided}
**New areas**: {list or "none"}
Architecture decisions:
- {Area}: {choice} — {rationale}
| Research Reveals | Add Todos For |
|---|---|
| Existing similar code | Integration approach |
| Multiple valid patterns | Pattern selection |
| External dependency | Dependency strategy |
| Complex state | State architecture |
| Cross-cutting concern | Concern isolation |
| Performance-sensitive | Performance strategy |
| Migration needed | Migration path |
Unbounded loop: Iterate until ALL completion criteria met. No fixed round limit. If user says "just decide", "you pick", "I don't care", "skip this", or otherwise explicitly delegates the decision, document remaining decisions with [INFERRED: {choice} - {rationale}] markers and finalize.
Spec-first: Business scope and requirements belong in spec. Questions here are TECHNICAL only—architecture, patterns, implementation approach. If spec has gaps affecting implementation: (1) flag in research log under ## Spec Gaps, (2) ask user via AskUserQuestion whether to pause for spec update OR proceed with stated assumption, (3) document choice and continue.
Prioritize questions that eliminate other questions - Ask questions where the answer changes what other questions you need to ask, or eliminates entire branches of implementation. If knowing X makes Y irrelevant, ask X first.
Interleave discovery and questions:
[TBD] markersQuestion priority order:
| Priority | Type | Purpose | Examples |
|---|---|---|---|
| 1 | Implementation Phasing | How much to build now vs later | Full impl vs stub? Include migration? Optimize or simple first? |
| 2 | Branching | Open/close implementation paths | Sync vs async? Polling vs push? In-memory vs persistent? |
| 3 | Technical Constraints | Non-negotiable technical limits | Must integrate with X? Performance requirements? Backward compatibility? |
| 4 | Architectural | Choose between patterns | Error strategy? State management? Concurrency model? |
| 5 | Detail Refinement | Fine-grained technical details | Test coverage scope? Retry policy? Logging verbosity? |
Always mark one option "(Recommended)" - put first with reasoning in description. When options are equivalent AND easily reversible (changes affect only 1-2 files, where each changed file is imported by 5 or fewer other files, and there are no data migrations, schema changes, or public API changes), decide yourself (lean toward existing codebase patterns).
Be thorough via technique:
questions array per call (batch questions that share a common decision—e.g., multiple state management questions, or multiple error handling questions—where answers to one inform the others); max 4 options per question (tool limit)Ask non-obvious questions - Error handling strategies, edge cases affecting correctness, performance implications, testing approach for complex logic, rollback/migration needs, failure modes
Ask vs Decide - Codebase patterns and technical standards are authority; user decides significant trade-offs.
Ask user when:
| Category | Examples |
|---|---|
| Trade-offs affecting measurable outcomes | Estimated >20% change to latency/throughput vs current implementation, adds abstraction layers, locks approach for >6 months, changes user-facing behavior |
| No clear codebase precedent | New pattern not yet established |
| Multiple valid approaches | Architecture choice with different implications |
| Phasing decisions | Full impl vs stub, migration included or deferred |
| Breaking changes | API changes, schema migrations |
| Resource allocation | Cache size, connection pools, batch sizes with cost implications |
Decide yourself when:
| Category | Examples |
|---|---|
| Existing codebase pattern | Error format, naming conventions, file structure |
| Industry standard | HTTP status codes, retry with exponential backoff |
| Sensible defaults | Timeout 30s, pagination 50 items, debounce 300ms |
| Easily changed later | Internal function names, log messages, test structure |
| Implementation detail | Which hook to use, internal state shape, helper organization |
| Clear best practice | Dependency injection, separation of concerns |
Test: "If I picked wrong, would user say 'that's not what I meant' (ASK) or 'that works, I would have done similar' (DECIDE)?"
## Planning Complete
Finished: {timestamp} | Research log entries: {count} | Architecture decisions: {count}
## Summary
{Key decisions}
Remove [TBD], ensure chunk consistency, verify dependency ordering, add line ranges for files >500 lines.
## Plan Summary
**Plan file**: /tmp/plan-{...}.md
### What We're Building
{1-2 sentences}
### Chunks ({count})
1. {Name} - {description}
### Key Decisions
- {Decision}: {choice}
### Execution Order
{Dependencies, parallel opportunities}
---
Review full plan. Adjust or approve to start.
Do NOT implement until user explicitly approves. After approval: create todos from chunks, execute.
| Principle | Description |
|---|---|
| Safety | Never skip gates (type checks, tests, lint); every chunk tests+demos independently |
| Clarity | Full paths, numbered chunks, rationale for context files, line ranges |
| Minimalism | Ship today's requirements; parallelize where possible |
| Forward focus | Don't prioritize backward compatibility unless requested or public API/schema contracts would be broken |
| Cognitive load | Deep modules with simple interfaces > many shallow; reduce choices |
| Conflicts | Safety > Clarity > Minimalism > Forward focus |
Definitions:
User's explicit intent takes precedence for implementation choices (P2-P10). P1 (Correctness) and Safety gates (type checks 0 errors, tests pass, lint clean) are non-negotiable—if user requests skipping these, flag as risk but do not skip.
| # | Principle | Planning Implication |
|---|---|---|
| P1 | Correctness | Every chunk must demonstrably work |
| P2 | Observability | Plan logging, error visibility |
| P3 | Illegal States Unrepresentable | Design types preventing compile-time bugs |
| P4 | Single Responsibility | Each chunk ONE thing |
| P5 | Explicit Over Implicit | Clear APIs, no hidden behaviors |
| P6 | Minimal Surface Area | YAGNI—don't add features beyond spec |
| P7 | Tests | Specific cases, not "add tests" |
| P8 | Safe Evolution | Public API/schema changes need migration |
| P9 | Fault Containment | Plan failure isolation, retry/fallback |
| P10 | Comments Why | Document complex logic why, not what |
P1-P10 apply to code quality within chunks. Principle conflicts (Safety > Clarity > Minimalism > Forward focus) govern planning-level decisions. When both apply, Safety (gates) takes precedence over all P2-P10.
Values: Mini-PR > monolithic; parallel > sequential; function-level > code details; dependency clarity > implicit coupling; ship-ready > half-built
Each chunk must:
| Complexity | Chunks | Guidance |
|---|---|---|
| Simple | 1-2 | 1-3 functions each |
| Medium | 3-5 | <200 lines of code per chunk |
| Complex | 5-8 | Each demo-able |
| Integration | +1 final | Connect prior work |
Decision guide: New model/schema → types chunk first | >3 files or >5 functions → split by concern | Complex integration → foundation then integration | One module <200 lines of code → single chunk OK
| Belongs | Does Not Belong |
|---|---|
| Numbered chunks, gates, todo descriptions | Code snippets |
| File manifests with reasons | Extra features, future-proofing |
| Function names only | Performance tuning, assumed knowledge |
| Pattern | Flow |
|---|---|
| Sequential | Model → Logic → API → Error handling |
| Parallel after foundation | Model → CRUD ops (parallel) → Integration |
| Pipeline | Types → Parse/Transform (parallel) → Format → Errors |
| Authentication | User model → Login → Auth middleware → Logout |
| Search | Data structure → Algorithm → API → Ranking |
# IMPLEMENTATION PLAN: [Feature]
[1-2 sentences]
Gates: Type checks (0 errors), Tests (pass), Lint (clean)
---
## Requirement Coverage
- [Spec requirement] → Chunk N
- [Spec requirement] → Chunk M, Chunk N
---
## 1. [Name]
Depends on: - | Parallel: -
[What this delivers]
Files to modify:
- path.ts - [changes]
Files to create:
- new.ts - [purpose]
Context files:
- reference.ts - [why relevant]
Notes: [Assumptions, risks, alternatives]
Tasks:
- Implement fn() - [purpose]
- Tests - [cases]
- Run gates
Acceptance criteria:
- Gates pass
- [Specific verifiable criterion]
Key functions: fn(), helper()
Types: TypeName
## 2. Add User Validation Service
Depends on: 1 (User types) | Parallel: 3
Implements email/password validation with rate limiting.
Files to modify:
- src/services/user.ts - Add validateUserInput()
Files to create:
- src/services/validation.ts - Validation + rate limiter
Context:
- src/services/auth.ts:45-80 - Existing validation patterns
- src/types/user.ts - User types from chunk 1
Tasks:
- validateEmail() - RFC 5322
- validatePassword() - Min 8, 1 number, 1 special
- rateLimit() - 5 attempts/min/IP
- Tests: valid email, invalid formats, password edges, rate limit
- Run gates
Acceptance criteria:
- Gates pass
- validateEmail() rejects invalid formats, accepts valid RFC 5322
- validatePassword() enforces min 8, 1 number, 1 special
- Rate limiter blocks after 5 attempts/min/IP
Functions: validateUserInput(), validateEmail(), rateLimit()
Types: ValidationResult, RateLimitConfig
## 2. User Stuff
Add validation for users.
Files: user.ts
Tasks: Add validation, Add tests
Why bad: No dependencies, vague description, missing full paths, no context files, generic tasks, no functions listed, no acceptance criteria.
| Level | Criteria |
|---|---|
| Good | Each chunk ships value; dependencies ordered; parallel identified; files explicit; context has reasons; tests in todos; gates listed |
| Excellent | + optimal parallelization, line numbers, clear integration, risks, alternatives, reduces cognitive load |
MUST verify:
SHOULD verify:
| Priority | What | Requirement |
|---|---|---|
| 9-10 | Data mutations, money, auth, state machines | MUST |
| 7-8 | Business logic, API contracts, errors | SHOULD |
| 5-6 | Edge cases, boundaries, integration | GOOD |
| 1-4 | Trivial getters, pass-through | OPTIONAL |
For external systems/user input, specify:
Avoid: empty catch, catch-return-null, silent fallbacks, broad catching.
| Scenario | Action |
|---|---|
| No detailed requirements | Research → core requirements/constraints unclear: ask via AskUserQuestion OR stop → non-critical: assume+document |
| Extensive requirements | MUSTs first → research scope → ask priority trade-offs → defer SHOULD/MAY |
| Multiple approaches | Research first → ask only when significantly different implications |
| Everything dependent | Start from types → question each dependency → find false dependencies → foundation → parallel → integration |
Memento (always):
Primary: 4. Smallest shippable increment? 5. Passes all gates? 6. Explicitly required? 7. Passes review first submission?
Secondary: 8. Ship with less? 9. Dependencies determine order? 10. Researched first, asked strategically? 11. Reduces cognitive load? 12. Satisfies P1-P10? 13. Error paths planned?
/tmp/)[TBD]| Symptom | Action |
|---|---|
| Chunk >200 lines of code | Split by concern |
| No clear value | Merge or refocus |
| Dependencies unclear | Make explicit, number |
| Context missing | Add files + line numbers |