Ship Workflow
Help the user ship code while building deep, lasting understanding of their changes and codebase.
Step 1: Analyze Changes
- Run
git diff (or git diff --staged) to see all changes
- Read changed files AND surrounding context:
- What files import/depend on these changes?
- What's the data flow through these components?
- What patterns/libraries are being used?
- If new libraries/APIs, WebSearch for docs and best practices
- Build a mental model: WHAT changed, WHY it matters, HOW it connects
Step 2: Code Review
Present thorough review checking:
- Bugs: Edge cases, null checks, error handling, race conditions
- Security: Input validation, injection risks, auth issues
- Performance: Unnecessary work, memory leaks, N+1 queries
- Design: SOLID principles, separation of concerns, testability
Provide: Summary, specific issues with file:line references, suggestions.
Step 3: Structure Review
Analyze whether the changed/new code follows the existing codebase's organizational patterns and separation of concerns.
3.1: Learn the Codebase Patterns
Before suggesting changes, understand how THIS codebase is organized:
-
Map the directory structure using Glob to understand the architecture:
- What are the top-level directories? (
lib/, src/, app/, etc.)
- How are features organized? (by feature, by layer, hybrid?)
- Where do different concerns live? (UI, business logic, data, utilities)
-
Identify naming conventions:
- File naming:
snake_case.dart, PascalCase.tsx, kebab-case.js?
- Class/function naming patterns
- Suffix conventions:
*_screen.dart, *_controller.dart, *_repository.dart?
-
Recognize layer boundaries (reference: STRUCTURE_PATTERNS.md):
- Presentation: widgets, screens, controllers (state management)
- Application: services (orchestration logic)
- Domain: models, business logic, validation
- Data: repositories, data sources, DTOs
- Shared: common_widgets, utils, constants
-
Note existing patterns:
- State management approach (Riverpod, Bloc, Redux, etc.)
- Dependency injection patterns
- How similar features are structured
3.2: Analyze Changed Files
For each new/modified file, evaluate:
Location Check
- Is this file in the right directory based on its responsibility?
- Does it follow the feature/layer organization pattern?
- Would a new developer know where to find this?
Separation of Concerns
- UI Pollution: Is business logic mixed into UI components?
- Data Leakage: Is data access code mixed with presentation?
- God Files: Is this file doing too many unrelated things?
- Cross-cutting: Are concerns properly separated or tangled?
Single Responsibility (File Level)
- Does this file have ONE clear purpose?
- Could you describe what this file does in one sentence?
- If it has multiple responsibilities, how should they split?
Pattern Consistency
- Does it follow the same patterns as similar existing code?
- Are naming conventions consistent?
- Does it integrate cleanly with existing architecture?
3.3: Structure Recommendations
If issues are found, present:
1. Issue Summary
List each structural issue found:
⚠️ `lib/screens/session_handler.dart`
- Contains both UI widgets AND session management logic
- Session logic should live in a controller/service
⚠️ `lib/utils/api_service.dart`
- Handles auth, user data, AND analytics
- Should be split by domain
2. Proposed Structure
Show a tree diagram following the feature-first pattern (see STRUCTURE_PATTERNS.md):
📁 Proposed Structure:
lib/src/
├── features/
│ └── session/
│ ├── presentation/
│ │ └── session_screen.dart ← UI only
│ ├── application/
│ │ └── session_service.dart ← Orchestration logic
│ ├── domain/
│ │ └── session_model.dart ← Business logic
│ └── data/
│ └── session_repository.dart ← Data access (extracted)
├── common_widgets/ ← Shared UI components
└── utils/ ← Shared utilities
3. Migration Path
For each suggested change, explain:
- WHAT to move/split
- WHERE it should go
- WHY this improves the structure
- HOW it aligns with existing codebase patterns
3.4: User Decision
Use AskUserQuestion to present options:
"Structure Review Complete"
Options:
- Apply all - Restructure files as suggested
- Apply some - Let me choose which changes to apply
- Skip - Keep current structure, continue to quiz
- Discuss - I have questions about the suggestions
3.5: Apply Restructuring (if approved)
If user approves:
-
Execute the restructuring:
- Move files to new locations
- Split files if needed (extract classes/functions to new files)
- Update imports across the codebase
- Ensure no broken references
-
Verify changes:
- Run any available linter/analyzer (
flutter analyze, eslint, etc.)
- Check for import errors
-
Proceed to commit message (Step 4)
When to Skip Structure Review
Skip this step if:
- Changes are minor (single-line fixes, typos)
- Changes are only to existing files without new structural elements
- User explicitly requests to skip (
/ship --no-structure)
Mention briefly: "No structural concerns with these changes" and proceed to commit message.
Step 4: Commit Message
Generate concise commit message for all changes (including any structural improvements applied):
- Imperative mood: "Add", "Fix", "Update"
- Max 50 chars first line
- Specific but concise
- Focus on WHAT the feature/fix accomplishes, not the refactoring details
Present to user. They commit manually.
Step 5: Understanding Quiz
Research Foundation
This quiz is designed using evidence-based learning science:
- Testing Effect: Retrieving information strengthens memory more than passive review
- Elaborative Interrogation: "Why/How" questions force deeper processing and schema building
- Bloom's Taxonomy: Progress through cognitive levels from understanding → analysis → evaluation → transfer
- Mental Model Building: Help develop expert-like representations with hierarchical structure, pattern recognition, and goal mapping
- Desirable Difficulties: Effortful retrieval improves long-term retention
- Self-Explanation Effect: Translating code to explanations builds comprehension
Quiz Setup
Ask the user using AskUserQuestion:
"How deep do you want to go?"
- Quick (2-3 questions) - Hit the key points
- Standard (5 questions) - Solid understanding
- Deep (8-10 questions) - Thorough mastery, recommended for complex changes
Suggest depth based on change complexity if they say "you decide."
Question Framework: Bloom's Taxonomy for Code
Progress through these levels, spending more time on higher levels:
Level 1: TRACE (Understand)
"Can you follow what this code does?"
Questions that test comprehension of the actual execution:
- "Walk me through what happens when [function] is called with [input]"
- "What is the value of [variable] after line X executes?"
- "In what order do these operations occur?"
Why this matters: Code tracing is foundational - it must be learned before code writing.
Level 2: CONNECT (Apply/Analyze)
"How does this fit into the bigger picture?"
Questions that build the mental model of how code connects:
- "What other files/components call this code?"
- "If this function returns an error, what happens to the UI?"
- "Trace the data flow from [user action] to [this code] to [result]"
- "What state does the app need to be in for this code to run?"
Why this matters: Experts have "explicit mapping of code to goals" and "connection of knowledge" - novices lack this.
Level 3: EXPLAIN (Analyze)
"Why does this work the way it does?"
Elaborative interrogation - the most powerful learning technique:
- "WHY did you use [pattern/approach] instead of [alternative]?"
- "WHY is this code located in [this file] rather than [elsewhere]?"
- "HOW does [library/API] work under the hood here?"
- "WHAT trade-off did this design choice make?"
Why this matters: Asking "why" forces integration of new info with prior knowledge and builds schemas.
Level 4: EVALUATE (Evaluate)
"Is this good code? How could it be better?"
Questions that develop critical judgment:
- "What's the weakest part of this implementation?"
- "Does this follow the [relevant principle - SRP, DRY, etc.]? Why or why not?"
- "What edge case could break this?"
- "Is this code easy to test? What makes it easy or hard?"
- "If a new developer read this, what would confuse them?"
Why this matters: Evaluation is a higher-order skill that separates experts from novices.
Level 5: TRANSFER (Create/Transfer)
"How would you extend or modify this?"
Questions that test ability to apply knowledge to new situations:
- "If we needed to add [new feature], what would need to change?"
- "What if [requirement X] changed - how would you modify this?"
- "How would you refactor this to support [new use case]?"
- "If this needed to handle 10x the load, what would break first?"
Why this matters: Transfer questions develop the ability to apply learning to new contexts - the ultimate goal.
How to Quiz
Execution principles based on learning science:
-
One question at a time (don't overwhelm working memory)
-
Use AskUserQuestion with multiple choice including:
- The correct answer
- 2-3 plausible wrong answers (things a developer might reasonably think)
- Make wrong answers educational - they should represent common misconceptions
-
Interleave question types (don't group all similar questions)
- Mix Trace → Connect → Explain → Evaluate throughout
- This is harder but produces better long-term retention
-
After each answer, provide rich feedback:
- If correct: Reinforce WHY, connect to broader principles, praise the reasoning
- If incorrect: Don't just give the answer - EXPLAIN it:
- Trace through the code showing why the correct answer is right
- Explain the misconception that led to the wrong answer
- Connect to the codebase: "This matters because..."
-
Build progressively within a session:
- Start with Trace/Connect (foundation)
- Move to Explain/Evaluate (deeper)
- End with Transfer (synthesis)
-
Encourage self-explanation:
- For complex questions, ask them to explain their reasoning
- "Why do you think that?" before revealing if correct/incorrect
- Translation exercise: "Explain in plain English what line X does"
Question Generation Guidelines
When generating questions, ensure they are:
- Specific to the actual changes - Reference real file names, function names, variables
- Connected to the codebase - Ask about how changes interact with existing code
- Progressively harder - Build from comprehension to evaluation
- Educational even if wrong - Wrong answers should teach something
- Practical - Focus on things that matter for maintaining this code
Bad question: "What design pattern is this?"
Good question: "Your AuthChecker uses Riverpod's .when() pattern. What happens to the UI if authStateChangesProvider emits an error?"
Bad question: "Is this code good?"
Good question: "The _handleStartSession function does permission checking, state management, and navigation. Does this violate Single Responsibility? What's the trade-off of keeping vs splitting it?"
Post-Quiz Summary
After all questions, provide:
- Score and patterns: What they got right/wrong, any patterns in mistakes
- Mental model reinforcement:
- "Here's how this feature fits into the app architecture..."
- Draw connections between different parts they were quizzed on
- Key insights: The 2-3 most important things to remember about these changes
- Further exploration: Areas of the codebase to explore to deepen understanding
- Spaced repetition hook: "Next time you're in this area of the code, try to recall [X]"
Step 6: Documentation (Optional)
Ask if they want to push docs to Confluence.
If yes:
- Ask which space/page
- Ask what problem this solves
- Write docs:
Problem: [1-2 sentences]
Solution: [1-2 sentences]
Overview
[Brief description]
How It Works
[Key functionality + code paths]
Usage
[Example code or steps]
Notes
[Edge cases, limitations]
- Show draft for approval
- Push to Atlassian