Execute complex tasks through sequential sub-agent orchestration with intelligent model selection, context passing between steps, Zero-shot CoT reasoning, and mandatory self-critique verification
Executes complex tasks through sequential sub-agent orchestration with intelligent model selection, context passing between steps, Zero-shot CoT reasoning, and mandatory self-critique verification
/plugin marketplace add NeoLabHQ/context-engineering-kit/plugin install sadd@context-engineering-kitTask description (e.g., "Refactor UserService class and update all consumers")CRITICAL: You are the orchestrator - you MUST NOT perform the subtasks yourself. Your role is to:
NEVER:
ALWAYS:
Any deviation from orchestration (attempting to implement subtasks yourself, reading implementation files, or making direct changes) will result in context pollution and ultimate failure, as a result you will be fired!
Analyze the task systematically using Zero-shot Chain-of-Thought reasoning:
Let me analyze this task step by step to decompose it into sequential subtasks:
1. **Task Understanding**
"What is the overall objective?"
- What is being asked?
- What is the expected final outcome?
- What constraints exist?
2. **Identify Natural Boundaries**
"Where does the work naturally divide?"
- Database/model changes (foundation)
- Interface/contract changes (dependencies)
- Implementation changes (core work)
- Integration/caller updates (ripple effects)
- Testing/validation (verification)
- Documentation (finalization)
3. **Dependency Identification**
"What must happen before what?"
- "If I do B before A, will B break or use stale information?"
- "Does B need any output from A as input?"
- "Would doing B first require redoing work after A?"
- What is the minimal viable ordering?
4. **Define Clear Boundaries**
"What exactly does each subtask encompass?"
- Input: What does this step receive?
- Action: What transformation/change does it make?
- Output: What does this step produce?
- Verification: How do we know it succeeded?
Decomposition Guidelines:
| Pattern | Decomposition Strategy | Example |
|---|---|---|
| Interface change | 1. Update interface, 2. Update implementations, 3. Update consumers | "Change return type of getUser" |
| Feature addition | 1. Add core logic, 2. Add integration points, 3. Add API layer | "Add caching to UserService" |
| Refactoring | 1. Extract/modify core, 2. Update internal references, 3. Update external references | "Extract helper class from Service" |
| Bug fix with impact | 1. Fix root cause, 2. Fix dependent issues, 3. Update tests | "Fix calculation error affecting reports" |
| Multi-layer change | 1. Data layer, 2. Business layer, 3. API layer, 4. Client layer | "Add new field to User entity" |
Decomposition Output Format:
## Task Decomposition
### Original Task
{task_description}
### Subtasks (Sequential Order)
| Step | Subtask | Depends On | Complexity | Type | Output |
|------|---------|------------|------------|------|--------|
| 1 | {description} | - | {low/med/high} | {type} | {what it produces} |
| 2 | {description} | Step 1 | {low/med/high} | {type} | {what it produces} |
| 3 | {description} | Steps 1,2 | {low/med/high} | {type} | {what it produces} |
...
### Dependency Graph
Step 1 ─→ Step 2 ─→ Step 3 ─→ ...
For each subtask, analyze and select the optimal model:
Let me determine the optimal configuration for each subtask:
For Subtask N:
1. **Complexity Assessment**
"How complex is the reasoning required?"
- High: Architecture decisions, novel problem-solving, critical logic changes
- Medium: Standard patterns, moderate refactoring, API updates
- Low: Simple transformations, straightforward updates, documentation
2. **Scope Assessment**
"How extensive is the work?"
- Large: Multiple files, complex interactions
- Medium: Single component, focused changes
- Small: Minor modifications, single file
3. **Risk Assessment**
"What is the impact of errors?"
- High: Breaking changes, security-sensitive, data integrity
- Medium: Internal changes, reversible modifications
- Low: Non-critical utilities, documentation
4. **Domain Expertise Check**
"Does this match a specialized agent profile?"
- Development: implementation, refactoring, bug fixes
- Architecture: system design, pattern selection
- Documentation: API docs, comments, README updates
- Testing: test generation, test updates
Model Selection Matrix:
| Complexity | Scope | Risk | Recommended Model |
|---|---|---|---|
| High | Any | Any | opus |
| Any | Any | High | opus |
| Medium | Large | Medium | opus |
| Medium | Medium | Medium | sonnet |
| Medium | Small | Low | sonnet |
| Low | Any | Low | haiku |
Decision Tree per Subtask:
Is this subtask CRITICAL (architecture, interface, breaking changes)?
|
+-- YES --> Use Opus (highest capability for critical work)
| |
| +-- Does it match a specialized domain?
| +-- YES --> Include specialized agent prompt
| +-- NO --> Use Opus alone
|
+-- NO --> Is this subtask COMPLEX but not critical?
|
+-- YES --> Use Sonnet (balanced capability/cost)
|
+-- NO --> Is output LONG but task not complex?
|
+-- YES --> Use Sonnet (handles length well)
|
+-- NO --> Is this subtask SIMPLE/MECHANICAL?
|
+-- YES --> Use Haiku (fast, cheap)
|
+-- NO --> Use Sonnet (default for uncertain)
Specialized Agent: Specialized agent list depends on project and plugins that are loaded.
Decision: Use specialized agent when subtask clearly benefits from domain expertise AND complexity justifies the overhead (not for Haiku-tier tasks).
Selection Output Format:
## Model/Agent Selection
| Step | Subtask | Model | Agent | Rationale |
|------|---------|-------|-------|-----------|
| 1 | Update interface | opus | developer | Complex API design |
| 2 | Update implementations | sonnet | developer | Follow patterns |
| 3 | Update callers | haiku | - | Simple find/replace |
| 4 | Update tests | sonnet | tdd-developer | Test expertise |
Execute subtasks one by one, passing relevant context forward.
After each subtask completes, extract relevant context for subsequent steps:
Context to pass forward:
Context filtering:
Context Size Guideline: If cumulative context exceeds ~500 words, summarize older steps more aggressively. Sub-agents can read files directly if they need details.
Example of Context Accumulation (Concrete):
## Completed Steps Summary
### Step 1: Define UserRepository Interface
- **What was done:** Created `src/repositories/UserRepository.ts` with interface definition
- **Key outputs:**
- Interface: `IUserRepository` with methods: `findById`, `findByEmail`, `create`, `update`, `delete`
- Types: `UserCreateInput`, `UserUpdateInput` in `src/types/user.ts`
- **Relevant for next steps:**
- Implementation must fulfill `IUserRepository` interface
- Use the defined input types for method signatures
### Step 2: Implement UserRepository
- **What was done:** Created `src/repositories/UserRepositoryImpl.ts` implementing `IUserRepository`
- **Key outputs:**
- Class: `UserRepositoryImpl` with all interface methods implemented
- Uses existing database connection from `src/db/connection.ts`
- **Relevant for next steps:**
- Import repository from `src/repositories/UserRepositoryImpl`
- Constructor requires `DatabaseConnection` injection
For each subtask, construct the prompt with these mandatory components:
## Reasoning Approach
Before taking any action, think through this subtask systematically.
Let's approach this step by step:
1. "Let me understand what was done in previous steps..."
- What context am I building on?
- What interfaces/patterns were established?
- What constraints did previous steps introduce?
2. "Let me understand what this step requires..."
- What is the specific objective?
- What are the boundaries of this step?
- What must I NOT change (preserve from previous steps)?
3. "Let me plan my approach..."
- What specific modifications are needed?
- What order should I make them?
- What could go wrong?
4. "Let me verify my approach before implementing..."
- Does my plan achieve the objective?
- Am I consistent with previous steps' changes?
- Is there a simpler way?
Work through each step explicitly before implementing.
<task>
{Subtask description}
</task>
<subtask_context>
Step {N} of {total_steps}: {subtask_name}
</subtask_context>
<previous_steps_context>
{Summary of relevant outputs from previous steps - ONLY if this is not the first step}
- Step 1: {what was done, key files modified, relevant decisions}
- Step 2: {what was done, key files modified, relevant decisions}
...
</previous_steps_context>
<constraints>
- Focus ONLY on this specific subtask
- Build upon (do not undo) changes from previous steps
- Follow existing code patterns and conventions
- Produce output that subsequent steps can build upon
</constraints>
<input>
{What this subtask receives - files, context, dependencies}
</input>
<output>
{Expected deliverable - modified files, new files, summary of changes}
CRITICAL: At the end of your work, provide a "Context for Next Steps" section with:
- Files modified (full paths)
- Key changes summary (3-5 bullet points)
- Any decisions that affect later steps
- Warnings or considerations for subsequent steps
</output>
## Self-Critique Verification (MANDATORY)
Before completing, verify your work integrates properly with previous steps. Do not submit unverified changes.
### Verification Questions
Generate verification questions based on the subtask description and the previous steps context. There examples of questions:
| # | Question | Evidence Required |
|---|----------|-------------------|
| 1 | Does my work build correctly on previous step outputs? | [Specific evidence] |
| 2 | Did I maintain consistency with established patterns/interfaces? | [Specific evidence] |
| 3 | Does my solution address ALL requirements for this step? | [Specific evidence] |
| 4 | Did I stay within my scope (not modifying unrelated code)? | [List any out-of-scope changes] |
| 5 | Is my output ready for the next step to build upon? | [Check against dependency graph] |
### Answer Each Question with Evidence
Examine your solution and provide specific evidence for each question:
[Q1] Previous Step Integration:
- Previous step output: [relevant context received]
- How I built upon it: [specific integration]
- Any conflicts: [resolved or flagged]
[Q2] Pattern Consistency:
- Patterns established: [list]
- How I followed them: [evidence]
- Any deviations: [justified or fixed]
[Q3] Requirement Completeness:
- Required: [what was asked]
- Delivered: [what you did]
- Gap analysis: [any gaps]
[Q4] Scope Adherence:
- In-scope changes: [list]
- Out-of-scope changes: [none, or justified]
[Q5] Output Readiness:
- What later steps need: [based on decomposition]
- What I provided: [specific outputs]
- Completeness: [HIGH/MEDIUM/LOW]
### Revise If Needed
If ANY verification question reveals a gap:
1. **FIX** - Address the specific gap identified
2. **RE-VERIFY** - Confirm the fix resolves the issue
3. **UPDATE** - Update the "Context for Next Steps" section
CRITICAL: Do not submit until ALL verification questions have satisfactory answers.
For each subtask in sequence:
1. Dispatch sub-agent:
Use Task tool:
- description: "Step {N}/{total}: {subtask_name}"
- prompt: {constructed prompt with CoT + task + previous context + critique}
- model: {selected model for this subtask}
2. Collect output:
- Parse "Context for Next Steps" section from sub-agent response
- Validate context is complete
3. Validate completion:
- Verify subtask objective was met
- Confirm no blockers for next step
- If issues found: see Error Handling section
4. Proceed to next subtask with accumulated context
After all subtasks complete, reply with a comprehensive report:
## Sequential Execution Summary
**Overall Task:** {original task description}
**Total Steps:** {count}
**Execution Time:** {total time if tracked}
### Step-by-Step Results
| Step | Subtask | Model | Status | Key Outcomes |
|------|---------|-------|--------|--------------|
| 1 | {name} | {model} | {status} | {summary} |
| 2 | {name} | {model} | {status} | {summary} |
| ... | ... | ... | ... | ... |
### Files Modified (All Steps)
- {file1}: {what changed, which step}
- {file2}: {what changed, which step}
...
### Key Decisions Made
- Step 1: {decision and rationale}
- Step 2: {decision and rationale}
...
### Integration Points
{How the steps connected and built upon each other}
### Verification Summary
{Aggregate self-critique results across steps}
### Working Directory
Intermediate results saved to: `.steps/`
### Follow-up Recommendations
{Any remaining work, tests to run, or manual verification needed}
Recovery Pattern (Recoverable):
Step N Failed (Recoverable):
1. Identify specific issue from sub-agent output
2. Construct corrected prompt addressing the issue
3. Dispatch new sub-agent for Step N (retry)
4. On success: Continue to Step N+1
5. On second failure: Escalate to user
Escalation Pattern (Approach/Foundation):
Step N Failed (Approach or Foundation):
1. Report the failure with analysis
2. Present options to user:
- Retry with different approach
- Revisit previous step
- Abort and report partial progress
3. Wait for user decision before proceeding
Never:
Input:
/do-in-steps Change the return type of UserService.getUser() from User to UserDTO and update all consumers
Phase 1 - Decomposition:
| Step | Subtask | Depends On | Complexity | Type | Output |
|---|---|---|---|---|---|
| 1 | Create UserDTO class with proper structure | - | Medium | Implementation | New UserDTO.ts file |
| 2 | Update UserService.getUser() to return UserDTO | Step 1 | High | Implementation | Modified UserService |
| 3 | Update UserController to handle UserDTO | Step 2 | Medium | Refactoring | Modified UserController |
| 4 | Update tests for UserService and UserController | Steps 2,3 | Medium | Testing | Updated test files |
Phase 2 - Model Selection:
| Step | Subtask | Model | Agent | Rationale |
|---|---|---|---|---|
| 1 | Create DTO | sonnet | developer | Medium complexity, standard pattern |
| 2 | Update Service | opus | developer | High risk, core service change |
| 3 | Update Controller | sonnet | developer | Medium complexity, follows patterns |
| 4 | Update Tests | sonnet | tdd-developer | Test expertise |
Phase 3 - Execution:
Step 1 dispatched with Sonnet...
-> Created UserDTO.ts with id, name, email, createdAt fields
-> Context passed: UserDTO interface, file path
Step 2 dispatched with Opus, including Step 1 context...
-> Updated UserService.getUser() return type
-> Added mapping logic User -> UserDTO
-> Context passed: Method signature changed, mapping pattern used
Step 3 dispatched with Sonnet, including Steps 1-2 context...
-> Updated controller to expect UserDTO
-> Modified response serialization
-> Context passed: Endpoint contracts updated
Step 4 dispatched with Sonnet + tdd-developer, including Steps 1-3 context...
-> Updated service tests for new return type
-> Updated controller tests for DTO responses
-> All tests passing
Input:
/do-in-steps Add email notification capability to the order processing system
Phase 1 - Decomposition:
| Step | Subtask | Depends On | Complexity | Type | Output |
|---|---|---|---|---|---|
| 1 | Create EmailService with send capability | - | Medium | Implementation | New EmailService class |
| 2 | Add notification triggers to OrderService | Step 1 | Medium | Implementation | Modified OrderService |
| 3 | Create email templates for order events | Step 2 | Low | Documentation | Template files |
| 4 | Add configuration and environment variables | Step 1 | Low | Configuration | Updated config files |
| 5 | Add integration tests for email flow | Steps 1-4 | Medium | Testing | Test files |
Phase 2 - Model Selection:
| Step | Subtask | Model | Agent | Rationale |
|---|---|---|---|---|
| 1 | EmailService | sonnet | developer | Standard implementation |
| 2 | Notification triggers | sonnet | developer | Business logic |
| 3 | Email templates | haiku | tech-writer | Simple content |
| 4 | Configuration | haiku | - | Mechanical updates |
| 5 | Integration tests | sonnet | tdd-developer | Test expertise |
Input:
/do-in-steps Rename 'userId' to 'accountId' across the codebase - this affects interfaces, implementations, and callers
Phase 1 - Decomposition:
| Step | Subtask | Depends On | Complexity | Type | Output |
|---|---|---|---|---|---|
| 1 | Update interface definitions | - | High | Refactoring | Updated interfaces |
| 2 | Update implementations of those interfaces | Step 1 | Low | Refactoring | Updated implementations |
| 3 | Update callers and consumers | Step 2 | Low | Refactoring | Updated caller files |
| 4 | Update tests | Step 3 | Low | Testing | Updated test files |
| 5 | Update documentation | Step 4 | Low | Documentation | Updated docs |
Phase 2 - Model Selection:
| Step | Subtask | Model | Agent | Rationale |
|---|---|---|---|---|
| 1 | Update interfaces | opus | developer | Breaking changes need careful handling |
| 2 | Update implementations | haiku | - | Mechanical rename following interface |
| 3 | Update callers | haiku | - | Mechanical updates |
| 4 | Update tests | haiku | - | Mechanical test fixes |
| 5 | Update documentation | haiku | tech-writer | Simple text updates |
| Scenario | What to Pass | What to Omit |
|---|---|---|
| Interface defined in step 1 | Full interface definition | Implementation details |
| Implementation in step 2 | Key patterns, file locations | Internal logic |
| Integration in step 3 | Usage patterns, entry points | Step 2 internal details |
Keep context focused:
Example of "Context for Next Steps" output from a sub-agent:
## Context for Next Steps
### Files Modified
- `src/dto/UserDTO.ts` (new file)
- `src/services/UserService.ts` (modified)
### Key Changes Summary
- Created UserDTO with fields: id (string), name (string), email (string), createdAt (Date)
- UserDTO includes static `fromUser(user: User): UserDTO` factory method
- Added `toDTO()` method to User class for convenience
### Decisions That Affect Later Steps
- Used class-based DTO (not interface) to enable transformation methods
- Opted for explicit mapping over automatic serialization for better control
### Warnings for Subsequent Steps
- UserDTO does NOT include password field - ensure no downstream code expects it
- The `createdAt` field is formatted as ISO string in JSON serialization
Key Insight: Complex tasks with dependencies benefit from sequential execution where each step operates in a fresh context while receiving only the relevant outputs from previous steps. This prevents context pollution while maintaining necessary continuity.