Search everything...

Skill

Orchestrator Directives Skill

Cost-first delegation patterns and decision frameworks for multi-AI coordination

Install

npx claudepluginhub shakestzd/htmlgraph --plugin htmlgraph

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Use this skill for delegation patterns and decision frameworks in orchestrator mode.

Supporting Assets

reference.md

SKILL.md

Similar Skills

kotlin-ktor-patterns

Provides Ktor server patterns for routing DSL, plugins (auth, CORS, serialization), Koin DI, WebSockets, services, and testApplication testing.

everything-claude-code

163.2k

deep-research

Conducts multi-source web research with firecrawl and exa MCPs: searches, scrapes pages, synthesizes cited reports. For deep dives, competitive analysis, tech evaluations, or due diligence.

everything-claude-code

163.2k

inventory-demand-planning

Provides demand forecasting, safety stock optimization, replenishment planning, and promotional lift estimation for multi-location retailers managing 300-800 SKUs.

everything-claude-code

163.2k

Stats

Parent Repo Stars3

Parent Repo Forks2

Last CommitApr 5, 2026

Actions

View Source View Plugin View on GitHub View README

Orchestrator Directives Skill | htmlgraph | ClaudePluginHub

Skill

Orchestrator Directives Skill

From htmlgraph

Cost-first delegation patterns and decision frameworks for multi-AI coordination

Install

npx claudepluginhub shakestzd/htmlgraph --plugin htmlgraph

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Use this skill for delegation patterns and decision frameworks in orchestrator mode.

Supporting Assets

reference.md

SKILL.md

Orchestrator Directives Skill

Use this skill for delegation patterns and decision frameworks in orchestrator mode.

Trigger keywords: orchestrator, delegation, subagent, task coordination, parallel execution, cost-first, spawner

Quick Start - What is Orchestration?

Delegate tactical work to specialized subagents while you focus on strategic decisions. Save Claude Code context (expensive) by using FREE/CHEAP AIs for appropriate tasks.

Basic pattern:

Task(
    subagent_type="gemini",  # FREE - use for exploration
    description="Find auth patterns",
    prompt="Search codebase for authentication patterns..."
)

When to use: ALWAYS use for complex tasks requiring research, code generation, git operations, or any work that could fail and require retries.

For complete guidance: See sections below or run /multi-ai-orchestration for model selection details.

CRITICAL: Cost-First Delegation (IMPERATIVE)

Claude Code is EXPENSIVE. You MUST delegate to FREE/CHEAP AIs first.

Cost Comparison & Pre-Delegation Checklist

PRE-DELEGATION CHECKLIST (MUST EXECUTE BEFORE EVERY TASK())

Ask these questions IN ORDER:

Can Gemini do this? → Exploration, research, batch ops, file analysis
- YES = MUST try Bash("gemini ...") first (FREE - 2M tokens/min), fallback to haiku-coder
Is this code work? → Implementation, fixes, tests, refactoring
- YES = MUST try Bash("codex ...") first (70% cheaper than Claude), fallback to sonnet-coder
Is this git/GitHub? → Commits, PRs, issues, branches
- YES = MUST try Bash("copilot ...") first (60% cheaper, GitHub-native), fallback to haiku-coder
Does this need deep reasoning? → Architecture, complex planning
- YES = Use Claude Opus (expensive, but strategically needed)
Is this coordination? → Multi-agent work
- YES = Use Claude Sonnet (mid-tier)
ONLY if above fail → Haiku (fallback)

Cost Comparison Examples

Task	WRONG (Cost)	CORRECT (Cost)	Savings
Search 100 files	Task() ($15-25)	Gemini spawner (FREE)	100%
Generate code	Task() ($10)	Codex spawner ($3)	70%
Git commit	Task() ($5)	Copilot spawner ($2)	60%
Strategic decision	Direct task ($20)	Claude Opus ($50)	Must pay for quality

WRONG vs CORRECT Examples

WRONG (wastes Claude quota):
- Code implementation → Task(haiku)               # USE Bash("codex ..."), fallback sonnet-coder
- Git commits → Task(haiku)                       # USE Bash("copilot ..."), fallback haiku-coder
- File search → Task(haiku)                       # USE Bash("gemini ...") (FREE!)
- Research → Task(haiku)                          # USE Bash("gemini ...") (FREE!)

CORRECT (cost-optimized):
- Code implementation → Bash("codex ...")         # Cheap, sandboxed; fallback sonnet-coder
- Git commits → Bash("copilot ...")               # Cheap, GitHub-native; fallback haiku-coder
- File search → Bash("gemini ...")                # FREE!; fallback haiku-coder
- Research → Bash("gemini ...")                   # FREE!; fallback haiku-coder
- Strategic decisions → Claude Opus               # Expensive, but needed
- Coder agents → FALLBACK ONLY                    # When CLI tools fail or aren't installed

Core Concepts

Orchestrator vs Executor Roles

Orchestrator (You):

Makes strategic decisions
Delegates tactical work
Tracks progress with SDK
Coordinates parallel subagents
Only executes: Task(), AskUserQuestion(), TodoWrite(), SDK operations

Executor (Subagent):

Handles tactical implementation
Researches specific problems
Fixes issues with retries
Reports findings back
Consumes resources independently (saves your context)

Why separation matters:

Context preservation (MUST prevent failures from compounding in your context)
Parallel efficiency (MUST run multiple subagents simultaneously)
Cost optimization (ALWAYS use cheaper subagents than Claude Code)
Error isolation (MUST keep failures in subagent context)

Why Delegation Matters: Context Cost Model

What looks like "one bash call" becomes many:

Initial command fails → need to retry
Test hooks break → need to fix code → retry
Push conflicts → need to pull/merge → retry
Each retry consumes tokens

Context cost comparison:

Direct execution (fails):
  bash call 1 → fails
  bash call 2 → fails
  bash call 3 → fix code
  bash call 4 → bash call 1 retry
  bash call 5 → bash call 2 retry
  = 5+ tool calls, context consumed

Delegation (cascades isolated):
  Task(subagent handles all retries) → 1 tool call
  Read result → 1 tool call
  = 2 tool calls, clean context

Token savings:

Each failed retry: 2,000-5,000 tokens wasted
Cascading failures: 10,000+ tokens wasted
Subagent isolation: None of that pollution in orchestrator context

Decision Framework: When to Delegate vs Execute

Ask yourself these questions:

Will this likely be ONE tool call?
- Uncertain → DELEGATE
- Certain → MAY do directly (single file read, quick check)
Does this require error handling?
- If yes → DELEGATE (subagent handles retries)
Could this cascade into multiple operations?
- If yes → DELEGATE
Is this strategic or tactical?
- Strategic (decisions) → Do directly
- Tactical (execution) → DELEGATE

Rule of thumb: When in doubt, ALWAYS DELEGATE. Cascading failures are expensive.

Three Allowed Direct Operations

Only these can be executed directly by orchestrator:

Task() - Delegation itself
- Use spawner subagent types when possible
- Example: Task(subagent_type="htmlgraph:gemini-spawner", ...)
AskUserQuestion() - Clarifying requirements
- Get user input before delegating
- Example: AskUserQuestion("Should we use Redis or PostgreSQL?")
TodoWrite() - Tracking work items
- Create/update todo lists
- Example: TodoWrite(todos=[...])

HtmlGraph CLI operations (create features and bugs):

htmlgraph feature create "title" --track <trk-id>
htmlgraph bug create "title" --track <trk-id>

Track Assignment (MANDATORY before creating work items):

Before creating ANY new track:

Run htmlgraph track list to see all existing tracks
Match the new work against existing track titles and descriptions
Only create a new track if NO existing track covers the scope
When in doubt, ask the user which track to use

This also applies when creating bugs, features, or spikes with --track:

Search existing tracks first, create a new track only as last resort

Everything else MUST be delegated.

Model Selection & Spawner Guide

Spawner Selection Decision Tree

Decision tree (check each in order):

Is this exploration/research/analysis?
- Files search: YES → Gemini spawner (FREE)
- Pattern analysis: YES → Gemini spawner (FREE)
- Documentation reading: YES → Gemini spawner (FREE)
- Learning unfamiliar system: YES → Gemini spawner (FREE)
Is this code implementation/testing?
- Generate code: YES → Codex spawner (70% cheaper)
- Fix bugs: YES → Codex spawner
- Write tests: YES → Codex spawner
- Refactor code: YES → Codex spawner
Is this git/GitHub operation?
- Commit changes: YES → Copilot spawner (60% cheaper, GitHub-native)
- Create PR: YES → Copilot spawner
- Manage branches: YES → Copilot spawner
- Review code: YES → Copilot spawner
Does this need deep reasoning?
- Architecture decisions: YES → Claude Opus (expensive, but needed)
- Complex design: YES → Claude Opus
- Strategic planning: YES → Claude Opus
Is this multi-agent coordination?
- Coordinate multiple spawners: YES → Claude Sonnet (mid-tier)
- Complex workflows: YES → Claude Sonnet
All else fails → Task() with Haiku (fallback)

Delegation Pattern:

Bash("gemini ...") - FREE, 2M tokens/min, exploration & research → fallback: haiku-coder
Bash("codex ...") - Cheap code specialist, implementation & testing → fallback: sonnet-coder
Bash("copilot ...") - Cheap git specialist, GitHub integration → fallback: haiku-coder
Coder agents (haiku-coder, sonnet-coder) - Fallback only when CLI tools fail

Spawner Details & Configuration

Gemini CLI (FREE - Exploration)

gemini -p "Analyze codebase for:
- All authentication patterns
- OAuth implementations
- Session management
- JWT usage" --output-format json --yolo --include-directories . 2>&1

If gemini fails/unavailable → fallback to haiku-coder

Best for:

File searching (FREE!)
Pattern analysis (FREE!)
Documentation research (FREE!)
Understanding unfamiliar systems (FREE!)

Codex CLI (Cheap - Code)

codex exec "Implement OAuth authentication:
- Add JWT token generation
- Include error handling
- Write unit tests" --full-auto --json -m gpt-4.1-mini -C . 2>&1

If codex fails/unavailable → fallback to sonnet-coder

Best for:

Code generation
Bug fixes
Test writing
Refactoring
Sandboxed execution

Copilot CLI (Cheap - Git)

copilot -p "Commit changes:
- Message: 'feat: add OAuth authentication'
- Files: src/auth/*.py, tests/test_auth.py
- Do NOT push" --allow-all-tools --no-color --add-dir . 2>&1

If copilot fails/unavailable → fallback to haiku-coder

Best for:

Git commits (60% cheaper than Task)
PR creation
Branch management
GitHub integration
Resolving conflicts

Task() with Sonnet/Opus (Strategic)

Task(
    prompt="Design authentication architecture...",
    subagent_type="sonnet"  # or "opus" for deep reasoning
)

Sonnet (Mid-tier):

Coordinate complex workflows
Multi-agent orchestration
Fallback when spawners fail

Opus (Expensive):

Deep reasoning
Architecture decisions
Strategic planning
When quality matters more than cost

Delegation Patterns & Examples

Basic Delegation Pattern

Simple exploration (try CLI first):

gemini -p "Search codebase for authentication patterns and summarize findings" \
  --output-format json --yolo --include-directories . 2>&1
# fallback → Agent(subagent_type="htmlgraph:haiku-coder", ...)

Code implementation (try CLI first):

codex exec "Implement OAuth authentication endpoint with JWT support" \
  --full-auto --json -m gpt-4.1-mini -C . 2>&1
# fallback → Agent(subagent_type="htmlgraph:sonnet-coder", ...)

Git operations (try CLI first):

copilot -p "Commit changes with message: 'feat: add OAuth authentication'. Do NOT push." \
  --allow-all-tools --no-color --add-dir . 2>&1
# fallback → Agent(subagent_type="htmlgraph:haiku-coder", ...)

Git/Code Operations (Bash-first, haiku-coder fallback)

Try the Copilot CLI directly via Bash first, then delegate to haiku-coder if unavailable.

# Priority 1: Bash-copilot (preferred)
copilot -p "Stage files: <list>. Commit with message: '<message>'. Do NOT push." \
  --allow-all-tools --no-color --add-dir . 2>&1

# Priority 2: haiku-coder fallback (if copilot fails or not installed)
Agent(
    subagent_type="htmlgraph:haiku-coder",
    description="Commit and push changes",
    prompt="Stage files: <list>. Commit with message: 'feat: add X'. Do NOT push.",
)

Pattern: orchestrator tries the CLI directly, falls back to a coder agent.

Code Generation (Bash-first, sonnet-coder fallback)

For implementation, refactoring, and structured output tasks:

# Priority 1: Bash-codex (preferred)
codex exec "TASK_DESCRIPTION" --full-auto --json -m gpt-4.1-mini -C . 2>&1

# Priority 2: sonnet-coder fallback (if codex fails or not installed)
Agent(
    subagent_type="htmlgraph:sonnet-coder",
    description="Implement feature X",
    prompt="Add OAuth authentication to the login endpoint.",
)

Pattern: orchestrator tries the CLI directly, falls back to a coder agent. Always use -m gpt-4.1-mini for codex (never expensive gpt-5.4 default).

Research & Analysis (Bash-first, haiku-coder fallback)

For codebase exploration, documentation research, and large-context analysis:

# Priority 1: Bash-gemini (preferred — FREE, 2M context)
gemini -p "TASK_DESCRIPTION" --output-format json --yolo --include-directories . 2>&1

# Priority 2: haiku-coder fallback (if gemini fails or not installed)
Agent(
    subagent_type="htmlgraph:haiku-coder",
    description="Research auth patterns",
    prompt="Analyze all authentication patterns in this codebase. Find security gaps.",
)

Pattern: orchestrator tries the CLI directly, falls back to a coder agent.

Parallel Delegation (Multiple Independent Tasks)

MANDATORY: Always analyze parallelizability when 2+ tasks are identified.

Before presenting recommendations or starting multi-task work, ALWAYS:

Check dependency graph — do any tasks depend on outputs of others?
Check file overlap — do tasks touch the same files/modules?
If independent → propose parallel worktree execution as the DEFAULT
If dependent → identify the critical path and parallelize what you can

Decision matrix:

Dependency?	File Overlap?	Action
No	No	Parallel worktrees (DEFAULT)
No	Yes	Sequential (same files = merge conflicts)
Yes	No	Pipeline (parallel where deps allow)
Yes	Yes	Sequential

Pattern: Spawn all at once in isolated worktrees

# Launch parallel agents in worktrees — one per feature
Agent(
    subagent_type="htmlgraph:sonnet-coder",
    description="Feature A",
    prompt="Implement feature A...",
    isolation="worktree",
    run_in_background=True,
)

Agent(
    subagent_type="htmlgraph:sonnet-coder",
    description="Feature B",
    prompt="Implement feature B...",
    isolation="worktree",
    run_in_background=True,
)

Agent(
    subagent_type="htmlgraph:haiku-coder",
    description="Feature C (simple)",
    prompt="Implement feature C...",
    isolation="worktree",
    run_in_background=True,
)

Benefits:

3 tasks in parallel: time = max(T1, T2, T3) instead of T1+T2+T3
Cost optimization: Uses cheapest model for each task
Worktree isolation: No merge conflicts during execution
Independent results: Each task tracked separately

After completion: Merge worktree branches to main, run quality gates, clean up.

Sequential Delegation with Dependencies

Pattern: Chain dependent tasks in sequence

# 1. Research existing patterns
Task(
    subagent_type="gemini",
    description="Research OAuth patterns",
    prompt="Find all OAuth implementations in codebase..."
)

# 2. Wait for research, then implement
# (In next message after reading result)
research_findings = "..."  # Read from previous task result

Task(
    subagent_type="codex",
    description="Implement OAuth based on research",
    prompt=f"""
    Implement OAuth using discovered patterns:
    {research_findings}
    """
)

# 3. Wait for implementation, then commit
Task(
    subagent_type="copilot",
    description="Commit implementation",
    prompt="Commit OAuth implementation..."
)

When to use: When later tasks depend on earlier results

HtmlGraph Result Retrieval

Subagents report findings automatically:

When a Task() completes, findings are available via CLI:

# Check recent spikes
htmlgraph spike list

# View specific spike
htmlgraph spike show <id>

Pattern: Read findings after Task completes

# 1. Delegate exploration (try gemini CLI first)
gemini -p "Find all authentication patterns..." --output-format json --yolo --include-directories . 2>&1
# fallback → Agent(subagent_type="htmlgraph:haiku-coder", ...)

# 2. The subagent creates a spike with findings
# Read findings via: htmlgraph spike list (then spike show <id>)

# 3. Use findings in next delegation (try codex CLI first)
codex exec "Implement authentication based on auth pattern research findings..." --full-auto --json -m gpt-4.1-mini -C . 2>&1
# fallback → Agent(subagent_type="htmlgraph:sonnet-coder", ...)

Error Handling & Retries

Let subagents handle retries:

# WRONG - Don't retry directly as orchestrator
bash_result = Bash(command="git commit -m 'feat: new'")
if failed:
    # Retry directly (context pollution)
    Bash(command="git pull && git commit")  # More context used

# CORRECT - Subagent handles retries
Task(
    subagent_type="copilot",
    description="Commit changes with retry",
    prompt="""
    Commit changes:
    Message: "feat: new feature"

    If commit fails:
    1. Pull latest changes
    2. Resolve conflicts if any
    3. Retry commit
    4. Handle pre-commit hooks

    Report final status: success or failure
    """
)

Benefits:

Subagent context handles retries (not your context)
Cleaner error reporting
Automatic recovery attempts
You get clean success/failure

Advanced: Post-Compact Persistence

Orchestrator Activation After Compact

How it works:

Before compact, SDK sets environment variable: CLAUDE_ORCHESTRATOR_ACTIVE=true
SessionStart hook detects post-compact state
Orchestrator Directives Skill auto-activates
This skill section appears automatically (first time post-compact)

Why: Preserve orchestration discipline after context compact

What you see:

Skill automatically activates (no manual invocation needed)
Quick start section visible by default
Expand detailed sections as needed
Full guidance available without re-reading docs

To manually trigger:

/orchestrator-directives

Environment variable:

CLAUDE_ORCHESTRATOR_ACTIVE=true  # Set by SDK

Session Continuity Across Compacts

Features preserved across compact:

Work items in HtmlGraph
Feature/spike tracking
Delegation patterns
Model selection guidance
This skill's guidance

What's lost:

Your context (that's why compact happens)
Intermediate tool outputs
Local variables

Re-activation pattern:

Before compact:
- Work on features, track in HtmlGraph
- Delegate with clear prompts
- Use SDK to save progress

After compact:
- Orchestrator Skill auto-activates
- Re-read recent spikes for context
- Continue delegations
- Use Task IDs for parallel coordination

Core Development Principles (Enforce in ALL Delegations)

When delegating to ANY coder agent, include these requirements in the prompt:

Research First

Search for existing libraries before implementing from scratch
Check pyproject.toml before adding new dependencies
Prefer well-maintained packages over custom implementations

Code Design

DRY — Extract shared logic; check src/python/htmlgraph/utils/ for existing utilities before writing new ones
Single Responsibility — One clear purpose per module, class, and function
KISS — Simplest solution that satisfies current requirements
YAGNI — Only implement what is needed now, not speculative future needs
Composition over inheritance

Module Size Limits

Functions: <50 lines | Classes: <300 lines | Modules: <500 lines
If a module would exceed limits, split it as part of the work — do not defer refactoring

Before Committing

uv run ruff check --fix && uv run ruff format && uv run mypy src/ && uv run pytest

Never commit with unresolved type errors, lint warnings, or test failures.

Core Philosophy

Core Principles Summary

Principle 1: Delegation > Direct Execution

Cascading failures consume exponentially more context than structured delegation
One failed bash call becomes 3-5 calls with retries
Delegation isolates failures to subagent context

Principle 2: Cost-First > Capability-First

Use FREE/cheap AIs (Gemini, Codex, Copilot) before expensive Claude Code
Gemini: FREE (exploration)
Codex: 70% cheaper (code)
Copilot: 60% cheaper (git)
Claude: Expensive (strategic only)

Principle 3: You Don't Know the Outcome

What looks like "one tool call" often becomes many
Unexpected failures, conflicts, retries consume context
Delegation removes unpredictability from orchestrator context

Principle 4: Parallel > Sequential

Multiple subagents can work simultaneously
Much faster than sequential execution
Orchestrator stays available for decisions

Principle 5: Track Everything

Use HtmlGraph CLI to track delegations
Features, spikes, bugs created for all work
Clear record of who did what

Core Philosophy

Delegation > Direct Execution. Cascading failures consume exponentially more context than structured delegation.

Cost-First > Capability-First. Use FREE/cheap AIs before expensive Claude models.

Quick Reference Table

Operation Type → Correct Delegation

Operation	MUST Use	Cost	Fallback
Search files	`Bash("gemini ...")`	FREE	haiku-coder
Pattern analysis	`Bash("gemini ...")`	FREE	haiku-coder
Documentation research	`Bash("gemini ...")`	FREE	haiku-coder
Code generation	`Bash("codex ...")`	$ (70% off)	sonnet-coder
Bug fixes	`Bash("codex ...")`	$ (70% off)	haiku-coder
Write tests	`Bash("codex ...")`	$ (70% off)	haiku-coder
Git commits	`Bash("copilot ...")`	$ (60% off)	haiku-coder
Create PRs	`Bash("copilot ...")`	$ (60% off)	haiku-coder
Architecture	Claude Opus	$$$$	Sonnet
Strategic decisions	Claude Opus	$$$$	Task()

Key: FREE = No cost | $ = Cheap | $$$$ = Expensive (but necessary)

Pre-Work Validation (YOLO Mode Hook)

The PreToolUse hook enforces attribution before code changes. Behavior by scenario:

Active Work Item	Tool	Action
Feature	Read	Allow
Feature	Write/Edit/Delete	Allow
Spike	Read	Allow
Spike	Write/Edit/Delete	Warn + Allow
None	Read	Allow
None	Write/Edit (1 file)	Warn + Allow
None	Write/Edit (3+ files)	Deny

When denied: Create a work item first, then retry.

htmlgraph feature create "Title" --track <trk-id>   # creates + returns feat-id
htmlgraph feature start <feat-id>                   # sets attribution for this session

Decision rule for code changes:

Single file, <30 min → direct change (warns, allows)
3+ files, or new tests, or multi-component → create feature first

Related Skills

/multi-ai-orchestration - Comprehensive model selection guide with detailed decision matrix
/code-quality - Quality gates and pre-commit workflows
/strategic-planning - HtmlGraph analytics for smart prioritization

Reference Documentation

Complete Rules: See orchestration.md
Advanced Patterns: See reference.md
HtmlGraph CLI: htmlgraph --help

Quick Summary

Cost-First Orchestration:

Bash("gemini ...") (FREE) → exploration, research, analysis → fallback: haiku-coder
Bash("codex ...") (70% off) → code implementation, fixes, tests → fallback: sonnet-coder
Bash("copilot ...") (60% off) → git operations, PRs → fallback: haiku-coder
Claude Opus → deep reasoning, strategy only

Orchestrator Rule: Only execute: Task(), AskUserQuestion(), TodoWrite(), SDK operations

Everything else → Delegate to appropriate spawner

When in doubt → DELEGATE

Similar Skills

kotlin-ktor-patterns

Provides Ktor server patterns for routing DSL, plugins (auth, CORS, serialization), Koin DI, WebSockets, services, and testApplication testing.

everything-claude-code

163.2k

deep-research

Conducts multi-source web research with firecrawl and exa MCPs: searches, scrapes pages, synthesizes cited reports. For deep dives, competitive analysis, tech evaluations, or due diligence.

everything-claude-code

163.2k

inventory-demand-planning

Provides demand forecasting, safety stock optimization, replenishment planning, and promotional lift estimation for multi-location retailers managing 300-800 SKUs.

everything-claude-code

163.2k

Stats

Parent Repo Stars3

Parent Repo Forks2

Last CommitApr 5, 2026

Actions

View Source View Plugin View on GitHub View README

Orchestrator Directives Skill

Use this skill for delegation patterns and decision frameworks in orchestrator mode.

Trigger keywords: orchestrator, delegation, subagent, task coordination, parallel execution, cost-first, spawner

Quick Start - What is Orchestration?

Delegate tactical work to specialized subagents while you focus on strategic decisions. Save Claude Code context (expensive) by using FREE/CHEAP AIs for appropriate tasks.

Basic pattern:

Task(
    subagent_type="gemini",  # FREE - use for exploration
    description="Find auth patterns",
    prompt="Search codebase for authentication patterns..."
)

When to use: ALWAYS use for complex tasks requiring research, code generation, git operations, or any work that could fail and require retries.

For complete guidance: See sections below or run /multi-ai-orchestration for model selection details.

CRITICAL: Cost-First Delegation (IMPERATIVE)

Claude Code is EXPENSIVE. You MUST delegate to FREE/CHEAP AIs first.

Cost Comparison & Pre-Delegation Checklist

PRE-DELEGATION CHECKLIST (MUST EXECUTE BEFORE EVERY TASK())

Ask these questions IN ORDER:

Can Gemini do this? → Exploration, research, batch ops, file analysis
- YES = MUST try Bash("gemini ...") first (FREE - 2M tokens/min), fallback to haiku-coder
Is this code work? → Implementation, fixes, tests, refactoring
- YES = MUST try Bash("codex ...") first (70% cheaper than Claude), fallback to sonnet-coder
Is this git/GitHub? → Commits, PRs, issues, branches
- YES = MUST try Bash("copilot ...") first (60% cheaper, GitHub-native), fallback to haiku-coder
Does this need deep reasoning? → Architecture, complex planning
- YES = Use Claude Opus (expensive, but strategically needed)
Is this coordination? → Multi-agent work
- YES = Use Claude Sonnet (mid-tier)
ONLY if above fail → Haiku (fallback)

Cost Comparison Examples

Task	WRONG (Cost)	CORRECT (Cost)	Savings
Search 100 files	Task() ($15-25)	Gemini spawner (FREE)	100%
Generate code	Task() ($10)	Codex spawner ($3)	70%
Git commit	Task() ($5)	Copilot spawner ($2)	60%
Strategic decision	Direct task ($20)	Claude Opus ($50)	Must pay for quality

WRONG vs CORRECT Examples

WRONG (wastes Claude quota):
- Code implementation → Task(haiku)               # USE Bash("codex ..."), fallback sonnet-coder
- Git commits → Task(haiku)                       # USE Bash("copilot ..."), fallback haiku-coder
- File search → Task(haiku)                       # USE Bash("gemini ...") (FREE!)
- Research → Task(haiku)                          # USE Bash("gemini ...") (FREE!)

CORRECT (cost-optimized):
- Code implementation → Bash("codex ...")         # Cheap, sandboxed; fallback sonnet-coder
- Git commits → Bash("copilot ...")               # Cheap, GitHub-native; fallback haiku-coder
- File search → Bash("gemini ...")                # FREE!; fallback haiku-coder
- Research → Bash("gemini ...")                   # FREE!; fallback haiku-coder
- Strategic decisions → Claude Opus               # Expensive, but needed
- Coder agents → FALLBACK ONLY                    # When CLI tools fail or aren't installed

Core Concepts

Orchestrator vs Executor Roles

Orchestrator (You):

Makes strategic decisions
Delegates tactical work
Tracks progress with SDK
Coordinates parallel subagents
Only executes: Task(), AskUserQuestion(), TodoWrite(), SDK operations

Executor (Subagent):

Handles tactical implementation
Researches specific problems
Fixes issues with retries
Reports findings back
Consumes resources independently (saves your context)

Why separation matters:

Context preservation (MUST prevent failures from compounding in your context)
Parallel efficiency (MUST run multiple subagents simultaneously)
Cost optimization (ALWAYS use cheaper subagents than Claude Code)
Error isolation (MUST keep failures in subagent context)

Why Delegation Matters: Context Cost Model

What looks like "one bash call" becomes many:

Initial command fails → need to retry
Test hooks break → need to fix code → retry
Push conflicts → need to pull/merge → retry
Each retry consumes tokens

Context cost comparison:

Direct execution (fails):
  bash call 1 → fails
  bash call 2 → fails
  bash call 3 → fix code
  bash call 4 → bash call 1 retry
  bash call 5 → bash call 2 retry
  = 5+ tool calls, context consumed

Delegation (cascades isolated):
  Task(subagent handles all retries) → 1 tool call
  Read result → 1 tool call
  = 2 tool calls, clean context

Token savings:

Each failed retry: 2,000-5,000 tokens wasted
Cascading failures: 10,000+ tokens wasted
Subagent isolation: None of that pollution in orchestrator context

Decision Framework: When to Delegate vs Execute

Ask yourself these questions:

Will this likely be ONE tool call?
- Uncertain → DELEGATE
- Certain → MAY do directly (single file read, quick check)
Does this require error handling?
- If yes → DELEGATE (subagent handles retries)
Could this cascade into multiple operations?
- If yes → DELEGATE
Is this strategic or tactical?
- Strategic (decisions) → Do directly
- Tactical (execution) → DELEGATE

Rule of thumb: When in doubt, ALWAYS DELEGATE. Cascading failures are expensive.

Three Allowed Direct Operations

Only these can be executed directly by orchestrator:

Task() - Delegation itself
- Use spawner subagent types when possible
- Example: Task(subagent_type="htmlgraph:gemini-spawner", ...)
AskUserQuestion() - Clarifying requirements
- Get user input before delegating
- Example: AskUserQuestion("Should we use Redis or PostgreSQL?")
TodoWrite() - Tracking work items
- Create/update todo lists
- Example: TodoWrite(todos=[...])

HtmlGraph CLI operations (create features and bugs):

htmlgraph feature create "title" --track <trk-id>
htmlgraph bug create "title" --track <trk-id>

Track Assignment (MANDATORY before creating work items):

Before creating ANY new track:

Run htmlgraph track list to see all existing tracks
Match the new work against existing track titles and descriptions
Only create a new track if NO existing track covers the scope
When in doubt, ask the user which track to use

This also applies when creating bugs, features, or spikes with --track:

Search existing tracks first, create a new track only as last resort

Everything else MUST be delegated.

Model Selection & Spawner Guide

Spawner Selection Decision Tree

Decision tree (check each in order):

Is this exploration/research/analysis?
- Files search: YES → Gemini spawner (FREE)
- Pattern analysis: YES → Gemini spawner (FREE)
- Documentation reading: YES → Gemini spawner (FREE)
- Learning unfamiliar system: YES → Gemini spawner (FREE)
Is this code implementation/testing?
- Generate code: YES → Codex spawner (70% cheaper)
- Fix bugs: YES → Codex spawner
- Write tests: YES → Codex spawner
- Refactor code: YES → Codex spawner
Is this git/GitHub operation?
- Commit changes: YES → Copilot spawner (60% cheaper, GitHub-native)
- Create PR: YES → Copilot spawner
- Manage branches: YES → Copilot spawner
- Review code: YES → Copilot spawner
Does this need deep reasoning?
- Architecture decisions: YES → Claude Opus (expensive, but needed)
- Complex design: YES → Claude Opus
- Strategic planning: YES → Claude Opus
Is this multi-agent coordination?
- Coordinate multiple spawners: YES → Claude Sonnet (mid-tier)
- Complex workflows: YES → Claude Sonnet
All else fails → Task() with Haiku (fallback)

Delegation Pattern:

Bash("gemini ...") - FREE, 2M tokens/min, exploration & research → fallback: haiku-coder
Bash("codex ...") - Cheap code specialist, implementation & testing → fallback: sonnet-coder
Bash("copilot ...") - Cheap git specialist, GitHub integration → fallback: haiku-coder
Coder agents (haiku-coder, sonnet-coder) - Fallback only when CLI tools fail

Spawner Details & Configuration

Gemini CLI (FREE - Exploration)

gemini -p "Analyze codebase for:
- All authentication patterns
- OAuth implementations
- Session management
- JWT usage" --output-format json --yolo --include-directories . 2>&1

If gemini fails/unavailable → fallback to haiku-coder

Best for:

File searching (FREE!)
Pattern analysis (FREE!)
Documentation research (FREE!)
Understanding unfamiliar systems (FREE!)

Codex CLI (Cheap - Code)

codex exec "Implement OAuth authentication:
- Add JWT token generation
- Include error handling
- Write unit tests" --full-auto --json -m gpt-4.1-mini -C . 2>&1

If codex fails/unavailable → fallback to sonnet-coder

Best for:

Code generation
Bug fixes
Test writing
Refactoring
Sandboxed execution

Copilot CLI (Cheap - Git)

copilot -p "Commit changes:
- Message: 'feat: add OAuth authentication'
- Files: src/auth/*.py, tests/test_auth.py
- Do NOT push" --allow-all-tools --no-color --add-dir . 2>&1

If copilot fails/unavailable → fallback to haiku-coder

Best for:

Git commits (60% cheaper than Task)
PR creation
Branch management
GitHub integration
Resolving conflicts

Task() with Sonnet/Opus (Strategic)

Task(
    prompt="Design authentication architecture...",
    subagent_type="sonnet"  # or "opus" for deep reasoning
)

Sonnet (Mid-tier):

Coordinate complex workflows
Multi-agent orchestration
Fallback when spawners fail

Opus (Expensive):

Deep reasoning
Architecture decisions
Strategic planning
When quality matters more than cost

Delegation Patterns & Examples

Basic Delegation Pattern

Simple exploration (try CLI first):

gemini -p "Search codebase for authentication patterns and summarize findings" \
  --output-format json --yolo --include-directories . 2>&1
# fallback → Agent(subagent_type="htmlgraph:haiku-coder", ...)

Code implementation (try CLI first):

codex exec "Implement OAuth authentication endpoint with JWT support" \
  --full-auto --json -m gpt-4.1-mini -C . 2>&1
# fallback → Agent(subagent_type="htmlgraph:sonnet-coder", ...)

Git operations (try CLI first):

copilot -p "Commit changes with message: 'feat: add OAuth authentication'. Do NOT push." \
  --allow-all-tools --no-color --add-dir . 2>&1
# fallback → Agent(subagent_type="htmlgraph:haiku-coder", ...)

Git/Code Operations (Bash-first, haiku-coder fallback)

Try the Copilot CLI directly via Bash first, then delegate to haiku-coder if unavailable.

# Priority 1: Bash-copilot (preferred)
copilot -p "Stage files: <list>. Commit with message: '<message>'. Do NOT push." \
  --allow-all-tools --no-color --add-dir . 2>&1

# Priority 2: haiku-coder fallback (if copilot fails or not installed)
Agent(
    subagent_type="htmlgraph:haiku-coder",
    description="Commit and push changes",
    prompt="Stage files: <list>. Commit with message: 'feat: add X'. Do NOT push.",
)

Pattern: orchestrator tries the CLI directly, falls back to a coder agent.

Code Generation (Bash-first, sonnet-coder fallback)

For implementation, refactoring, and structured output tasks:

# Priority 1: Bash-codex (preferred)
codex exec "TASK_DESCRIPTION" --full-auto --json -m gpt-4.1-mini -C . 2>&1

# Priority 2: sonnet-coder fallback (if codex fails or not installed)
Agent(
    subagent_type="htmlgraph:sonnet-coder",
    description="Implement feature X",
    prompt="Add OAuth authentication to the login endpoint.",
)

Pattern: orchestrator tries the CLI directly, falls back to a coder agent. Always use -m gpt-4.1-mini for codex (never expensive gpt-5.4 default).

Research & Analysis (Bash-first, haiku-coder fallback)

For codebase exploration, documentation research, and large-context analysis:

# Priority 1: Bash-gemini (preferred — FREE, 2M context)
gemini -p "TASK_DESCRIPTION" --output-format json --yolo --include-directories . 2>&1

# Priority 2: haiku-coder fallback (if gemini fails or not installed)
Agent(
    subagent_type="htmlgraph:haiku-coder",
    description="Research auth patterns",
    prompt="Analyze all authentication patterns in this codebase. Find security gaps.",
)

Pattern: orchestrator tries the CLI directly, falls back to a coder agent.

Parallel Delegation (Multiple Independent Tasks)

MANDATORY: Always analyze parallelizability when 2+ tasks are identified.

Before presenting recommendations or starting multi-task work, ALWAYS:

Check dependency graph — do any tasks depend on outputs of others?
Check file overlap — do tasks touch the same files/modules?
If independent → propose parallel worktree execution as the DEFAULT
If dependent → identify the critical path and parallelize what you can

Decision matrix:

Dependency?	File Overlap?	Action
No	No	Parallel worktrees (DEFAULT)
No	Yes	Sequential (same files = merge conflicts)
Yes	No	Pipeline (parallel where deps allow)
Yes	Yes	Sequential

Pattern: Spawn all at once in isolated worktrees

# Launch parallel agents in worktrees — one per feature
Agent(
    subagent_type="htmlgraph:sonnet-coder",
    description="Feature A",
    prompt="Implement feature A...",
    isolation="worktree",
    run_in_background=True,
)

Agent(
    subagent_type="htmlgraph:sonnet-coder",
    description="Feature B",
    prompt="Implement feature B...",
    isolation="worktree",
    run_in_background=True,
)

Agent(
    subagent_type="htmlgraph:haiku-coder",
    description="Feature C (simple)",
    prompt="Implement feature C...",
    isolation="worktree",
    run_in_background=True,
)

Benefits:

3 tasks in parallel: time = max(T1, T2, T3) instead of T1+T2+T3
Cost optimization: Uses cheapest model for each task
Worktree isolation: No merge conflicts during execution
Independent results: Each task tracked separately

After completion: Merge worktree branches to main, run quality gates, clean up.

Sequential Delegation with Dependencies

Pattern: Chain dependent tasks in sequence

# 1. Research existing patterns
Task(
    subagent_type="gemini",
    description="Research OAuth patterns",
    prompt="Find all OAuth implementations in codebase..."
)

# 2. Wait for research, then implement
# (In next message after reading result)
research_findings = "..."  # Read from previous task result

Task(
    subagent_type="codex",
    description="Implement OAuth based on research",
    prompt=f"""
    Implement OAuth using discovered patterns:
    {research_findings}
    """
)

# 3. Wait for implementation, then commit
Task(
    subagent_type="copilot",
    description="Commit implementation",
    prompt="Commit OAuth implementation..."
)

When to use: When later tasks depend on earlier results

HtmlGraph Result Retrieval

Subagents report findings automatically:

When a Task() completes, findings are available via CLI:

# Check recent spikes
htmlgraph spike list

# View specific spike
htmlgraph spike show <id>

Pattern: Read findings after Task completes

# 1. Delegate exploration (try gemini CLI first)
gemini -p "Find all authentication patterns..." --output-format json --yolo --include-directories . 2>&1
# fallback → Agent(subagent_type="htmlgraph:haiku-coder", ...)

# 2. The subagent creates a spike with findings
# Read findings via: htmlgraph spike list (then spike show <id>)

# 3. Use findings in next delegation (try codex CLI first)
codex exec "Implement authentication based on auth pattern research findings..." --full-auto --json -m gpt-4.1-mini -C . 2>&1
# fallback → Agent(subagent_type="htmlgraph:sonnet-coder", ...)

Error Handling & Retries

Let subagents handle retries:

# WRONG - Don't retry directly as orchestrator
bash_result = Bash(command="git commit -m 'feat: new'")
if failed:
    # Retry directly (context pollution)
    Bash(command="git pull && git commit")  # More context used

# CORRECT - Subagent handles retries
Task(
    subagent_type="copilot",
    description="Commit changes with retry",
    prompt="""
    Commit changes:
    Message: "feat: new feature"

    If commit fails:
    1. Pull latest changes
    2. Resolve conflicts if any
    3. Retry commit
    4. Handle pre-commit hooks

    Report final status: success or failure
    """
)

Benefits:

Subagent context handles retries (not your context)
Cleaner error reporting
Automatic recovery attempts
You get clean success/failure

Advanced: Post-Compact Persistence

Orchestrator Activation After Compact

How it works:

Before compact, SDK sets environment variable: CLAUDE_ORCHESTRATOR_ACTIVE=true
SessionStart hook detects post-compact state
Orchestrator Directives Skill auto-activates
This skill section appears automatically (first time post-compact)

Why: Preserve orchestration discipline after context compact

What you see:

Skill automatically activates (no manual invocation needed)
Quick start section visible by default
Expand detailed sections as needed
Full guidance available without re-reading docs

To manually trigger:

/orchestrator-directives

Environment variable:

CLAUDE_ORCHESTRATOR_ACTIVE=true  # Set by SDK

Session Continuity Across Compacts

Features preserved across compact:

Work items in HtmlGraph
Feature/spike tracking
Delegation patterns
Model selection guidance
This skill's guidance

What's lost:

Your context (that's why compact happens)
Intermediate tool outputs
Local variables

Re-activation pattern:

Before compact:
- Work on features, track in HtmlGraph
- Delegate with clear prompts
- Use SDK to save progress

After compact:
- Orchestrator Skill auto-activates
- Re-read recent spikes for context
- Continue delegations
- Use Task IDs for parallel coordination

Core Development Principles (Enforce in ALL Delegations)

When delegating to ANY coder agent, include these requirements in the prompt:

Research First

Search for existing libraries before implementing from scratch
Check pyproject.toml before adding new dependencies
Prefer well-maintained packages over custom implementations

Code Design

DRY — Extract shared logic; check src/python/htmlgraph/utils/ for existing utilities before writing new ones
Single Responsibility — One clear purpose per module, class, and function
KISS — Simplest solution that satisfies current requirements
YAGNI — Only implement what is needed now, not speculative future needs
Composition over inheritance

Module Size Limits

Functions: <50 lines | Classes: <300 lines | Modules: <500 lines
If a module would exceed limits, split it as part of the work — do not defer refactoring

Before Committing

uv run ruff check --fix && uv run ruff format && uv run mypy src/ && uv run pytest

Never commit with unresolved type errors, lint warnings, or test failures.

Core Philosophy

Core Principles Summary

Principle 1: Delegation > Direct Execution

Cascading failures consume exponentially more context than structured delegation
One failed bash call becomes 3-5 calls with retries
Delegation isolates failures to subagent context

Principle 2: Cost-First > Capability-First

Use FREE/cheap AIs (Gemini, Codex, Copilot) before expensive Claude Code
Gemini: FREE (exploration)
Codex: 70% cheaper (code)
Copilot: 60% cheaper (git)
Claude: Expensive (strategic only)

Principle 3: You Don't Know the Outcome

What looks like "one tool call" often becomes many
Unexpected failures, conflicts, retries consume context
Delegation removes unpredictability from orchestrator context

Principle 4: Parallel > Sequential

Multiple subagents can work simultaneously
Much faster than sequential execution
Orchestrator stays available for decisions

Principle 5: Track Everything

Use HtmlGraph CLI to track delegations
Features, spikes, bugs created for all work
Clear record of who did what

Core Philosophy

Delegation > Direct Execution. Cascading failures consume exponentially more context than structured delegation.

Cost-First > Capability-First. Use FREE/cheap AIs before expensive Claude models.

Quick Reference Table

Operation Type → Correct Delegation

Operation	MUST Use	Cost	Fallback
Search files	`Bash("gemini ...")`	FREE	haiku-coder
Pattern analysis	`Bash("gemini ...")`	FREE	haiku-coder
Documentation research	`Bash("gemini ...")`	FREE	haiku-coder
Code generation	`Bash("codex ...")`	$ (70% off)	sonnet-coder
Bug fixes	`Bash("codex ...")`	$ (70% off)	haiku-coder
Write tests	`Bash("codex ...")`	$ (70% off)	haiku-coder
Git commits	`Bash("copilot ...")`	$ (60% off)	haiku-coder
Create PRs	`Bash("copilot ...")`	$ (60% off)	haiku-coder
Architecture	Claude Opus	$$$$	Sonnet
Strategic decisions	Claude Opus	$$$$	Task()

Key: FREE = No cost | $ = Cheap | $$$$ = Expensive (but necessary)

Pre-Work Validation (YOLO Mode Hook)

The PreToolUse hook enforces attribution before code changes. Behavior by scenario:

Active Work Item	Tool	Action
Feature	Read	Allow
Feature	Write/Edit/Delete	Allow
Spike	Read	Allow
Spike	Write/Edit/Delete	Warn + Allow
None	Read	Allow
None	Write/Edit (1 file)	Warn + Allow
None	Write/Edit (3+ files)	Deny

When denied: Create a work item first, then retry.

htmlgraph feature create "Title" --track <trk-id>   # creates + returns feat-id
htmlgraph feature start <feat-id>                   # sets attribution for this session

Decision rule for code changes:

Single file, <30 min → direct change (warns, allows)
3+ files, or new tests, or multi-component → create feature first

Related Skills

/multi-ai-orchestration - Comprehensive model selection guide with detailed decision matrix
/code-quality - Quality gates and pre-commit workflows
/strategic-planning - HtmlGraph analytics for smart prioritization

Reference Documentation

Complete Rules: See orchestration.md
Advanced Patterns: See reference.md
HtmlGraph CLI: htmlgraph --help

Quick Summary

Cost-First Orchestration:

Bash("gemini ...") (FREE) → exploration, research, analysis → fallback: haiku-coder
Bash("codex ...") (70% off) → code implementation, fixes, tests → fallback: sonnet-coder
Bash("copilot ...") (60% off) → git operations, PRs → fallback: haiku-coder
Claude Opus → deep reasoning, strategy only

Orchestrator Rule: Only execute: Task(), AskUserQuestion(), TodoWrite(), SDK operations

Everything else → Delegate to appropriate spawner

When in doubt → DELEGATE