From htmlgraph
Cost-first delegation patterns and decision frameworks for multi-AI coordination
npx claudepluginhub shakestzd/htmlgraph --plugin htmlgraphThis skill uses the workspace's default tool permissions.
Use this skill for delegation patterns and decision frameworks in orchestrator mode.
Provides Ktor server patterns for routing DSL, plugins (auth, CORS, serialization), Koin DI, WebSockets, services, and testApplication testing.
Conducts multi-source web research with firecrawl and exa MCPs: searches, scrapes pages, synthesizes cited reports. For deep dives, competitive analysis, tech evaluations, or due diligence.
Provides demand forecasting, safety stock optimization, replenishment planning, and promotional lift estimation for multi-location retailers managing 300-800 SKUs.
Use this skill for delegation patterns and decision frameworks in orchestrator mode.
Trigger keywords: orchestrator, delegation, subagent, task coordination, parallel execution, cost-first, spawner
Delegate tactical work to specialized subagents while you focus on strategic decisions. Save Claude Code context (expensive) by using FREE/CHEAP AIs for appropriate tasks.
Basic pattern:
Task(
subagent_type="gemini", # FREE - use for exploration
description="Find auth patterns",
prompt="Search codebase for authentication patterns..."
)
When to use: ALWAYS use for complex tasks requiring research, code generation, git operations, or any work that could fail and require retries.
For complete guidance: See sections below or run /multi-ai-orchestration for model selection details.
Claude Code is EXPENSIVE. You MUST delegate to FREE/CHEAP AIs first.
Ask these questions IN ORDER:
Can Gemini do this? → Exploration, research, batch ops, file analysis
Bash("gemini ...") first (FREE - 2M tokens/min), fallback to haiku-coderIs this code work? → Implementation, fixes, tests, refactoring
Bash("codex ...") first (70% cheaper than Claude), fallback to sonnet-coderIs this git/GitHub? → Commits, PRs, issues, branches
Bash("copilot ...") first (60% cheaper, GitHub-native), fallback to haiku-coderDoes this need deep reasoning? → Architecture, complex planning
Is this coordination? → Multi-agent work
ONLY if above fail → Haiku (fallback)
| Task | WRONG (Cost) | CORRECT (Cost) | Savings |
|---|---|---|---|
| Search 100 files | Task() ($15-25) | Gemini spawner (FREE) | 100% |
| Generate code | Task() ($10) | Codex spawner ($3) | 70% |
| Git commit | Task() ($5) | Copilot spawner ($2) | 60% |
| Strategic decision | Direct task ($20) | Claude Opus ($50) | Must pay for quality |
WRONG (wastes Claude quota):
- Code implementation → Task(haiku) # USE Bash("codex ..."), fallback sonnet-coder
- Git commits → Task(haiku) # USE Bash("copilot ..."), fallback haiku-coder
- File search → Task(haiku) # USE Bash("gemini ...") (FREE!)
- Research → Task(haiku) # USE Bash("gemini ...") (FREE!)
CORRECT (cost-optimized):
- Code implementation → Bash("codex ...") # Cheap, sandboxed; fallback sonnet-coder
- Git commits → Bash("copilot ...") # Cheap, GitHub-native; fallback haiku-coder
- File search → Bash("gemini ...") # FREE!; fallback haiku-coder
- Research → Bash("gemini ...") # FREE!; fallback haiku-coder
- Strategic decisions → Claude Opus # Expensive, but needed
- Coder agents → FALLBACK ONLY # When CLI tools fail or aren't installed
Orchestrator (You):
Executor (Subagent):
Why separation matters:
What looks like "one bash call" becomes many:
Context cost comparison:
Direct execution (fails):
bash call 1 → fails
bash call 2 → fails
bash call 3 → fix code
bash call 4 → bash call 1 retry
bash call 5 → bash call 2 retry
= 5+ tool calls, context consumed
Delegation (cascades isolated):
Task(subagent handles all retries) → 1 tool call
Read result → 1 tool call
= 2 tool calls, clean context
Token savings:
Ask yourself these questions:
Will this likely be ONE tool call?
Does this require error handling?
Could this cascade into multiple operations?
Is this strategic or tactical?
Rule of thumb: When in doubt, ALWAYS DELEGATE. Cascading failures are expensive.
Only these can be executed directly by orchestrator:
Task() - Delegation itself
Task(subagent_type="htmlgraph:gemini-spawner", ...)AskUserQuestion() - Clarifying requirements
AskUserQuestion("Should we use Redis or PostgreSQL?")TodoWrite() - Tracking work items
TodoWrite(todos=[...])HtmlGraph CLI operations (create features and bugs):
htmlgraph feature create "title" --track <trk-id>htmlgraph bug create "title" --track <trk-id>Track Assignment (MANDATORY before creating work items):
Before creating ANY new track:
htmlgraph track list to see all existing tracksThis also applies when creating bugs, features, or spikes with --track:
Everything else MUST be delegated.
Decision tree (check each in order):
Is this exploration/research/analysis?
Is this code implementation/testing?
Is this git/GitHub operation?
Does this need deep reasoning?
Is this multi-agent coordination?
All else fails → Task() with Haiku (fallback)
Delegation Pattern:
Bash("gemini ...") - FREE, 2M tokens/min, exploration & research → fallback: haiku-coderBash("codex ...") - Cheap code specialist, implementation & testing → fallback: sonnet-coderBash("copilot ...") - Cheap git specialist, GitHub integration → fallback: haiku-coderhaiku-coder, sonnet-coder) - Fallback only when CLI tools failgemini -p "Analyze codebase for:
- All authentication patterns
- OAuth implementations
- Session management
- JWT usage" --output-format json --yolo --include-directories . 2>&1
If gemini fails/unavailable → fallback to haiku-coder
Best for:
codex exec "Implement OAuth authentication:
- Add JWT token generation
- Include error handling
- Write unit tests" --full-auto --json -m gpt-4.1-mini -C . 2>&1
If codex fails/unavailable → fallback to sonnet-coder
Best for:
copilot -p "Commit changes:
- Message: 'feat: add OAuth authentication'
- Files: src/auth/*.py, tests/test_auth.py
- Do NOT push" --allow-all-tools --no-color --add-dir . 2>&1
If copilot fails/unavailable → fallback to haiku-coder
Best for:
Task(
prompt="Design authentication architecture...",
subagent_type="sonnet" # or "opus" for deep reasoning
)
Sonnet (Mid-tier):
Opus (Expensive):
Simple exploration (try CLI first):
gemini -p "Search codebase for authentication patterns and summarize findings" \
--output-format json --yolo --include-directories . 2>&1
# fallback → Agent(subagent_type="htmlgraph:haiku-coder", ...)
Code implementation (try CLI first):
codex exec "Implement OAuth authentication endpoint with JWT support" \
--full-auto --json -m gpt-4.1-mini -C . 2>&1
# fallback → Agent(subagent_type="htmlgraph:sonnet-coder", ...)
Git operations (try CLI first):
copilot -p "Commit changes with message: 'feat: add OAuth authentication'. Do NOT push." \
--allow-all-tools --no-color --add-dir . 2>&1
# fallback → Agent(subagent_type="htmlgraph:haiku-coder", ...)
Try the Copilot CLI directly via Bash first, then delegate to haiku-coder if unavailable.
# Priority 1: Bash-copilot (preferred)
copilot -p "Stage files: <list>. Commit with message: '<message>'. Do NOT push." \
--allow-all-tools --no-color --add-dir . 2>&1
# Priority 2: haiku-coder fallback (if copilot fails or not installed)
Agent(
subagent_type="htmlgraph:haiku-coder",
description="Commit and push changes",
prompt="Stage files: <list>. Commit with message: 'feat: add X'. Do NOT push.",
)
Pattern: orchestrator tries the CLI directly, falls back to a coder agent.
For implementation, refactoring, and structured output tasks:
# Priority 1: Bash-codex (preferred)
codex exec "TASK_DESCRIPTION" --full-auto --json -m gpt-4.1-mini -C . 2>&1
# Priority 2: sonnet-coder fallback (if codex fails or not installed)
Agent(
subagent_type="htmlgraph:sonnet-coder",
description="Implement feature X",
prompt="Add OAuth authentication to the login endpoint.",
)
Pattern: orchestrator tries the CLI directly, falls back to a coder agent.
Always use -m gpt-4.1-mini for codex (never expensive gpt-5.4 default).
For codebase exploration, documentation research, and large-context analysis:
# Priority 1: Bash-gemini (preferred — FREE, 2M context)
gemini -p "TASK_DESCRIPTION" --output-format json --yolo --include-directories . 2>&1
# Priority 2: haiku-coder fallback (if gemini fails or not installed)
Agent(
subagent_type="htmlgraph:haiku-coder",
description="Research auth patterns",
prompt="Analyze all authentication patterns in this codebase. Find security gaps.",
)
Pattern: orchestrator tries the CLI directly, falls back to a coder agent.
MANDATORY: Always analyze parallelizability when 2+ tasks are identified.
Before presenting recommendations or starting multi-task work, ALWAYS:
Decision matrix:
| Dependency? | File Overlap? | Action |
|---|---|---|
| No | No | Parallel worktrees (DEFAULT) |
| No | Yes | Sequential (same files = merge conflicts) |
| Yes | No | Pipeline (parallel where deps allow) |
| Yes | Yes | Sequential |
Pattern: Spawn all at once in isolated worktrees
# Launch parallel agents in worktrees — one per feature
Agent(
subagent_type="htmlgraph:sonnet-coder",
description="Feature A",
prompt="Implement feature A...",
isolation="worktree",
run_in_background=True,
)
Agent(
subagent_type="htmlgraph:sonnet-coder",
description="Feature B",
prompt="Implement feature B...",
isolation="worktree",
run_in_background=True,
)
Agent(
subagent_type="htmlgraph:haiku-coder",
description="Feature C (simple)",
prompt="Implement feature C...",
isolation="worktree",
run_in_background=True,
)
Benefits:
After completion: Merge worktree branches to main, run quality gates, clean up.
Pattern: Chain dependent tasks in sequence
# 1. Research existing patterns
Task(
subagent_type="gemini",
description="Research OAuth patterns",
prompt="Find all OAuth implementations in codebase..."
)
# 2. Wait for research, then implement
# (In next message after reading result)
research_findings = "..." # Read from previous task result
Task(
subagent_type="codex",
description="Implement OAuth based on research",
prompt=f"""
Implement OAuth using discovered patterns:
{research_findings}
"""
)
# 3. Wait for implementation, then commit
Task(
subagent_type="copilot",
description="Commit implementation",
prompt="Commit OAuth implementation..."
)
When to use: When later tasks depend on earlier results
Subagents report findings automatically:
When a Task() completes, findings are available via CLI:
# Check recent spikes
htmlgraph spike list
# View specific spike
htmlgraph spike show <id>
Pattern: Read findings after Task completes
# 1. Delegate exploration (try gemini CLI first)
gemini -p "Find all authentication patterns..." --output-format json --yolo --include-directories . 2>&1
# fallback → Agent(subagent_type="htmlgraph:haiku-coder", ...)
# 2. The subagent creates a spike with findings
# Read findings via: htmlgraph spike list (then spike show <id>)
# 3. Use findings in next delegation (try codex CLI first)
codex exec "Implement authentication based on auth pattern research findings..." --full-auto --json -m gpt-4.1-mini -C . 2>&1
# fallback → Agent(subagent_type="htmlgraph:sonnet-coder", ...)
Let subagents handle retries:
# WRONG - Don't retry directly as orchestrator
bash_result = Bash(command="git commit -m 'feat: new'")
if failed:
# Retry directly (context pollution)
Bash(command="git pull && git commit") # More context used
# CORRECT - Subagent handles retries
Task(
subagent_type="copilot",
description="Commit changes with retry",
prompt="""
Commit changes:
Message: "feat: new feature"
If commit fails:
1. Pull latest changes
2. Resolve conflicts if any
3. Retry commit
4. Handle pre-commit hooks
Report final status: success or failure
"""
)
Benefits:
How it works:
CLAUDE_ORCHESTRATOR_ACTIVE=trueWhy: Preserve orchestration discipline after context compact
What you see:
To manually trigger:
/orchestrator-directives
Environment variable:
CLAUDE_ORCHESTRATOR_ACTIVE=true # Set by SDK
Features preserved across compact:
What's lost:
Re-activation pattern:
Before compact:
- Work on features, track in HtmlGraph
- Delegate with clear prompts
- Use SDK to save progress
After compact:
- Orchestrator Skill auto-activates
- Re-read recent spikes for context
- Continue delegations
- Use Task IDs for parallel coordination
When delegating to ANY coder agent, include these requirements in the prompt:
pyproject.toml before adding new dependenciessrc/python/htmlgraph/utils/ for existing utilities before writing new onesuv run ruff check --fix && uv run ruff format && uv run mypy src/ && uv run pytest
Never commit with unresolved type errors, lint warnings, or test failures.
Principle 1: Delegation > Direct Execution
Principle 2: Cost-First > Capability-First
Principle 3: You Don't Know the Outcome
Principle 4: Parallel > Sequential
Principle 5: Track Everything
Delegation > Direct Execution. Cascading failures consume exponentially more context than structured delegation.
Cost-First > Capability-First. Use FREE/cheap AIs before expensive Claude models.
| Operation | MUST Use | Cost | Fallback |
|---|---|---|---|
| Search files | Bash("gemini ...") | FREE | haiku-coder |
| Pattern analysis | Bash("gemini ...") | FREE | haiku-coder |
| Documentation research | Bash("gemini ...") | FREE | haiku-coder |
| Code generation | Bash("codex ...") | $ (70% off) | sonnet-coder |
| Bug fixes | Bash("codex ...") | $ (70% off) | haiku-coder |
| Write tests | Bash("codex ...") | $ (70% off) | haiku-coder |
| Git commits | Bash("copilot ...") | $ (60% off) | haiku-coder |
| Create PRs | Bash("copilot ...") | $ (60% off) | haiku-coder |
| Architecture | Claude Opus | $$$$ | Sonnet |
| Strategic decisions | Claude Opus | $$$$ | Task() |
Key: FREE = No cost | $ = Cheap | $$$$ = Expensive (but necessary)
The PreToolUse hook enforces attribution before code changes. Behavior by scenario:
| Active Work Item | Tool | Action |
|---|---|---|
| Feature | Read | Allow |
| Feature | Write/Edit/Delete | Allow |
| Spike | Read | Allow |
| Spike | Write/Edit/Delete | Warn + Allow |
| None | Read | Allow |
| None | Write/Edit (1 file) | Warn + Allow |
| None | Write/Edit (3+ files) | Deny |
When denied: Create a work item first, then retry.
htmlgraph feature create "Title" --track <trk-id> # creates + returns feat-id
htmlgraph feature start <feat-id> # sets attribution for this session
Decision rule for code changes:
htmlgraph --helpCost-First Orchestration:
Bash("gemini ...") (FREE) → exploration, research, analysis → fallback: haiku-coderBash("codex ...") (70% off) → code implementation, fixes, tests → fallback: sonnet-coderBash("copilot ...") (60% off) → git operations, PRs → fallback: haiku-coderOrchestrator Rule: Only execute: Task(), AskUserQuestion(), TodoWrite(), SDK operations
Everything else → Delegate to appropriate spawner
When in doubt → DELEGATE