Complete guide to managing sub-agent and teammate lifecycle in Claude Code — health checks, idle detection, cleanup, retention, and the mandatory audit loop.
From claude-code-expertnpx claudepluginhub markus41/claude --plugin claude-code-expertThis skill uses the workspace's default tool permissions.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Translates visa document images to English via OCR (Vision/EasyOCR/Tesseract), rotates via EXIF, and generates bilingual A4 PDFs with original and translation.
Complete guide to managing sub-agent and teammate lifecycle in Claude Code — health checks, idle detection, cleanup, retention, and the mandatory audit loop.
Claude should prefer to orchestrate rather than do work directly. When a task has multiple components or would benefit from specialized review:
Direct work is the fallback, not the default.
┌──────────┐
┌────────→│ active │──────────┐
│ └──────────┘ │
│ │ │
spawn/resume no output completes
│ >60s │
│ ↓ ↓
┌──────────┐ ┌──────────┐ ┌──────────┐
│ spawning │ │ idle │ │ completed│
└──────────┘ └──────────┘ └──────────┘
│ │
no response collected?
>120s │
↓ yes/ \no
┌──────────┐ ↓ ↓
│ stalled │ released orphaned
└──────────┘
│
terminate
↓
┌──────────┐
│ errored │
└──────────┘
For foreground agents: They return results when complete. No polling needed.
For background agents: Use SendMessage to check in:
SendMessage(to: "agent-id", message: "Status check: progress on your task?")
For agent teams: Teammates communicate via shared task list. Check TaskList for stale assignments.
"Status check on your assigned task: [{task}].
Reply with one of:
- WORKING: {current activity}
- BLOCKED: {what's blocking}
- DONE: {summary}
- NEEDS_HELP: {what you need}"
| Signal | Threshold | Action |
|---|---|---|
| No output | 60s | Send check-in |
| No response to check-in | 30s | Mark stalled |
| Stalled | 180s | Terminate, reassign |
| Completed but uncollected | 60s | Collect results |
| Errored | Immediate | Collect partial, log |
cleanup_steps:
1_identify:
- List all agent IDs from spawn records
- Check each agent's last known state
- Identify: completed, idle, stalled, errored
2_collect:
- For completed agents: gather results
- For errored agents: gather partial output + error
- For stalled agents: attempt final check-in
3_terminate:
- Release completed agents (results collected)
- Terminate stalled agents (no response after 3 min)
- Terminate errored agents (unrecoverable)
- Keep active agents running
4_report:
- Log cleanup summary
- Report any lost work
- Suggest task reassignment if needed
Track cumulative token usage across all agents:
Per-agent budget = total_budget / num_agents
If agent exceeds 150% of per-agent budget → flag for review
If total usage exceeds 80% of budget → switch to cheaper models
If total usage exceeds 95% → terminate non-essential agents
Every agent's work must be audited before acceptance.
Agent completes task
↓
Orchestrator spawns audit agent (code-reviewer or audit-reviewer)
↓
Audit agent reviews:
- Completeness (did it do everything asked?)
- Correctness (is the output right?)
- Consistency (does it match project style?)
- Security (any vulnerabilities introduced?)
- Tests (are new things tested?)
↓
Audit verdict:
PASS → accept, merge into deliverables
PASS_WITH_NOTES → accept, log improvement notes
FAIL → send back to original agent with specific fixes
↓
Re-do (max 2 rework rounds)
↓
Re-audit
↓
If still fails → escalate to user
| Original Agent | Audit Agent | Focus |
|---|---|---|
| Builder/Implementer | code-reviewer | Correctness, completeness |
| Test writer | code-reviewer | Test quality, coverage |
| Security reviewer | audit-reviewer (opus) | False positives, missed vectors |
| Researcher | audit-reviewer (opus) | Source quality, bias |
| Doc writer | code-reviewer | Accuracy, completeness |
| Infrastructure | security-reviewer | Misconfigs, blast radius |
audit_levels:
quick:
checks: [completeness, correctness]
model: haiku
cost: low
standard:
checks: [completeness, correctness, consistency, security]
model: sonnet
cost: medium
thorough:
checks: [completeness, correctness, consistency, security, testing, documentation]
model: opus
cost: high
Teams have additional lifecycle considerations:
team_lifecycle:
spawn:
- Lead session creates team
- Lead defines task list with dependencies
- Teammates join and claim tasks
active:
- Teammates self-coordinate via TaskList
- Direct messaging via SendMessage
- Lead monitors overall progress
idle_detection:
- Teammate hasn't claimed new task in 120s
- Teammate's current task hasn't progressed
- TeammateIdle hook fires
cleanup:
- Lead sends final status check to all teammates
- Collect all teammate outputs
- TeamDelete to dissolve team
- Prune any worktrees created
audit:
- Lead reviews each teammate's work
- Cross-check: teammate A audits teammate B's output
- Final synthesis by lead
In teams of 3+, use cross-auditing:
Agent A completes → Agent B audits A's work
Agent B completes → Agent C audits B's work
Agent C completes → Agent A audits C's work
This distributes audit load and provides diverse perspectives.