Multi-Agent Systems: orchestration vs choreography, tool routing, state management, agent handoffs, parallelization (fan-out/fan-in), error handling in multi-agent workflows, Claude SDK patterns (Agent/Tool/Handoff), and observability with OpenTelemetry.
From clarcnpx claudepluginhub marvinrichter/clarc --plugin clarcThis skill uses the workspace's default tool permissions.
Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Patterns for building reliable, scalable multi-agent systems with Claude.
Orchestration (Central Control) Choreography (Decentralized Events)
───────────────────────────────── ─────────────────────────────────────
Orchestrator Agent A ──event──▶ Agent B
/ | \ Agent B ──event──▶ Agent C
Agent A Agent B Agent C (no central coordinator)
WHEN: Clear workflow, sequential steps WHEN: Loose coupling, event-driven, scaling
easy to debug and reason about independent microservices
Choose Orchestration when:
Choose Choreography when:
The orchestrator decides which agent/tool handles a task.
// Claude as router — classify then dispatch
const AGENT_REGISTRY = {
'code-review': codeReviewAgent,
'security-scan': securityAgent,
'test-generation': tddAgent,
'documentation': docAgent,
};
async function route(task: string, context: string): Promise<AgentResult> {
const classification = await claude.messages.create({
model: 'claude-haiku-latest', // Haiku for lightweight routing
system: `Classify the task into one of: ${Object.keys(AGENT_REGISTRY).join(', ')}.
Reply with ONLY the category name.`,
messages: [{ role: 'user', content: task }],
max_tokens: 10,
});
const category = classification.content[0].text.trim();
const agent = AGENT_REGISTRY[category];
if (!agent) throw new Error(`No agent for category: ${category}`);
return agent.run(task, context);
}
Where does state live between agent calls?
interface WorkflowState {
taskId: string;
input: string;
steps: StepResult[];
metadata: Record<string, unknown>;
}
class WorkflowContext {
private state: WorkflowState;
constructor(taskId: string, input: string) {
this.state = { taskId, input, steps: [], metadata: {} };
}
addStep(name: string, result: unknown): void {
this.state.steps.push({ name, result, timestamp: Date.now() });
}
getLastResult(): unknown {
return this.state.steps.at(-1)?.result;
}
toHandoffSummary(): string {
// Compress context for sub-agent handoffs
return `Task: ${this.state.input}\n` +
`Completed: ${this.state.steps.map(s => s.name).join(', ')}\n` +
`Last result: ${JSON.stringify(this.getLastResult())}`;
}
}
For durable state (Redis, DynamoDB event log) and task decomposition handoffs, see
multi-agent-patterns-advanced.
Pass the complete conversation history to the next agent — use when the sub-agent needs full context.
async function handoffWithFullContext(
conversation: Message[],
nextAgentSystem: string
): Promise<string> {
const response = await claude.messages.create({
model: 'claude-sonnet-latest',
system: nextAgentSystem,
messages: conversation, // Full history passed through
max_tokens: 4096,
});
return response.content[0].text;
}
Compress context before handoff — use for long workflows or to save tokens.
async function summarizeForHandoff(
context: WorkflowContext,
maxTokens = 500
): Promise<string> {
const summary = await claude.messages.create({
model: 'claude-haiku-latest', // Haiku for cheap summarization
system: 'Summarize the key findings and decisions. Be concise.',
messages: [{
role: 'user',
content: `Summarize this workflow progress for handoff to next agent:\n${context.toHandoffSummary()}`,
}],
max_tokens: maxTokens,
});
return summary.content[0].text;
}
async function parallelReview(codeFiles: string[]): Promise<ReviewResult[]> {
// Fan-out: launch all reviews concurrently
const reviewPromises = codeFiles.map(file =>
reviewAgent.run(file).catch(err => ({
file,
error: err.message,
issues: [],
}))
);
// Fan-in: collect results
const results = await Promise.allSettled(reviewPromises);
return results.map((result, i) => {
if (result.status === 'fulfilled') return result.value;
return { file: codeFiles[i], error: result.reason.message, issues: [] };
});
}
// Controlled parallelism (avoid rate limits)
async function parallelWithConcurrencyLimit<T>(
tasks: (() => Promise<T>)[],
concurrency = 5
): Promise<T[]> {
const results: T[] = [];
const chunks = [];
for (let i = 0; i < tasks.length; i += concurrency) {
chunks.push(tasks.slice(i, i + concurrency));
}
for (const chunk of chunks) {
const chunkResults = await Promise.all(chunk.map(t => t()));
results.push(...chunkResults);
}
return results;
}
async function runWithFallback<T>(
primary: () => Promise<T>,
fallback: () => Promise<T>,
maxRetries = 2
): Promise<T> {
for (let attempt = 0; attempt <= maxRetries; attempt++) {
try {
return await primary();
} catch (err) {
if (attempt === maxRetries) {
console.warn(`Primary agent failed after ${maxRetries} retries, using fallback`);
return fallback();
}
await backoff(attempt);
}
}
throw new Error('Unreachable');
}
// Partial results: continue with what succeeded
async function collectPartialResults<T>(
tasks: Promise<T>[],
minRequired: number
): Promise<T[]> {
const results = await Promise.allSettled(tasks);
const successes = results
.filter((r): r is PromiseFulfilledResult<T> => r.status === 'fulfilled')
.map(r => r.value);
if (successes.length < minRequired) {
throw new Error(`Only ${successes.length}/${tasks.length} tasks succeeded (need ${minRequired})`);
}
return successes;
}
The core agentic loop: call the model, handle tool_use stop reason by executing tools and appending results, repeat until end_turn.
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
async function orchestratorLoop(goal: string): Promise<string> {
const messages: Anthropic.MessageParam[] = [{ role: 'user', content: goal }];
while (true) {
const response = await client.messages.create({
model: 'claude-sonnet-latest',
system: ORCHESTRATOR_SYSTEM_PROMPT,
tools: AVAILABLE_TOOLS,
messages,
max_tokens: 4096,
});
if (response.stop_reason === 'end_turn') {
return response.content.filter(b => b.type === 'text').map(b => b.text).join('');
}
// Handle tool calls: execute all in parallel, append results
messages.push({ role: 'assistant', content: response.content });
const toolResults = await Promise.all(
response.content
.filter((b): b is Anthropic.ToolUseBlock => b.type === 'tool_use')
.map(async (t) => ({
type: 'tool_result' as const,
tool_use_id: t.id,
content: await executeTool(t.name, t.input),
}))
);
messages.push({ role: 'user', content: toolResults });
}
}
import { trace, context, SpanStatusCode } from '@opentelemetry/api';
const tracer = trace.getTracer('multi-agent-system');
async function tracedAgentCall<T>(
agentName: string,
task: string,
fn: () => Promise<T>
): Promise<T> {
return tracer.startActiveSpan(`agent.${agentName}`, async (span) => {
span.setAttributes({
'agent.name': agentName,
'agent.task.length': task.length,
'agent.task.preview': task.slice(0, 100),
});
try {
const result = await fn();
span.setStatus({ code: SpanStatusCode.OK });
return result;
} catch (err) {
span.setStatus({ code: SpanStatusCode.ERROR, message: String(err) });
span.recordException(err as Error);
throw err;
} finally {
span.end();
}
});
}
// Log key agent metrics
function logAgentCall(event: {
agent: string;
inputTokens: number;
outputTokens: number;
latencyMs: number;
toolCalls: number;
success: boolean;
}): void {
console.log(JSON.stringify({
type: 'agent_call',
...event,
timestamp: new Date().toISOString(),
}));
}
Wrong:
// Runs one at a time — wastes latency when tasks are independent
const reviewResult = await codeReviewAgent.run(code)
const securityResult = await securityAgent.run(code)
const docsResult = await docAgent.run(code)
Correct:
// Fan-out: all three run concurrently
const [reviewResult, securityResult, docsResult] = await Promise.all([
codeReviewAgent.run(code),
securityAgent.run(code),
docAgent.run(code),
])
Why: Sequential execution of independent agents multiplies latency unnecessarily — fan-out with Promise.all reduces wall-clock time to the slowest single agent.
Wrong:
// Sub-agent receives 50 turns of unrelated orchestrator history
async function handoff(fullConversation: Message[], nextSystem: string) {
return claude.messages.create({
system: nextSystem,
messages: fullConversation, // bloats context, raises cost, degrades focus
max_tokens: 4096,
})
}
Correct:
// Compress to only what the next agent needs
const summary = await summarizeForHandoff(context, 500)
return claude.messages.create({
system: nextSystem,
messages: [{ role: 'user', content: summary }],
max_tokens: 4096,
})
Why: Passing irrelevant history inflates token costs, risks hitting context limits, and distracts sub-agents with information they don't need.
Wrong:
const classification = await claude.messages.create({
model: 'claude-opus-latest', // overkill for a one-word classification
system: 'Classify as: code-review | security-scan | documentation.',
messages: [{ role: 'user', content: task }],
max_tokens: 10,
})
Correct:
const classification = await claude.messages.create({
model: 'claude-haiku-latest', // fast and cheap for routing
system: 'Classify as: code-review | security-scan | documentation. Reply with the category only.',
messages: [{ role: 'user', content: task }],
max_tokens: 10,
})
Why: Routing is a lightweight classification task; using a heavyweight model wastes cost and latency on a decision that requires no deep reasoning.
For advanced patterns — capability registry, durable state, task decomposition, testing multi-agent systems, pattern quick-selection guide, and failure handling — see
multi-agent-patterns-advanced.