Help us improve
Share bugs, ideas, or general feedback.
Guides building production multi-agent systems with OpenAI AgentKit and Agents SDK — orchestration, handoffs, routines, and architectural patterns.
npx claudepluginhub frankxai/claude-skills-library --plugin claude-skills-libraryHow this skill is triggered — by the user, by Claude, or both
Slash command
/claude-skills-library:openai-agentkitThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill provides comprehensive guidance on building production-ready multi-agent systems using OpenAI's AgentKit platform and Agents SDK, following 2025 best practices.
Orchestrates multi-agent AI systems with handoffs, routing, and workflows using AI SDK v5 in TypeScript. For agent collaboration and task delegation across providers.
Designs O-Agent orchestrators for multi-agent fleet management using Claude SDK. Guides scope definition, agent templates, prompts, and tools for create/command/monitor/delete workflows.
Provides patterns and principles for building reliable autonomous agents: agent loops (ReAct, Plan-Execute), goal decomposition, reflection, and production guardrails. Useful when designing constrained, domain-specific agents.
Share bugs, ideas, or general feedback.
This skill provides comprehensive guidance on building production-ready multi-agent systems using OpenAI's AgentKit platform and Agents SDK, following 2025 best practices.
Complete platform for building, deploying, and optimizing agents with enterprise-grade tooling.
Core Components:
The Agents SDK is the production evolution of the experimental Swarm framework. Use Agents SDK for all production work - Swarm is educational only.
Migration Note: If you encounter legacy Swarm code, migrate to Agents SDK immediately.
An Agent encapsulates:
Design Principle: Agents should be lightweight and specialized rather than monolithic and general-purpose.
A routine is a sequence of actions an agent can perform:
Think of it as: A mini-workflow that an agent owns and executes autonomously.
Handoffs enable agent-to-agent transitions in execution flow.
Key Pattern: When an agent encounters a task outside its specialization, it hands off to a more appropriate agent.
Example:
# Triage agent determines which specialist to use
if task.type == "refund":
handoff_to(refund_agent)
elif task.type == "sales":
handoff_to(sales_agent)
Use Case: Routing requests to specialized sub-agents
Structure:
User Request → Triage Agent → [Determines Category] → Specialist Agent
Example Implementation:
# Triage agent with handoff capabilities
triage_agent = Agent(
name="Customer Service Triage",
instructions="Analyze customer requests and route to appropriate specialist",
functions=[analyze_request],
handoffs=[refund_agent, sales_agent, support_agent]
)
When to Use:
Use Case: Multi-step workflows where each step has a specialist
Structure:
Step 1 Agent → [Complete] → Handoff → Step 2 Agent → ... → Final Agent
Example:
# Research → Analysis → Report Generation pipeline
research_agent → analysis_agent → report_agent
When to Use:
Use Case: Breaking complex tasks into parallel subtasks
Structure:
Coordinator Agent
↓
├─→ Subtask Agent 1
├─→ Subtask Agent 2
└─→ Subtask Agent 3
↓
Synthesis Agent (combines results)
When to Use:
DO: ✅ Keep agents focused on single responsibilities ✅ Provide clear, specific instructions in system prompts ✅ Define explicit handoff conditions ✅ Use descriptive agent names (helps with debugging) ✅ Test agents in isolation before integration
DON'T: ❌ Create monolithic "do-everything" agents ❌ Allow agents to communicate directly (use handoffs) ❌ Over-engineer with too many specialized agents ❌ Ignore error handling in handoffs ❌ Skip agent boundary testing
Effective Routines:
Example:
routine = {
"name": "Process Refund",
"entry": "User requests refund",
"steps": [
"Verify order exists",
"Check refund eligibility",
"Calculate refund amount",
"Process payment reversal",
"Send confirmation"
],
"exit": "Refund confirmed or rejection reason provided",
"error_handling": "Escalate to human agent if verification fails"
}
Critical Elements:
Example:
def handoff_condition(state):
"""Determine if handoff needed"""
if state.requires_specialized_knowledge:
return specialist_agent
if state.exceeds_authority_level:
return supervisor_agent
return None # Continue with current agent
Principle: Frameworks that limit LLM involvement and rely on predefined or direct execution flows operate more efficiently.
Strategies:
Principle: Give agents only the tools they need for their specialty.
Pattern:
# Specialized agents get targeted toolsets
refund_agent.tools = [verify_order, calculate_refund, process_payment]
sales_agent.tools = [check_inventory, create_quote, process_order]
# NOT: both agents get all 6 tools
Problem: Too many agents for simple tasks creates overhead
Solution: Start simple, add agents only when complexity demands it
Problem: Agent A → Agent B → Agent A creates loops
Solution: Design clear hierarchy or state-based termination
Problem: Agents lose context across handoffs
Solution: Implement proper state management and context passing
Problem: Overlapping agent responsibilities cause conflicts
Solution: Define explicit agent domains and decision criteria
Agent-Level:
System-Level:
Unit Testing:
# Test individual agent behaviors
def test_refund_agent():
result = refund_agent.process(valid_refund_request)
assert result.status == "approved"
assert result.amount > 0
Integration Testing:
# Test agent handoffs
def test_triage_to_refund():
initial_state = {"request": "I want a refund"}
final_state = orchestrator.run(initial_state)
assert final_state.handling_agent == "refund_agent"
assert final_state.completed == True
End-to-End Testing:
# Test full user journeys
def test_customer_journey():
scenarios = load_test_scenarios()
for scenario in scenarios:
result = system.execute(scenario)
assert result.meets_requirements()
Essential Observability:
Tools:
Agent Security:
Data Protection:
Horizontal Scaling:
Vertical Optimization:
from openai import OpenAI
client = OpenAI()
# Define specialized agent
support_agent = {
"name": "Technical Support Agent",
"model": "gpt-4o",
"instructions": """You are a technical support specialist.
Help users troubleshoot technical issues.
If issue requires refund, hand off to refund agent.
If issue is sales-related, hand off to sales agent.""",
"tools": [
{"type": "function", "function": troubleshooting_guide},
{"type": "function", "function": escalate_to_human}
]
}
def execute_agent_workflow(initial_request):
current_agent = triage_agent
context = {"request": initial_request, "history": []}
while not is_complete(context):
# Execute current agent
response = client.chat.completions.create(
model=current_agent.model,
messages=build_messages(context, current_agent),
tools=current_agent.tools
)
# Check for handoff
next_agent = determine_handoff(response)
if next_agent:
context["history"].append({
"from": current_agent.name,
"to": next_agent.name
})
current_agent = next_agent
else:
context["result"] = response
break
return context
Use AgentKit for OpenAI-based workflows, Claude SDK for Anthropic-based workflows, and MCP to bridge data sources to both.
LangGraph provides more fine-grained control flow. Use AgentKit for simpler workflows, LangGraph for complex state machines.
AgentKit agents can consume MCP servers as tools, standardizing data source connections.
Key Changes:
swarm.run() with Agents SDK orchestrationTimeline: Swarm is maintenance-only. Migrate all production code by Q2 2025.
Use OpenAI AgentKit when:
Consider alternatives when:
Official Documentation:
Community:
This skill ensures you build robust, scalable, production-ready multi-agent systems using OpenAI's latest platform capabilities in 2025.