This skill should be used when the user asks to "improve agent architecture", "assess agent maturity", "refactor agents", "evolve agent system", "scale agent architecture", or needs guidance on measuring, improving, and evolving deep agent systems over time.
From deepagents-buildernpx claudepluginhub spulido99/claude-toolkit --plugin deepagents-builderThis skill uses the workspace's default tool permissions.
references/maturity-model.mdreferences/refactoring-patterns.mdDesigns and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Assess, measure, and evolve agent architectures through maturity levels.
| Level | Name | Characteristics |
|---|---|---|
| 1 | Initial | Single agent, 40-60+ tools, frequent errors |
| 2 | Managed | 2-4 subagents, basic grouping, some overlap |
| 3 | Defined | Capability-aligned, bounded contexts, documented |
| 4 | Measured | Full topologies, metrics tracked, automated testing |
| 5 | Optimizing | Self-organizing, auto-optimization, A/B testing |
Symptoms:
# Level 1 example
agent = create_deep_agent(tools=[tool1, tool2, ..., tool60])
Next step: Identify tool groupings, create platform subagents
Symptoms:
# Level 2 example
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-5-20250929",
subagents=[
{"name": "data-agent", "tools": [...]},
{"name": "api-agent", "tools": [...]}
]
)
Next step: Map business capabilities, define bounded contexts
Symptoms:
# Level 3 example
agent = create_deep_agent(
model="anthropic:claude-sonnet-4-5-20250929",
subagents=[
{
"name": "customer-support",
"system_prompt": "In support context: 'ticket' = inquiry...",
"tools": [support_kb, ticket_system]
}
]
)
Next step: Apply Team Topologies, establish metrics
Tip: Use
/design-evalsto scaffold your first eval dataset. This is the key step in reaching Level 4 (Measured).
Symptoms:
Metrics to track:
Next step: Implement evolutionary architecture
Symptoms:
Score 0-5 for each (total 80 possible):
Score interpretation:
When main agent is overloaded:
# Before: 15 tools in main
agent = create_deep_agent(tools=[t1, t2, ..., t15])
# After: Extract platform
agent = create_deep_agent(
tools=[t1, t2, t3],
subagents=[{"name": "platform", "tools": [t4, ..., t15]}]
)
When subagent used only once:
# Before: Subagent for single use
subagents=[{"name": "calculator", "tools": [calc]}]
# After: Tool in main agent
tools=[calc]
When subagent covers multiple domains:
# Before: Mixed responsibilities
{"name": "data-handler", "tools": [ingest, clean, visualize]}
# After: Separated concerns
{"name": "data-ingestion", "tools": [ingest]},
{"name": "data-visualization", "tools": [visualize]}
When subagents are too granular:
# Before: 10 tiny subagents
subagents=[{"name": "a", "tools": [t1]}, ...]
# After: Consolidated platforms
subagents=[
{"name": "data-platform", "tools": [t1, t2, t3]},
{"name": "analysis-platform", "tools": [t4, t5, t6]}
]
/assess — Run the 80-point maturity assessment with level determination and next-level recommendations/evolve — Guided refactoring to the next maturity level (interactive, step-by-step, with EDD checkpoints)/validate-agent — Quick anti-pattern and security check (simplified scoring)