From engineering-advanced-skills
Use when the user asks to design a multi-agent system, pick an orchestration pattern (supervisor/swarm/pipeline), generate tool schemas for agents, or evaluate agent execution logs for cost, latency, and failure bottlenecks. Examples: 'design an agent architecture for research automation', 'generate Anthropic tool schemas from these tool descriptions', 'analyze these agent run logs for bottlenecks'. NOT for Claude Code workflow files (use workflow-builder) or single-agent prompt design (use agent-workflow-designer).
How this skill is triggered — by the user, by Claude, or both
Slash command
/engineering-advanced-skills:agent-designerThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Design, schema-generate, and evaluate multi-agent systems with three deterministic tools. The scripts are the workflow — do not freehand an architecture when the planner can score one from requirements.
README.mdagent_evaluator.pyagent_planner.pyassets/sample_execution_logs.jsonassets/sample_system_requirements.jsonassets/sample_tool_descriptions.jsonexpected_outputs/sample_agent_architecture.jsonexpected_outputs/sample_evaluation_report.jsonexpected_outputs/sample_tool_schemas.jsonreferences/agent_architecture_patterns.mdreferences/evaluation_methodology.mdreferences/tool_design_best_practices.mdtool_schema_generator.pyDesign, schema-generate, and evaluate multi-agent systems with three deterministic tools. The scripts are the workflow — do not freehand an architecture when the planner can score one from requirements.
When NOT to use: Claude Code Workflow-tool automations → workflow-builder; single-agent workflow scaffolds → agent-workflow-designer; multi-agent fan-out at runtime → agenthub.
| Choose | When | Watch out for |
|---|---|---|
| Single agent | One bounded task, < ~5 tools | Don't add agents you don't need |
| Supervisor | Central decomposition, specialists report back | Supervisor becomes the bottleneck |
| Pipeline | Strictly sequential stages with handoffs | Rigid order; slowest stage gates throughput |
| Hierarchical | Multiple org layers, > ~8 agents | Communication overhead per level |
| Swarm | Parallel peers, fault tolerance over predictability | Hard to debug; needs consensus rules |
The planner applies this scoring deterministically — run it rather than picking by feel.
All paths relative to this skill folder. Each step's JSON output is the next step's design input.
Write a requirements JSON (copy assets/sample_system_requirements.json — keys: goal, tasks[], constraints{max_response_time, budget_per_task, concurrent_tasks}, team_size):
python3 agent_planner.py requirements.json --format json -o arch
Emits arch.json with architecture_design (pattern, agents, communication links), mermaid_diagram, and implementation_roadmap. Read architecture_design.pattern and the per-agent role list; present the mermaid diagram to the user.
Describe each agent's tools in plain JSON (copy assets/sample_tool_descriptions.json), then:
python3 tool_schema_generator.py tool_descriptions.json --validate -o tools
Emits tools.json (tool_schemas, validation_summary) plus provider-specific tools_anthropic.json / tools_openai.json. Gate: every tool must print ✓ Valid. Fix any invalid schema before proceeding — never hand an agent an unvalidated schema.
Once the system runs (or against assets/sample_execution_logs.json for a dry run):
python3 agent_evaluator.py execution_logs.json --detailed -o eval
Emits eval.json with summary, agent_metrics, bottleneck_analysis, error_analysis, cost_breakdown, sla_compliance, and optimization_recommendations, plus split files (eval_errors.json, eval_recommendations.json).
The design is not done until:
tool_schema_generator.py --validate reports 0 invalid schemas.agent_evaluator.py on a pilot run reports 0 critical issues (the tool prints CRITICAL: N critical issues when found). If N > 0, apply the top item in eval_recommendations.json, re-run the pilot, and re-evaluate.expected_outputs/ to confirm the schema shape you're consuming hasn't drifted.references/agent_architecture_patterns.md — pattern trade-offs in depthreferences/tool_design_best_practices.md — schema, idempotency, error-handling rulesreferences/evaluation_methodology.md — metric definitions the evaluator implementsnpx claudepluginhub ai-integr8tor/alirezarezvani-claude-skills --plugin engineering-advanced-skillsSets up isolated workspaces using native worktree tools or git worktree fallback. Use before starting feature work to protect the current branch.
2plugins reuse this skill
First indexed Jun 23, 2026