From engineering-advanced-skills
Use when the user asks to design a multi-agent system, pick an orchestration pattern (supervisor/swarm/pipeline), generate tool schemas for agents, or evaluate agent execution logs for cost, latency, and failure bottlenecks. Examples: 'design an agent architecture for research automation', 'generate Anthropic tool schemas from these tool descriptions', 'analyze these agent run logs for bottlenecks'. NOT for Claude Code workflow files (use workflow-builder) or single-agent prompt design (use agent-workflow-designer).
How this skill is triggered — by the user, by Claude, or both
Slash command
/engineering-advanced-skills:agent-designerThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Design, schema-generate, and evaluate multi-agent systems with three deterministic tools. The scripts are the workflow — do not freehand an architecture when the planner can score one from requirements.
README.mdagent_evaluator.pyagent_planner.pyassets/sample_execution_logs.jsonassets/sample_system_requirements.jsonassets/sample_tool_descriptions.jsonexpected_outputs/sample_agent_architecture.jsonexpected_outputs/sample_evaluation_report.jsonexpected_outputs/sample_tool_schemas.jsonreferences/agent_architecture_patterns.mdreferences/evaluation_methodology.mdreferences/tool_design_best_practices.mdtool_schema_generator.pyDesign, schema-generate, and evaluate multi-agent systems with three deterministic tools. The scripts are the workflow — do not freehand an architecture when the planner can score one from requirements.
When NOT to use: Claude Code Workflow-tool automations → workflow-builder; single-agent workflow scaffolds → agent-workflow-designer; multi-agent fan-out at runtime → agenthub.
| Choose | When | Watch out for |
|---|---|---|
| Single agent | One bounded task, < ~5 tools | Don't add agents you don't need |
| Supervisor | Central decomposition, specialists report back | Supervisor becomes the bottleneck |
| Pipeline | Strictly sequential stages with handoffs | Rigid order; slowest stage gates throughput |
| Hierarchical | Multiple org layers, > ~8 agents | Communication overhead per level |
| Swarm | Parallel peers, fault tolerance over predictability | Hard to debug; needs consensus rules |
The planner applies this scoring deterministically — run it rather than picking by feel.
All paths relative to this skill folder. Each step's JSON output is the next step's design input.
Write a requirements JSON (copy assets/sample_system_requirements.json — keys: goal, tasks[], constraints{max_response_time, budget_per_task, concurrent_tasks}, team_size):
python3 agent_planner.py requirements.json --format json -o arch
Emits arch.json with architecture_design (pattern, agents, communication links), mermaid_diagram, and implementation_roadmap. Read architecture_design.pattern and the per-agent role list; present the mermaid diagram to the user.
Describe each agent's tools in plain JSON (copy assets/sample_tool_descriptions.json), then:
python3 tool_schema_generator.py tool_descriptions.json --validate -o tools
Emits tools.json (tool_schemas, validation_summary) plus provider-specific tools_anthropic.json / tools_openai.json. Gate: every tool must print ✓ Valid. Fix any invalid schema before proceeding — never hand an agent an unvalidated schema.
Once the system runs (or against assets/sample_execution_logs.json for a dry run):
python3 agent_evaluator.py execution_logs.json --detailed -o eval
Emits eval.json with summary, agent_metrics, bottleneck_analysis, error_analysis, cost_breakdown, sla_compliance, and optimization_recommendations, plus split files (eval_errors.json, eval_recommendations.json).
The design is not done until:
tool_schema_generator.py --validate reports 0 invalid schemas.agent_evaluator.py on a pilot run reports 0 critical issues (the tool prints CRITICAL: N critical issues when found). If N > 0, apply the top item in eval_recommendations.json, re-run the pilot, and re-evaluate.expected_outputs/ to confirm the schema shape you're consuming hasn't drifted.references/agent_architecture_patterns.md — pattern trade-offs in depthreferences/tool_design_best_practices.md — schema, idempotency, error-handling rulesreferences/evaluation_methodology.md — metric definitions the evaluator implementsnpx claudepluginhub richyboy170/agentic-sdlc-internship --plugin engineering-advanced-skillsCreates bite-sized, testable implementation plans from specs or requirements, with file structure and task decomposition. Activates before coding multi-step tasks.