Build production-ready AI agents with LangGraph. Provides skills for architecture design, AI-friendly tool design, implementation patterns, interactive testing, and evolution strategies.
npx claudepluginhub spulido99/claude-toolkit --plugin deepagents-builderGenerate an interactive chat console (chat.py) for testing your agent with tool call logging, thread management, and optional context injection.
Add a single eval scenario to an existing dataset interactively or from a production trace.
Add a single subagent to an existing agent architecture interactively or by named capability.
Add a single tool to an existing catalog interactively or from an API endpoint.
Run a maturity assessment on your agent architecture. Scores across 4 categories and identifies the path to the next level.
Design a simple single agent interactively — role, tools, prompt, and model. Generates create_deep_agent(...) code. Escalates to /design-topology if complexity warrants multi-agent.
Design eval scenarios from JTBD. Scaffolds evals directory and generates scenario datasets for new or existing agents.
Design an AI-friendly tool catalog from scratch. Detects agent code, discovers requirements, then generates tools following the 10 principles.
Interactive guide to design agent topology based on business capabilities and Team Topologies principles.
Show eval dataset health dashboard — scenario counts, snapshot staleness, last run results.
Review and accept/reject changed snapshots after intentional agent changes.
Run evals against your agent. Default snapshot mode. Use --smoke, --full, --report, or --diagnose for other modes.
Guided refactoring to evolve your agent to the next maturity level. Assesses current state, recommends a refactoring pattern, and walks through implementation.
Create and setup a new DeepAgents application with scaffolding, dependencies, and example code. Guides through project creation interactively.
Show tool catalog quality dashboard — counts by domain, quality scores per tool, principle compliance, and eval coverage.
Validate a DeepAgents configuration for anti-patterns, security issues, and best practices compliance.
Designs deep agent architectures based on requirements. Use this agent proactively when the user needs help with agent architecture decisions, planning subagent hierarchies, or mapping business capabilities to agent structures. <example> User: I need to build an agent that handles customer support, order management, and billing Action: Use agent-architect to design the subagent topology and bounded contexts </example> <example> User: My agent has 50 tools and is getting confused Action: Use agent-architect to recommend decomposition into platform subagents </example> <example> User: How should I structure my research agent? Action: Use agent-architect to design appropriate topology </example>
Reviews DeepAgents code for anti-patterns, security issues, and best practices. Use this agent proactively when the user has written agent code that should be reviewed for quality and security. <example> User: Here's my agent code, can you review it? Action: Use code-reviewer to analyze for anti-patterns and security issues </example> <example> User: I finished implementing my customer support agent Action: Use code-reviewer to validate the implementation </example> <example> User: Is my agent setup correct? Action: Use code-reviewer to check configuration and patterns </example>
Designs eval scenarios and datasets for deep agents. Use this agent when the user needs to create eval datasets from JTBD, scaffold eval directory structure, or add scenarios. <example> User: I need to create evals for my customer support agent Action: Use eval-designer to interview about JTBD and generate scenario YAML </example> <example> User: Add a scenario for when the order is not found Action: Use eval-designer to generate and add the scenario to existing dataset </example> <example> User: I'm starting a new agent, help me define what to test Action: Use eval-designer to guide JTBD definition before agent exists </example>
Runs eval scenarios against agents, manages snapshots, and analyzes failures. Use this agent to execute evals, review changed snapshots, or diagnose test failures. <example> User: Run my evals Action: Use eval-runner to load datasets and run scenarios against the agent </example> <example> User: Why is my refund scenario failing? Action: Use eval-runner to diagnose the failure with trajectory diff and suggestions </example> <example> User: I changed the prompt, update my snapshots Action: Use eval-runner to show changed snapshots for review and acceptance </example>
Assesses agent maturity and guides architecture evolution. Use this agent when the user needs a maturity assessment, wants to identify improvement opportunities, or needs step-by-step refactoring guidance. <example> User: How mature is my agent architecture? Action: Use evolution-guide to run the 80-point assessment and report maturity level </example> <example> User: My agent has too many tools and is getting slow Action: Use evolution-guide to assess, identify refactoring pattern, and guide implementation </example> <example> User: I want to evolve my agent from Level 2 to Level 3 Action: Use evolution-guide to provide migration path with specific steps </example>
Designs and generates AI-friendly tools for agents. Use this agent proactively when the user needs to create tools for an agent, convert an API to agent tools, or design a tool catalog. <example> User: I need to create tools for a banking agent Action: Use tool-architect to discover requirements, design tools, and generate code </example> <example> User: Convert this REST API into agent tools Action: Use tool-architect to map API endpoints to AI-friendly tools </example> <example> User: Design tools for my customer support agent Action: Use tool-architect to create domain-organized tool catalog </example> <example> User: Add a refund tool to my existing banking tools Action: Use tool-architect in incremental mode to design and add the tool </example>
This skill should be used when the user asks to "design agent topology", "plan agent architecture", "create bounded contexts", "map business capabilities to agents", "organize subagents", or needs guidance on structuring multi-agent systems. Provides Team Topologies principles applied to AI agents.
This skill should be used when the user asks to "evaluate agents", "test agent", "create eval dataset", "design evals", "benchmark agent", "debug agent", "run evals", or needs guidance on testing, evaluating, and iterating on deep agent systems using an evals-driven approach.
This skill should be used when the user asks to "improve agent architecture", "assess agent maturity", "refactor agents", "evolve agent system", "scale agent architecture", or needs guidance on measuring, improving, and evolving deep agent systems over time.
This skill should be used when the user asks about "agent prompts", "system prompt design", "tool patterns", "anti-patterns", "agent best practices", "subagent prompts", or needs guidance on implementing effective prompts, tools, and avoiding common mistakes in DeepAgents.
This skill should be used when the user asks to "start a deepagent project", "create a new agent", "quickstart agent", "simple agent example", "get started with deepagents", or needs a quick introduction to building agents with LangChain's DeepAgents framework. Provides minimal setup and basic patterns for rapid prototyping.
This skill should be used when the user asks to "design tools", "create tools for agent", "tool design", "API to tools", "define tools", "convert API to tools", or needs guidance on designing AI-friendly tools for agents. Provides principles from AI-Friendly API Design, Agent Native architecture, and real-world tool catalogs.
Backend API design, GraphQL architecture, workflow orchestration with Temporal, and test-driven backend development
Complete developer workflow toolkit. Includes 34 reference skills, 34 specialized agents, and 21 slash commands covering TDD, debugging, code review, architecture, documentation, refactoring, security, testing, git workflows, API design, performance, UI/UX design, plugin development, and incident response. Full SDLC coverage with MCP integrations.
Battle-tested Claude Code plugin for engineering teams — 38 agents, 156 skills, 72 legacy command shims, production-ready hooks, and selective install workflows evolved through continuous real-world use
Complete creative writing suite with 10 specialized agents covering the full writing process: research gathering, character development, story architecture, world-building, dialogue coaching, editing/review, outlining, content strategy, believability auditing, and prose style/voice analysis. Includes genre-specific guides, templates, and quality checklists.
Efficient skill management system with progressive discovery — 410+ production-ready skills across 33+ domains
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.