AI agent orchestration patterns (ReAct, chain-of-thought, multi-agent), tool design, memory strategies, guardrails, and token cost modeling. Use when designing AI agent systems.
Designs AI agent systems with orchestration patterns, tool specifications, memory strategies, and cost modeling.
npx claudepluginhub navraj007in/architecture-cowork-pluginThis skill inherits all available tools. When active, it can use any tool Claude has access to.
Patterns, decision frameworks, and best practices for designing AI agent systems. Use this skill when the project type is agent or hybrid.
| Pattern | Best For | Complexity | Example Use Case |
|---|---|---|---|
| Single-turn | Simple Q&A, classification, extraction | Low | FAQ bot, sentiment analysis, data extraction |
| ReAct | Tool-using tasks that need reasoning | Medium | Research assistant, data analysis agent |
| Chain-of-thought | Complex reasoning without tools | Medium | Math problems, logic puzzles, decision analysis |
| Multi-agent router | Multiple specialist domains | High | Customer support (billing + technical + sales agents) |
| Multi-agent parallel | Independent subtasks that can run simultaneously | High | Content pipeline (research + write + review in parallel) |
| Plan-and-execute | Multi-step tasks with dependencies | High | Project management bot, complex workflow automation |
User Message → LLM → Response
User Message → LLM thinks → Calls tool → Observes result → LLM thinks → ... → Final response
User Message → LLM step 1 → LLM step 2 → ... → Final response
User Message → Router Agent → Specialist Agent A or B or C → Response
User Message → Coordinator → [Agent A, Agent B, Agent C] (parallel) → Merger → Response
User Message → Planner creates step list → Executor runs step 1 → ... → step N → Response
Every agent tool should follow this structure:
Tool:
name: descriptive-verb-noun (e.g., "search-knowledge-base", "create-ticket")
type: <agent_tool_type>
description: What it does in one sentence
input:
- parameter: name
type: string | number | boolean | object
required: true | false
description: What this parameter controls
output:
- field: name
type: string | number | object | array
description: What this field contains
errors:
- error: name
description: When this error occurs
search-products not tool1, create-user not do-thing| Tool | Purpose | Input | Output |
|---|---|---|---|
search-knowledge-base | Find relevant documents | query string, top_k | Array of {document, score} |
query-database | Execute a database query | collection, filter, fields | Array of documents |
call-api | Make an HTTP request | url, method, headers, body | Response body + status |
create-record | Insert data | collection, data | Created record ID |
update-record | Modify data | collection, id, updates | Updated record |
send-notification | Notify user/system | channel, recipient, message | Delivery status |
generate-content | Create text/images | prompt, format, constraints | Generated content |
human-handoff | Escalate to human | reason, context, priority | Handoff confirmation |
| Strategy | When to Use | Implementation | Cost Impact |
|---|---|---|---|
| Session | Short conversations, stateless tasks | Include last N messages in context | Low — bounded context |
| Persistent | Multi-session relationships, user preferences | Store in database, load relevant history | Medium — database reads |
| Vector Store | Large knowledge bases, semantic retrieval | Embed documents, retrieve top-K similar | Medium-High — embedding + storage costs |
| Hybrid | Complex agents needing both history and knowledge | Session memory + vector retrieval | High — multiple systems |
Every agent needs explicit guardrails. Define these in the system prompt and enforce them in the orchestration layer.
For each guardrail, document:
Guardrail: [Name]
Trigger: When does this guardrail activate?
Action: What does the agent do?
Message: What does the agent say to the user?
Fallback: What happens if the guardrail can't be enforced?
Monthly cost = conversations/mo × turns/conversation × (input_tokens × input_price + output_tokens × output_price)
| Pattern | Avg Turns | Avg Input Tokens | Avg Output Tokens | Cost per Conversation (Sonnet) |
|---|---|---|---|---|
| Single-turn | 1 | 1,500 | 500 | ~$0.01 |
| ReAct (simple) | 3 | 3,000 | 1,500 | ~$0.03 |
| ReAct (complex) | 7 | 8,000 | 3,000 | ~$0.07 |
| Multi-agent router | 3 | 4,000 | 2,000 | ~$0.04 |
| Plan-and-execute | 8 | 10,000 | 5,000 | ~$0.11 |
| Provider | Best For | Strengths | Weaknesses |
|---|---|---|---|
| Anthropic (Claude) | Tool use, long context, safety | Best tool use, 200K context, strong guardrails | Higher cost at Opus tier |
| OpenAI (GPT-4o) | General purpose, ecosystem | Huge ecosystem, function calling, vision | Slightly weaker at complex reasoning vs Opus |
| Google (Gemini) | Long context, multimodal | 1M+ context window, good vision | Smaller ecosystem |
| Mistral | EU data residency, multilingual | Strong European language support, fast | Smaller model selection |
| Groq | Speed-critical applications | Ultra-fast inference | Limited model selection |
| Local (Ollama) | Privacy, offline, cost control | No API costs, full data control | Requires GPU, lower quality |
When designing agent architecture, provide comprehensive specifications:
Always provide a visual flow diagram showing:
Format (use Mermaid):
graph TD
A[User Message] --> B{Router Agent}
B -->|Technical Question| C[Technical Agent]
B -->|Billing Question| D[Billing Agent]
B -->|General| E[General Agent]
C --> F[Search Knowledge Base]
F --> G{Found Answer?}
G -->|Yes| H[Format Response]
G -->|No| I[Escalate to Human]
D --> J[Query Billing DB]
J --> K[Format Invoice Data]
E --> L[Generate Response]
H --> M[Return to User]
I --> M
K --> M
L --> M
For each agent in the system, provide:
Agent: [Agent Name]
Purpose:
[1-2 sentences describing what this agent does and when it's invoked]
Pattern:
[Single-turn / ReAct / Multi-agent router / etc.]
System Prompt:
You are a [role] agent. Your job is to [specific task].
Guidelines:
Available tools:
Output format: [Expected output structure]
Guardrails:
Tools Available:
| Tool Name | Purpose | When to Use |
|-----------|---------|-------------|
| tool-1 | [Brief description] | [Trigger condition] |
| tool-2 | [Brief description] | [Trigger condition] |
Memory Configuration:
- Type: [Session / Persistent / Vector / Hybrid]
- Storage: [Where data is stored]
- Retention: [How long data is kept]
- Load strategy: [When/how to load memory]
Expected Input:
```json
{
"user_message": "string",
"conversation_id": "string",
"user_context": {
"user_id": "string",
"preferences": {}
}
}
Expected Output:
{
"response": "string",
"confidence": 0.0-1.0,
"sources": ["source1", "source2"],
"needs_escalation": boolean,
"metadata": {}
}
Error Handling:
| Error Type | Trigger | Action | User Message |
|---|---|---|---|
| Tool failure | API returns 500 | Retry 3x with backoff | "I'm having trouble accessing [system]. Please try again." |
| Low confidence | confidence < 0.7 | Escalate to human | "I'm not confident in my answer. Let me connect you with a specialist." |
| Out of scope | Topic outside domain | Redirect | "I specialize in [domain]. For [other topic], please contact [resource]." |
Token Estimate:
Performance Targets:
### 3. Tool Implementation Specs (REQUIRED for each tool)
**For each custom tool, provide complete specification:**
```markdown
Tool: [tool-name]
Type: [API call / Database query / Function / External service]
Description:
[2-3 sentences on what this tool does and why it exists]
Input Schema:
```typescript
interface ToolInput {
param1: string; // Description of param1
param2?: number; // Optional: Description of param2
param3: { // Nested object
field1: string;
field2: boolean;
};
}
Output Schema:
interface ToolOutput {
success: boolean;
data?: {
// Success case structure
};
error?: {
code: string;
message: string;
};
}
Implementation: [Detailed description of how the tool works internally]
External Dependencies:
Error Cases:
| Error | Trigger | Code | Message | Retry? |
|---|---|---|---|---|
| [Error name] | [When it happens] | ERROR_CODE | "User-friendly message" | Yes/No |
Example Usage:
Input:
{
"param1": "example value",
"param3": {
"field1": "test",
"field2": true
}
}
Output (Success):
{
"success": true,
"data": {
"result": "example result"
}
}
Output (Failure):
{
"success": false,
"error": {
"code": "INVALID_INPUT",
"message": "param1 must not be empty"
}
}
Performance:
Testing Notes:
### 4. Memory Architecture Specification (REQUIRED if using memory)
**Provide detailed memory implementation:**
```markdown
Memory Strategy: [Session / Persistent / Vector / Hybrid]
Storage Backend:
- Technology: [PostgreSQL / MongoDB / Pinecone / Redis]
- Connection: [How to connect]
- Schema: [Data structure]
Session Memory (if applicable):
- Window size: Last N messages
- Summarization: After M messages, summarize older messages
- Retention: Until session ends (X minutes of inactivity)
Persistent Memory (if applicable):
Database Schema:
```sql
CREATE TABLE conversations (
id UUID PRIMARY KEY,
user_id UUID NOT NULL,
created_at TIMESTAMP,
last_message_at TIMESTAMP,
summary TEXT,
metadata JSONB
);
CREATE TABLE messages (
id UUID PRIMARY KEY,
conversation_id UUID REFERENCES conversations(id),
role VARCHAR(50), -- 'user' or 'assistant'
content TEXT,
tokens INTEGER,
created_at TIMESTAMP
);
Load Strategy:
Vector Memory (if applicable):
Chunk Strategy (for document embeddings):
Cost Impact:
Privacy Considerations:
### 5. Guardrail Implementation (REQUIRED)
**For each guardrail, provide enforceable specification:**
```markdown
Guardrail: [Name]
Priority: [Critical / High / Medium / Low]
Trigger:
[Specific, measurable condition that activates this guardrail]
Detection Method:
[How the system detects the trigger — pattern matching, classifier, heuristic, etc.]
Action:
1. [First action taken by system]
2. [Second action taken by system]
3. [Final outcome]
User Message:
[Exact message shown to user when guardrail activates]
Bypass Conditions:
[When is it OK to bypass this guardrail? Usually: "Never" or very specific exception]
Logging:
- Log event: Yes/No
- Alert team: Yes/No
- Include in analytics: Yes/No
Example:
User input: "[Example input that triggers guardrail]"
System detects: [What pattern/condition is detected]
System action: [What the system does]
User sees: "[Message shown]"
Testing:
- Test case 1: [Input that SHOULD trigger guardrail]
- Test case 2: [Input that should NOT trigger guardrail]
- Edge case: [Tricky input to test boundary]
Provide detailed token and cost analysis:
Agent Cost Analysis
Model: [Selected LLM model and tier]
Pricing: Input $X / MTok, Output $Y / MTok
Token Breakdown (per conversation):
| Component | Tokens | Cost |
|-----------|--------|------|
| System prompt | X | $X.XX |
| Average user message | X | $X.XX |
| Average agent response | X | $X.XX |
| Tool results | X | $X.XX |
| Context/memory | X | $X.XX |
| **Total per turn** | **X** | **$X.XX** |
Turns per conversation:
- Simple queries: A turns = $X
- Medium complexity: B turns = $Y
- Complex queries: C turns = $Z
- **Average: D turns = $W**
Monthly projection:
| Usage Level | Conversations/mo | Cost/mo |
|-------------|:----------------:|:-------:|
| Low | 1,000 | $X |
| Medium | 10,000 | $Y |
| High | 100,000 | $Z |
Cost Optimization Strategies:
✅ Strategy #1: [Name]
- Current cost: $X per conversation
- Optimized cost: $Y per conversation
- Savings: Z%
- How: [Specific implementation]
- Trade-off: [What you lose, if anything]
✅ Strategy #2: [Name]
- Current cost: $X per conversation
- Optimized cost: $Y per conversation
- Savings: Z%
- How: [Specific implementation]
- Trade-off: [What you lose, if anything]
[Continue for 3-5 strategies...]
Recommended optimization path:
1. [Start with this optimization — easiest/highest impact]
2. [Then this one]
3. [Finally this one if needed]
Cost with all optimizations:
- Before: $X per conversation
- After: $Y per conversation
- Savings: Z% ($W/month at medium usage)
Provide comprehensive testing plan:
Agent Testing Plan
Unit Tests (per tool):
| Tool | Test Case | Expected Output | Edge Cases |
|------|-----------|----------------|------------|
| tool-1 | [Normal input] | [Expected result] | [3-5 edge cases to test] |
| tool-2 | [Normal input] | [Expected result] | [3-5 edge cases to test] |
Integration Tests (agent workflows):
1. **Happy path test**: [Describe complete successful flow]
- Input: [User message]
- Expected: Agent uses [tools], returns [response]
- Success criteria: [Measurable criteria]
2. **Tool failure test**: [Describe tool failure scenario]
- Input: [User message]
- Failure: [Which tool fails and how]
- Expected: Agent gracefully handles, returns [response]
- Success criteria: [No crash, appropriate fallback]
3. **Guardrail test**: [Describe guardrail trigger]
- Input: [User message that should trigger guardrail]
- Expected: Guardrail activates, agent [action]
- Success criteria: [Specific guardrail behavior]
4. **Multi-turn conversation test**: [Describe conversation flow]
- Turn 1: [User message] → [Agent response]
- Turn 2: [User message] → [Agent response]
- Turn 3: [User message] → [Agent response]
- Success criteria: [Agent maintains context, proper memory usage]
Evaluation Metrics:
| Metric | Target | How to Measure |
|--------|--------|----------------|
| Success rate | >X% | % conversations that resolve without escalation |
| Average response time | <Y sec | p95 latency from user message to response |
| Tool call accuracy | >Z% | % of tool calls that return successful results |
| User satisfaction | >W rating | Post-conversation survey (1-5 scale) |
| Cost per conversation | <$X | Track actual token usage vs estimate |
Red Team Tests (adversarial):
1. **Jailbreak attempt**: [Try to make agent ignore guardrails]
2. **Infinite loop attempt**: [Try to make agent loop forever]
3. **PII extraction**: [Try to make agent reveal sensitive data]
4. **Out-of-scope task**: [Request something agent shouldn't do]
5. **Ambiguous input**: [Intentionally vague or unclear request]
Testing Timeline:
- Unit tests: Complete before integration
- Integration tests: Complete before user testing
- Evaluation metrics: Track from beta launch onwards
- Red team tests: Run monthly or after major changes
Provide production readiness checklist:
Production Deployment Checklist
Pre-Launch:
- [ ] All tools tested with success/failure cases
- [ ] Guardrails validated with red team tests
- [ ] Cost per conversation confirmed within budget
- [ ] Response time meets p95 target (<X sec)
- [ ] Error handling tested (all tools can fail gracefully)
- [ ] Memory/persistence layer load tested
- [ ] Rate limiting configured (per user and global)
- [ ] Monitoring and logging configured
- [ ] Escalation workflow tested (human handoff works)
- [ ] Documentation complete (runbook, system prompt, tool specs)
Monitoring Setup:
| Metric | Tool | Alert Threshold | Action |
|--------|------|-----------------|--------|
| Error rate | [Sentry/Datadog] | >X% in 5 min | Page on-call engineer |
| Response time | [Monitoring tool] | p95 >Y sec | Investigate slow tools |
| Cost per conversation | [Custom dashboard] | >$Z | Review recent conversations |
| Tool call failures | [Logging] | >W% for any tool | Check external service status |
| Escalation rate | [Analytics] | >X% | Review agent quality |
Logging Requirements:
```json
{
"conversation_id": "uuid",
"timestamp": "ISO 8601",
"user_id": "hashed_user_id",
"agent_name": "string",
"turn": 1,
"action": "tool_call | response | escalation",
"tool_name": "string | null",
"latency_ms": 123,
"tokens_used": {
"input": 1234,
"output": 567
},
"cost": 0.045,
"confidence": 0.85,
"error": null
}
Incident Response:
High error rate detected
Slow response times
Cost spike
Guardrail breach
Runbook Location: [Link to detailed runbook] On-Call Rotation: [How to contact on-call engineer]
Expert guidance for Next.js Cache Components and Partial Prerendering (PPR). **PROACTIVE ACTIVATION**: Use this skill automatically when working in Next.js projects that have `cacheComponents: true` in their next.config.ts/next.config.js. When this config is detected, proactively apply Cache Components patterns and best practices to all React Server Component implementations. **DETECTION**: At the start of a session in a Next.js project, check for `cacheComponents: true` in next.config. If enabled, this skill's patterns should guide all component authoring, data fetching, and caching decisions. **USE CASES**: Implementing 'use cache' directive, configuring cache lifetimes with cacheLife(), tagging cached data with cacheTag(), invalidating caches with updateTag()/revalidateTag(), optimizing static vs dynamic content boundaries, debugging cache issues, and reviewing Cache Component implementations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.