Perform 12-Factor Agents compliance analysis on any codebase. Use when evaluating agent architecture, reviewing LLM-powered systems, or auditing agentic applications against the 12-Factor methodology.
/plugin marketplace add anderskev/beagle/plugin install anderskev-beagle@anderskev/beagleThis skill inherits all available tools. When active, it can use any tool Claude has access to.
Reference: 12-Factor Agents
| Parameter | Description | Required |
|---|---|---|
docs_path | Path to documentation directory (for existing analyses) | Optional |
codebase_path | Root path of the codebase to analyze | Required |
Principle: Convert natural language inputs into structured, deterministic tool calls using schema-validated outputs.
Search Patterns:
# Look for Pydantic schemas
grep -r "class.*BaseModel" --include="*.py"
grep -r "TaskDAG\|TaskResponse\|ToolCall" --include="*.py"
# Look for JSON schema generation
grep -r "model_json_schema\|json_schema" --include="*.py"
# Look for structured output generation
grep -r "output_type\|response_model" --include="*.py"
File Patterns: **/agents/*.py, **/schemas/*.py, **/models/*.py
Compliance Criteria:
| Level | Criteria |
|---|---|
| Strong | All LLM outputs use Pydantic/dataclass schemas with validators |
| Partial | Some outputs typed, but dict returns or unvalidated strings exist |
| Weak | LLM returns raw strings parsed manually or with regex |
Anti-patterns:
json.loads(llm_response) without schema validationoutput.split() or regex parsing of LLM responsesdict[str, Any] return types from agentsPrinciple: Treat prompts as first-class code you control, version, and iterate on.
Search Patterns:
# Look for embedded prompts
grep -r "SYSTEM_PROMPT\|system_prompt" --include="*.py"
grep -r '""".*You are' --include="*.py"
# Look for template systems
grep -r "jinja\|Jinja\|render_template" --include="*.py"
find . -name "*.jinja2" -o -name "*.j2"
# Look for prompt directories
find . -type d -name "prompts"
File Patterns: **/prompts/**, **/templates/**, **/agents/*.py
Compliance Criteria:
| Level | Criteria |
|---|---|
| Strong | Prompts in separate files, templated (Jinja2), versioned |
| Partial | Prompts as module constants, some parameterization |
| Weak | Prompts hardcoded inline in functions, f-strings only |
Anti-patterns:
f"You are a {role}..." inline in agent methodsPrinciple: Control how history, state, and tool results are formatted for the LLM.
Search Patterns:
# Look for context/message management
grep -r "AgentMessage\|ChatMessage\|messages" --include="*.py"
grep -r "context_window\|context_compiler" --include="*.py"
# Look for custom serialization
grep -r "to_xml\|to_context\|serialize" --include="*.py"
# Look for token management
grep -r "token_count\|max_tokens\|truncate" --include="*.py"
File Patterns: **/context/*.py, **/state/*.py, **/core/*.py
Compliance Criteria:
| Level | Criteria |
|---|---|
| Strong | Custom context format, token optimization, typed events, compaction |
| Partial | Basic message history with some structure |
| Weak | Raw message accumulation, standard OpenAI format only |
Anti-patterns:
Principle: Tools produce schema-validated JSON that triggers deterministic code, not magic function calls.
Search Patterns:
# Look for tool/response schemas
grep -r "class.*Response.*BaseModel" --include="*.py"
grep -r "ToolResult\|ToolOutput" --include="*.py"
# Look for deterministic handlers
grep -r "def handle_\|def execute_" --include="*.py"
# Look for validation layer
grep -r "model_validate\|parse_obj" --include="*.py"
File Patterns: **/tools/*.py, **/handlers/*.py, **/agents/*.py
Compliance Criteria:
| Level | Criteria |
|---|---|
| Strong | All tool outputs schema-validated, handlers type-safe |
| Partial | Most tools typed, some loose dict returns |
| Weak | Tools return arbitrary dicts, no validation layer |
Anti-patterns:
eval() or exec() on LLM-generated codePrinciple: Merge execution state (step, retries) with business state (messages, results).
Search Patterns:
# Look for state models
grep -r "ExecutionState\|WorkflowState\|Thread" --include="*.py"
# Look for dual state systems
grep -r "checkpoint\|MemorySaver" --include="*.py"
grep -r "sqlite\|database\|repository" --include="*.py"
# Look for state reconstruction
grep -r "load_state\|restore\|reconstruct" --include="*.py"
File Patterns: **/state/*.py, **/models/*.py, **/database/*.py
Compliance Criteria:
| Level | Criteria |
|---|---|
| Strong | Single serializable state object with all execution metadata |
| Partial | State exists but split across systems (memory + DB) |
| Weak | Execution state scattered, requires multiple queries to reconstruct |
Anti-patterns:
Principle: Agents support simple APIs for launching, pausing at any point, and resuming.
Search Patterns:
# Look for REST endpoints
grep -r "@router.post\|@app.post" --include="*.py"
grep -r "start_workflow\|pause\|resume" --include="*.py"
# Look for interrupt mechanisms
grep -r "interrupt_before\|interrupt_after" --include="*.py"
# Look for webhook handlers
grep -r "webhook\|callback" --include="*.py"
File Patterns: **/routes/*.py, **/api/*.py, **/orchestrator/*.py
Compliance Criteria:
| Level | Criteria |
|---|---|
| Strong | REST API + webhook resume, pause at any point including mid-tool |
| Partial | Launch/pause/resume exists but only at coarse-grained points |
| Weak | CLI-only launch, no pause/resume capability |
Anti-patterns:
input() or confirm() callsPrinciple: Human contact is a tool call with question, options, and urgency.
Search Patterns:
# Look for human input mechanisms
grep -r "typer.confirm\|input(\|prompt(" --include="*.py"
grep -r "request_human_input\|human_contact" --include="*.py"
# Look for approval patterns
grep -r "approval\|approve\|reject" --include="*.py"
# Look for structured question formats
grep -r "question.*options\|HumanInputRequest" --include="*.py"
File Patterns: **/agents/*.py, **/tools/*.py, **/orchestrator/*.py
Compliance Criteria:
| Level | Criteria |
|---|---|
| Strong | request_human_input tool with question/options/urgency/format |
| Partial | Approval gates exist but hardcoded in graph structure |
| Weak | Blocking CLI prompts, no tool-based human contact |
Anti-patterns:
typer.confirm() in agent codePrinciple: Custom control flow, not framework defaults. Full control over routing, retries, compaction.
Search Patterns:
# Look for routing logic
grep -r "add_conditional_edges\|route_\|should_continue" --include="*.py"
# Look for custom loops
grep -r "while True\|for.*in.*range" --include="*.py" | grep -v test
# Look for execution mode control
grep -r "execution_mode\|agentic\|structured" --include="*.py"
File Patterns: **/orchestrator/*.py, **/graph/*.py, **/core/*.py
Compliance Criteria:
| Level | Criteria |
|---|---|
| Strong | Custom routing functions, conditional edges, execution mode control |
| Partial | Framework control flow with some customization |
| Weak | Default framework loop with no custom routing |
Anti-patterns:
Principle: Errors in context enable self-healing. Track consecutive errors, escalate after threshold.
Search Patterns:
# Look for error handling
grep -r "except.*Exception\|error_history\|consecutive_errors" --include="*.py"
# Look for retry logic
grep -r "retry\|backoff\|max_attempts" --include="*.py"
# Look for escalation
grep -r "escalate\|human_escalation" --include="*.py"
File Patterns: **/agents/*.py, **/orchestrator/*.py, **/core/*.py
Compliance Criteria:
| Level | Criteria |
|---|---|
| Strong | Errors in context, retry with threshold, automatic escalation |
| Partial | Errors logged and returned, no automatic retry loop |
| Weak | Errors logged only, not fed back to LLM, task fails immediately |
Anti-patterns:
logger.error() without adding to contextPrinciple: Each agent has narrow responsibility, 3-10 steps max.
Search Patterns:
# Look for agent classes
grep -r "class.*Agent\|class.*Architect\|class.*Developer" --include="*.py"
# Look for step definitions
grep -r "steps\|tasks" --include="*.py" | head -20
# Count methods per agent
grep -r "async def\|def " agents/*.py 2>/dev/null | wc -l
File Patterns: **/agents/*.py
Compliance Criteria:
| Level | Criteria |
|---|---|
| Strong | 3+ specialized agents, each with single responsibility, step limits |
| Partial | Multiple agents but some have broad scope |
| Weak | Single "god" agent that handles everything |
Anti-patterns:
Principle: Workflows triggerable from CLI, REST, WebSocket, Slack, webhooks, etc.
Search Patterns:
# Look for entry points
grep -r "@cli.command\|@router.post\|@app.post" --include="*.py"
# Look for WebSocket support
grep -r "WebSocket\|websocket" --include="*.py"
# Look for external integrations
grep -r "slack\|discord\|webhook" --include="*.py" -i
File Patterns: **/routes/*.py, **/cli/*.py, **/main.py
Compliance Criteria:
| Level | Criteria |
|---|---|
| Strong | CLI + REST + WebSocket + webhooks + chat integrations |
| Partial | CLI + REST API available |
| Weak | CLI only, no programmatic access |
Anti-patterns:
if __name__ == "__main__" entry pointPrinciple: Agents as pure functions: (state, input) -> (state, output). No side effects in agent logic.
Search Patterns:
# Look for state mutation patterns
grep -r "\.status = \|\.field = " --include="*.py"
# Look for immutable updates
grep -r "model_copy\|\.copy(\|with_" --include="*.py"
# Look for side effects in agents
grep -r "write_file\|subprocess\|requests\." agents/*.py 2>/dev/null
File Patterns: **/agents/*.py, **/nodes/*.py
Compliance Criteria:
| Level | Criteria |
|---|---|
| Strong | Immutable state updates, side effects isolated to tools/handlers |
| Partial | Mostly immutable, some in-place mutations |
| Weak | State mutated in place, side effects mixed with agent logic |
Anti-patterns:
state.field = new_value (mutation)Principle: Fetch likely-needed data upfront rather than mid-workflow.
Search Patterns:
# Look for context pre-fetching
grep -r "pre_fetch\|prefetch\|fetch_context" --include="*.py"
# Look for RAG/embedding systems
grep -r "embedding\|vector\|semantic_search" --include="*.py"
# Look for related file discovery
grep -r "related_tests\|similar_\|find_relevant" --include="*.py"
File Patterns: **/context/*.py, **/retrieval/*.py, **/rag/*.py
Compliance Criteria:
| Level | Criteria |
|---|---|
| Strong | Automatic pre-fetch of related tests, files, docs before planning |
| Partial | Manual context passing, design doc support |
| Weak | No pre-fetching, LLM must request all context via tools |
Anti-patterns:
| Factor | Status | Notes |
|--------|--------|-------|
| 1. Natural Language -> Tool Calls | **Strong/Partial/Weak** | [Key finding] |
| 2. Own Your Prompts | **Strong/Partial/Weak** | [Key finding] |
| ... | ... | ... |
| 13. Pre-fetch Context | **Strong/Partial/Weak** | [Key finding] |
**Overall**: X Strong, Y Partial, Z Weak
For each factor, provide:
Current Implementation
Compliance Level
Gaps
Recommendations
Initial Scan
Deep Dive (per factor)
Gap Analysis
Recommendations
Summary
| Score | Meaning | Action |
|---|---|---|
| Strong | Fully implements principle | Maintain, minor optimizations |
| Partial | Some implementation, significant gaps | Planned improvements |
| Weak | Minimal or no implementation | High priority for roadmap |
Use when working with Payload CMS projects (payload.config.ts, collections, fields, hooks, access control, Payload API). Use when debugging validation errors, security issues, relationship queries, transactions, or hook behavior.
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.