Build applications where agents are first-class citizens. Use this skill when designing autonomous agents, creating MCP tools, implementing self-modifying systems, or building apps where features are outcomes achieved by agents operating in a loop.
From aimi-engineeringnpx claudepluginhub aimi-so/aimi-engineering-plugin --plugin aimi-engineeringThis skill uses the workspace's default tool permissions.
references/action-parity-discipline.mdreferences/agent-execution-patterns.mdreferences/agent-native-testing.mdreferences/architecture-patterns.mdreferences/dynamic-context-injection.mdreferences/files-universal-interface.mdreferences/from-primitives-to-domain-tools.mdreferences/mcp-tool-design.mdreferences/mobile-patterns.mdreferences/product-implications.mdreferences/refactoring-to-prompt-native.mdreferences/self-modification.mdreferences/shared-workspace-architecture.mdreferences/system-prompt-design.mdDesigns and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
<why_now>
Software agents work reliably now. Claude Code proved that an LLM with bash and file tools, looping until an objective is met, can accomplish complex tasks autonomously. The same architecture applies beyond coding — file management, workflows, any domain. The Claude Code SDK makes this accessible: features aren't code you write, they're outcomes agents achieve. </why_now>
<core_principles>
Whatever the user can do through the UI, the agent must be able to achieve through tools.
This doesn't require 1:1 mapping of UI buttons to tools — it requires the agent can achieve the same outcomes. Sometimes that's a single tool (create_note), sometimes it's composing primitives (write_file to a notes directory).
Discipline: When adding any UI capability, ask: can the agent achieve this outcome? If not, add the necessary tools or primitives.
Read action-parity-discipline.md for capability mapping workflow.
Prefer atomic primitives. Features are outcomes achieved by an agent operating in a loop.
A tool is a primitive capability (read file, write file, run bash). A feature is an outcome described in a prompt, achieved by an agent with tools looping until done.
Less granular: classify_and_organize_files(files) → You wrote the logic
More granular: read_file, write_file, move_file → Agent makes decisions
Prompt: "Organize downloads by content and recency"
Key shift: To change how a feature behaves, you edit prose, not refactor code.
With atomic tools and parity, new features are just new prompts.
Want a "weekly review" feature? Write a prompt: "Review files modified this week, summarize changes, suggest three priorities." The agent uses list_files, read_file, and judgment. No weekly-review code needed. Users can extend behavior the same way.
Constraint: Only works if tools are atomic enough for unanticipated composition and the agent has parity.
The agent can accomplish things you didn't explicitly design for.
When tools are atomic and parity is maintained, users ask for unanticipated things — and the agent figures them out. This reveals latent demand: observe what users ask, then optimize common patterns with domain tools or dedicated prompts.
Flywheel: Atomic tools + parity → users request unexpected things → agent composes solutions → you observe patterns → optimize → repeat.
Agent-native apps get better through accumulated context and prompt refinement.
context.md or structured memoryRead self-modification.md for guardrails.
Test for each principle:
Wait for response before proceeding. </intake>
<routing> | Response | Action | |----------|--------| | 1, "design", "architecture", "plan" | Read [architecture-patterns.md](./references/architecture-patterns.md), then apply Architecture Checklist below | | 2, "files", "workspace", "filesystem" | Read [files-universal-interface.md](./references/files-universal-interface.md) and [shared-workspace-architecture.md](./references/shared-workspace-architecture.md) | | 3, "tool", "mcp", "primitive", "crud" | Read [mcp-tool-design.md](./references/mcp-tool-design.md) | | 4, "domain tool", "when to add" | Read [from-primitives-to-domain-tools.md](./references/from-primitives-to-domain-tools.md) | | 5, "execution", "completion", "loop" | Read [agent-execution-patterns.md](./references/agent-execution-patterns.md) | | 6, "prompt", "system prompt", "behavior" | Read [system-prompt-design.md](./references/system-prompt-design.md) | | 7, "context", "inject", "runtime", "dynamic" | Read [dynamic-context-injection.md](./references/dynamic-context-injection.md) | | 8, "parity", "ui action", "capability map" | Read [action-parity-discipline.md](./references/action-parity-discipline.md) | | 9, "self-modify", "evolve", "git" | Read [self-modification.md](./references/self-modification.md) | | 10, "product", "progressive", "approval", "latent demand" | Read [product-implications.md](./references/product-implications.md) | | 11, "mobile", "ios", "android", "background", "checkpoint" | Read [mobile-patterns.md](./references/mobile-patterns.md) | | 12, "test", "testing", "verify", "validate" | Read [agent-native-testing.md](./references/agent-native-testing.md) | | 13, "review", "refactor", "existing" | Read [refactoring-to-prompt-native.md](./references/refactoring-to-prompt-native.md) |After reading the reference, apply those patterns to the user's specific context. </routing>
<architecture_checklist>
Verify these before implementation:
complete_task tool (not heuristic detection)When designing architecture, explicitly address each checkbox. </architecture_checklist>
<quick_start>
// Step 1: Atomic tools
const tools = [
tool("read_file", "Read any file", { path: z.string() }, ...),
tool("write_file", "Write any file", { path: z.string(), content: z.string() }, ...),
tool("list_files", "List directory", { path: z.string() }, ...),
tool("complete_task", "Signal completion", { summary: z.string() }, ...),
];
// Step 2: Behavior in system prompt
const systemPrompt = `When asked to organize content:
1. Read existing files to understand structure
2. Analyze what organization makes sense
3. Create/move files using your tools
4. Call complete_task when done. You decide the structure.`;
// Step 3: Agent loops until complete_task is called
const result = await agent.run({ prompt: userMessage, tools, systemPrompt });
</quick_start>
<reference_index>
All references in references/:
Core Patterns:
Agent-Native Disciplines:
Platform-Specific:
<anti_patterns>
Agent as router — Agent figures out intent, calls a function. Uses intelligence to route, not to act. You're using a fraction of agent capability.
Build app, then add agent — Features built as code, then exposed to agent. No emergent capability possible.
Request/response thinking — Agent does one thing and returns. Misses the loop: agent pursues an outcome, handles unexpected situations along the way.
Defensive tool design — Over-constrained inputs (strict enums, excessive validation) prevent unanticipated usage.
Happy path in code — Edge cases handled in code means the agent is just a caller, not using judgment.
Cardinal sin: Agent executes your code instead of figuring things out
// WRONG: You wrote the workflow
tool("process_feedback", async ({ message }) => {
const cat = categorize(message); const pri = calculatePriority(message);
await store(message, cat, pri); if (pri > 3) await notify();
});
// RIGHT: Agent decides
tools: store_item, send_message
prompt: "Rate importance 1-5, store feedback, notify if >= 4"
Workflow-shaped tools — analyze_and_organize bundles judgment. Break into primitives.
Context starvation — Agent doesn't know what resources exist. Fix: inject available resources and vocabulary into system prompt.
Orphan UI actions — User can do something the agent can't. Fix: maintain parity.
Silent actions — Agent changes state, UI doesn't update. Fix: shared data stores with reactive binding.
Heuristic completion — Detecting completion through heuristics is fragile. Fix: explicit complete_task tool.
Static tool mapping — 50 tools for 50 API endpoints. Fix: discover + access pattern for dynamic APIs.
Incomplete CRUD — Agent can create but not update or delete. Fix: every entity needs full CRUD.
Sandbox isolation — Agent works in separate data space. Fix: shared workspace.
Gates without reason — Domain tool is the only path, restricting access unintentionally. Keep primitives available. </anti_patterns>
<success_criteria>
Architecture:
Implementation:
The Ultimate Test: Describe an outcome within your domain that you didn't build a feature for. Can the agent figure it out, looping until it succeeds? If yes — agent-native. If "I don't have a feature for that" — too constrained. </success_criteria>