agent-creation

L1 Improvement

Rebuilt the skill to follow the Skill Forge section cadence with explicit guardrails, validation hooks, and confidence ceilings.
Added prompt-architect style constraint extraction, contract-first design, and adversarial testing steps to avoid shallow agent definitions.

LIBRARY-FIRST PROTOCOL (MANDATORY)

Before writing ANY code, you MUST check:

Step 1: Library Catalog

Location: .claude/library/catalog.json
If match >70%: REUSE or ADAPT

Step 2: Patterns Guide

Location: .claude/docs/inventories/LIBRARY-PATTERNS-GUIDE.md
If pattern exists: FOLLOW documented approach

Step 3: Existing Projects

Location: D:\Projects\*
If found: EXTRACT and adapt

Decision Matrix

Match	Action
Library >90%	REUSE directly
Library 70-90%	ADAPT minimally
Pattern exists	FOLLOW pattern
In project	EXTRACT
No match	BUILD (add to library after)

STANDARD OPERATING PROCEDURE

Purpose

Create production-ready specialist agents with clear personas, tool wiring, evaluation scaffolding, and documentation that can be routed by orchestration layers.

Trigger Conditions

Positive: requests to create or refine an agent, multi-agent system design, agent prompt hardening, adding new tools to an agent.
Negative/reroute: single-turn tasks better handled by micro-skill-creator, standalone prompt tuning handled by prompt-architect, or skill structure requests handled by skill-builder/skill-forge.

Guardrails

Use English-first outputs with explicit confidence ceilings following prompt-architect rules.
Structure-first: deliver the agent spec (frontmatter + body), examples, and validation notes; avoid "created" confirmations without artifacts.
Avoid duplication: check existing registry entries; prefer specialization over overlap.
Run adversarial probes (boundary, failure, misuse) before considering the agent production-ready.
Respect hooks latency targets from Skill Forge: pre_hook_target_ms 20/100 max; post_hook_target_ms 100/1000 max.

Execution Phases

Discovery: Capture intent, domain, constraints (hard/soft/inferred), tools, and expected outputs.
Design: Define persona, responsibilities, tool permissions, memory strategy, and decision checkpoints.
Prompt Construction: Author the system prompt with role, inputs, outputs, style, and escalation rules; add few-shot patterns if helpful.
Validation: Create test cases (happy path + edge), run self-consistency checks, and document evaluation results with confidence ceilings.
Delivery: Package the agent spec, integration notes, and next-step risks; register metadata for agent-selector routing.

Pattern Recognition

Greenfield agent for new domain → emphasize constraint extraction and baseline tests.
Refining underperforming agent → capture failure cases and add targeted guardrails.
Multi-agent topology → design interaction contracts, escalation paths, and boundaries between agents.

Advanced Techniques

Use program-of-thought to decompose complex roles into capabilities and checkpoints.
Apply contrastive prompting to defend against prompt drift and misuse.
Capture evaluation harness steps so skill-forge or CI can rerun them.

Common Anti-Patterns

Shipping an agent without tool/contracts defined.
Using generic personas without domain evidence or examples.
Skipping adversarial tests or confidence ceilings.

Practical Guidelines

Prefer specific tool verbs over open-ended "help" instructions.
Keep output formats deterministic (JSON or structured text) to ease downstream parsing.
Record INFERRED constraints and seek confirmation before finalizing.

Cross-Skill Coordination

Upstream: prompt-architect for constraint extraction and clarity; skill-builder for directory scaffolding.
Parallel: cognitive-lensing for alternative reasoning frames; meta-tools for tooling composition.
Downstream: agent-selector for routing; recursive-improvement for iterative hardening.

MCP Requirements

Optional memory/vector MCP for loading prior agent outcomes; tag sessions with WHO=agent-creation-{session}, WHY=skill-execution.
If external tools are added, document authentication and rate-limit assumptions in the agent spec.

Input/Output Contracts

inputs:
  task: string  # required description of the agent's job
  domain: string  # required domain or vertical
  tools: list[string]  # optional tools or MCP servers to integrate
  constraints: list[string]  # optional hard/soft/inferred constraints
outputs:
  agent_spec: file  # system prompt with frontmatter and body
  eval_notes: file  # tests, adversarial cases, and results
  integration: summary  # how to register and route the new agent

Recursive Improvement

Run recursive-improvement with failure cases and evaluation deltas; iterate until improvement delta < 2% or risks are documented.

Examples

Create a compliance-review agent with SOC2/GDPR guardrails and audit-ready outputs.
Refine an existing API-tester agent to add contract testing and retry logic for flaky endpoints.

Troubleshooting

Ambiguous scope → rerun constraint extraction and confirm hard vs soft requirements.
Overlapping agents → consult registry and propose consolidation or specialization.
Tool misfires → add error-handling branches and safe defaults in the prompt body.

Completion Verification

Agent spec delivered with persona, contracts, and escalation rules.
Tests and adversarial probes executed with recorded results.
Confidence ceiling stated; memory/tags applied if MCP used.
Registry metadata prepared for agent-selector.

Confidence: 0.70 (ceiling: inference 0.70) - SOP rewritten with Skill Forge structure and prompt-architect guardrails.