From agents
Audit runtime controls for tool permissions, approvals, memory, telemetry, evals, rollout, and containment. Use when reviewing tool-bearing agent systems. NOT for security scans, prompt-only work, or static code review.
npx claudepluginhub wyattowalsh/agents --plugin agentsThis skill uses the workspace's default tool permissions.
Design and audit the controls that keep tool-bearing agent systems predictable,
Creates isolated Git worktrees for feature branches with prioritized directory selection, gitignore safety checks, auto project setup for Node/Python/Rust/Go, and baseline verification.
Executes implementation plans in current session by dispatching fresh subagents per independent task, with two-stage reviews: spec compliance then code quality.
Dispatches parallel agents to independently tackle 2+ tasks like separate test failures or subsystems without shared state or dependencies.
Design and audit the controls that keep tool-bearing agent systems predictable, observable, and safe to operate.
Scope: Runtime governance for agents that use tools, memory, approvals,
subagents, evals, or external systems. NOT for generic vulnerability scanning
(security-scanner), normal code review (honest-review), prompt-only
optimization (prompt-engineer), or MCP implementation details (mcp-creator).
$ARGUMENTS | Mode | Action |
|---|---|---|
| Empty | menu | Show governance modes and required inputs |
design <system> | design | Define runtime policies for a new or changing agent system |
audit <path-or-system> | audit | Review existing tool, approval, memory, telemetry, and eval controls |
permissions <agent-or-tools> | permissions | Design allowlists, denylists, approval modes, and escalation rules |
memory <agent-or-system> | memory | Define memory scope, retention, privacy, and invalidation policy |
evals <workflow> | evals | Plan regression, adversarial, and runtime acceptance eval loops |
rollout <system> | rollout | Define staged release, monitoring, rollback, and operator readiness controls |
incident <failure-mode> | incident | Define containment and recovery controls for agent failures |
| Natural language about agent tools, permissions, memory, evals, or containment | Auto-detect the closest mode |
| Surface | Review Questions |
|---|---|
| Tools | Which tools can read, write, spend money, deploy, message users, or delete data? |
| Approvals | Which operations require explicit user approval or human review? |
| Memory | What can be stored, for how long, and at what scope? |
| State | What is durable, replayable, idempotent, and auditable? |
| Telemetry | Which traces, decisions, tool calls, and failures are observable? |
| Evals | Which scenarios prevent regression before rollout? |
| Containment | How does the system stop, rollback, quarantine, or degrade safely? |
Use these canonical terms exactly when producing governance reports.
| Term | Meaning |
|---|---|
| tool consequence | The real-world effect a tool call can have: read, write, deploy, message, spend, delete, or expose |
| approval gate | Explicit human or policy checkpoint before a higher-risk action |
| runtime guard | Hook, wrapper, allowlist, denylist, test, or platform policy that enforces a governance rule |
| memory boundary | Scope, retention, redaction, and invalidation policy for stored agent context |
| containment | Stop, rollback, quarantine, or degrade action after unsafe or failed behavior |
| shadow mode | Runtime mode that records proposed actions without executing them |
Classify the request before choosing a mode:
security-scanner.honest-review.prompt-engineer.mcp-creator.| Scope | Strategy |
|---|---|
| Single agent or workflow | Produce one control matrix and one eval/monitoring set |
| Multiple agents sharing tools | Group by tool consequence and shared approval gates |
| Platform-wide governance | Define baseline policy first, then exceptions by agent class |
| Live production rollout | Add staged rollout, rollback, monitoring, and owner review gates |
SKILL.md for routing and control surfaces.references/control-matrix.md for permissions, memory, telemetry, and eval controls.references/rollout-governance.md only when release, rollback, monitoring, or production readiness is in scope.| File | Read When |
|---|---|
references/control-matrix.md | Designing or auditing runtime control surfaces |
references/rollout-governance.md | Planning staged release, rollback, monitoring, and operator readiness |
## Agent Governance Report
- System:
- Mode:
- Risk tier:
### Control Matrix
| Surface | Current | Required | Enforcement | Evidence |
|---|---|---|---|---|
### Required Changes
- ...
### Evals And Monitoring
- ...
### Rollout And Containment
- ...
security-scanner, honest-review, prompt-engineer, or mcp-creator; route to them when the request is outside runtime governance.Before declaring this skill complete after edits:
uv run wagents validate
uv run wagents eval validate
uv run python audit.py skills/agent-runtime-governance
uv run wagents package agent-runtime-governance --dry-run
Completion criteria: