Comprehensive audit skill for agents and skills across the plugin ecosystem. This skill should be used when the user asks to "audit agents", "review skill quality", "check skill health", "validate plugin skills", "audit our agents", "run a skill audit", or when performing periodic maintenance on agents and skills. Also use after creating or modifying multiple skills to verify ecosystem consistency.
From bopen-toolsnpx claudepluginhub b-open-io/claude-plugins --plugin bopen-toolsThis skill uses the workspace's default tool permissions.
references/skill-quality-guide.mdreferences/testing-strategies.mdreferences/workflow-patterns.mdSearches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Systematic audit methodology for evaluating the health, quality, and consistency of agents and skills across the plugin ecosystem. Produces actionable findings with severity ratings and recommended fixes.
Every audit evaluates skills across seven dimensions. For each skill, score pass/warn/fail per dimension.
Verify the invocation control fields are set correctly.
Check against the invocation matrix:
| Scenario | user-invocable | disable-model-invocation |
|---|---|---|
| Default (user + Claude can invoke) | omit (default true) | omit (default false) |
Agent-only (hidden from / menu) | false | omit |
| User-only (Claude cannot auto-invoke) | omit | true |
| Agent-only + no auto-invoke | false | true |
Checks:
disable-model-invocation: truedisable-model-invocation: true/skill-name directly? If no, needs user-invocable: falseuser-invocable: falsetools: frontmatter? Does that match the intended audience?Common failure: Skills that are agent-internal but missing user-invocable: false, cluttering the user's / menu.
name field in frontmatter exactlySKILL.md (case-sensitive)The description is the single most important field -- it determines whether Claude loads the skill.
Structure: [What it does] + [When to use it] + [Key capabilities]
Checks:
< or >)Test the description: Ask Claude "When would you use the [skill name] skill?" -- Claude should quote the description back accurately. If it can't, the triggers are weak.
Skills use a three-level system to minimize token usage:
Checks:
wc -w to verify.references/, not inlinescripts/Checks:
evals/evals.json with trigger and functional test casesConsult references/testing-strategies.md for the full testing methodology.
Agents that create or modify skills should have access to the right toolkit:
| Required Skill | Purpose |
|---|---|
Skill(skill-creator:skill-creator) | Interactive skill creation workflow |
Skill(plugin-dev:skill-development) | Skill writing best practices |
Skill(bopen-tools:benchmark-skills) | Eval/benchmark harness |
Skill(bopen-tools:agent-auditor) | This audit skill |
Check the agent's tools: frontmatter to verify these are listed.
If the agent's domain involves UI generation, rendering, or cross-platform output, check for generative UI readiness.
Checks:
Skill(bopen-tools:generative-ui) in tools?@json-render/react-native?Applicable agents: designer, agent-builder, nextjs, mobile, integration-expert
Not applicable (skip this dimension): code-auditor, documentation-writer, researcher, devops, database, payments
Delegate enumeration and classification to a subagent to keep the main context clean:
Agent(prompt: "Enumerate and classify all skills in the target plugin.
1. Run: ls skills/*/SKILL.md and count total
2. For each skill, read the YAML frontmatter and classify:
- Type: agent-only (user-invocable: false), user-only (disable-model-invocation: true), or default
- Plugin it belongs in
- Which agents reference it (grep agents/*.md for Skill(name))
3. Return a table: | Skill | Type | Referenced By | Notes |
Target directory: skills/",
subagent_type: "general-purpose")
For multi-plugin audits, dispatch one subagent per plugin in parallel. For single-plugin audits, dispatch one subagent per batch of 5-10 skills:
Agent(prompt: "Audit these skills against the seven-dimension checklist:
<list of skills from Step 1>
For each skill, evaluate: Scope & Invocation, Location & Cross-Client, Description Quality, Structure, Testing, Agent Equipment, Generative UI.
Score each dimension as pass/warn/fail. Return findings in the report format.",
subagent_type: "general-purpose")
The main context receives only the formatted audit report, not raw skill file contents.
Record per dimension:
Format findings as:
## Audit Report: [plugin-name]
### Summary
- Total skills: N
- Pass: N | Warn: N | Fail: N
### Findings
#### [skill-name]
| Dimension | Status | Notes |
|-----------|--------|-------|
| Scope & Invocation | pass/warn/fail | details |
| Location & Cross-Client | pass/warn/fail | details |
| Description Quality | pass/warn/fail | details |
| Structure | pass/warn/fail | details |
| Testing | pass/warn/fail | details |
| Agent Equipment | pass/warn/fail | details |
| Generative UI | pass/warn/fail/skip | details |
**Recommended fixes:**
1. [specific, actionable fix]
Apply fixes, then re-run the audit on changed skills only. Use the evaluator-optimizer loop from references/workflow-patterns.md for iterative improvement.
For multi-plugin audits, use parallelization -- dispatch one subagent per plugin. See references/workflow-patterns.md for:
See references/testing-strategies.md for:
| File | When to Consult |
|---|---|
references/skill-quality-guide.md | Writing or reviewing description, structure, and instructions |
references/workflow-patterns.md | Planning multi-plugin audits or iterative fix cycles |
references/testing-strategies.md | Creating evals, running benchmarks, measuring effectiveness |