Skill

skill-optimizer

Audits and optimizes Agent Skills (SKILL.md files) across 8 dimensions using session transcripts and static analysis, prioritizing P0-P2 fixes for better triggering.

Python

Bash

Markdown

developer-tools

code-quality

npx claudepluginhub hqhq1025/skill-optimizer

Tool Access

This skill uses the workspace's default tool permissions.

Preview

- **Read-only**: never modify skill files. Only output report.

SKILL.md

Similar Skills

skill-optimizer

36.4k

Diagnoses and optimizes Agent Skills (SKILL.md) using session transcripts and static analysis. Generates reports scoring 8 dimensions with P0-P2 fixes for Claude Code and Codex.

antigravity-awesome-skills

general-skill-refiner

Analyzes and refines skills by identifying issues like time estimates, oversized files, poor structure, redundant content; prioritizes fixes (MUST/SHOULD/NICE); implements improvements with user feedback.

3 files

generic-skills

review-skill

Reviews and improves AI agent SKILL.md files against Agent Skills spec and Anthropic guidelines. Scores 10 quality dimensions, identifies issues, and suggests rewrites for creating, editing, or auditing skills.

1 file

learn

Stats

Stars17

Forks2

Last CommitMar 30, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Source	Claude Code	Codex	Shared
Session transcripts	`~/.claude/projects/*/.jsonl`	`~/.codex/sessions/*/.jsonl`	—
Skill files	`~/.claude/skills/*/SKILL.md`	`~/.codex/skills/*/SKILL.md`	`~/.agents/skills/*/SKILL.md`

Source

Claude Code

Codex

Shared

Session transcripts

~/.claude/projects/**/*.jsonl

~/.codex/sessions/**/*.jsonl

—

Skill files

~/.claude/skills/*/SKILL.md

~/.codex/skills/*/SKILL.md

~/.agents/skills/*/SKILL.md

Identify target skills ↓ Collect session data (python3 scripts scan JSONL transcripts) ↓ Run 8 analysis dimensions ↓ Compute composite scores ↓ Output report with P0/P1/P2

Check	Pass Criteria
Frontmatter format	Only `name` + `description`, total < 1024 chars
Name format	Letters, numbers, hyphens only
Description trigger	Starts with "Use when..." or has explicit trigger conditions
Description workflow leak	Description does NOT summarize the skill's workflow steps (CSO violation)
Description pushiness	Description actively claims scenarios where it should be used, not just passive
Overview section	Present
Rules section	Present
MUST/NEVER density	Count ALL-CAPS directive words; >5 per 100 words = flag. Note: Meincke et al. (2025) found persuasion directives have inconsistent effects across models. Suggest converting to concrete bright-line rules with rationale, not mere emphasis.
Word count	< 500 words (flag if over)
Narrative anti-pattern	No "In session X, we found..." storytelling — skills should be instructions, not post-hoc reports
YAML quoting safety	description containing `:` must be wrapped in double quotes, otherwise YAML parse failure makes skill invisible
Critical info position	Core trigger conditions and primary actions must be in the first 20% of SKILL.md, not buried in the middle (Lost in the Middle, Liu et al. TACL 2024: U-shaped attention curve)
Description 250-char check	Primary trigger keywords must appear within the first 250 characters of description (skill listing truncation point in most agents)
Trigger condition count	≤ 2 trigger conditions in description is ideal; consistent with IFEval (Zhou et al. 2023) finding that LLMs struggle with multi-constraint prompts

Check

Pass Criteria

Frontmatter format

Only name + description, total < 1024 chars

Name format

Letters, numbers, hyphens only

Description trigger

Starts with "Use when..." or has explicit trigger conditions

Description workflow leak

Description does NOT summarize the skill's workflow steps (CSO violation)

Description pushiness

Description actively claims scenarios where it should be used, not just passive

Overview section

Present

Rules section

Present

MUST/NEVER density

Count ALL-CAPS directive words; >5 per 100 words = flag. Note: Meincke et al. (2025) found persuasion directives have inconsistent effects across models. Suggest converting to concrete bright-line rules with rationale, not mere emphasis.

Word count

< 500 words (flag if over)

Narrative anti-pattern

No "In session X, we found..." storytelling — skills should be instructions, not post-hoc reports

YAML quoting safety

description containing : must be wrapped in double quotes, otherwise YAML parse failure makes skill invisible

Critical info position

Core trigger conditions and primary actions must be in the first 20% of SKILL.md, not buried in the middle (Lost in the Middle, Liu et al. TACL 2024: U-shaped attention curve)

Description 250-char check

Primary trigger keywords must appear within the first 250 characters of description (skill listing truncation point in most agents)

Trigger condition count

≤ 2 trigger conditions in description is ideal; consistent with IFEval (Zhou et al. 2023) finding that LLMs struggle with multi-constraint prompts

Score	Meaning
5	Healthy: high trigger rate, positive reactions, complete workflows, clean static
4	Good: minor issues in 1-2 dimensions
3	Needs attention: significant gap in 1 dimension or minor gaps in 3+
2	Problematic: never triggered, or negative user reactions, or major static issues
1	Broken: doesn't work, references missing, or fundamentally misaligned

Score

Meaning

Healthy: high trigger rate, positive reactions, complete workflows, clean static

Good: minor issues in 1-2 dimensions

Needs attention: significant gap in 1 dimension or minor gaps in 3+

Problematic: never triggered, or negative user reactions, or major static issues

Broken: doesn't work, references missing, or fundamentally misaligned

# Skill Optimization Report **Date**: {date} **Scope**: {all / specified skills} **Session data**: {N} sessions, {date range} ## Overview | Skill | Triggers | Reaction | Completion | Static | Undertrigger | Token | Score | |-------|----------|----------|------------|--------|--------------|-------|-------| | example-skill | 2 | 100% | 86% | B+ | 1 miss | 486w | 4/5 | ## P0 Fixes (blocking usage) 1. ... ## P1 Improvements (better experience) 1. ... ## P2 Optional Optimizations 1. ... ## Per-Skill Diagnostics ### {skill-name} #### 4.1 Trigger Rate ... #### 4.2 User Reaction ... (all 8 dimensions)

Source	Claude Code	Codex	Shared
Session transcripts	`~/.claude/projects/*/.jsonl`	`~/.codex/sessions/*/.jsonl`	—
Skill files	`~/.claude/skills/*/SKILL.md`	`~/.codex/skills/*/SKILL.md`	`~/.agents/skills/*/SKILL.md`

Source

Claude Code

Codex

Shared

Session transcripts

~/.claude/projects/**/*.jsonl

~/.codex/sessions/**/*.jsonl

—

Skill files

~/.claude/skills/*/SKILL.md

~/.codex/skills/*/SKILL.md

~/.agents/skills/*/SKILL.md

Identify target skills ↓ Collect session data (python3 scripts scan JSONL transcripts) ↓ Run 8 analysis dimensions ↓ Compute composite scores ↓ Output report with P0/P1/P2

Check	Pass Criteria
Frontmatter format	Only `name` + `description`, total < 1024 chars
Name format	Letters, numbers, hyphens only
Description trigger	Starts with "Use when..." or has explicit trigger conditions
Description workflow leak	Description does NOT summarize the skill's workflow steps (CSO violation)
Description pushiness	Description actively claims scenarios where it should be used, not just passive
Overview section	Present
Rules section	Present
MUST/NEVER density	Count ALL-CAPS directive words; >5 per 100 words = flag. Note: Meincke et al. (2025) found persuasion directives have inconsistent effects across models. Suggest converting to concrete bright-line rules with rationale, not mere emphasis.
Word count	< 500 words (flag if over)
Narrative anti-pattern	No "In session X, we found..." storytelling — skills should be instructions, not post-hoc reports
YAML quoting safety	description containing `:` must be wrapped in double quotes, otherwise YAML parse failure makes skill invisible
Critical info position	Core trigger conditions and primary actions must be in the first 20% of SKILL.md, not buried in the middle (Lost in the Middle, Liu et al. TACL 2024: U-shaped attention curve)
Description 250-char check	Primary trigger keywords must appear within the first 250 characters of description (skill listing truncation point in most agents)
Trigger condition count	≤ 2 trigger conditions in description is ideal; consistent with IFEval (Zhou et al. 2023) finding that LLMs struggle with multi-constraint prompts

Check

Pass Criteria

Frontmatter format

Only name + description, total < 1024 chars

Name format

Letters, numbers, hyphens only

Description trigger

Starts with "Use when..." or has explicit trigger conditions

Description workflow leak

Description does NOT summarize the skill's workflow steps (CSO violation)

Description pushiness

Description actively claims scenarios where it should be used, not just passive

Overview section

Present

Rules section

Present

MUST/NEVER density

Word count

< 500 words (flag if over)

Narrative anti-pattern

No "In session X, we found..." storytelling — skills should be instructions, not post-hoc reports

YAML quoting safety

description containing : must be wrapped in double quotes, otherwise YAML parse failure makes skill invisible

Critical info position

Core trigger conditions and primary actions must be in the first 20% of SKILL.md, not buried in the middle (Lost in the Middle, Liu et al. TACL 2024: U-shaped attention curve)

Description 250-char check

Primary trigger keywords must appear within the first 250 characters of description (skill listing truncation point in most agents)

Trigger condition count

≤ 2 trigger conditions in description is ideal; consistent with IFEval (Zhou et al. 2023) finding that LLMs struggle with multi-constraint prompts

Score	Meaning
5	Healthy: high trigger rate, positive reactions, complete workflows, clean static
4	Good: minor issues in 1-2 dimensions
3	Needs attention: significant gap in 1 dimension or minor gaps in 3+
2	Problematic: never triggered, or negative user reactions, or major static issues
1	Broken: doesn't work, references missing, or fundamentally misaligned

Score

Meaning

Healthy: high trigger rate, positive reactions, complete workflows, clean static

Good: minor issues in 1-2 dimensions

Needs attention: significant gap in 1 dimension or minor gaps in 3+

Problematic: never triggered, or negative user reactions, or major static issues

Broken: doesn't work, references missing, or fundamentally misaligned

skill-optimizer

Tool Access

Preview

SKILL.md

Similar Skills

Help us improve

Help us improve

skill-optimizer

Tool Access

Preview

SKILL.md

Rules

Overview

Usage

Data Sources

Workflow

Step 1: Identify Target Skills

Step 2: Collect Session Data

Step 3: Run 8 Analysis Dimensions

4.1 Trigger Rate

4.2 Post-Invocation User Reaction

4.3 Workflow Completion Rate

4.4 Static Quality Analysis

4.5a False Positive Rate (Overtrigger)

4.5b Undertrigger Detection

4.6 Cross-Skill Conflicts

4.7 Environment Consistency

4.8 Token Economics

Step 4: Composite Score

Report Format

Research Background

Similar Skills

Help us improve

Rules

Overview

Usage

Data Sources

Workflow

Step 1: Identify Target Skills

Step 2: Collect Session Data

Step 3: Run 8 Analysis Dimensions

4.1 Trigger Rate

4.2 Post-Invocation User Reaction

4.3 Workflow Completion Rate

4.4 Static Quality Analysis

4.5a False Positive Rate (Overtrigger)

4.5b Undertrigger Detection

4.6 Cross-Skill Conflicts

4.7 Environment Consistency

4.8 Token Economics

Step 4: Composite Score

Report Format

Research Background