Evaluates Claude Skills for description quality, content organization, writing style, and structural integrity. Generates weighted scores, grades, and improvement plans in score-only, remediation, or batch modes.
From claude-scholarnpx claudepluginhub galaxy-dawn/claude-scholar --plugin claude-scholarThis skill uses the workspace's default tool permissions.
references/batch-review-template.mdreferences/examples-bad.mdreferences/examples-good.mdreferences/scoring-criteria.mdscripts/extract-yaml.shscripts/skill-audit.pySearches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Guides agent creation for Claude Code plugins with file templates, frontmatter specs (name, description, model), triggering examples, system prompts, and best practices.
A meta-skill for evaluating the quality of Claude Skills. Perform comprehensive analysis across four key dimensions—description quality (25%), content organization (30%), writing style (20%), and structural integrity (25%)—to generate weighted scores, letter grades, and actionable improvement plans.
Use this skill to validate skills before sharing, identify improvement opportunities, or ensure compliance with skill development best practices.
Invoke this skill when:
Trigger phrases:
Use one of three review modes depending on the task:
Prefer remediation-backlog when the user asks what to fix next.
Prefer batch-portfolio when auditing many skills at once.
Accept skill path as input. Verify the path exists and contains SKILL.md. Read the complete skill directory structure.
# Example invocation
ls -la ~/.claude/skills/target-skill/
Validate:
Extract and validate the YAML frontmatter from SKILL.md.
Required fields:
name - Skill identifierdescription - Trigger description with phrasesCheck for:
Assess the quality and effectiveness of the frontmatter description.
Scoring breakdown:
| Criterion | Points | Evaluation |
|---|---|---|
| Trigger phrases clarity | 25 | 3-5 specific user phrases present |
| Third-person format | 25 | Uses "This skill should be used when..." |
| Description length | 25 | 100-300 characters optimal |
| Specific scenarios | 25 | Concrete use cases, not vague |
Red flags:
Reference: references/examples-good.md for exemplary descriptions
Assess adherence to progressive disclosure principles.
Scoring breakdown:
| Criterion | Points | Evaluation |
|---|---|---|
| Progressive disclosure | 30 | SKILL.md lean, details in references/ |
| SKILL.md length | 25 | Under 5,000 words (1,500-2,000 ideal) |
| References/ usage | 25 | Detailed content properly moved |
| Logical organization | 20 | Clear sections, good flow |
Check:
references/Reference: references/scoring-criteria.md for detailed rubrics
Verify adherence to skill writing conventions.
Scoring breakdown:
| Criterion | Points | Evaluation |
|---|---|---|
| Imperative form | 40 | Verb-first instructions throughout |
| No second person in body | 30 | Avoids conversational second person in the main workflow body |
| Objective language | 30 | Factual, instructional tone |
Check for:
Good examples:
Create the skill directory structure.
Validate the YAML frontmatter.
Check for required fields.
Bad examples:
You should create the directory.
You need to validate the frontmatter.
Check if the fields are there.
Verify the skill's physical structure and completeness.
Scoring breakdown:
| Criterion | Points | Evaluation |
|---|---|---|
| YAML frontmatter | 30 | All required fields present |
| Directory structure | 30 | Proper organization |
| Resource references | 40 | All referenced files exist |
Validate:
name and descriptionskill-name/
├── SKILL.md
├── references/ (optional)
├── examples/ (optional)
└── scripts/ (optional)
Compute the overall quality score using weighted dimensions.
Formula:
Overall Score = (Description × 0.25) + (Organization × 0.30) +
(Style × 0.20) + (Structure × 0.25)
Letter grade mapping:
| Score Range | Grade | Meaning |
|---|---|---|
| 97-100 | A+ | Exemplary |
| 93-96 | A | Excellent |
| 90-92 | A- | Very Good |
| 87-89 | B+ | Good |
| 83-86 | B | Above Average |
| 80-82 | B- | Solid |
| 77-79 | C+ | Acceptable |
| 73-76 | C | Satisfactory |
| 70-72 | C- | Minimal Acceptable |
| 67-69 | D+ | Below Standard |
| 63-66 | D | Poor |
| 60-62 | D- | Very Poor |
| 0-59 | F | Fail |
Create two output documents in the current working directory.
1. Quality Report (quality-report-{skill-name}.md)
2. Improvement Plan (improvement-plan-{skill-name}.md)
# Skill Quality Report: {skill-name}
## Executive Summary
- **Overall Score**: X/100 ({Grade})
- **Evaluated**: {Date}
- **Skill Path**: {path}
## Dimension Scores
### 1. Description Quality (25%)
**Score**: X/100
**Strengths**:
- ✅ {specific strength}
**Weaknesses**:
- ❌ {specific weakness}
**Recommendations**:
1. {actionable recommendation}
[Repeat for other dimensions...]
## Grade Breakdown
| Dimension | Score | Weight | Contribution |
|-----------|-------|--------|--------------|
| Description | X/100 | 25% | X.X |
| Organization | X/100 | 30% | X.X |
| Style | X/100 | 20% | X.X |
| Structure | X/100 | 25% | X.X |
| **Overall** | **X/100** | **100%** | **X.X ({Grade})** |
## Next Steps
See `improvement-plan-{skill-name}.md` for detailed improvement suggestions.
# Skill Improvement Plan: {skill-name}
## Priority Summary
- **High Priority**: {count} items
- **Medium Priority**: {count} items
- **Low Priority**: {count} items
## High Priority Improvements
### 1. [Issue Title]
**File**: SKILL.md:line:line
**Dimension**: Description Quality
**Impact**: +X points
**Current**:
```yaml
{current content}
Suggested:
{suggested content}
Reason: {why this improves quality}
[Continue with all issues...]
## Additional Resources
### Reference Files
For detailed evaluation criteria and examples, consult:
- **`references/scoring-criteria.md`** - Comprehensive scoring rubrics for each dimension
- **`references/examples-good.md`** - Exemplary skills demonstrating best practices
- **`references/examples-bad.md`** - Common anti-patterns to avoid
### Scripts
- **`scripts/extract-yaml.sh`** - Utility for extracting YAML frontmatter from SKILL.md
- **`scripts/skill-audit.py`** - Lightweight integrity audit for missing references, word count, and sibling-path checks
### Related Skills
- **`skill-development`** - Comprehensive guide for creating skills
- **`code-review-excellence`** - Best practices for code review
## Best Practices
### When Analyzing Skills
1. **Be objective and specific** - Base scores on observable criteria, not opinions
2. **Provide actionable feedback** - Each recommendation should be concrete and implementable
3. **Include examples** - Show current vs. suggested content for clarity
4. **Estimate impact** - Help users understand which changes matter most
5. **Be constructive** - Frame feedback as opportunities for improvement
### Common Quality Issues
**Description Quality:**
- Vague or generic trigger phrases
- Second-person descriptions
- Missing concrete use cases
**Content Organization:**
- SKILL.md too long (>5,000 words)
- Detailed content not moved to references/
- Poor information hierarchy
**Writing Style:**
- Second-person language ("you", "your")
- Mixed imperative and descriptive styles
- Subjective or conversational tone
**Structural Integrity:**
- Missing required YAML fields
- Referenced files don't exist
- Incomplete examples or broken scripts
### Grade Benchmarks
**A grade (90-100)**: Exemplary skills serving as templates for others
- All dimensions score 85+
- Clear, specific descriptions
- Excellent progressive disclosure
- Consistent imperative style
- Complete, well-organized structure
**B grade (80-89)**: High-quality skills with minor improvements needed
- Most dimensions score 75+
- Good descriptions and organization
- Generally follows best practices
- May have minor style inconsistencies
**C grade (70-79)**: Acceptable skills requiring moderate improvements
- Key areas meet minimum standards
- Some weaknesses in organization or style
- Functional but not exemplary
**D/F grade (below 70)**: Skills needing significant work
- Multiple dimensions below 70
- Major structural or style issues
- Requires comprehensive revision
## Usage Examples
**Example 1: Analyze a local skill**
User: "Analyze skill quality for ~/.claude/skills/git-workflow"
[Claude executes the 8-step workflow and generates:]
**Example 2: Review before sharing**
User: "Review my new skill before I publish it"
[Claude analyzes the skill and provides:]
**Example 3: Quality check for existing skill**
User: "Check skill quality of api-helper"
[Claude evaluates and reports:]
**Example 4: Batch portfolio review**
User: "Review all skills in ~/.claude/skills and tell me what to fix first"
[Claude evaluates and reports:]