Artifact Validator: Quality Assurance for Claude Code Artifacts

Comprehensive validation and quality grading system for Claude Code Skills, Commands, Subagents, and Hooks. Ensures artifacts meet technical requirements, follow best practices, and will activate/execute reliably.

Core Capabilities

YAML Validation: Check syntax, structure, and required fields
Naming Verification: Ensure directory names match YAML names
Description Quality: Grade activation potential for Skills
Field Validation: Verify all required and optional fields
Activation Testing: Simulate trigger scenarios for Skills
Quality Grading: Apply Q = 0.40R + 0.30C + 0.20S + 0.10E framework
Improvement Recommendations: Provide specific actionable feedback
Cross-Artifact Checks: Detect conflicts and overlaps

Methodology

Step 1: Identify Artifact Type

First, determine what type of artifact to validate:

Detection rules:

Skill: Located in .claude/skills/*/SKILL.md
Command: Located in .claude/commands/*.md
Subagent: Located in .claude/agents/*.yaml
Hook: Configuration in settings.json under hooks key

Example:

# Auto-detect artifact type
if [[ -f ".claude/skills/my-skill/SKILL.md" ]]; then
  TYPE="skill"
elif [[ -f ".claude/commands/my-command.md" ]]; then
  TYPE="command"
elif [[ -f ".claude/agents/my-agent.yaml" ]]; then
  TYPE="subagent"
fi

Step 2: YAML Syntax Validation

Validate YAML frontmatter is syntactically correct:

For Skills and Commands:

# Extract YAML frontmatter (between --- delimiters)
python3 -c "
import yaml
import sys

with open('SKILL.md') as f:
    content = f.read()
    parts = content.split('---')
    if len(parts) < 3:
        print('ERROR: Missing YAML frontmatter delimiters')
        sys.exit(1)

    try:
        data = yaml.safe_load(parts[1])
        print('✅ YAML syntax valid')
        print(f'Fields: {list(data.keys())}')
    except yaml.YAMLError as e:
        print(f'❌ YAML syntax error: {e}')
        sys.exit(1)
"

For Subagents:

# Full YAML file
python3 -c "
import yaml
with open('agent.yaml') as f:
    try:
        data = yaml.safe_load(f)
        print('✅ YAML syntax valid')
    except yaml.YAMLError as e:
        print(f'❌ YAML error: {e}')
"

Common YAML errors:

Missing closing --- delimiter
Tab characters (use spaces only)
Smart quotes instead of straight quotes
Array syntax for comma-separated fields

Step 3: Required Fields Validation

Check that all required fields are present and valid:

Skill required fields:

name (string, matches directory name)
description (string, ≤1024 characters)

Skill optional fields:

allowed-tools (comma-separated string)

Command required fields:

description (string)

Command optional fields:

argument-hint (string)
allowed-tools (comma-separated string)
disable-model-invocation (boolean)

Subagent required fields:

name (string)
description (string)

Subagent optional fields:

tools (array)
model (enum: sonnet, opus, haiku)

Validation example:

import yaml

with open('SKILL.md') as f:
    parts = f.read().split('---')
    data = yaml.safe_load(parts[1])

# Check required fields
required = ['name', 'description']
missing = [f for f in required if f not in data]

if missing:
    print(f"❌ Missing required fields: {missing}")
else:
    print("✅ All required fields present")

# Validate field types and constraints
if 'name' in data and not isinstance(data['name'], str):
    print("❌ 'name' must be a string")

if 'description' in data:
    if not isinstance(data['description'], str):
        print("❌ 'description' must be a string")
    elif len(data['description']) > 1024:
        print(f"❌ 'description' exceeds 1024 chars: {len(data['description'])}")
    else:
        print(f"✅ description length: {len(data['description'])}/1024")

Step 4: Naming Convention Validation

Verify directory/file names match YAML fields:

For Skills:

# Directory name must match YAML 'name' field
SKILL_DIR=$(basename $(dirname SKILL.md))
YAML_NAME=$(grep "^name:" SKILL.md | cut -d: -f2 | xargs)

if [ "$SKILL_DIR" == "$YAML_NAME" ]; then
    echo "✅ Name matches: $SKILL_DIR"
else
    echo "❌ Mismatch: directory=$SKILL_DIR, YAML=$YAML_NAME"
fi

Naming rules:

Lowercase only
Hyphens for word separation (no underscores)
No spaces
Alphanumeric + hyphens only
Max 64 characters

Validation:

import re

def validate_name(name, artifact_type):
    """Validate artifact name follows conventions."""

    # Check length
    if len(name) > 64:
        return f"❌ Name too long: {len(name)} chars (max 64)"

    # Check pattern
    if not re.match(r'^[a-z0-9-]+$', name):
        return "❌ Name must be lowercase alphanumeric with hyphens only"

    # Check doesn't start/end with hyphen
    if name.startswith('-') or name.endswith('-'):
        return "❌ Name cannot start or end with hyphen"

    # Check no consecutive hyphens
    if '--' in name:
        return "❌ No consecutive hyphens allowed"

    return f"✅ Name '{name}' is valid"

Step 5: Description Quality Assessment (Skills Only)

Grade Skill descriptions using activation quality criteria:

Quality dimensions:

Action Verbs (0-10 points)
- Count action verbs in description
- 5+ verbs = 10 points, scale down for fewer
Capabilities Specificity (0-10 points)
- 10 points: Specific capabilities with details
- 5 points: Generic capabilities
- 0 points: Vague "helps with" language
Technology Mentions (0-10 points)
- Count technologies/tools/file types mentioned
- 3+ = 10 points, scale down for fewer
Trigger Keywords (0-10 points)
- Count "Use when/for..." keywords
- 10+ keywords = 10 points, scale down
Character Efficiency (0-10 points)
- Optimal: 400-700 chars = 10 points
- Scale down for too short (<200) or too long (>900)

Scoring formula:

Description Score = (Verbs + Specificity + Tech + Keywords + Efficiency) / 50 * 100

Grade:

A (≥90): Excellent, will activate reliably
B (80-89): Good, minor improvements possible
C (70-79): Acceptable, needs improvements
D (60-69): Poor, significant revision needed
F (<60): Failing, complete rewrite required

Example assessment:

def assess_description(description):
    """Grade Skill description quality."""

    # 1. Count action verbs
    action_verbs = ['extract', 'parse', 'test', 'validate', 'generate',
                    'optimize', 'convert', 'analyze', 'detect', 'compare']
    verb_count = sum(1 for verb in action_verbs if verb in description.lower())
    verb_score = min(10, verb_count * 2)

    # 2. Check specificity (heuristic: no vague terms)
    vague_terms = ['helps', 'various', 'multiple', 'different']
    has_vague = any(term in description.lower() for term in vague_terms)
    specificity_score = 5 if has_vague else 10

    # 3. Count technologies (file extensions, tool names)
    tech_indicators = ['.pdf', '.json', '.csv', 'api', 'sql', 'http',
                       'rest', 'openapi', 'yaml']
    tech_count = sum(1 for tech in tech_indicators if tech in description.lower())
    tech_score = min(10, tech_count * 3)

    # 4. Check for "Use when/for"
    has_triggers = 'use when' in description.lower() or 'use for' in description.lower()
    trigger_score = 10 if has_triggers else 0

    # 5. Character efficiency
    length = len(description)
    if 400 <= length <= 700:
        efficiency_score = 10
    elif 200 <= length < 400 or 700 < length <= 900:
        efficiency_score = 7
    elif length < 200 or length > 900:
        efficiency_score = 3
    else:
        efficiency_score = 0

    total = verb_score + specificity_score + tech_score + trigger_score + efficiency_score
    percentage = (total / 50) * 100

    if percentage >= 90:
        grade = 'A'
    elif percentage >= 80:
        grade = 'B'
    elif percentage >= 70:
        grade = 'C'
    elif percentage >= 60:
        grade = 'D'
    else:
        grade = 'F'

    return {
        'grade': grade,
        'score': percentage,
        'breakdown': {
            'verbs': verb_score,
            'specificity': specificity_score,
            'technologies': tech_score,
            'triggers': trigger_score,
            'efficiency': efficiency_score
        }
    }

Step 6: Content Quality Validation

Check the artifact's content beyond YAML frontmatter:

For Skills (SKILL.md):

Overview paragraph exists and explains purpose
Core Capabilities section with 3+ capabilities
Methodology section with step-by-step workflow
At least 3 examples (recommend 5)
Examples have code blocks with language specified
No placeholder text (TODO, FIXME, Lorem Ipsum)
Troubleshooting section present
No broken markdown syntax

For Commands:

Description explains what command does
If arguments used, they're documented
Command body has executable content
No placeholder prompts

For Subagents:

System prompt is substantial (>100 words)
Clear focus/specialization explained
Instructions are actionable

Validation approach:

def validate_skill_content(filepath):
    """Check Skill content quality."""

    with open(filepath) as f:
        content = f.read()

    issues = []

    # Check for required sections
    required_sections = [
        'Core Capabilities',
        'Methodology',
        'Examples'
    ]

    for section in required_sections:
        if section not in content:
            issues.append(f"Missing section: {section}")

    # Count examples
    example_count = content.count('### Example')
    if example_count < 3:
        issues.append(f"Only {example_count} examples (need 3+)")

    # Check for placeholders
    placeholders = ['TODO', 'FIXME', 'PLACEHOLDER', 'Lorem ipsum']
    for placeholder in placeholders:
        if placeholder in content:
            issues.append(f"Contains placeholder: {placeholder}")

    # Check code blocks have language
    code_blocks = re.findall(r'```(\w*)\n', content)
    unnamed_blocks = [i for i, lang in enumerate(code_blocks) if not lang]
    if unnamed_blocks:
        issues.append(f"{len(unnamed_blocks)} code blocks missing language")

    if not issues:
        return "✅ Content quality: PASS"
    else:
        return f"⚠️ Content issues:\n" + "\n".join(f"  - {issue}" for issue in issues)

Step 7: Activation Pattern Testing (Skills Only)

Simulate activation scenarios to predict reliability:

Test categories:

Direct keyword match
Contextual trigger
File extension trigger (if applicable)

Example test suite:

def test_activation_potential(description):
    """Predict if Skill will activate for common scenarios."""

    # Extract keywords from description
    keywords = set(description.lower().split())

    test_scenarios = [
        {
            'name': 'Direct keyword match',
            'request': 'extract tables from pdf',
            'expected_keywords': ['extract', 'tables', 'pdf']
        },
        {
            'name': 'Technology mention',
            'request': 'use pdfplumber to get data',
            'expected_keywords': ['pdfplumber', 'data']
        }
    ]

    results = []
    for scenario in test_scenarios:
        request_words = set(scenario['request'].split())
        matches = sum(1 for kw in scenario['expected_keywords']
                     if kw in keywords)

        if matches >= 2:
            results.append(f"✅ {scenario['name']}: Likely activates")
        else:
            results.append(f"⚠️ {scenario['name']}: May not activate")

    return results

Step 8: Generate Quality Report

Compile all validation results into comprehensive report:

Report structure:

# Artifact Validation Report

**Artifact:** [name]
**Type:** [Skill/Command/Subagent/Hook]
**Date:** [YYYY-MM-DD]

## Overall Grade: [A/B/C/D/F]

### YAML Validation
- Syntax: [✅/❌]
- Required fields: [✅/❌]
- Optional fields: [list present fields]

### Naming Convention
- Directory/file name: [✅/❌]
- YAML name field: [✅/❌]
- Consistency: [✅/❌]

### Description Quality (Skills only)
- Grade: [A/B/C/D/F]
- Score: [X/100]
- Breakdown:
  - Action verbs: [X/10]
  - Specificity: [X/10]
  - Technologies: [X/10]
  - Triggers: [X/10]
  - Efficiency: [X/10]

### Content Quality
- Structure: [✅/❌/⚠️]
- Examples: [count] ([✅ if ≥3, ❌ if <3])
- Placeholders: [✅ none / ❌ found]
- Code blocks: [✅/⚠️]

### Activation Potential (Skills only)
- Direct keywords: [✅/⚠️/❌]
- Contextual triggers: [✅/⚠️/❌]
- File extensions: [✅/⚠️/❌/N/A]
- Predicted success rate: [X%]

## Issues Found

[List of all issues with severity]

## Recommendations

[Specific, actionable improvements]

## Quality Framework Score

Using Q = 0.40·R + 0.30·C + 0.20·S + 0.10·E:

- Relevance (R): [score] - [explanation]
- Completeness (C): [score] - [explanation]
- Consistency (S): [score] - [explanation]
- Efficiency (E): [score] - [explanation]

**Final Q score: [X.XX]**
**Grade: [A/B+/B/C/D/F]**

Examples

Example 1: Validate a Skill with Issues

Scenario: Check pdf-processor Skill for quality

Validation steps:

# Step 1: Check YAML syntax
python3 -c "
import yaml
with open('.claude/skills/pdf-processor/SKILL.md') as f:
    content = f.read()
    parts = content.split('---')
    yaml.safe_load(parts[1])
"
# Output: (no error = valid)

# Step 2: Verify name consistency
DIRNAME=$(basename .claude/skills/pdf-processor)
YAMLNAME=$(grep "^name:" .claude/skills/pdf-processor/SKILL.md | cut -d: -f2 | xargs)

if [ "$DIRNAME" == "$YAMLNAME" ]; then
  echo "✅ Names match: $DIRNAME"
else
  echo "❌ Mismatch: dir=$DIRNAME, yaml=$YAMLNAME"
fi

Issues found:

⚠️ Description only 45 characters (too short)
❌ Missing "Use when..." triggers
⚠️ Only 2 action verbs (need 5+)
❌ Missing Examples section

Recommendations:

1. Expand description to 400-700 characters
2. Add "Use when working with PDF files, document extraction..." clause
3. Add action verbs: parse, merge, split, convert
4. Create Examples section with 3-5 complete examples

Quality grade: D (65/100)

Example 2: Validate High-Quality Skill

Scenario: Validate artifact-advisor Skill

Results:

✅ YAML syntax: Valid
✅ Name consistency: artifact-advisor (matches)
✅ Description: 586 characters (optimal range)
✅ Required fields: All present
✅ Action verbs: 7 (analyze, recommend, advise, choose, justify)
✅ Technologies: 4 (Skills, Commands, Subagents, Hooks)
✅ Triggers: Present ("Use when user asks...")
✅ Examples: 7 examples found
✅ No placeholders
✅ Code blocks: All have language specified

Quality grade: A (94/100)

Quality framework:

Relevance: 0.95 (highly focused on artifact selection)
Completeness: 0.92 (comprehensive examples and decision trees)
Consistency: 0.90 (consistent terminology and structure)
Efficiency: 0.88 (concise yet thorough)

Q = 0.40(0.95) + 0.30(0.92) + 0.20(0.90) + 0.10(0.88) = 0.924

Final grade: A

Example 3: Validate Command

Scenario: Check deployment command

Command file: .claude/commands/deploy.md

---
description: Deploy application to specified environment with pre-deployment checks
argument-hint: <environment> [branch]
allowed-tools: Bash
disable-model-invocation: true
---

Deploy to $1 environment from ${2:-main} branch.

Run tests, build, and deploy with validation.

Validation:

✅ YAML syntax: Valid
✅ Description: Present and descriptive
✅ argument-hint: Properly formatted
✅ disable-model-invocation: Correctly set (prevents accidental deployment)
✅ Command body: Uses $1, $2 arguments correctly
✅ No placeholders

Quality grade: A (92/100)

Example 4: Batch Validation

Scenario: Validate all Skills in project

# Find all Skills
for skill_dir in .claude/skills/*/; do
  skill_name=$(basename "$skill_dir")
  skill_file="$skill_dir/SKILL.md"

  if [ -f "$skill_file" ]; then
    echo "Validating: $skill_name"

    # Run validation
    python3 validate.py "$skill_file"

    echo "---"
  fi
done

Output:

Validating: artifact-advisor
✅ Grade: A (94/100)
---
Validating: skill-builder
✅ Grade: A (92/100)
---
Validating: pdf-processor
⚠️ Grade: C (75/100)
Issues: Description too short, missing triggers
---

Summary:

Total Skills: 3
Grade A: 2 (67%)
Grade B: 0 (0%)
Grade C: 1 (33%)
Grade D or below: 0 (0%)

Recommendation: Fix pdf-processor description

Example 5: Pre-Deployment Validation

Scenario: Validate Skill before git commit

Workflow:

# 1. Run full validation
python3 .claude/scripts/validate-artifact.py .claude/skills/new-skill/SKILL.md

# 2. Check for critical issues
if [ $? -ne 0 ]; then
  echo "❌ Validation failed - fix issues before committing"
  exit 1
fi

# 3. Check grade threshold
GRADE=$(python3 validate.py SKILL.md | grep "Grade:" | cut -d: -f2 | xargs | cut -c1)

if [ "$GRADE" != "A" ] && [ "$GRADE" != "B" ]; then
  echo "⚠️ Grade $GRADE - consider improving before deployment"
  echo "Proceed anyway? (y/n)"
  read response
  if [ "$response" != "y" ]; then
    exit 1
  fi
fi

# 4. All checks passed
echo "✅ Validation passed - ready to commit"

Common Patterns

Pattern 1: Quick Validation Check

Run a fast check for critical issues only:

validate_quick() {
  local file=$1

  # Check YAML syntax
  python3 -c "import yaml; yaml.safe_load(open('$file').read().split('---')[1])" 2>&1

  # Check name field exists
  grep -q "^name:" "$file" || echo "❌ Missing 'name' field"

  # Check description exists
  grep -q "^description:" "$file" || echo "❌ Missing 'description' field"

  # Check description length
  DESC_LEN=$(grep "^description:" "$file" | cut -d: -f2- | wc -c)
  if [ $DESC_LEN -gt 1024 ]; then
    echo "❌ Description too long: $DESC_LEN chars"
  fi
}

Pattern 2: Continuous Integration Validation

Validate all artifacts in CI/CD pipeline:

# .github/workflows/validate-artifacts.yml
name: Validate Claude Code Artifacts

on: [push, pull_request]

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2

      - name: Validate Skills
        run: |
          for skill in .claude/skills/*/SKILL.md; do
            python3 scripts/validate.py "$skill" || exit 1
          done

      - name: Validate Commands
        run: |
          for cmd in .claude/commands/*.md; do
            python3 scripts/validate.py "$cmd" || exit 1
          done

      - name: Check grades
        run: |
          python3 scripts/grade-report.py

Pattern 3: Interactive Validation with Fixes

Validate and offer to fix common issues:

def validate_with_fixes(filepath):
    """Validate and optionally fix issues."""

    issues = validate_artifact(filepath)

    for issue in issues:
        print(f"\n{issue['severity']} {issue['message']}")

        if issue['fixable']:
            response = input("Apply automatic fix? (y/n): ")
            if response.lower() == 'y':
                apply_fix(filepath, issue)
                print("✅ Fixed")

Troubleshooting

Issue 1: "YAML syntax error" on valid YAML

Symptoms: Validator reports syntax error but YAML looks correct

Cause: Hidden characters (tabs, smart quotes, BOM)

Solution:

# Check for tabs
grep -P '\t' SKILL.md
# Replace tabs with spaces
sed -i 's/\t/    /g' SKILL.md

# Check for smart quotes
grep -P '[""]' SKILL.md
# Replace with straight quotes manually

# Remove BOM if present
sed -i '1s/^\xEF\xBB\xBF//' SKILL.md

Issue 2: Name mismatch false positive

Symptoms: Validator says names don't match but they appear identical

Cause: Whitespace in YAML name field

Solution:

# Check exact value
grep "^name:" SKILL.md | cat -A
# Should see: name: skill-name$

# If you see: name: skill-name $ (space before $)
# Fix by trimming:
sed -i 's/^name: \(.*\) *$/name: \1/' SKILL.md

Issue 3: Description passes validation but doesn't activate

Symptoms: Skill validated successfully but doesn't activate in practice

Cause: Validation checks structure, not real-world activation

Solution:

Run activation testing protocol (separate from validation)
Validation ≠ activation testing
Use activation-test-protocol.md for real tests
Validation ensures artifact CAN work, activation testing ensures it DOES work

Issue 4: Grade seems too low

Symptoms: Skill feels high quality but gets low grade

Cause: Grading is strict and focuses on specific metrics

Diagnosis:

Check breakdown:
- Verbs: Low score? Add more action verbs to description
- Technologies: Low score? Mention specific tools/file types
- Triggers: 0 points? Missing "Use when..." clause
- Efficiency: Low score? Description too short or too long

Fix: Address lowest-scoring dimension first

Best Practices

Do's ✅

✅ Run validation before committing artifacts to git
✅ Aim for Grade A (≥90) on all artifacts
✅ Fix critical issues (YAML syntax, missing fields) immediately
✅ Address warnings (low description quality) before deployment
✅ Validate after every significant edit
✅ Use validation in CI/CD pipelines
✅ Keep validation scripts up to date with latest specs

Don'ts ❌

❌ Don't deploy artifacts with validation errors
❌ Don't ignore warnings - they predict real problems
❌ Don't confuse validation with activation testing
❌ Don't over-optimize for validation score (natural quality matters)
❌ Don't skip validation because artifact "seems fine"
❌ Don't validate only new artifacts (validate existing ones too)

Quality Framework Reference

The validation system uses the Centauro quality framework:

Q = 0.40·Relevance + 0.30·Completeness + 0.20·Consistency + 0.10·Efficiency

Dimensions:

Relevance (R) - 40% weight:

Does artifact solve the stated problem?
Is description focused and specific?
Are examples relevant to use cases?

Completeness (C) - 30% weight:

Are all required fields present?
Does it have sufficient examples?
Is methodology explained?

Consistency (S) - 20% weight:

Naming conventions followed?
Terminology consistent?
Structure matches spec?

Efficiency (E) - 10% weight:

Description concise but informative?
No redundant content?
Optimal length?

Grading scale:

A: ≥0.90 (Excellent)
B+: 0.85-0.89 (Very Good)
B: 0.80-0.84 (Good)
C: 0.70-0.79 (Acceptable)
D: 0.60-0.69 (Poor)
F: <0.60 (Failing)

Related Skills

Use with:

skill-builder: Create Skills, then validate with artifact-validator
artifact-advisor: Decide artifact type, create it, then validate
artifact-troubleshooter: If validation finds issues, use for deep debugging

External tools:

yamllint: Advanced YAML linting
markdownlint: Markdown syntax checking
pre-commit hooks: Automatic validation before commits

Validate early, validate often. Quality artifacts are the foundation of effective Claude Code workflows.

artifact-validator

Artifact Validator: Quality Assurance for Claude Code Artifacts

Core Capabilities

Methodology

Step 1: Identify Artifact Type

Step 2: YAML Syntax Validation

Step 3: Required Fields Validation

Step 4: Naming Convention Validation

Step 5: Description Quality Assessment (Skills Only)

Step 6: Content Quality Validation

Step 7: Activation Pattern Testing (Skills Only)

Step 8: Generate Quality Report

Examples

Example 1: Validate a Skill with Issues

Example 2: Validate High-Quality Skill

Example 3: Validate Command

Example 4: Batch Validation

Example 5: Pre-Deployment Validation

Common Patterns

Pattern 1: Quick Validation Check

Pattern 2: Continuous Integration Validation

Pattern 3: Interactive Validation with Fixes

Troubleshooting

Issue 1: "YAML syntax error" on valid YAML

Issue 2: Name mismatch false positive

Issue 3: Description passes validation but doesn't activate

Issue 4: Grade seems too low

Best Practices

Do's ✅

Don'ts ❌

Quality Framework Reference

Related Skills

Similar Skills