Writing Skills

Overview

Writing skills IS Test-Driven Development applied to process documentation.

You write test cases (pressure scenarios with subagents), watch them fail (baseline behavior), write the skill (documentation), watch tests pass (agents comply), and refactor (close loopholes).

Core principle: If you didn't watch an agent fail without the skill, you don't know if the skill teaches the right thing.

Shannon enhancement: Quantitative validation of skill effectiveness.

What is a Skill?

A skill is a reference guide for proven techniques, patterns, or tools. Skills help future Claude instances find and apply effective approaches.

Skills are: Reusable techniques, patterns, tools, reference guides

Skills are NOT: Narratives about how you solved a problem once

TDD Mapping for Skills

TDD Concept	Skill Creation
Test case	Pressure scenario with subagent
Production code	Skill document (SKILL.md)
Test fails (RED)	Agent violates rule without skill (baseline)
Test passes (GREEN)	Agent complies with skill present
Refactor	Close loopholes while maintaining compliance

When to Create a Skill

Create when:

Technique wasn't intuitively obvious to you
You'd reference this again across projects
Pattern applies broadly (not project-specific)
Others would benefit
Shannon: Complexity >0.4 or used >3 times

Don't create for:

One-off solutions
Standard practices well-documented elsewhere
Project-specific conventions (put in CLAUDE.md)

SKILL.md Structure (Shannon Enhanced)

Frontmatter (YAML):

---
name: skill-name-with-hyphens
description: Use when [specific triggers and symptoms] - [what it does and how it helps, third person]
---

Max 1024 characters total Name: Letters, numbers, hyphens only Description: Third-person, starts with "Use when..."

Body structure:

# Skill Name

## Overview
What is this? Core principle in 1-2 sentences.

## When to Use
Bullet list with SYMPTOMS and use cases
When NOT to use

## The Iron Law (if discipline-enforcing skill)

NO [THING] WITHOUT [REQUIREMENT] FIRST


## [Core Process/Pattern]
Step-by-step or before/after code

## Shannon Enhancement (if applicable)
Quantitative tracking, Serena integration, validation gates

## Known Violations and Counters
Baseline testing captured rationalizations

## Red Flags - STOP
If you catch yourself thinking...

## Common Rationalizations
| Excuse | Reality |
|--------|---------|

## Integration with Other Skills
Required, complementary, Shannon-specific

## The Bottom Line
Summary with Shannon quantitative angle

RED-GREEN-REFACTOR for Skills

RED Phase: Baseline Testing

Run pressure scenarios WITHOUT the skill:

# Create test scenario
scenario = {
    "task": "Implement feature without TDD",
    "pressures": ["time_pressure", "sunk_cost", "exhaustion"],
    "expected_violation": "code_before_test"
}

# Run with sub-agent (no skill loaded)
result = run_subagent_test(scenario, skills=[])

# Document violations
violations = {
    "violated": True,
    "violation_type": "code_before_test",
    "rationalization": "I'll write tests after to verify it works",
    "timestamp": ISO_timestamp
}

serena.write_memory(f"skill_testing/{skill_name}/red_baseline", violations)

Capture:

What choices did they make?
What rationalizations did they use (verbatim)?
Which pressures triggered violations?

Shannon tracking: Quantify violation rate:

baseline_metrics = {
    "scenarios_tested": 5,
    "violations": 5,
    "violation_rate": 1.00,  # 100% violated without skill
    "common_rationalizations": [
        "I'll test after",
        "Too simple to test",
        "Deleting X hours is wasteful"
    ]
}

GREEN Phase: Write Skill & Test

Write skill addressing baseline violations:

Add explicit counters for each rationalization
Create RED FLAGS list
Build rationalization table

Run same scenarios WITH skill:

# Run with skill loaded
result = run_subagent_test(scenario, skills=[skill_name])

# Verify compliance
compliance = {
    "violated": False,
    "complied": True,
    "skill_effectiveness": 1.00,  # 100% compliance
    "timestamp": ISO_timestamp
}

serena.write_memory(f"skill_testing/{skill_name}/green_compliance", compliance)

Shannon validation: Quantify improvement:

green_metrics = {
    "scenarios_tested": 5,
    "violations": 0,
    "violation_rate": 0.00,  # 0% violated with skill
    "improvement": 1.00,  # 100% improvement
    "skill_effective": True
}

REFACTOR Phase: Close Loopholes

Find new rationalizations:

Run more pressure scenarios
Increase pressure combinations
Document new violations

Add explicit counters:

Update rationalization table
Add to RED FLAGS
Re-test until bulletproof

Shannon tracking: Measure robustness:

refactor_metrics = {
    "iterations": 3,
    "loopholes_found": 2,
    "loopholes_closed": 2,
    "final_violation_rate": 0.00,
    "pressure_scenarios_passed": 10,
    "robustness_score": 1.00  # Bulletproof
}

serena.write_memory(f"skill_testing/{skill_name}/final", refactor_metrics)

Shannon-Specific Skill Patterns

Quantitative Tracking in Skills

Skills should include Serena tracking:

# Example from systematic-debugging skill
debugging_session = {
    "attempts": 2,
    "success": True,
    "duration_minutes": 30,
    "timestamp": ISO_timestamp
}

serena.write_memory(f"debugging/sessions/{session_id}", debugging_session)

Validation Gates Integration

Skills should reference Shannon's 3-tier validation:

## Verification

**Shannon 3-Tier Validation**:
- Tier 1 (Flow): Code compiles
- Tier 2 (Artifacts): Tests pass
- Tier 3 (Functional): Real systems verified (NO MOCKS)

MCP Integration Patterns

Skills should use MCP-discovery:

## Shannon Integration

**MCP Requirements**:
- Sequential: Deep analysis
- Serena: Pattern tracking
- Context7: Framework docs (if applicable)

The Iron Law

NO SKILL WITHOUT A FAILING TEST FIRST

This applies to NEW skills AND EDITS to existing skills.

No exceptions:

Not for "simple additions"
Not for "just adding a section"
Not for "documentation updates"
Delete means delete

Skill Creation Checklist

RED Phase:

Create 3+ pressure scenarios
Run WITHOUT skill
Document violations verbatim
Identify rationalization patterns
Shannon: Quantify baseline violation rate

GREEN Phase:

REFACTOR Phase:

Find new rationalizations
Add explicit counters
Re-test until bulletproof
Shannon: Measure robustness score

Shannon Validation:

Serena tracking code included
Validation gates referenced (if applicable)
MCP requirements specified
Quantitative metrics defined

Quality Checks:

Frontmatter: name + description only
Description starts with "Use when..."
Small flowchart only if decision non-obvious
Quick reference table
Common mistakes section
No narrative storytelling

Integration with Other Skills

This is a meta-skill for creating other skills.

Required understanding:

test-driven-development - TDD principles apply to documentation

Produces:

New skills following Shannon patterns
Quantitatively validated process documentation

Shannon integration:

Serena MCP - Track skill effectiveness
Sequential MCP - Deep analysis for complex skills
Sub-agent testing framework

The Bottom Line

Creating skills IS TDD for process documentation.

Shannon enhancement: Quantify everything.

Not "skill seems robust" - "skill: 0% violation rate across 10 pressure scenarios, robustness score 1.00".

Test with subagents. Measure effectiveness. Close loopholes systematically.

writing-skills