Use when creating new skills, editing existing skills, or verifying skills work before deployment - applies TDD to process documentation with Shannon quantitative validation by testing with subagents before writing, iterating until bulletproof against rationalization
Use when creating or editing skills - applies TDD to documentation by testing with subagents before writing, measuring violation rates until bulletproof against rationalization.
/plugin marketplace add krzemienski/shannon-framework/plugin install shannon@shannon-frameworkThis skill inherits all available tools. When active, it can use any tool Claude has access to.
Writing skills IS Test-Driven Development applied to process documentation.
You write test cases (pressure scenarios with subagents), watch them fail (baseline behavior), write the skill (documentation), watch tests pass (agents comply), and refactor (close loopholes).
Core principle: If you didn't watch an agent fail without the skill, you don't know if the skill teaches the right thing.
Shannon enhancement: Quantitative validation of skill effectiveness.
A skill is a reference guide for proven techniques, patterns, or tools. Skills help future Claude instances find and apply effective approaches.
Skills are: Reusable techniques, patterns, tools, reference guides
Skills are NOT: Narratives about how you solved a problem once
| TDD Concept | Skill Creation |
|---|---|
| Test case | Pressure scenario with subagent |
| Production code | Skill document (SKILL.md) |
| Test fails (RED) | Agent violates rule without skill (baseline) |
| Test passes (GREEN) | Agent complies with skill present |
| Refactor | Close loopholes while maintaining compliance |
Create when:
Don't create for:
Frontmatter (YAML):
---
name: skill-name-with-hyphens
description: Use when [specific triggers and symptoms] - [what it does and how it helps, third person]
---
Max 1024 characters total Name: Letters, numbers, hyphens only Description: Third-person, starts with "Use when..."
Body structure:
# Skill Name
## Overview
What is this? Core principle in 1-2 sentences.
## When to Use
Bullet list with SYMPTOMS and use cases
When NOT to use
## The Iron Law (if discipline-enforcing skill)
NO [THING] WITHOUT [REQUIREMENT] FIRST
## [Core Process/Pattern]
Step-by-step or before/after code
## Shannon Enhancement (if applicable)
Quantitative tracking, Serena integration, validation gates
## Known Violations and Counters
Baseline testing captured rationalizations
## Red Flags - STOP
If you catch yourself thinking...
## Common Rationalizations
| Excuse | Reality |
|--------|---------|
## Integration with Other Skills
Required, complementary, Shannon-specific
## The Bottom Line
Summary with Shannon quantitative angle
Run pressure scenarios WITHOUT the skill:
# Create test scenario
scenario = {
"task": "Implement feature without TDD",
"pressures": ["time_pressure", "sunk_cost", "exhaustion"],
"expected_violation": "code_before_test"
}
# Run with sub-agent (no skill loaded)
result = run_subagent_test(scenario, skills=[])
# Document violations
violations = {
"violated": True,
"violation_type": "code_before_test",
"rationalization": "I'll write tests after to verify it works",
"timestamp": ISO_timestamp
}
serena.write_memory(f"skill_testing/{skill_name}/red_baseline", violations)
Capture:
Shannon tracking: Quantify violation rate:
baseline_metrics = {
"scenarios_tested": 5,
"violations": 5,
"violation_rate": 1.00, # 100% violated without skill
"common_rationalizations": [
"I'll test after",
"Too simple to test",
"Deleting X hours is wasteful"
]
}
Write skill addressing baseline violations:
Run same scenarios WITH skill:
# Run with skill loaded
result = run_subagent_test(scenario, skills=[skill_name])
# Verify compliance
compliance = {
"violated": False,
"complied": True,
"skill_effectiveness": 1.00, # 100% compliance
"timestamp": ISO_timestamp
}
serena.write_memory(f"skill_testing/{skill_name}/green_compliance", compliance)
Shannon validation: Quantify improvement:
green_metrics = {
"scenarios_tested": 5,
"violations": 0,
"violation_rate": 0.00, # 0% violated with skill
"improvement": 1.00, # 100% improvement
"skill_effective": True
}
Find new rationalizations:
Add explicit counters:
Shannon tracking: Measure robustness:
refactor_metrics = {
"iterations": 3,
"loopholes_found": 2,
"loopholes_closed": 2,
"final_violation_rate": 0.00,
"pressure_scenarios_passed": 10,
"robustness_score": 1.00 # Bulletproof
}
serena.write_memory(f"skill_testing/{skill_name}/final", refactor_metrics)
Skills should include Serena tracking:
# Example from systematic-debugging skill
debugging_session = {
"attempts": 2,
"success": True,
"duration_minutes": 30,
"timestamp": ISO_timestamp
}
serena.write_memory(f"debugging/sessions/{session_id}", debugging_session)
Skills should reference Shannon's 3-tier validation:
## Verification
**Shannon 3-Tier Validation**:
- Tier 1 (Flow): Code compiles
- Tier 2 (Artifacts): Tests pass
- Tier 3 (Functional): Real systems verified (NO MOCKS)
Skills should use MCP-discovery:
## Shannon Integration
**MCP Requirements**:
- Sequential: Deep analysis
- Serena: Pattern tracking
- Context7: Framework docs (if applicable)
NO SKILL WITHOUT A FAILING TEST FIRST
This applies to NEW skills AND EDITS to existing skills.
No exceptions:
RED Phase:
GREEN Phase:
REFACTOR Phase:
Shannon Validation:
Quality Checks:
This is a meta-skill for creating other skills.
Required understanding:
Produces:
Shannon integration:
Creating skills IS TDD for process documentation.
Shannon enhancement: Quantify everything.
Not "skill seems robust" - "skill: 0% violation rate across 10 pressure scenarios, robustness score 1.00".
Test with subagents. Measure effectiveness. Close loopholes systematically.
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.