This skill should be used when the user asks to "create spec tests", "generate markdown tests", "specification-as-tests", "TDD with markdown", "agent-driven testing", "port code to another language", or mentions test suites where markdown IS the test format. NOT for pytest/jest/unittest. Creates .md test files with inline assertions and uses a bundled custom test runner. Includes cookbook for porting code between languages.
/plugin marketplace add ianphil/my-skills/plugin install all-skills@ian-skillsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
references/assertion-syntax.mdreferences/creating-tests-for-porting-code.mdscripts/RunTests.csscripts/run_tests.jsscripts/run_tests.ps1scripts/run_tests.pyscripts/run_tests.rsscripts/run_tests.shscripts/run_tests.tsCreate a test suite for [DOMAIN] that an agent can run autonomously.
Porting Code? If your goal is to port code from one language to another (e.g., PowerShell → bash), see the Porting Code Cookbook for a comprehensive guide on creating layered test suites that define the contract both implementations must satisfy.
CRITICAL: This is NOT pytest/jest/unittest/etc. This creates a CUSTOM test format.
Generate two artifacts:
tests/<feature>.md)# Feature Name
Prose explaining what this tests and why.
## Test Group
### Specific Behavior
\`\`\`py
result = do_thing("input")
result.value # expect: 42
\`\`\`
### Error Case
\`\`\`py
do_thing(None) # error: [null-input]
\`\`\`
The markdown file IS the test. Code blocks contain executable code. Comments are assertions.
Copy the appropriate bundled runner for your language:
| Language | Runner | Copy Command |
|---|---|---|
| Python | run_tests.py | cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.py" tests/ |
| JavaScript | run_tests.js | cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.js" tests/ |
| TypeScript | run_tests.ts | cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.ts" tests/ |
| Bash/Shell | run_tests.sh | cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.sh" tests/ |
| PowerShell | run_tests.ps1 | cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.ps1" tests/ |
| Rust | run_tests.rs | cp "${CLAUDE_PLUGIN_ROOT}/scripts/run_tests.rs" tests/ |
| C# | RunTests.cs | cp "${CLAUDE_PLUGIN_ROOT}/scripts/RunTests.cs" tests/ |
All runners:
# expect: / // expect: and # error: / // error: assertionsapprox(), contains()Do NOT use pytest, jest, xunit, or any standard test framework.
Tests are the oracle for agent-driven development:
python run_tests.pyThe test file is both:
This pattern enabled Simon Willison to port an entire HTML5 parser in 4.5 hours—the agent ran 9,200 tests autonomously.
CRITICAL: Tests MUST actually test the real script/module, not self-contained snippets.
Tests should fail (red) when:
Tests should pass (green) only when:
### Normalizes to lowercase
\`\`\`bash
# This tests bash itself, NOT your script!
input="HELLO"
echo "$input" | tr '[:upper:]' '[:lower:]'
# expect: hello
\`\`\`
This passes regardless of whether your script exists or works.
### Normalizes to lowercase
\`\`\`bash
# Source the actual script being tested
source "../../tools/scripts/my-script.sh" --source-only
# Call the real function
result=$(normalize_input "HELLO")
echo "$result"
# expect: hello
\`\`\`
This fails until my-script.sh exists with a working normalize_input function.
Bash - Use --source-only flag pattern:
# In test file
source "$(dirname "${BASH_SOURCE[0]}")/../../path/to/script.sh" --source-only
# In script being tested (at bottom)
if [[ "${1:-}" != "--source-only" ]]; then
main "$@"
fi
PowerShell - Dot-source the script:
. "$PSScriptRoot/../../path/to/script.ps1"
Python - Import the module:
from mypackage.module import function_under_test
When porting from Language A to Language B:
.ps1 script.sh scriptTests = executable docs that stay accurate (lies fail CI). Each test answers: "What can the user now do?"
Intent structure per component:
Example:
## Buildout Intent
**Purpose:** Provision shared infrastructure
**Why it matters:** Without this, no Service Bus/KeyVault/storage
**After Buildout, developer can:** Connect to Service Bus, store secrets, upload artifacts
### Service Bus Is Accessible
\`\`\`py
namespace.is_accessible # expect: True
\`\`\`
Bad: config["buildout"].priority # expect: 1 (implementation detail)
Good: "After Buildout, developer can connect to Service Bus" (user outcome)
project/
├── src/
│ └── <module>.<ext> # Code under test
└── tests/
├── <feature>.md # Markdown test files
└── run_tests.<ext> # Language-specific runner (copied from skill)
Must be the FIRST thing in the file:
# Test Configuration
module = "mypackage.validator"
import = ["validate", "normalize", "ValidationError"]
isolation = "per-block"
List ALL error codes at the top, before any tests:
**Error codes:**
- `[empty-input]` — User provided empty string
- `[null-input]` — User provided null/None
- `[invalid-format]` — String doesn't match expected pattern
Headers create test groups. Prose explains intent. Code blocks are tests:
## Input Validation
Users paste data from spreadsheets which may contain invisible whitespace.
The validator normalizes input before checking length.
### Accepts Valid Input
\`\`\`py
validate("hello") # expect: True
\`\`\`
### Rejects Empty String
Empty means "user submitted nothing" — distinct from whitespace-only:
\`\`\`py
validate("") # error: [empty-input]
\`\`\`
Assertions are inline comments that define expected behavior:
validate("hello") # expect: True
validate("") # error: [empty-input]
pi_value() # expect: approx(3.14159, tol=0.0001)
For complete syntax details including matchers, CLI tests, and language-specific formats, see Assertion Syntax Reference.
Copy runner: cp "${CLAUDE_PLUGIN_ROOT}/scripts/<runner>" tests/
| Language | Run Command | Code Blocks | Assertions | Requirements |
|---|---|---|---|---|
| Python | python tests/run_tests.py | py, python | # expect:, # error: | - |
| JavaScript | node tests/run_tests.js | js, javascript | // expect:, // error:, // throws: | - |
| TypeScript | npx tsx tests/run_tests.ts | ts, typescript | // expect:, // error:, // throws: | npm install -D tsx |
| Bash | bash tests/run_tests.sh | sh, bash | # exit:, # stdout:, # stderr: | - |
| PowerShell | pwsh tests/run_tests.ps1 | ps1, powershell, bash, sh | # exit:, # stdout:, # stderr:, # throws:, # expect: | Supports both PS & bash |
| Rust | rust-script tests/run_tests.rs | rs, rust | // expect:, // error:, // compiles, // compile_fails: | cargo install rust-script |
| C# | dotnet script tests/RunTests.cs | cs, csharp | // expect:, // error: | dotnet tool install -g dotnet-script |
PowerShell runner note: The # expect: assertion compares stdout: echo "hello" # expect: hello
Every error must have a stable code the runner can match:
| Language | How to Expose Code |
|---|---|
| Python | Exception with .code property |
| Rust | Error enum variant |
| CLI | [code] in stderr + exit code |
| C# | Exception type name |
Example Python exception:
class ValidationError(Exception):
def __init__(self, code: str, message: str):
self.code = code
super().__init__(message)
# Usage
raise ValidationError("empty-input", "Input cannot be empty")
The runner matches # error: [empty-input] against exception.code.
# Test Configuration
module = "temperature"
import = ["celsius_to_fahrenheit", "TemperatureError"]
isolation = "per-block"
Converts between Celsius and Fahrenheit with validation.
Problem: Users need temperature conversion, but invalid inputs (below absolute zero) must fail clearly—not return nonsense values.
Error codes:
[below-absolute-zero] — Temperature violates laws of physicsStandard formula: F = (C × 9/5) + 32
celsius_to_fahrenheit(0) # expect: 32.0
celsius_to_fahrenheit(100) # expect: 212.0
-273.15°C is the physical limit. Below that is impossible:
celsius_to_fahrenheit(-300) # error: [below-absolute-zero]
When asked to create spec tests:
${CLAUDE_PLUGIN_ROOT}/scripts/.md files in tests/, NOT language-specific test files# for Python/Bash, // for Rust/C#)Fill in when generating:
For each major component, document:
## [Component] Intent
**Purpose:** [One sentence: what does this component do?]
**Why it matters:** [2-3 sentences: business/technical value, what breaks without it]
**After [Component] completes, a user can:**
- [Concrete outcome 1]
- [Concrete outcome 2]
- [Concrete outcome 3]
**Error codes:**
- `[error-code-1]` — [When this happens and what it means]
- `[error-code-2]` — [When this happens and what it means]
This ensures your tests serve as both executable specifications and onboarding documentation.