npx claudepluginhub outcomeeng/claude --plugin pythonThis skill is limited to using the following tools:
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
This skill WRITES tests. It does not just design or plan. </objective>
<mode_detection> Determine which mode you're in:
WRITE mode - Tests don't exist yet or you're starting fresh
ls {node_path}/tests/*.py returns nothing or minimal filesFIX mode - Tests exist but were rejected by reviewer
/auditing-python-tests output shows REJECT with specific issuesAlways check which mode before proceeding. </mode_detection>
<quick_start>
Input: Node spec path (e.g., spx/21-infra.enabler/43-parser.outcome/)
Output: Test files written to {node}/tests/ directory
Workflow:
Check mode → WRITE or FIX → Execute → Verify → Report
</quick_start>
<write_mode_workflow>
Read the node spec and related files:
# Read node spec
cat {node_path}/{slug}.outcome.md
# Read parent node for context (if nested)
cat {parent_path}/{slug}.enabler.md
# Check for ADRs/PDRs that constrain testing approach
ls {node_path}/../*.adr.md {node_path}/../*.pdr.md 2>/dev/null
Extract from the spec:
Note on Analysis sections: The Analysis section documents what the spec author examined. It provides context but is not binding — implementation may diverge as understanding deepens. Use it as a starting point, not a contract.
For each assertion, apply the /testing methodology:
| Evidence Type | Minimum Level |
|---|---|
| Pure computation/algorithm | 1 |
| File I/O with temp dirs | 1 |
| Standard dev tools (git, curl) | 1 |
| Project-specific binary | 2 |
| Database, Docker | 2 |
| Real credentials, external APIs | 3 |
Create test files following /standardizing-python-testing:
Mandatory elements:
-> None return type on every test function@given).unit.py, .integration.py, .e2e.py)uv run --extra dev pytest {node_path}/tests/ -v
Tests should FAIL with ImportError or AssertionError (implementation doesn't exist yet).
If the implementation module doesn't exist yet, tests fail on import — breaking the quality gate. Add the node to spx/EXCLUDE and run the project's sync command:
# Add node path to spx/EXCLUDE (paths relative to spx/)
echo "76-risc-v.outcome" >> spx/EXCLUDE
# Sync to pyproject.toml
just sync-exclude
This excludes the node's tests from pytest, mypy, and pyright until the implementation exists. Ruff still checks style. See the spec-tree /understanding skill's references/excluded-nodes.md for the full convention.
Remove the entry from spx/EXCLUDE when implementation begins.
</write_mode_workflow>
<fix_mode_workflow>
Find the most recent /auditing-python-tests output. Look for:
For each rejection reason:
| Rejection Category | Fix Action |
|---|---|
Missing -> None | Add return type to test functions |
| Evidentiary gap | Rewrite test to actually verify the assertion |
| Mocking detected | Replace with dependency injection |
| Missing property tests | Add @given tests for parsers/serializers |
| Silent skip | Change skipif to pytest.fail() for required deps |
| Magic values | Extract to named constants |
| Wrong filename suffix | Use .unit.py, .integration.py, or .e2e.py |
# Run tests again
uv run --extra dev pytest {node_path}/tests/ -v
# Check types
uv run --extra dev mypy {node_path}/tests/
# Check linting
uv run --extra dev ruff check {node_path}/tests/
## Tests Fixed
### Issues Addressed
| Issue | Location | Fix Applied |
| --------------- | -------------- | ------------------------------------ |
| Missing -> None | test_foo.py:15 | Added return type |
| Magic value | test_foo.py:23 | Extracted to EXPECTED_VALUE constant |
### Verification
Tests run and fail for expected reasons (RED phase complete).
</fix_mode_workflow>
<test_writing_checklist>
Before declaring tests complete:
/testing Stage 2).unit.py, .integration.py, .e2e.py)-> None return type@given)</test_writing_checklist>
<patterns_reference>
See /standardizing-python-testing for:
</patterns_reference>
<output_format>
WRITE mode output:
## Tests Written
### Node: {node_path}
### Test Files Created
| File | Level | Outcomes Covered |
| ------------------------ | ----- | ---------------- |
| `tests/test_foo.unit.py` | 1 | Outcome 1, 2 |
### Test Run (RED Phase)
Tests fail as expected. Ready for review.
FIX mode output:
## Tests Fixed
### Issues Addressed
| Issue | Location | Fix Applied |
| ------- | ----------- | ----------- |
| {issue} | {file:line} | {fix} |
### Verification
Tests pass checklist. Ready for re-review.
</output_format>
<success_criteria>
Task is complete when:
{node}/tests/ directory/standardizing-python-testing standards</success_criteria>