From test-writer
Write tests that verify behavior (not implementation), use table-driven/parameterized patterns, and minimize mocking. Triggers when asked to write tests, add test coverage, or create test files. Also triggers when reviewing existing tests for quality.
npx claudepluginhub smykla-skalski/sai --plugin test-writerThis skill is limited to using the following tools:
Write tests that survive refactoring, catch real bugs, and don't waste maintenance effort.
Guides browser automation using Playwright and Puppeteer for web testing, scraping, and AI agents. Covers selectors, auto-waits, test isolation, and anti-detection patterns.
Provides checklists for code reviews covering functionality, code quality, security, performance, tests, and maintainability. Use for PRs, audits, team standards, or training.
Guides A/B test setup with mandatory gates for hypothesis validation, metrics definition, sample size calculation, and execution readiness checks.
Write tests that survive refactoring, catch real bugs, and don't waste maintenance effort.
Philosophy: Test what the code does, not how it does it. If you refactor internals and tests break — the tests are wrong, not the code.
Parse from $ARGUMENTS:
--review — review existing tests for anti-patterns instead of writing new ones--lang — override language detection (go, python, ts, java, rust)If no arguments: ask what to test.
--lang flagBehavior identification checklist:
Read the knowledge base before writing:
cat "$(dirname "$0")/references/testing-principles.md" 2>/dev/null || cat references/testing-principles.md
Use table-driven when:
Use individual tests when:
Mock only external boundaries:
time.Now()Do not mock:
Preference hierarchy (try in order):
If you need >2 mocks, stop and reconsider — the code may need restructuring, not more mocks.
Every test follows AAA with blank line separation:
<example> ``` // Arrange — set up test data and preconditions// Act — execute the single behavior under test
// Assert — verify the expected outcome
</example>
### Naming Convention
Test names describe behavior, not method names:
<example>
- `TestTransferFunds_RejectsInsufficientBalance` (Go)
- `test_rejects_withdrawal_when_balance_insufficient` (Python)
- `it("rejects withdrawal when balance is insufficient")` (JS/TS)
</example>
**Format:** `[action]_[scenario]_[expected outcome]` or `should_[behavior]_when_[condition]`
### Table-Driven Patterns by Language
Read [references/language-patterns.md](references/language-patterns.md) for idiomatic table-driven test patterns in Go, Python, TypeScript, Java, and Rust.
### Assertions
- **Assert outcomes** — return values, state changes, observable side-effects
- **Never assert interactions** — don't verify internal method call order
- **Use concrete literals** — `want: "Hello, Alice"` not `want: fmt.Sprintf("Hello, %s", name)`
- **Multiple assertions OK** if they verify facets of the same behavior
- **No logic in assertions** — no string concatenation, no computation, no conditionals
### Edge Cases Checklist
Always consider:
- `null`/`nil`/`undefined` inputs
- Empty string, empty slice/array, empty map
- Boundary values (0, -1, max int, min int)
- Unicode, emoji, special characters in strings
- Duplicate entries where uniqueness expected
- Concurrent access if applicable
## Phase 4: Quality Check
Before finishing, verify each test against this checklist:
### Behavior Tests (must pass ALL)
- [ ] Test name describes a behavior/requirement, not a method name
- [ ] Assertions check outcomes (state, return values), not interactions
- [ ] Test would survive internal refactoring without changes
- [ ] No `verify()` on internal method calls
- [ ] Can explain what this tests without reading production code
### Table Quality (if table-driven)
- [ ] Every case has a descriptive name (not "case 1")
- [ ] No conditional logic in the test loop
- [ ] Expected values are concrete literals, not computed
- [ ] One table = one behavior (not mixing validation + formatting + error handling)
- [ ] Table struct has <=8 fields (otherwise restructure)
### Mock Discipline
- [ ] Only external boundaries are mocked (DB, HTTP, clock, filesystem)
- [ ] No internal collaborators mocked
- [ ] No data structures/value objects mocked
- [ ] <=2 mocks per test (if more: reconsider design)
- [ ] Using real objects or fakes where possible
### General Quality
- [ ] AAA structure with blank line separation
- [ ] No logic in test code (no if/for/switch)
- [ ] Each test is independent — runs in any order
- [ ] No flakiness sources (time, randomness, network)
- [ ] Error paths tested, not just happy path
## Phase 5: Review Mode (--review)
When `--review` flag is set, analyze existing tests for anti-patterns:
### Anti-Pattern Detection
Use Grep to scan for these smells and report with file:line references:
1. **Change detectors** — tests that mirror implementation structure, verify internal call order
2. **Mock explosion** — tests with 3+ mocks, especially mocking internal collaborators
3. **Missing table opportunities** — 3+ tests with identical structure differing only in data
4. **Obscure tests** — hard to understand what's being tested (magic numbers, unclear names)
5. **Conditional test logic** — if/switch inside test methods
6. **General fixture** — shared setup with fields most tests don't use
7. **Fragile tests** — coupled to implementation (private field access, internal API calls)
8. **Missing edge cases** — no error path testing, no boundary values
9. **Computed expected values** — expected values derived from same logic as production code
10. **Interaction verification** — `verify()`/`assert_called_with()` on non-boundary dependencies
### Review Output Format
<example>
```markdown
## Test Review: [file]
### Critical (must fix)
- **[anti-pattern]** at line N: [explanation + fix suggestion]
### Improvement (should fix)
- **[anti-pattern]** at line N: [explanation + fix suggestion]
### Opportunities
- Lines N-M: could consolidate into table-driven test
- Missing coverage: [behavior not tested]
</example>
verify(mock.someMethod()) on an internal dependency, stop