npx claudepluginhub oribarilan/97 --plugin 97This skill uses the workspace's default tool permissions.
A test that compiles and goes green is not the same as a *good* test. **A good test pins the contract (not the implementation), states a concrete example a reader can check by eye, names the scenario in the language of the domain, and is safe for the test data to leak in front of a customer.** This skill enforces a short, ordered set of decisions to make every time you write a new test, design ...
Guides test writing with outcome-based naming, specific assertions, one-concept-per-test rules, and edge case checklists. Applies to new tests, reviews, TDD, and coverage expansion.
Provides language-agnostic testing principles including TDD cycles, quality standards, coverage guidelines, and test types for writing tests, strategies, or reviews.
Guides effective test writing with AAA structure, testing pyramid, mock boundaries. Debugs flaky/brittle tests, chooses unit/integration/E2E boundaries.
Share bugs, ideas, or general feedback.
A test that compiles and goes green is not the same as a good test. A good test pins the contract (not the implementation), states a concrete example a reader can check by eye, names the scenario in the language of the domain, and is safe for the test data to leak in front of a customer. This skill enforces a short, ordered set of decisions to make every time you write a new test, design a fixture, or name a test method. It draws on eight contributors to 97 Things Every Programmer Should Know (CC-BY-3.0; see principles.md for citations and links).
This is a rigid skill. Run the checklist in order. If you can't satisfy a step, stop and tell your human partner what's blocking you.
Invoke when you're about to:
npm test, pytest)If you're not sure whether a change counts as "writing a test," invoke anyway — the checklist is short and skipping it produces tests that pass forever while protecting nothing.
superpowers/test-driven-development decides whether and when to write a test (the process: red, green, refactor; test first; one failing test at a time). This skill decides what makes the test itself good (the quality: what to assert, how to name it, what data to use). When mocking gets harder than the production code, fall back to GOOS/ListenToTestPain — the design is wrong, not the test.before-you-refactor precedes this skill when the trigger is "I want to restructure existing test code" rather than "I'm writing a new test."domain-modeling governs the names of the types under test; this skill governs the names of the tests.Run every step in order when writing a new test. Do not commit until step 7 is satisfied.
[3,3,3,3,3,3] against an input of [3,1,4,1,5,9]. The full postcondition is sorted and a permutation of the input — but expressing that as a generic checker is often more code than the function under test. Prefer a concrete example pair: input [3,1,4,1,5,9], expected [1,1,3,4,5,9]. "Adding to an empty collection" is not "now non-empty" — it's "now contains exactly one item, and that item is X." (Henney, 97/81.)Stack_pop_on_empty_throws reads better than test17. Then test the test: introduce a deliberate bug into the code under test on a private branch and verify the failure message tells you what went wrong. (Meszaros, 97/95.)"don't click that again, you moron" as a placeholder dialog, do not use four-letter words as fake stock tickers. Use boring, obviously-fake, professional data. (Begbie, 97/25.)These thoughts mean STOP — restart the checklist:
| Thought | Reality |
|---|---|
| "I'll just assert the function returns exactly -1." | The contract says "negative" — ±1 is an implementation incidental. The test will go red on a valid refactor and tell you nothing about the requirement. (97/80) |
| "Same length, all elements in range — that's enough for a sort test." | [3,3,3,3,3,3] satisfies that and is wrong. Use a concrete input/output pair so the only correct answer is the one in the assertion. (97/81) |
"I'll seed the database with band members and song titles — funnier than User1." | Cute test data ends up in customer screenshots, demos, and leaked source. Use boring, obviously-fake, professional data. (97/25) |
| "QA is too picky — those bugs aren't real." | The picky tester is the reason users do not see those bugs. The accumulation of small defects is what makes users distrust the product. (97/60) |
| "I'll throw it over the wall — testers will find anything I missed." | Adversarial handoff burns time in the defect tracker. Share testing ideas before coding; accept acceptance tests as input. (97/92) |
| "No time to write the test — we'll add it later." | "Later" rarely arrives, and shipping without verification is professionally irresponsible. The test is part of the change. (97/83) |
| "I know what this test means — I just wrote it." | You will not, in six months. Tests are read more than they are written. Name the scenario, structure as context/act/assert, hide scaffolding. (97/95) |
| "The soak test takes 8 hours — we'll skip it." | Schedule it overnight. The build server is idle from 6pm to 8am and all weekend; that is free coverage you are throwing away. (97/82) |
"test17 is a fine name — the body explains it." | Test names are scanned to verify coverage and to read failure reports. Encode the scenario and the entry point in the name. (97/95) |
| "The test setup is fifty lines, then a one-line assert — but it works." | Test pain is design pressure. If the setup dwarfs the body, reshape the production code. Do not mock harder. (GOOS/ListenToTestPain) |
"I'll assert that repo.save was called exactly three times." | Over-specifying mock interactions makes the test red on innocent refactors. Assert on the observable contract, not the call shape. (xUnit/FragileTest) |
"The test reads from /tmp/fixtures/users.json set up in conftest.py." | Mystery Guest. The test cannot be read in isolation. Build the fixture in the test or in a function named for what it returns. (xUnit/MysteryGuest) |
| "I'll branch on the return value and assert different things in each branch." | One test, two scenarios fighting for one name. Split into two tests, or use named parameterized cases. (xUnit/ConditionalTestLogic) |
A single well-written test is done when all of the following are true:
If any box is unchecked, the test is not done — it is mid-written. Either finish, or delete it and start over.
| # | Principle | Author |
|---|---|---|
| 97/25 | Don't Be Cute with Your Test Data | Rod Begbie |
| 97/60 | News of the Weird: Testers Are Your Friends | Burk Hufnagel |
| 97/80 | Test for Required Behavior, Not Incidental Behavior | Kevlin Henney |
| 97/81 | Test Precisely and Concretely | Kevlin Henney |
| 97/82 | Test While You Sleep (and over Weekends) | Rajith Attapattu |
| 97/83 | Testing Is the Engineering Rigor of Software Development | Neal Ford |
| 97/92 | When Programmers and Testers Collaborate | Janet Gregory |
| 97/95 | Write Tests for People | Gerard Meszaros |
GOOS/ListenToTestPain | Listen to Test Pain | Steve Freeman & Nat Pryce |
xUnit/ObscureTest | Obscure Test | Gerard Meszaros |
xUnit/FragileTest | Fragile Test | Gerard Meszaros |
xUnit/MysteryGuest | Mystery Guest | Gerard Meszaros |
xUnit/ConditionalTestLogic | Conditional Test Logic | Gerard Meszaros |
See principles.md for the long-form distillations, citations, and source links.