Testing methodology and strategy. Tests prove behavior, not implementation. Triggers: test, tests, coverage, assertion, mock, fixture, spec, verify.
From kernelnpx claudepluginhub ariaxhan/kernel-claude --plugin kernelThis skill is limited to using the following tools:
reference/testing-research.mdDesigns and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Enables AI agents to execute x402 payments with per-task budgets, spending controls, and non-custodial wallets via MCP tools. Use when agents pay for APIs, services, or other agents.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
<core_principles>
<test_hierarchy>
Invert the pyramid at your peril. More E2E = slower feedback = less testing. </test_hierarchy>
<anti_patterns> <block id="coverage_theater">High coverage, weak assertions. 100% coverage with .toBeTruthy() catches nothing.</block> <block id="implementation_coupling">Test breaks when refactoring. Test behavior, not structure.</block> <block id="happy_path_only">Normal inputs rarely fail. Test edges, nulls, boundaries, concurrent access.</block> <block id="ai_test_trust">AI generates tests that validate bugs. Review AI tests for what they ACTUALLY assert.</block> <block id="flaky_tolerance">Flaky test = broken test. Fix or delete. Never ignore.</block> </anti_patterns>
<verification_gate>
<!-- Updated 2026-03-28: https://code.claude.com/docs/en/best-practices -->Always provide verification. If you can't verify it, don't ship it. AI writes tests prolifically — but tends to test where the code IS, not where it matters most. Review AI-generated tests for: do they test the actual risk area, or just the happy path it already handles? </verification_gate>
<jit_testing> From Meta (Feb 2026): tests generated on-the-fly before code lands. No maintenance burden (tests don't persist). Mutation-based fault injection. Consider for high-churn code where traditional test suites rot faster than they help. </jit_testing>
<!-- Updated 2026-03-30: Claude Code best practices, AI code review research --><golden_ratio_principle> Test coverage follows diminishing returns past ~80%. Beyond that, invest in:
Coverage % is a vanity metric without mutation score. </golden_ratio_principle>
<ai_generated_test_review> When reviewing AI-generated tests, specifically check:
AI-generated tests frequently validate bugs because they're synthesized FROM the code. Read them as if the implementation might be wrong — because it might be. </ai_generated_test_review>
<!-- Updated 2026-04-02: https://code.claude.com/docs/en/best-practices, https://testdino.com/blog/claude-code-with-playwright/ --><multi_agent_test_patterns> Writer/Reviewer split: Have one agent write tests, a separate agent write code to pass them. This prevents the common failure where AI writes tests FROM the code (validating bugs, not behavior). The test-writing agent must only see the SPECIFICATION, not the implementation.
Parallel hypothesis testing: When coverage gaps exist across multiple modules, spawn agents per module boundary. Each agent tests its domain independently. Prevents agents from stepping on each other's test state or sharing fixtures incorrectly.
Test-before-code in agentic context: When issuing a contract to a surgeon agent, include acceptance criteria as executable test cases. The surgeon's done-when is tests passing, not code written. This forces behavioral specification before implementation.
Verification criteria = highest leverage: Claude performs dramatically better when it can verify its own work by running tests. Always give agents runnable verification — a test suite they can execute — not just a written description of done. </multi_agent_test_patterns>
<on_complete> agentdb write-end '{"skill":"testing","tests_added":<N>,"coverage_delta":"<+X%>","edge_cases":["<list>"],"assertions":"<strong|weak>"}'
Record what you tested and WHY. Prevent duplicate coverage. </on_complete>
</skill>