Use when writing or reviewing tests - covers test philosophy, condition-based waiting, mocking strategy, and test isolation
Generates robust tests with condition-based waiting, strategic mocking, and proper isolation.
npx claudepluginhub ed3dai/ed3d-plugins-testingThis skill inherits all available tools. When active, it can use any tool Claude has access to.
"Write tests. Not too many. Mostly integration." — Kent C. Dodds
Tests verify real behavior, not implementation details. The goal is confidence that your code works, not coverage numbers.
Core principles:
Use Arrange-Act-Assert (or Given-When-Then):
test('user can cancel reservation', async () => {
// Arrange
const reservation = await createReservation({ userId: 'user-1', roomId: 'room-1' });
// Act
const result = await cancelReservation(reservation.id);
// Assert
expect(result.status).toBe('cancelled');
expect(await getReservation(reservation.id)).toBeNull();
});
One action per test. Multiple assertions are fine if they verify the same behavior.
Flaky tests often guess at timing. This creates race conditions where tests pass locally but fail in CI.
Wait for conditions, not time:
// BAD: Guessing at timing
await new Promise(r => setTimeout(r, 50));
const result = getResult();
// GOOD: Waiting for condition
await waitFor(() => getResult() !== undefined);
const result = getResult();
async function waitFor<T>(
condition: () => T | undefined | null | false,
description: string,
timeoutMs = 5000
): Promise<T> {
const startTime = Date.now();
while (true) {
const result = condition();
if (result) return result;
if (Date.now() - startTime > timeoutMs) {
throw new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`);
}
await new Promise(r => setTimeout(r, 10)); // Poll every 10ms
}
}
| Scenario | Pattern |
|---|---|
| Wait for event | waitFor(() => events.find(e => e.type === 'DONE')) |
| Wait for state | waitFor(() => machine.state === 'ready') |
| Wait for count | waitFor(() => items.length >= 5) |
Only when testing actual timing behavior (debounce, throttle, intervals):
// Testing tool that ticks every 100ms
await waitForEvent(manager, 'TOOL_STARTED'); // First: wait for condition
await new Promise(r => setTimeout(r, 200)); // Then: wait for 2 ticks
// Comment explains WHY: 200ms = 2 ticks at 100ms intervals
"You don't hate mocks; you hate side-effects." — J.B. Rainsberger
Mocks reveal where side-effects complicate your code. Use them strategically, not reflexively.
Create thin wrappers around third-party libraries. Mock YOUR wrapper, not the library.
// BAD: Mock the HTTP client directly
const mockClient = vi.mocked(httpx.Client);
// GOOD: Create your own wrapper
class RegistryClient {
constructor(private client: HttpClient) {}
async getRepos() {
return this.client.get('https://registry.example.com/v2/_catalog');
}
}
// Mock your wrapper
vi.mock('./registry-client');
This simplifies tests AND improves your design.
| Dependency Type | Example | Strategy |
|---|---|---|
| Managed (you control it) | Your database, your file system | Use REAL instances |
| Unmanaged (external) | Third-party APIs, SMTP, message bus | Use MOCKS |
Communications with managed dependencies are implementation details — you can refactor them freely. Communications with unmanaged dependencies are observable behavior — mocking protects against external changes.
// BAD: Testing that the mock exists
test('renders sidebar', () => {
render(<Page />);
expect(screen.getByTestId('sidebar-mock')).toBeInTheDocument();
});
// GOOD: Test real behavior
test('renders sidebar', () => {
render(<Page />);
expect(screen.getByRole('navigation')).toBeInTheDocument();
});
Gate: Before asserting on any mock element, ask: "Am I testing real behavior or mock existence?"
// BAD: Mock breaks test logic
test('detects duplicate server', () => {
// Mock prevents config write that test depends on!
vi.mock('ToolCatalog', () => ({
discoverAndCacheTools: vi.fn().mockResolvedValue(undefined)
}));
await addServer(config);
await addServer(config); // Should throw - but won't!
});
// GOOD: Mock at correct level
test('detects duplicate server', () => {
vi.mock('MCPServerManager'); // Just mock slow server startup
await addServer(config); // Config written
await addServer(config); // Duplicate detected
});
Gate: Before mocking, ask: "What side effects does this have? Does my test depend on them?"
Mock the COMPLETE data structure as it exists in reality:
// BAD: Partial mock
const mockResponse = {
status: 'success',
data: { userId: '123' }
// Missing: metadata that downstream code uses
};
// GOOD: Mirror real API
const mockResponse = {
status: 'success',
data: { userId: '123', name: 'Alice' },
metadata: { requestId: 'req-789', timestamp: 1234567890 }
};
Warning signs:
"As the number of mocks grows, the probability of testing the mock instead of the desired code goes up." — Codurance
Consider integration tests with real components — often simpler than elaborate mocks.
// BAD: destroy() only used in tests
class Session {
async destroy() { /* cleanup */ }
}
// GOOD: Test utilities handle cleanup
// test-utils/session-helpers.ts
export async function cleanupSession(session: Session) {
const workspace = session.getWorkspaceInfo();
if (workspace) {
await workspaceManager.destroyWorkspace(workspace.id);
}
}
Gate: Before adding any method to production class, ask: "Is this only used by tests?" If yes, put it in test utilities.
Tests should not depend on execution order. But isolation doesn't mean cleaning up everything.
Long-lived resources MUST be cleaned up:
Prefer product tools for cleanup when possible:
afterAll(async () => {
// Use the product's own cleanup mechanisms
await deployment.delete();
await job.terminate();
});
Side-channel cleanup when product tools aren't available:
afterAll(async () => {
// Direct cleanup when product doesn't provide it
await exec('kubectl delete job test-job-123');
});
Database artifacts are fine to leave around. Trying to clean up test data perfectly is a fool's errand and makes multi-step integration tests nearly impossible.
The database should handle its own lifecycle. Tests that require pristine state should create unique identifiers, not depend on cleanup.
// Use unique identifiers instead of depending on clean state
const testId = `test-${Date.now()}-${Math.random()}`;
const user = await createUser({ email: `${testId}@test.com` });
| Problem | Fix |
|---|---|
| Arbitrary setTimeout in tests | Use condition-based waiting |
| Assert on mock elements | Test real component or unmock |
| Mock third-party directly | Create wrapper, mock wrapper |
| Test-only methods in production | Move to test utilities |
| Mock without understanding | Understand dependencies first |
| Incomplete mocks | Mirror real API completely |
| Over-complex mocks | Consider integration tests |
| Long-lived resources left running | Clean up VMs, k8s jobs, cloud resources |
Stop and reconsider when you see:
setTimeout/sleep without justificationTDD prevents most testing anti-patterns:
For certain patterns, property-based testing provides stronger coverage than example-based tests. See property-based-testing skill for complete reference.
| Pattern | Example | Why PBT |
|---|---|---|
| Serialization pairs | encode/decode, toJSON/fromJSON | Roundtrip property catches edge cases |
| Normalizers | sanitize, canonicalize, format | Idempotence property ensures stability |
| Validators | is_valid, validate | Valid-after-normalize property |
| Pure functions | Business logic, calculations | Multiple properties verify contract |
| Sorting/ordering | sort, rank, compare | Ordering + idempotence properties |
Before committing property-based tests:
sorted(xs) == sorted(xs) tests nothing)assume() calls don't filter out most inputs@example([]), @example([1]) decoratorsassert add(a,b) == a+b)You MUST use this before any creative work - creating features, building components, adding functionality, or modifying behavior. Explores user intent, requirements and design before implementation.