Phase 3: Write tests FIRST (TDD RED phase)
Write all tests BEFORE implementation (TDD RED phase). Creates failing unit, integration, and E2E tests based on your design, ensuring comprehensive coverage and proper test isolation.
/plugin marketplace add kenotron-ms/amplifier-setup/plugin install dev-kit@amplifier-setupnew-feature/Write all tests BEFORE implementation. Tests should fail initially.
This template MUST be followed when creating 03-test-plan.md.
# Test Plan: [Feature]
**Based on**: 02-design.md
**Created**: YYYY-MM-DD
## Test Strategy
- Unit tests: 60% coverage target
- Integration tests: 30% coverage target
- E2E tests: 10% coverage target
## Test Categories
### New Feature Tests
Tests for NEW functionality being added. These MUST fail in Phase 3 (RED).
### Regression Tests
Tests for EXISTING functionality to ensure it doesn't break. These should PASS in Phase 3.
## Unit Tests To Write
### New Feature Tests (will FAIL in Phase 3 - RED)
#### Module: [Module Name]
- [ ] test_new_function_with_valid_input
- [ ] test_new_function_with_invalid_input
- [ ] test_new_function_edge_case
- [ ] test_new_function_error_handling
**Total**: 0/[X] unit tests (expecting FAIL)
### Regression Tests (should PASS in Phase 3 - protecting existing code)
- [ ] test_existing_feature_still_works
- [ ] test_existing_api_not_broken
- [ ] test_existing_data_flow_preserved
**Total**: 0/[X] regression tests (expecting PASS)
## Integration Tests To Write
### New Feature Tests (will FAIL)
- [ ] test_new_module_integration
- [ ] test_new_data_flow
### Regression Tests (should PASS)
- [ ] test_existing_integration_not_broken
**Total**: 0/[X] integration tests
## E2E Tests To Write
### New Feature Tests (will FAIL)
- [ ] test_new_user_workflow
- [ ] test_new_feature_error_handling
### Regression Tests (should PASS)
- [ ] test_existing_user_flows_still_work
**Total**: 0/[X] e2e tests
## Current Status
Tests planned. Ready to write tests.
**Test Files To Create**:
- tests/unit/test_[module].py
- tests/integration/test_[feature].py
- tests/e2e/test_[feature].py
ai_working/<feature>-<date>/02-design.mdprogress.mdUpdate TodoWrite:
- [ ] Test design checklist reviewed
- [ ] Test plan created (what tests to write)
- [ ] Unit tests written per plan (failing)
- [ ] Integration tests written per plan (failing)
- [ ] E2E tests written per plan (failing)
- [ ] Tests verified to fail correctly (RED)
- [ ] Test cleanup verified
CRITICAL: All tests MUST follow these principles.
Before writing tests, ensure the test design includes:
UNIT TEST - Tests single unit in isolation with mocked dependencies:
Example UNIT test:
// Testing MinimizedDock component in isolation
mockStore = { dockExpanded: true, setDockExpanded: vi.fn() };
render(<MinimizedDock />);
await user.click(collapseButton);
expect(mockSetDockExpanded).toHaveBeenCalledWith(false); // Mocked function
INTEGRATION TEST - Tests multiple units working together with real dependencies:
Example INTEGRATION test:
// Testing MinimizedDock with REAL store and localStorage
render(<App />); // Real Zustand provider, real localStorage
await user.click(collapseButton);
expect(useStore.getState().dockExpanded).toBe(false); // Real store state
expect(localStorage.getItem('settings')).toContain('dockExpanded":false'); // Real persistence
E2E TEST - Tests complete user journey through full system:
Example E2E test:
// Testing complete user flow
await page.goto('/'); // Real app
await page.getByRole('button', { name: 'Minimize Project' }).click();
await page.getByRole('button', { name: 'Collapse sidebar' }).click();
await page.reload(); // Test persistence across reload
expect(await page.getByRole('region', { name: 'Dock' })).toHaveAttribute('data-expanded', 'false');
Quick Classification Guide:
pytest test_file.py::test_nameSelector Priority (most to least resilient):
Accessible roles and labels (BEST - mirrors user interaction)
getByRole('button', { name: 'Submit' })
getByLabelText('Email address')
Use for: All tests
Test IDs (explicit contracts)
data-testid="workspace-create-button"
getByTestId('workspace-create-button')
Use for: Complex scenarios, dynamic content, when semantic queries fail
User-visible text (natural but fragile)
getByText('Welcome back')
Use for: Simple unit tests
AVOID (brittle, implementation-coupled):
.btn-primary (styling details)#header (implementation)//div[@class='foo']/span[2] (brittle)div > span:nth-child(2) (breaks easily)screen.queryByText('1') (ambiguous, multiple matches)Guidelines:
[component]-[action]-[element]getByRole('button', { name: /submit/i })queryByText('1') that could match multiple elements)AVOID arbitrary timeouts (flaky, slow, unreliable):
await page.waitForTimeout(1000) (arbitrary wait)await sleep(500) (arbitrary delay)setTimeout() in tests (timing-based)USE condition-based waits (reliable, fast, deterministic):
Playwright:
✅ await page.waitForSelector('#element')
✅ await page.waitForLoadState('networkidle')
✅ await page.waitForResponse(url => url.includes('/api'))
✅ await expect(locator).toBeVisible()
Testing Library:
✅ await waitFor(() => expect(element).toBeInTheDocument())
✅ await findByRole('button', { name: 'Submit' })
✅ await waitForElementToBeRemoved(() => screen.getByText('Loading'))
Cypress:
✅ cy.get('[data-testid="item"]').should('be.visible')
✅ cy.contains('Success').should('exist')
Guidelines:
When instructing agents to write tests, explicitly require:
REQUIRED: Create test plan BEFORE writing any tests.
Review 02-design.md to identify:
ai_working/<feature>-<date>/03-test-plan.mdThis plan is your checklist - follow it exactly when writing tests!
Use 03-test-plan.md as your checklist. Write each test listed in the plan.
Use test-coverage agent:
Task test-coverage: "Create unit tests for [feature] following the test plan in
03-test-plan.md. Write EXACTLY the tests listed in the plan - no more, no less.
Tests should FAIL initially (no implementation yet).
Reference 02-design.md for module specifications.
CRITICAL TEST REQUIREMENTS:
- Include proper cleanup in teardown/afterEach/afterAll
- Use test fixtures with automatic cleanup
- Tests must be isolated (no shared state, no order dependencies)
- Database: Use transactions that rollback OR delete records in teardown
- Files: Delete any temp files created
- State: Reset mocks, clear caches after each test
- Each test must be self-contained and leave no trace
WRITE the actual test files using Write tool. Create:
- tests/unit/test_[module].py (or .js/.ts based on project)
- Include imports, test functions, assertions
- Include setup/teardown hooks
As you create each test, check it off in 03-test-plan.md!"
Verify tests were created:
ls -la tests/unit/test_*.*
Check off tests in 03-test-plan.md as created.
Use 03-test-plan.md as your checklist. Write each integration test listed.
Use integration-specialist agent:
Task integration-specialist: "Create integration tests for [feature] following
the test plan in 03-test-plan.md. Write EXACTLY the tests listed - no more, no less.
Tests should FAIL initially.
Reference 02-design.md for integration specifications.
CRITICAL TEST REQUIREMENTS:
- Include proper cleanup in teardown
- Use test database with cleanup/rollback
- Tests must be isolated, self-contained, and idempotent
- No execution order dependencies
WRITE the actual test files using Write tool. Create:
- tests/integration/test_[feature]_integration.py (or .js/.ts)
- Include setup/teardown with database cleanup
Check off each test in 03-test-plan.md as created!"
Verify tests created and check off in plan.
Use 03-test-plan.md as your checklist. Write each E2E test listed.
Use test-coverage agent:
Task test-coverage: "Create E2E tests for [feature] following the test plan in
03-test-plan.md. Write EXACTLY the tests listed in the plan.
Tests should FAIL initially.
Reference 02-design.md for user flow specifications.
CRITICAL TEST REQUIREMENTS:
- Include proper cleanup in teardown/afterAll
- Clear localStorage/sessionStorage after tests
- Tests must be isolated, self-contained, and repeatable
- No execution order dependencies
WRITE the actual test files using Write tool. Create:
- tests/e2e/test_[feature]_e2e.py (or .spec.js/.spec.ts)
- Include browser setup/teardown and data cleanup
Check off each test in 03-test-plan.md as created!"
Verify tests created and check off in plan.
REQUIRED: Verify all tests are syntactically correct and properly structured BEFORE running them.
Check each test file:
# Check syntax
[language-specific syntax check command]
# Python: python -m py_compile tests/**/*.py
# JavaScript: npx eslint tests/ --max-warnings 0
# TypeScript: npx tsc --noEmit
# Verify test framework can discover tests
[test discovery command]
# pytest: pytest --collect-only
# jest: npm test -- --listTests
# vitest: npx vitest list
Verify each test has:
Verify logical correctness:
assert True or assert 1 == 1)If issues found:
Only proceed to run tests after ALL test files are verified correct.
If issues can't be auto-fixed, ask user:
Test verification found issues that need your input:
[List of issues]
What would you like to do?
1. Let me fix these issues manually
2. Help me debug [specific issue]
3. Review the test files together
Your choice: _
Check if E2E framework supports auto-starting dev server:
Review 00-discovery.md for E2E framework in use:
webServer config in playwright.configbaseUrl or use start-server-and-testIf framework supports auto-server, configure it:
For Playwright:
// playwright.config.ts
webServer: {
command: '[dev server command from discovery]',
port: [port],
reuseExistingServer: !process.env.CI,
}
For Cypress:
// cypress.config.js
{
"baseUrl": "http://localhost:[port]",
// Use start-server-and-test in package.json
}
If configured, E2E tests will auto-start server. Skip Step 4b.
If E2E framework doesn't auto-start server, start it manually:
Get dev server command from 00-discovery.md (Build Command section).
# Start dev server in background
[dev server command from discovery] &
DEV_SERVER_PID=$!
# Wait for server to be ready
sleep 5 # Or use wait-on/wait-for-it
# Check if server started successfully
if ! curl -s http://localhost:[port] > /dev/null; then
echo "❌ Dev server failed to start"
fi
If server fails to start, ask user:
Dev server failed to start.
Possible issues:
- Port [port] already in use
- Missing dependencies
- Configuration error
- [Error message from logs]
What would you like to do?
1. Fix the issue and retry
2. Use a different port
3. Skip E2E tests for now (NOT RECOMMENDED - breaks TDD)
4. Debug the issue together
Your choice: _
Do NOT automatically skip E2E tests without user decision.
Keep server running for test execution if started successfully.
CRITICAL: Must actually run tests and confirm they fail!
Run unit and integration tests first:
# Run without E2E initially
[unit test command]
[integration test command]
Then run E2E tests (app should be running):
[test command from discovery] # e.g., pytest -v, npm test
Expected outcomes:
Verify test results:
[test command with verbose] | tee test_output.txt
# Analyze results
grep "PASSED" test_output.txt # Should be regression tests only
grep "FAILED" test_output.txt # Should be new feature tests only
Check test output for:
If NEW FEATURE tests PASS:
If REGRESSION tests FAIL:
If tests have errors (syntax, import issues, config problems):
Ask user instead of assuming:
Tests failed to execute (not failed as in RED, but couldn't run):
Error: [error message]
Possible issues:
- Syntax errors in test files
- Missing test dependencies
- Test framework not configured
- Import errors
What would you like to do?
1. Fix the issue and retry
2. Debug the test configuration
3. Review test files for errors
4. Other (explain)
Your choice: _
Do NOT skip tests or mark phase complete if tests can't execute.
Document test results:
Create summary showing both categories:
Phase 3 Test Results (RED Phase):
NEW FEATURE TESTS (expecting FAIL):
✓ Unit tests: X/X failing (expected RED)
✓ Integration tests: X/X failing (expected RED)
✓ E2E tests: X/X failing (expected RED)
REGRESSION TESTS (expecting PASS):
✓ Unit tests: X/X passing (existing code works)
✓ Integration tests: X/X passing (existing code works)
✓ E2E tests: X/X passing (existing code works)
Total: X new feature tests FAILING (RED ✓), X regression tests PASSING ✓
RED phase confirmed for new feature - ready for implementation.
Only proceed to Phase 4 if:
This is proper TDD discipline with regression protection.
If you started the dev server manually in Step 4b:
# Stop the dev server
kill $DEV_SERVER_PID
# Or if PID not available
pkill -f "[dev server process name]"
echo "Dev server stopped"
If E2E framework auto-manages server, skip this step.
CRITICAL: Verify tests are self-contained and clean up properly.
Test 1: Run twice (verify cleanup)
# Run tests twice - both should pass/fail the same way
[test command]
[test command]
Check for:
Test 2: Run in different orders (verify independence)
# Run tests in random order (if framework supports)
[test command with random order flag] # e.g., pytest --random-order, npm test -- --randomize
# Or run individual tests
[test command] test_file.py::test_1
[test command] test_file.py::test_2
[test command] test_file.py::test_1 # Run test_1 again
Check for:
Test 3: Run in parallel (verify true isolation)
# Run tests with maximum parallelization
[test command with parallel workers]
# pytest: pytest -n auto (requires pytest-xdist)
# jest: npm test -- --maxWorkers=100%
# vitest: npx vitest --threads
# playwright: npx playwright test --workers=4
Check for:
If tests fail in parallel but pass sequentially:
Tests MUST pass in parallel execution
Test 4: Verify cleanup in CI
# Run full suite
[test command with verbose output]
Check for:
If issues found:
Update 03-test-plan.md: Check off all tests as created and mark with FAILING status. Add "Current Status" noting all tests are RED and ready for implementation.
Update progress.md:
[✓]35%Present to user:
Phase 3 Complete: Test Planning (RED Phase)
Created:
- X unit tests (all FAILING ✓)
- X integration tests (all FAILING ✓)
- X e2e tests (all FAILING ✓)
Total: X tests created, X failing (100% RED) ✓
Test plan: ai_working/<feature>-<date>/03-test-plan.md
All tests checked off and verified RED.
Ready to proceed to implementation (make tests GREEN).
Next: /new-feature:4-implement
ai_working/<feature>-<date>/03-test-plan.mdprogress.md (updated)/new-feature:4-implement