From ironclaude
Detect and eliminate testing theatre - tests that can't prevent regressions
npx claudepluginhub robertphyatt/ironclaude --plugin ironclaudeThis skill uses the workspace's default tool permissions.
Enforce zero tolerance for testing theatre across all test types.
Reviews test suites for coverage completeness, quality, and best practices. Checks happy/sad paths, edge cases, assertions, isolation, AAA patterns, and compliance with RSpec, Minitest, Jest, Playwright.
Detects test smells like overmocking, flaky tests, fragile tests, poor assertions, and coverage issues. Analyzes test correctness, reliability, maintainability when reviewing or improving tests.
Reviews test files for bug-catching quality, grading on six dimensions like assertion depth, input coverage, mock health with actionable scorecard.
Share bugs, ideas, or general feedback.
Enforce zero tolerance for testing theatre across all test types.
Whenever soliciting user input — choices, confirmations, or selections — ALWAYS use the AskUserQuestion tool. NEVER ask via prose. Follow the format in .claude/rules/ask-user-question-format.md: Re-ground context, Predict, Options.
Announce professional mode status:
Using testing-theatre-detection skill. Professional mode is ACTIVE - architect mode enforced (no code changes).
If invoked with exact file path:
Skip scope determination. Analyze the provided file directly.
Example: /testing-theatre-detection src/auth/login.test.js
If invoked from plan execution (subagent with task context): Identify tests related to the specific task/feature:
If manual invocation (no file path provided): Use AskUserQuestion tool:
If auto-invoked (from code-review): Automatically use current changes (git diff --name-only)
Run appropriate command based on scope:
import.*{ComponentName} or import.*'path/to/file'git diff --name-only | grep -E '\.(test|spec)\.(js|ts|tsx|java|py)$'**/*.test.js, **/*Test.java, **/test_*.pyFor each test file, determine framework:
Jest Detection:
.test.js, .spec.js, .test.ts, .spec.ts, .test.tsx, .spec.tsximport { test, expect } from, import { describe, it } fromjestJUnit Detection:
*Test.java, *Tests.javaimport org.junit, @Test, @Disabledjunitpytest Detection:
test_*.py, *_test.pyimport pytest, @pytest.markpytestReact Testing Library Detection:
.test.tsx, .spec.tsximport { render } from '@testing-library/react'jest + react-testing-libraryIf framework cannot be determined:
Jest patterns to detect:
it.skip(, test.skip(, xit(, xtest(, describe.skip(\.(skip|xit|xtest)\(JUnit patterns to detect:
@Disabled, @Ignore annotations@(Disabled|Ignore)pytest patterns to detect:
@pytest.mark.skip, @pytest.mark.xfail@pytest\.mark\.(skip|xfail)For each match found:
Issue: Skipped Test (Critical)
Line {number}: {test name}
Problem: Test is disabled - known broken behavior being ignored
Risk: Production bug hiding behind disabled test
Fix: Remove skip/disable annotation and fix the failing test
Detection approach: For each test function, count assertion statements.
Jest assertions to count:
expect( statementsassert( statementsexpect\(|assert\(JUnit assertions to count:
assert, assertEquals, assertTrue, assertFalse, etc.assert[A-Z]\w+\(pytest assertions to count:
assert statements^\s+assert\sTautological assertion patterns:
expect(x).toBe(x), expect(true).toBe(true)assertTrue(true), assertEquals(x, x)assert True, assert x == xFor each test with zero assertions:
Issue: No Assertions (Critical)
Line {number}: test "{name}"
Problem: Test has no expect() calls - always passes regardless of implementation
Risk: Cannot detect regressions in {component} behavior
Fix:
test('{name}', () => {
// Arrange
const result = functionUnderTest(input);
// Assert
expect(result).toBe(expectedValue);
expect(result.property).toEqual(expectedProperty);
});
Detection approach: Count mock statements vs real code invocations in each test.
Mock patterns to count:
Jest:
jest.mock(, jest.spyOn(, mockImplementation, mockReturnValuejest\.(mock|spyOn)|mock(Implementation|ReturnValue)JUnit:
@Mock, Mockito.mock(, when(, verify(@Mock|Mockito\.(mock|when|verify)pytest:
@patch, Mock(), MagicMock()@patch|Mock\(\)|MagicMock\(\)Calculate ratio:
For each over-mocked test:
Issue: Over-Mocking (Critical)
Line {number}: test "{name}"
Problem: {percentage}% of test is mocking - not testing real behavior
Risk: Tests pass but production code may be broken
Fix: Reduce mocking. Test real integrations when possible:
- Mock external dependencies (APIs, databases) at boundaries
- Use real implementations for internal logic
- Integration tests should test actual integration
Detection approach: Find tests that ONLY use snapshot assertions without behavioral assertions.
Snapshot patterns:
toMatchSnapshot(), toMatchInlineSnapshot()toMatchSnapshot|toMatchInlineSnapshotCheck logic:
For each snapshot-only test:
Issue: Snapshot Only (Critical)
Line {number}: test "{name}"
Problem: Only uses toMatchSnapshot() with no behavior validation
Risk: Doesn't verify {component} actually works
Fix: Add assertions for interactive behavior:
- Test click handlers are called with correct arguments
- Test state changes correctly
- Test props are applied correctly
- Test accessibility attributes
- Then use snapshots as supplementary check
Detection approach: Find try/catch blocks or conditional logic that can prevent test failures.
Patterns to detect:
Try/catch with no rethrow:
try { ... } catch (e) { } or catch (e) { console.log }catch (Exception e) { } with no throw/failexcept: pass or except Exception:Conditional assertions:
if (condition) { expect(...) } - assertion might not runSearch patterns:
catch\s*\([^)]+\)\s*\{\s*\}For each error swallowing pattern:
Issue: Error Swallowing (Critical)
Line {number}: test "{name}"
Problem: Try/catch or conditional logic can prevent test failure
Risk: Test passes even when code throws errors
Fix: Either:
- Remove try/catch and let test fail on error
- If testing error handling, assert the error: expect(() => fn()).toThrow()
- Remove conditional logic around assertions
Find test command:
Check package.json (for JS/TS projects):
scripts.test or scripts.test-cinpm test or yarn testCheck Makefile (for any project):
^test:|^test-.*:make testCheck build.gradle (for Java projects):
task test./gradlew testDirect framework invocation (fallback):
npx jest --coverage./gradlew test or mvn testpytest --covIf ambiguous, use AskUserQuestion tool:
Execute test command: Run with Bash tool:
Example:
npm test 2>&1
Check 1: Exit Code
❌ Test Failures (Critical)
Problem: {count} tests failed
Risk: Cannot assess test quality when tests don't pass
Fix: Address test failures first:
{list of failed tests from output}
Check 2: Warning Messages
Scan output for warning patterns:
WARN:, WARNING:, Warning:deprecated, deprecation(node:) with warningFor each warning:
⚠️ Test Warning (Critical)
Problem: Test output contains warning
Warning: {warning text}
Risk: Warnings indicate unreliable test behavior
Fix: Address the warning - update deprecated APIs, fix configuration
Check 3: Error Messages (even if tests pass)
Scan output for error patterns:
Error:, ERROR:Exception:failed toFor each error in passing tests:
❌ Hidden Error (Critical)
Problem: Test output contains errors but tests still pass
Error: {error text}
Risk: Test is swallowing errors
Fix: Update test to fail on errors or fix the underlying issue
Check 4: Flaky Indicators
Scan for flaky patterns:
timeout, timed outETIMEDOUT, ECONNREFUSEDUnhandledPromiseRejectionrace conditionFor each flaky indicator:
🔀 Flaky Test Indicator (Critical)
Problem: Test output suggests flaky/unreliable behavior
Indicator: {text}
Risk: Test may pass/fail randomly
Fix:
- Add proper async/await handling
- Increase timeouts if necessary
- Fix race conditions
- Mock unstable dependencies
Find coverage reports:
Jest (JSON format):
coverage/coverage-final.jsonJUnit (JaCoCo XML format):
build/reports/jacoco/test/jacocoTestReport.xmlpytest (JSON format via pytest-cov):
.coverage or coverage.jsonCoverage metrics to extract:
Correlation with static analysis:
⚠️ High Coverage, Low Assertions (Critical)
Problem: {coverage}% code coverage but only {count} assertions
Risk: Executing code without verifying behavior - false confidence
Fix: Add meaningful assertions that verify:
- Return values
- State changes
- Side effects
- Error conditions
Report structure:
Testing Theatre Audit Report
=============================
Scope: {scope description} ({count} test files analyzed)
Status: {✅ CLEAN or ❌ {count} issues found (MUST FIX)}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
📁 {file_path} ({issue_count} issues)
{for each issue in file:}
❌ {Issue Type} (Critical)
Line {number}: {test name}
Problem: {description}
Risk: {impact}
Fix: {guidance with code example if applicable}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Summary: {total_issues} critical issues across {file_count} files
All issues must be fixed before production readiness.
Grouping logic:
Issue formatting: