Skill

analyze-test-failures

Analyzes failing tests to classify as test bugs, implementation bugs, or ambiguous. Investigates test intent vs code behavior with evidence, recommends targeted fixes.

Python

testing

npx claudepluginhub jamie-bitflight/claude_skills --plugin python3-development

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Analyze failing test cases with a balanced, investigative approach.

SKILL.md

Similar Skills

analyze-test-failures

Analyzes failing test cases to classify failures as test bugs, implementation bugs, or ambiguous. Uses investigative process to verify expectations against code and docs before recommending fixes.

python-engineering

analyze-test-failures

Analyzes failing test cases to classify as test bugs, implementation bugs, or ambiguous cases. Uses investigative process to determine root causes without premature fixes.

test-failure-mindset

Guides systematic test failure investigation with dual hypotheses (test vs. code issue), step-by-step protocol, red flags, and best practices.

python3-development

Stats

Parent Repo Stars34

Parent Repo Forks5

Last CommitMar 23, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Analyze Test Failures

Analyze failing test cases with a balanced, investigative approach.

Context

Consult ../python3-development/references/python3-standards.md when shared testing or quality rules from this plugin apply; full standards, graphs, and amendment process are documented there.

When tests fail, there are two primary possibilities:

False positive: The test itself is incorrect
True positive: The test discovered a genuine bug

Assuming tests are wrong by default is a dangerous anti-pattern that defeats the purpose of testing.

Analysis Process

1. Initial Analysis

Read the failing test carefully, understanding its intent
Examine the test's assertions and expected behavior
Review the error message and stack trace

2. Investigate the Implementation

Check the actual implementation being tested
Trace through the code path that leads to the failure
Verify that implementation matches documented behavior

3. Apply Critical Thinking

For each failing test, ask:

What behavior is the test trying to verify?
Is this behavior clearly documented or implied by the API design?
Does the current implementation actually provide this behavior?
Could this be an edge case the implementation missed?

4. Make a Determination

Classify the failure as one of:

Classification	Meaning
Test Bug	Test's expectations are incorrect
Implementation Bug	Code doesn't behave as it should
Ambiguous	Intended behavior is unclear

5. Document Reasoning

Provide clear explanation including:

Evidence supporting the conclusion
Specific mismatch between expectation and reality
Recommended fix (to test or implementation)

Example Analyses

Example 1: Ambiguous Behavior

Scenario: Test expects calculateDiscount(100, 0.2) to return 20, but it returns 80

Analysis:

Test assumes function returns discount amount
Implementation returns price after discount
Function name is ambiguous

Determination: Ambiguous Recommendation: Check documentation or clarify intended behavior

Example 2: Implementation Bug

Scenario: Test expects validateEmail("user@example.com") to return true, but it returns false

Analysis:

Test provides a valid email format
Implementation regex is missing support for dots in domain
Other valid emails also fail

Determination: Implementation Bug Recommendation: Fix the regex to properly validate email addresses per RFC standards

Example 3: Test Bug

Scenario: Test expects divide(10, 0) to return 0, but it throws an error

Analysis:

Test assumes division by zero returns 0
Implementation throws DivisionByZeroError
Standard mathematical behavior is to treat as undefined/error

Determination: Test Bug Recommendation: Update test to expect an error, not 0

Output Format

For each failing test, provide:

Test: [test name/description]
Failure: [what failed and how]

Investigation:
- Test expects: [expected behavior]
- Implementation does: [actual behavior]
- Root cause: [why they differ]

Determination: [Test Bug | Implementation Bug | Ambiguous]

Recommendation:
[Specific fix to either test or implementation]

Key Principles

NEVER automatically assume the test is wrong
ALWAYS consider that the test might have found a real bug
When uncertain, lean toward investigating the implementation
Tests are often your specification - they define expected behavior
A failing test is a gift - it's either catching a bug or clarifying requirements

Related Skills

test-failure-mindset: Set investigative approach for session
comprehensive-test-review: Full test suite review