Skill

ring:dev-chaos-testing

Verifies chaos tests using Toxiproxy exist for Go and TypeScript services with external dependencies, ensuring graceful handling of connection loss, latency, and partitions after integration testing.

Typescript

testing

devops

npx claudepluginhub lerianstudio/ring --plugin ring-dev-team

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Ensure code handles **failure conditions gracefully** by injecting faults using Toxiproxy. Verify connection loss, latency, and network partitions don't cause crashes.

SKILL.md

Similar Skills

design-system

167.4k

Generates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.

team-skills-platform

ui-demo

167.4k

Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.

team-skills-platform

kotlin-patterns

167.4k

Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.

team-skills-platform

Stats

Parent Repo Stars169

Parent Repo Forks18

Last CommitApr 12, 2026

Actions

View Source View Plugin View on GitHub View README

Dev Chaos Testing (Gate 7)

Overview

Ensure code handles failure conditions gracefully by injecting faults using Toxiproxy. Verify connection loss, latency, and network partitions don't cause crashes.

Core principle: All infrastructure fails. Chaos testing ensures your code handles it gracefully.

<block_condition>

No chaos tests = FAIL
Any dependency without failure test = FAIL
Recovery not verified = FAIL
System crashes on failure = FAIL </block_condition>

CRITICAL: Role Clarification

This skill ORCHESTRATES. QA Analyst Agent (chaos mode) EXECUTES.

Who	Responsibility
This Skill	Gather requirements, dispatch agent, track iterations
QA Analyst Agent	Write chaos tests, setup Toxiproxy, verify recovery

Standards Reference

MANDATORY: Load testing-chaos.md standards via WebFetch.

<fetch_required> https://raw.githubusercontent.com/LerianStudio/ring/main/dev-team/docs/standards/golang/testing-chaos.md </fetch_required>

Step 0: Detect External Dependencies (Auto-Detection)

MANDATORY: When external_dependencies is empty or not provided, scan the codebase to detect them automatically before validation.

if external_dependencies is empty or not provided:

  detected_dependencies = []

  1. Scan docker-compose.yml / docker-compose.yaml for service images:
     - Grep tool: pattern "postgres" in docker-compose* files → add "postgres"
     - Grep tool: pattern "mongo" in docker-compose* files → add "mongodb"
     - Grep tool: pattern "valkey" in docker-compose* files → add "valkey"
     - Grep tool: pattern "redis" in docker-compose* files → add "redis"
     - Grep tool: pattern "rabbitmq" in docker-compose* files → add "rabbitmq"

  2. Scan dependency manifests:
     if language == "go":
       - Grep tool: pattern "github.com/lib/pq" in go.mod → add "postgres"
       - Grep tool: pattern "github.com/jackc/pgx" in go.mod → add "postgres"
       - Grep tool: pattern "go.mongodb.org/mongo-driver" in go.mod → add "mongodb"
       - Grep tool: pattern "github.com/redis/go-redis" in go.mod → add "redis"
       - Grep tool: pattern "github.com/valkey-io/valkey-go" in go.mod → add "valkey"
       - Grep tool: pattern "github.com/rabbitmq/amqp091-go" in go.mod → add "rabbitmq"

     if language == "typescript":
       - Grep tool: pattern "\"pg\"" in package.json → add "postgres"
       - Grep tool: pattern "@prisma/client" in package.json → add "postgres"
       - Grep tool: pattern "\"mongodb\"" in package.json → add "mongodb"
       - Grep tool: pattern "\"mongoose\"" in package.json → add "mongodb"
       - Grep tool: pattern "\"redis\"" in package.json → add "redis"
       - Grep tool: pattern "\"ioredis\"" in package.json → add "redis"
       - Grep tool: pattern "@valkey" in package.json → add "valkey"
       - Grep tool: pattern "\"amqplib\"" in package.json → add "rabbitmq"
       - Grep tool: pattern "amqp-connection-manager" in package.json → add "rabbitmq"

  3. Deduplicate detected_dependencies
  4. Set external_dependencies = detected_dependencies

  Log: "Auto-detected external dependencies: [detected_dependencies]"

<auto_detect_reason> PM team task files often omit external_dependencies. If the codebase uses postgres, mongodb, valkey, or rabbitmq, these are external dependencies that MUST have chaos tests. Auto-detection prevents silent skips. </auto_detect_reason>

Step 1: Validate Input

REQUIRED INPUT:
- unit_id: [task/subtask being tested]
- external_dependencies: [postgres, mongodb, valkey, redis, rabbitmq, etc.] (from input OR auto-detected in Step 0)
- language: [go|typescript]

OPTIONAL INPUT:
- gate6_handoff: [full Gate 6 output]

if any REQUIRED input is missing:
  → STOP and report: "Missing required input: [field]"

if external_dependencies is empty (AFTER auto-detection in Step 0):
  → STOP and report: "No external dependencies found after codebase scan - chaos testing requires dependencies"

Step 2: Dispatch QA Analyst Agent (Chaos Mode)

Task tool:
  subagent_type: "ring:qa-analyst"
  prompt: |
    **MODE:** CHAOS TESTING (Gate 7)

    **Standards:** Load testing-chaos.md

    **Input:**
    - Unit ID: {unit_id}
    - External Dependencies: {external_dependencies}
    - Language: {language}

    **Requirements:**
    1. Setup Toxiproxy infrastructure in tests/utils/chaos/
    2. Create chaos tests (TestIntegration_Chaos_{Component}_{Scenario} naming)
    3. Use dual-gate pattern (CHAOS=1 env + testing.Short())
    4. Test failure scenarios: Connection Loss, High Latency, Network Partition
    5. Verify 5-phase structure: Normal → Inject → Verify → Restore → Recovery

    **Output Sections Required:**
    - ## Chaos Testing Summary
    - ## Failure Scenarios
    - ## Handoff to Next Gate

Step 3: Evaluate Results

Parse agent output:

if "Status: PASS" in output:
  → Gate 7 PASSED
  → Return success with metrics

if "Status: FAIL" in output:
  → Dispatch fix to implementation agent
  → Re-run chaos tests (max 3 iterations)
  → If still failing: ESCALATE to user

Step 4: Generate Output

## Chaos Testing Summary
**Status:** {PASS|FAIL}
**Dependencies Tested:** {count}
**Scenarios Tested:** {count}
**Recovery Verified:** {Yes|No}

## Failure Scenarios
| Component | Scenario | Status | Recovery |
|-----------|----------|--------|----------|
| {component} | {scenario} | {PASS|FAIL} | {Yes|No} |

## Handoff to Next Gate
- Ready for Gate 8 (Code Review): {YES|NO}
- Iterations: {count}

Failure Scenarios by Dependency

Dependency	Required Scenarios
PostgreSQL	Connection Loss, High Latency, Network Partition
MongoDB	Connection Loss, High Latency, Network Partition
Valkey	Connection Loss, High Latency, Timeout
Redis	Connection Loss, High Latency, Timeout
RabbitMQ	Connection Loss, Network Partition, Slow Consumer
HTTP APIs	Timeout, 5xx Errors, Connection Refused

Severity Calibration

Severity	Criteria	Examples
CRITICAL	System crashes on failure, data loss	Panic on connection loss, corrupted state on partition
HIGH	No recovery, missing dependency tests	System doesn't recover after failure, untested dependency
MEDIUM	Partial recovery, missing scenarios	Recovery takes too long, missing latency test
LOW	Cleanup issues, documentation	Test artifacts not cleaned, missing chaos docs

Report all severities. CRITICAL = immediate fix (production risk). HIGH = fix before gate pass. MEDIUM = fix in iteration. LOW = document.

Anti-Rationalization Table

Rationalization	Why It's WRONG	Required Action
"Infrastructure is reliable"	AWS, GCP, Azure all have outages. Your code must handle them.	Write chaos tests
"Integration tests cover failures"	Integration tests verify happy path. Chaos tests verify failure handling.	Write chaos tests
"Toxiproxy is complex"	One container. 20 minutes setup. Prevents production incidents.	Write chaos tests
"We have monitoring"	Monitoring detects problems. Chaos testing prevents them.	Write chaos tests
"Circuit breakers handle it"	Circuit breakers need testing too. Chaos tests verify they work.	Write chaos tests