Plugin

stress-test

Name: stress-test
Author: gbasin

Adversarially stress-test technical plans by verifying claims against real documentation, running proof-of-concept code in .poc-stress-test/, and iteratively updating the plan to catch issues before building.

testing

developer-tools

npx claudepluginhub gbasin/stress-test-skill --plugin stress-test

Component Overview

Skills

Component Details

Skills (1)

stress-test

/stress-test

Adversarially stress-test a technical plan by verifying claims against real docs, running POC code, and updating the plan before you build.

README

stress-test-skill

An agent skill that stress-tests technical plans before you build them.

Models are lazy about verification. They'll write a plan that says "use SQLite for concurrent writes" or "Y.js supports persistence out of the box" and move on without checking. These unchecked assumptions become mid-build surprises that force architectural pivots, messy workarounds, and wasted context.

This skill forces the model to actually verify its claims — searching real docs, ranking evidence quality, running proof-of-concept code when search is not enough, and fixing the plan before implementation starts. Each verification runs in a fresh sub-agent context, so there's less confirmation bias from the planning conversation — fewer hidden assumptions, less mid-build churn, and a clearer line between what's confirmed and what's still risk.

In action

A plan claimed bash + sqlite3 would be fast enough for git hooks. The skill spun up parallel agents to research alternatives and run an actual latency POC:

POC running

The POC disproved the assumption — bash was 4-5x slower than estimated — and surfaced the real tradeoffs across runtimes:

POC results

Install

npx skills add gbasin/stress-test-skill --all -g

Works with Claude Code, Codex, Cursor, Gemini CLI, GitHub Copilot, Windsurf, and other supported agents.

How it works

Six phases, each building on the last:

Decompose — Extracts decisions, assumptions, dependencies, interfaces, invariants, recovery paths, and observability gaps. Activates the relevant risk lenses for the plan.
Verify — Launches parallel sub-agents to search docs, repos, and the web for evidence, while ranking source quality and tracking contradictions instead of smoothing them over.
Triage — Separates what's resolved by evidence from what needs hands-on testing. Drafts minimal POC specs for unresolved items.
Approve — Presents proposed POCs and lets you choose which to run, skip, or modify. Any runtime validation requires approval first.
Test — Runs approved POCs in parallel in an isolated .poc-stress-test/ directory using the smallest representative setup in the most production-like environment available.
Update — Buckets results into Confirmed, Unresolved, and Accepted Risks; walks through plan-changing findings individually; applies approved updates inline; and cleans up after itself.

When to use it

After writing a technical plan or architecture doc, before you start building
When evaluating a new library, framework, or integration approach
Before committing to decisions that are expensive to reverse
Anytime a plan has claims you haven't personally verified

License

MIT

Similar Plugins

plan-review

Codex, Gemini, Claude の3つの AI で Plan ファイルを並列レビュー。実装計画の妥当性、抜け漏れ、リスクを分析する

2mo

v1.0.0

Stats

Version1.1.0

Stars39

Forks1

MaintenanceExcellent

LicenseMIT

AddedMar 16, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Available In

stress-test-skill39

Help us improve

Share bugs, ideas, or general feedback.

Back to Plugins

stress-test-skill

An agent skill that stress-tests technical plans before you build them.

In action

A plan claimed bash + sqlite3 would be fast enough for git hooks. The skill spun up parallel agents to research alternatives and run an actual latency POC:

The POC disproved the assumption — bash was 4-5x slower than estimated — and surfaced the real tradeoffs across runtimes:

Install

npx skills add gbasin/stress-test-skill --all -g

Works with Claude Code, Codex, Cursor, Gemini CLI, GitHub Copilot, Windsurf, and other supported agents.

How it works

Six phases, each building on the last:

Decompose — Extracts decisions, assumptions, dependencies, interfaces, invariants, recovery paths, and observability gaps. Activates the relevant risk lenses for the plan.

Verify — Launches parallel sub-agents to search docs, repos, and the web for evidence, while ranking source quality and tracking contradictions instead of smoothing them over.

Triage — Separates what's resolved by evidence from what needs hands-on testing. Drafts minimal POC specs for unresolved items.

Approve — Presents proposed POCs and lets you choose which to run, skip, or modify. Any runtime validation requires approval first.

Test — Runs approved POCs in parallel in an isolated .poc-stress-test/ directory using the smallest representative setup in the most production-like environment available.

Update — Buckets results into Confirmed, Unresolved, and Accepted Risks; walks through plan-changing findings individually; applies approved updates inline; and cleans up after itself.

When to use it

After writing a technical plan or architecture doc, before you start building

When evaluating a new library, framework, or integration approach

Before committing to decisions that are expensive to reverse

Anytime a plan has claims you haven't personally verified

License

stress-test

Component Overview

Component Details

Skills (1)

README

stress-test-skill

In action

Install

How it works

When to use it

License

Similar Plugins

plan-review

Help us improve

Help us improve

stress-test

Component Overview

Component Details

Skills (1)

README

stress-test-skill

In action

Install

How it works

When to use it

License

Similar Plugins

plan-review

Help us improve

deep-plan

axiom-planning

beagle-testing

beast-forge

code-foundations