From antigravity-awesome-skills
Enforces A/B test setup with gates for hypothesis locking, metrics definition, sample size calculation, assumptions checks, and execution readiness before implementation.
npx claudepluginhub mit-network/antigravity-awesome-skillsThis skill uses the workspace's default tool permissions.
Ensure every A/B test is **valid, rigorous, and safe** before a single line of code is written.
Guides A/B test setup with mandatory gates for hypothesis validation, metrics definition, sample size calculation, and execution readiness checks.
Guides planning, design, and implementation of A/B tests and experiments using hypothesis frameworks, sample size calculators, test types, and metric selection for statistical validity.
Produces A/B test experiment specs with hypothesis, primary metric, MDE, sample size, run time, and decision rules. Determines when not to test and suggests alternatives.
Share bugs, ideas, or general feedback.
Ensure every A/B test is valid, rigorous, and safe before a single line of code is written.
You must have:
A valid hypothesis includes:
Before designing variants or metrics, you MUST:
Ask explicitly:
“Is this the final hypothesis we are committing to for this test?”
Do NOT proceed until confirmed.
Explicitly list assumptions about:
If assumptions are weak or violated:
Choose the simplest valid test:
Default to A/B unless there is a clear reason otherwise.
Define upfront:
Estimate:
Do NOT proceed without a realistic sample size estimate.
You may proceed to implementation only if all are true:
If any item is missing, stop and resolve it.
DO:
DO NOT:
When interpreting results:
| Result | Action |
|---|---|
| Significant positive | Consider rollout |
| Significant negative | Reject variant, document learning |
| Inconclusive | Consider more traffic or bolder change |
| Guardrail failure | Do not ship, even if primary wins |
Document:
Store records in a shared, searchable location to avoid repeated failures.
Refuse to proceed if:
Explain why and recommend next steps.
A/B testing is not about proving ideas right. It is about learning the truth with confidence.
If you feel tempted to rush, simplify, or “just try it” — that is the signal to slow down and re-check the design.
This skill is applicable to execute the workflow or actions described in the overview.