From product-skills
Designs hypothesis-driven A/B tests and experiments, including hypothesis templates, primary/guardrail metrics, sample size calculations, duration planning, and common pitfalls to avoid.
npx claudepluginhub assimovt/productskills --plugin product-skillsThis skill uses the workspace's default tool permissions.
Design experiments that actually prove something. Most A/B tests fail because they test vague ideas, run too short, or peek at results. A well-designed experiment has a clear hypothesis, adequate power, and a pre-committed analysis plan.
Designs A/B tests with hypothesis, variants, success metrics, sample size, and duration. Use when planning experiments to validate product changes or hypotheses.
Designs controlled experiments (A/B, multivariate, quasi) with hypothesis, success metrics, sample size, and statistical power. For validating features via /design-experiment or phrases like 'design experiment'.
Designs complete A/B test plans from hypotheses, including structured hypothesis, primary/guardrail metrics, variants, sample size, duration, success criteria, and risks.
Share bugs, ideas, or general feedback.
Design experiments that actually prove something. Most A/B tests fail because they test vague ideas, run too short, or peek at results. A well-designed experiment has a clear hypothesis, adequate power, and a pre-committed analysis plan.
Every experiment starts with a written hypothesis before any work begins:
"If we [make this specific change] for [this audience], then [this metric] will [change in this direction] by [this amount], because [this reason based on evidence]."
Example:
"If we replace the 5-step onboarding wizard with a single guided first-project flow for new signups, then 7-day activation rate will increase from 23% to 35%, because 4/6 interviewed users said they wanted to 'just start using it' not 'set everything up first.'"
Every part matters:
One metric the experiment is designed to move. Not three. One. Additional metrics are guardrails.
Metrics that must NOT degrade. These prevent "winning" by breaking something else.
Calculate BEFORE running. Use a sample size calculator with:
If you need 50,000 users and you get 500/week, the experiment will take 100 weeks. Either increase the MDE or don't run the experiment.
Run for at least one full business cycle (usually 1-2 weeks minimum) to capture day-of-week effects. NEVER run less than 7 days.
Write BEFORE launching: what metric, what threshold, what you'll do if it wins/loses/is inconclusive. Pre-commit to avoid post-hoc storytelling.
Built on controlled experimentation methodology (Kohavi, Tang, Xu). Skills from productskills.