Help us improve
Share bugs, ideas, or general feedback.
From rampstack-skills
Runs conversion rate optimization through hypothesis-driven testing: audit, hypothesis generation, test design, statistical analysis, and rollout decisions.
npx claudepluginhub rampstackco/claude-skills --plugin rampstack-skillsHow this skill is triggered — by the user, by Claude, or both
Slash command
/rampstack-skills:cro-optimizationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Run conversion rate optimization as a structured discipline: audit → hypothesize → test → decide. Stack-agnostic. Tool-agnostic.
Guides planning, designing, and implementing A/B tests, split tests, multivariate experiments. Covers hypotheses, sample sizes, test types, statistical principles.
Provides A/B testing strategies for funnel pages: priorities (headlines, CTAs), rules, 95% significance thresholds, hypothesis templates, patterns for opt-in/sales/pricing pages.
Generate structured A/B test plans for DTC funnels — hypothesis, control vs variant, primary metric, sample size estimate, test duration, and success criteria using RMBC principles.
Share bugs, ideas, or general feedback.
Run conversion rate optimization as a structured discipline: audit → hypothesize → test → decide. Stack-agnostic. Tool-agnostic.
This skill is for running tests against existing pages and flows. For writing landing page copy from scratch, use landing-page-copy. For setting up the analytics that make CRO possible, use analytics-strategy.
Diagnose before treating.
Quantitative audit:
Qualitative audit:
Heuristic audit:
The audit produces a list of suspected friction points. Each becomes a hypothesis candidate.
A testable statement.
Hypothesis structure:
Because [observation from audit], we believe that [change] will produce [predicted outcome] for [user segment], because [reason].
Example:
Because session replays show users abandoning at the shipping step (audit), we believe that adding visible shipping cost to the product page (change) will increase add-to-cart conversion by 5 percent (outcome) for desktop users (segment), because users are surprised by shipping cost and abandon (reason).
Hypothesis quality criteria:
Hypothesis prioritization (ICE or PIE):
Score each 1 to 10. Highest combined scores test first.
A test that produces an unambiguous answer.
Sample size and duration:
Use a sample size calculator (most A/B tools have one) before launching. Inputs:
This produces required sample size per variant. Run the test until that sample is reached, OR for a minimum duration that captures full business cycle (typically 2 weeks minimum, to cover weekends and weekly patterns).
Common test setup mistakes:
Test parameters to define before launch:
After the test concludes.
Decision framework:
| Outcome | Decision |
|---|---|
| Variant clearly wins (>95% significance, exceeds minimum effect) | Ship variant. Document. Continue testing. |
| Variant clearly loses | Kill. Capture the lesson. Iterate hypothesis. |
| Inconclusive (neither significant) | Larger test, different angle, or move on. Don't ship "tied" variants. |
| Small lift, lots of variance | Probably not worth shipping. Even if "winner," may not replicate. |
| Wins overall, loses for important segment | Investigate segment. Consider segment-specific solution. |
Anti-patterns:
A 95% significance level means: if there were truly no difference between variants, there's only a 5% chance you'd see results this extreme by chance.
That's not the same as "95% chance the variant wins."
Most CRO tools report Bayesian probabilities ("95% chance of being best"). Read the methodology your tool uses.
Conversion testing needs more sample than people intuit. Quick reference:
| Baseline rate | Minimum detectable effect | Sample per variant |
|---|---|---|
| 2% | 10% relative lift | ~75,000 |
| 2% | 20% relative lift | ~19,000 |
| 5% | 10% relative lift | ~30,000 |
| 5% | 20% relative lift | ~7,500 |
| 10% | 10% relative lift | ~14,000 |
| 10% | 20% relative lift | ~3,500 |
(Approximate. Use a calculator.)
If your monthly conversions per variant don't reach these numbers, A/B testing won't produce reliable results. Iterate via design and qualitative research instead.
The more variants and metrics tested simultaneously, the more false positives. Adjust significance thresholds for multiple comparisons (Bonferroni or similar).
Default output: a markdown test plan at cro-test-[hypothesis-slug].md per test. After the test runs, append the results section.
Structure:
# Test: [Hypothesis short name]
## Hypothesis
Because [observation], we believe that [change] will produce [outcome] for [segment], because [reason].
## Audit evidence
[What evidence supports this hypothesis]
## Test design
- Primary metric:
- Guardrail metrics:
- Sample size required:
- Duration: minimum X, maximum Y
- Variant traffic split:
- Segments to analyze:
## Decision criteria
- Ship if: [conditions]
- Kill if: [conditions]
- Extend if: [conditions]
## Results (filled after test)
- Sample reached:
- Duration actual:
- Primary metric: [variant vs control + significance]
- Guardrail metrics: [results]
- Segment analysis: [findings]
## Decision
[Ship / Kill / Extend / Iterate] - [Why]
## Lesson
[What this teaches us, regardless of outcome]
references/hypothesis-library.md - Common high-impact hypothesis patterns by funnel stage.