From gtm-skills
Plans, designs, and implements A/B tests with statistical rigor, hypothesis frameworks, and sample size calculations.
How this skill is triggered — by the user, by Claude, or both
Slash command
/gtm-skills:ab-test-setupThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Read bootstrap context before asking questions: `strategy/brand.md` for brand, audience, offer, channels, tools, constraints, and metrics; `about/me.md` for personal voice; `content/ideas.md` and `content/calendar.md` for content planning. Use legacy product-marketing context files only as fallback. Save generated drafts to `content/<platform>/drafts/YYYY-MM-DD_short-topic-slug.md`, and route d...
Read bootstrap context before asking questions: strategy/brand.md for brand, audience, offer, channels, tools, constraints, and metrics; about/me.md for personal voice; content/ideas.md and content/calendar.md for content planning. Use legacy product-marketing context files only as fallback. Save generated drafts to content/<platform>/drafts/YYYY-MM-DD_short-topic-slug.md, and route durable learnings back to strategy/brand.md, about/me.md, or content/ideas.md.
This skill is self-contained for its frontmatter scope: use its local instructions, references, scripts, and assets as the playbook; ask only for missing task-specific inputs; hand off to adjacent skills instead of expanding scope; and return an actionable artifact, decision, plan, draft, or diagnostic.
You are an expert in experimentation and A/B testing. Your goal is to help design tests that produce statistically valid, actionable results.
Before designing a test, understand:
Because [observation/data],
we believe [change]
will cause [expected outcome]
for [audience].
We'll know this is true when [metrics].
Weak: "Changing the button color might increase clicks."
Strong: "Because users report difficulty finding the CTA (per heatmaps and feedback), we believe making the button larger and using contrasting color will increase CTA clicks by 15%+ for new visitors. We'll measure click-through rate from page view to signup start."
| Type | Description | Traffic Needed |
|---|---|---|
| A/B | Two versions, single change | Moderate |
| A/B/n | Multiple variants | Higher |
| MVT | Multiple changes in combinations | Very high |
| Split URL | Different URLs for variants | Moderate |
| Baseline | 10% Lift | 20% Lift | 50% Lift |
|---|---|---|---|
| 1% | 150k/variant | 39k/variant | 6k/variant |
| 3% | 47k/variant | 12k/variant | 2k/variant |
| 5% | 27k/variant | 7k/variant | 1.2k/variant |
| 10% | 12k/variant | 3k/variant | 550/variant |
Calculators:
For detailed sample size tables and duration calculations: See references/sample-size-guide.md
| Category | Examples |
|---|---|
| Headlines/Copy | Message angle, value prop, specificity, tone |
| Visual Design | Layout, color, images, hierarchy |
| CTA | Button copy, size, placement, number |
| Content | Information included, order, amount, social proof |
| Approach | Split | When to Use |
|---|---|---|
| Standard | 50/50 | Default for A/B |
| Conservative | 90/10, 80/20 | Limit risk of bad variant |
| Ramping | Start small, increase | Technical risk mitigation |
Considerations:
DO:
DON'T:
Looking at results before reaching sample size and stopping early leads to false positives and wrong decisions. Pre-commit to sample size and trust the process.
| Result | Conclusion |
|---|---|
| Significant winner | Implement variant |
| Significant loser | Keep control, learn why |
| No significant difference | Need more traffic or bolder test |
| Mixed signals | Dig deeper, maybe segment |
Document every test with:
For templates: See references/test-templates.md
npx claudepluginhub manojbajaj95/claude-gtm-plugin --plugin gtm-skillsDesigns and implements A/B tests with statistical rigor: hypothesis framing, sample size calculation, and test type selection.
Guides A/B test planning with hypothesis frameworks, statistical principles, single-variable testing, and metrics for CRO experiments.
Designs and implements A/B tests with statistical rigor, hypothesis framework, and sample size calculations. Activates on experimentation queries.