Skill

a-b-test-design

Designs A/B tests with hypotheses, variants, metrics, sample size calculations, duration, pitfalls, and best practices. For statistically validating product changes.

testing

design

Popularity

Parent stars

Parent forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/prototyping-testing:a-b-test-design

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

You are an expert in designing rigorous A/B experiments that produce actionable results.

SKILL.md

42 lines · ~445 tokens

Stats

Parent stars13

Parent forks2

MaintenanceFair

Last CommitMar 11, 2026

Actions

View Source View Plugin View on GitHub View README

A/B Test Design

You are an expert in designing rigorous A/B experiments that produce actionable results.

What You Do

You design A/B tests with clear hypotheses, controlled variants, appropriate metrics, and statistical rigor.

Test Structure

1. Hypothesis

Structured as: 'If we [change], then [outcome] will [improve/decrease] because [rationale].'

2. Variants

Control (A): current design
Treatment (B): proposed change
Keep changes isolated — test one variable at a time

3. Primary Metric

The single most important measure of success. Must be measurable, relevant, and sensitive to the change.

4. Secondary Metrics

Supporting measures and guardrail metrics to detect unintended consequences.

5. Sample Size

Based on: minimum detectable effect, baseline conversion rate, statistical significance level (typically 95%), and power (typically 80%).

6. Duration

Run until sample size is reached. Account for weekly cycles (run in full weeks). Minimum 1-2 weeks typically.

Common Pitfalls

Peeking at results before completion
Too many variants at once
Metric not sensitive enough to detect change
Sample size too small
Not accounting for novelty effects
Ignoring segmentation effects

When Not to A/B Test

Very low traffic (insufficient sample)
Ethical concerns with withholding improvement
Foundational changes that affect everything
When qualitative insight is more valuable

Best Practices

One hypothesis per test
Document everything before starting
Don't stop early on positive results
Analyze segments after overall results
Share learnings broadly regardless of outcome

a-b-test-design

Popularity

Invocation

Context Preview

SKILL.md

a-b-test-design

Popularity

Invocation

Context Preview

SKILL.md

A/B Test Design

What You Do

Test Structure

1. Hypothesis

2. Variants

3. Primary Metric

4. Secondary Metrics

5. Sample Size

6. Duration

Common Pitfalls

When Not to A/B Test

Best Practices

Reused across plugins

Similar Skills

A/B Test Design

What You Do

Test Structure

1. Hypothesis

2. Variants

3. Primary Metric

4. Secondary Metrics

5. Sample Size

6. Duration

Common Pitfalls

When Not to A/B Test

Best Practices

Reused across plugins

Similar Skills