Help us improve
Share bugs, ideas, or general feedback.
From pm-data-analytics
Analyzes A/B test results for statistical significance, sample size validation, confidence intervals, lift, guardrails, and ship/extend/stop recommendations. Handles CSV/Excel data via Python scripts.
npx claudepluginhub phuryn/pm-skills --plugin pm-data-analyticsHow this skill is triggered — by the user, by Claude, or both
Slash command
/pm-data-analytics:ab-test-analysisThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Evaluate A/B test results with statistical rigor and translate findings into clear product decisions.
Analyzes A/B test results for statistical significance, sample size validation, confidence intervals, guardrail metrics, and recommendations on launch, extension, or termination. Useful for evaluating experiments, interpreting split test data, or deciding variant rollouts.
Designs statistically rigorous A/B tests and interprets experiment results with ship/iterate/kill recommendations. Handles sample size calculation, success criteria, and risk flags.
Share bugs, ideas, or general feedback.
Evaluate A/B test results with statistical rigor and translate findings into clear product decisions.
You are analyzing A/B test results for $ARGUMENTS.
If the user provides data files (CSV, Excel, or analytics exports), read and analyze them directly. Generate Python scripts for statistical calculations when needed.
Understand the experiment:
Validate the test setup:
Calculate statistical significance:
If the user provides raw data, generate and run a Python script to calculate these.
Check guardrail metrics:
Interpret results:
| Outcome | Recommendation |
|---|---|
| Significant positive lift, no guardrail issues | Ship it — roll out to 100% |
| Significant positive lift, guardrail concerns | Investigate — understand trade-offs before shipping |
| Not significant, positive trend | Extend the test — need more data or larger effect |
| Not significant, flat | Stop the test — no meaningful difference detected |
| Significant negative lift | Don't ship — revert to control, analyze why |
Provide the analysis summary:
## A/B Test Results: [Test Name]
**Hypothesis**: [What we expected]
**Duration**: [X days] | **Sample**: [N control / M variant]
| Metric | Control | Variant | Lift | p-value | Significant? |
|---|---|---|---|---|---|
| [Primary] | X% | Y% | +Z% | 0.0X | Yes/No |
| [Guardrail] | ... | ... | ... | ... | ... |
**Recommendation**: [Ship / Extend / Stop / Investigate]
**Reasoning**: [Why]
**Next steps**: [What to do]
Think step by step. Save as markdown. Generate Python scripts for calculations if raw data is provided.