From mlx
Statistical analysis, hypothesis testing, A/B testing, cohort analysis, segmentation, trend detection, business metrics, pre-delivery validation, and data visualization. Use when the user asks to "analyze this data", "run a statistical test", "compare groups", "find trends", "do A/B test analysis", "segment customers", "calculate KPIs", "validate this analysis", "check my work", "sanity check", "review my numbers", "make a chart", "create a dashboard", "plot the data", "visualize results", or mentions hypothesis testing, cohort analysis, business analytics, data validation, bar charts, line charts, heatmaps, scatter plots, or data storytelling.
npx claudepluginhub damionrashford/mlx --plugin mlxThis skill is limited to using the following tools:
Frameworks for answering business questions with data: descriptive statistics, hypothesis testing, cohort analysis, segmentation, trend detection, KPI calculation, and pre-delivery QA.
evals/evals.jsonevals/files/sales.csvreferences/analysis-methods.mdreferences/chart-selection.mdscripts/ab_test.pyscripts/chart_templates.pyscripts/cohort_analysis.pyscripts/descriptive_stats.pyscripts/format_number.pyscripts/hypothesis_test.pyscripts/rfm_segmentation.pyscripts/trend_analysis.pyscripts/validate.pyGenerates design tokens/docs from CSS/Tailwind/styled-components codebases, audits visual consistency across 10 dimensions, detects AI slop in UI.
Records polished WebM UI demo videos of web apps using Playwright with cursor overlay, natural pacing, and three-phase scripting. Activates for demo, walkthrough, screen recording, or tutorial requests.
Delivers idiomatic Kotlin patterns for null safety, immutability, sealed classes, coroutines, Flows, extensions, DSL builders, and Gradle DSL. Use when writing, reviewing, refactoring, or designing Kotlin code.
Frameworks for answering business questions with data: descriptive statistics, hypothesis testing, cohort analysis, segmentation, trend detection, KPI calculation, and pre-delivery QA.
| Script | Usage |
|---|---|
| descriptive_stats.py | uv run ${CLAUDE_SKILL_DIR}/scripts/descriptive_stats.py data.csv --group segment --value revenue |
| hypothesis_test.py | uv run ${CLAUDE_SKILL_DIR}/scripts/hypothesis_test.py data.csv --col value --group segment --a control --b treatment |
| ab_test.py | uv run ${CLAUDE_SKILL_DIR}/scripts/ab_test.py data.csv --col converted --group variant --control A --treatment B |
| cohort_analysis.py | uv run ${CLAUDE_SKILL_DIR}/scripts/cohort_analysis.py data.csv --user user_id --date order_date |
| rfm_segmentation.py | uv run ${CLAUDE_SKILL_DIR}/scripts/rfm_segmentation.py data.csv --customer customer_id --date order_date --value revenue |
| trend_analysis.py | uv run ${CLAUDE_SKILL_DIR}/scripts/trend_analysis.py data.csv --date date --value revenue --window 30 |
| validate.py | uv run ${CLAUDE_SKILL_DIR}/scripts/validate.py data.csv |
| chart_templates.py | uv run ${CLAUDE_SKILL_DIR}/scripts/chart_templates.py data.csv --type bar --x category --y value -o chart.png |
| Question | Analysis type | Script |
|---|---|---|
| What happened? | Descriptive statistics, aggregations | descriptive_stats.py |
| Why did it happen? | Diagnostic analysis, drill-downs, segmentation | rfm_segmentation.py |
| Is this difference real? | Hypothesis testing (t-test, chi-square) | hypothesis_test.py |
| Did the change work? | A/B test analysis | ab_test.py |
| How do groups behave over time? | Cohort analysis | cohort_analysis.py |
| What are the natural groupings? | Segmentation / clustering | rfm_segmentation.py |
| What are the trends? | Time series decomposition, rolling averages | trend_analysis.py |
| What should we track? | KPI definition and dashboarding | descriptive_stats.py |
| Is this ready to share? | Pre-delivery QA, sanity checking | validate.py |
| Situation | Use | Why |
|---|---|---|
| Symmetric distribution, no outliers | Mean | Most efficient estimator |
| Skewed distribution (revenue, duration) | Median | Robust to outliers |
| Categorical or ordinal data | Mode | Only option for non-numeric |
| Highly skewed with outliers | Median + mean | The gap shows skew |
Always report mean and median together for business metrics. If they diverge significantly, the data is skewed and the mean alone is misleading.
| Scenario | Test |
|---|---|
| Compare 2 group means (normal) | Independent t-test |
| Compare 2 group means (non-normal) | Mann-Whitney U |
| Compare 2 paired measurements | Paired t-test |
| Compare 3+ group means | One-way ANOVA |
| Compare proportions | Chi-square test |
| Test correlation | Pearson / Spearman |
| Test normality | Shapiro-Wilk |
The hypothesis_test.py script auto-selects the right test based on normality checks and reports p-value, effect size (Cohen's d), and confidence interval.
| Cohen's d | Interpretation |
|---|---|
| < 0.2 | Negligible |
| 0.2 - 0.5 | Small |
| 0.5 - 0.8 | Medium |
| > 0.8 | Large |
| Category | KPI | Formula |
|---|---|---|
| Revenue | MRR | Sum of monthly recurring revenue |
| Revenue | ARPU | Total revenue / active users |
| Growth | MoM Growth | (this_month - last_month) / last_month |
| Retention | Churn Rate | Lost customers / start customers |
| Retention | Retention Rate | 1 - churn rate |
| Engagement | DAU/MAU | Daily active / monthly active |
| Efficiency | CAC | Marketing spend / new customers |
| Efficiency | LTV | ARPU * avg lifetime months |
| Efficiency | LTV:CAC | LTV / CAC (target: > 3:1) |
| Conversion | Conversion Rate | Conversions / visitors |
| Conversion | Funnel Drop-off | Lost at each stage / entered stage |
=== Analysis Report ===
Question: [What business question are we answering?]
Data: [Dataset, date range, filters applied]
Method: [Statistical test / analysis type used]
Key Findings:
1. [Most important finding with numbers]
2. [Second finding]
3. [Third finding]
Statistical Evidence:
- Test: [name], p-value: [value], effect size: [value]
- Confidence interval: [range]
Caveats:
- [Sample size limitations]
- [Selection bias concerns]
- [Missing data impact]
Recommendation:
[Actionable next step based on findings]
| Method | How | When |
|---|---|---|
| Naive | Tomorrow = today | Baseline |
| Seasonal naive | Tomorrow = same day last week/year | Seasonal data |
| Linear trend | Fit a line to historical data | Clearly linear trends |
| Moving average | Trailing average as forecast | Noisy data |
Always communicate uncertainty — provide a range, not a point estimate:
When to escalate to a data scientist: Non-linear trends, multiple seasonalities, external factors, or when forecast accuracy matters for resource allocation.
A trend in aggregated data can reverse when segmented. Always check whether conclusions hold across key segments.
Testing 20 metrics at p=0.05 means ~1 will be falsely significant. Apply Bonferroni correction (alpha / number of tests) or report how many tests were run.
Aggregate trends may not apply to individuals. "Countries with higher X have higher Y" does NOT mean individuals with higher X have higher Y.
When you find a correlation, consider:
What you can say: "Users who use feature X have 30% higher retention" What you cannot say: "Feature X causes 30% higher retention"
uv run ${CLAUDE_SKILL_DIR}/scripts/chart_templates.py data.csv --type bar --x category --y value -o chart.png
uv run ${CLAUDE_SKILL_DIR}/scripts/chart_templates.py data.csv --type line --x date --y value --hue segment -o trend.png
uv run ${CLAUDE_SKILL_DIR}/scripts/chart_templates.py data.csv --type hist --x value -o dist.png
uv run ${CLAUDE_SKILL_DIR}/scripts/chart_templates.py data.csv --type heatmap -o correlations.png
uv run ${CLAUDE_SKILL_DIR}/scripts/chart_templates.py data.csv --type scatter --x feature_a --y target -o scatter.png
uv run ${CLAUDE_SKILL_DIR}/scripts/chart_templates.py data.csv --type box --x group --y value -o box.png
| Question | Chart type |
|---|---|
| How does X change over time? | Line chart |
| How do categories compare? | Bar chart (horizontal if many categories) |
| What is the distribution? | Histogram, box plot, violin plot |
| How do two variables relate? | Scatter plot |
| What are the correlations? | Heatmap |
| What is the composition? | Stacked bar |
| How do groups differ? | Grouped bar, box plot by group |
| What are the top/bottom N? | Horizontal bar, sorted |
| Multi-dimensional? | Pair plot |
| Framework | Best for | Output |
|---|---|---|
| matplotlib | Static charts, publications, fine control | PNG, PDF, SVG |
| seaborn | Statistical plots, quick EDA visuals | PNG, PDF, SVG |
| plotly | Interactive charts, dashboards, web | HTML, JSON |
| altair | Declarative, concise, notebooks | HTML, JSON |
Default: matplotlib + seaborn. Interactive: plotly (self-contained HTML).
sns.color_palette("colorblind")figures/ directory; use descriptive filenames (revenue_by_quarter.png)plt.close() after saving to avoid memory leaksSee references/chart-selection.md for the full chart reference.
Pre-delivery QA checklist, common data analysis pitfalls, result sanity checking, and documentation standards.
Run through before sharing any analysis with stakeholders.
A many-to-many join silently multiplies rows, inflating counts and sums. Always check row counts after joins. Use COUNT(DISTINCT id) instead of COUNT(*) when counting entities through joins.
Analyzing only entities that exist today, ignoring those that churned, failed, or were deleted. Ask "who is NOT in this dataset?" before drawing conclusions.
Comparing a partial period to a full period. "January revenue is $500K vs December's $800K" — but January isn't over yet. Filter to complete periods, or compare same-number-of-days.
The denominator changes between periods, making rates incomparable. Use consistent definitions across all compared periods. Document any changes.
Averaging pre-computed averages gives wrong results when group sizes differ. Always aggregate from raw data. Never average pre-aggregated averages.
Different data sources use different timezones, causing misalignment. Standardize all timestamps to a single timezone (UTC recommended) before analysis.
Segments defined by the outcome you're measuring, creating circular logic. Define segments based on pre-treatment characteristics, not outcomes.
| Metric Type | Sanity Check |
|---|---|
| User counts | Match known MAU/DAU figures? |
| Revenue | Right order of magnitude vs known totals? |
| Rates | Between 0% and 100%? Match dashboard? |
| Growth rates | Is 50%+ MoM realistic or a data issue? |
| Averages | Reasonable given the distribution? |
| Percentages | Segment percentages sum to ~100%? |
Every non-trivial analysis should include:
## Analysis: [Title]
### Question
[The specific question being answered]
### Data Sources
- Table/file: [name] (as of [date])
### Definitions
- [Metric A]: [How it's calculated]
- [Segment X]: [How membership is determined]
- [Time period]: [Start] to [end], [timezone]
### Methodology
1. [Step 1]
2. [Step 2]
### Assumptions and Limitations
- [Assumption and why it's reasonable]
- [Limitation and its impact on conclusions]
### Key Findings
1. [Finding with evidence]
### Caveats
- [Things the reader should know before acting on this]