From claude-data-analyst
Describe and assess the sample size of a dataset — not just row count, but effective sample size per question the user wants to answer. Flags underpowered segments, imbalanced classes, small-n group cells, and gives a concrete "you can / cannot reliably claim X from this data" verdict.
npx claudepluginhub danielrosehill/claude-code-plugins --plugin claude-data-analystThis skill uses the workspace's default tool permissions.
Row count is not sample size. This skill characterises the *effective* sample the user has for the *specific* analysis they want to run, and tells them whether that's enough.
Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.
Share bugs, ideas, or general feedback.
Row count is not sample size. This skill characterises the effective sample the user has for the specific analysis they want to run, and tells them whether that's enough.
duckdb — counts, group-by cardinality, null rates.uv run --with statsmodels python -c '...' — power calculations (statsmodels.stats.power), sample-size-for-proportions, sample-size-for-regression.Report:
Ask (or infer): what is one row? Customer? Transaction? Country-year? Sensor reading?
The honest sample size for most claims is the number of independent units, not the number of rows. If the dataset has 1M transactions from 50 customers and the question is about customer behaviour, n = 50, not 1,000,000. Surface this distinction — it's the single biggest source of misstated power.
For each grouping column or categorical target the user mentions:
Compute, per planned analysis column-set, the count of rows with all required columns non-null. This is frequently much smaller than the total row count — flag it loudly if drop rate > 20%.
Apply standard rules of thumb, tuned to the user's stated question:
| Question type | Rough minimum n | Notes |
|---|---|---|
| Single-group mean with SE ≈ SD/10 | ~100 | For ±10%-of-SD precision. |
| Proportion estimate (±5 pp, 95% CI) | ~385 | Classic survey number. Scales down for wider CIs. |
| Two-group comparison, medium effect (Cohen's d = 0.5) | ~64 per group | 80% power, two-sided α = 0.05. |
| Two-group comparison, small effect (d = 0.2) | ~394 per group | 80% power. |
| Chi-square 2×2 | Min expected cell ≥ 5 | Use Fisher's exact below that. |
| Linear regression | ≥ 10–20 rows per predictor | Harrell's rule. |
| Logistic regression | ≥ 10–20 events per predictor (EPV) | Rare-event datasets bite here. |
| Any ML model | ≥ 10× predictors, more for non-linear / high-variance methods | Validate via CV, not rules of thumb alone. |
| Time series forecasting | ≥ 2 full seasonal cycles | Otherwise seasonality can't be separated from trend. |
For any quantitative question the user states with an effect size or precision target, run the matching power calculation in statsmodels and report the required n alongside the observed n.
For each stated analysis:
Write outputs/sample-size/report.md:
Keep it short. This skill's job is to prevent overclaiming, not to teach statistics.
hypothesis-testing (which consumes the verdict to decide whether the test is worth running) and multivariate-analysis (where the 10-EPV / 10-rows-per-predictor rule bites hardest).