Skill

Neuroimaging Sample Size Calculator

From awesome-cognitive-and-neuroscience-skills

Guides simulation-based sample size planning for neuroimaging studies (fMRI, EEG, MEG) using effect-size maps. For grant proposals, registered reports, or pilot data evaluation.

ai-ml

data-engineering

npx claudepluginhub neuroaihub/awesome_cognitive_and_neuroscience_skills --plugin awesome-cognitive-and-neuroscience-skills

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Traditional power analysis (e.g., using G*Power for a t-test) fails for neuroimaging because it cannot account for the massive multiple comparisons problem, spatial correlation structure, or the multi-level nature of neuroimaging inference. Neuroimaging requires simulation-based approaches that generate synthetic datasets, apply the full analysis pipeline including multiple comparison correctio...

Supporting Assets

references/worked-examples.md

SKILL.md

Similar Skills

Neuroimaging Power Guide

Guides sample-size planning for fMRI/EEG/MEG studies using effect-size benchmarks, simulations, and multiple-comparison adjustments. For new study design, grants, and power evaluation.

1 file

awesome-cognitive-and-neuroscience-skills

statistical-analysis

Guides statistical test selection, assumption checks, power analysis, hypothesis tests (t-tests, ANOVA, chi-square, regression, Bayesian), effect sizes, and APA-formatted reports for research data.

6 files

superpowers

quantitative-analysis

Selects statistical tests, interprets effect sizes and confidence intervals, conducts power analysis, verifies assumptions for quantitative research data analysis.

co-researcher

Stats

Stars15

Forks0

Last CommitMar 2, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Neuroimaging Sample Size Calculator

Purpose

A competent programmer without neuroimaging training would use standard power formulas and dramatically overestimate the power of a whole-brain analysis. They would not know that cluster-extent thresholds, random field theory corrections, and spatial smoothness all affect the effective number of tests, nor that pilot-data-based simulation is the gold standard for neuroimaging power analysis. This skill encodes the domain-specific methodology for simulation-based sample size planning.

When to Use This Skill

Planning sample size for a new fMRI, EEG, or MEG study
Conducting power analysis for a grant application or registered report
Estimating required N when pilot data or published effect size maps are available
Choosing between whole-brain and ROI-based analysis based on power constraints
Evaluating the statistical adequacy of a proposed or completed study

Research Planning Protocol

Before executing the domain-specific steps below, you MUST:

State the research question — What specific question is this analysis/paradigm addressing?
Justify the method choice — Why is this approach appropriate? What alternatives were considered?
Declare expected outcomes — What results would support vs. refute the hypothesis?
Note assumptions and limitations — What does this method assume? Where could it mislead?
Present the plan to the user and WAIT for confirmation before proceeding.

For detailed methodology guidance, see the research-literacy skill.

⚠️ Verification Notice

This skill was generated by AI from academic literature. All parameters, thresholds, and citations require independent verification before use in research. If you find errors, please open an issue.

Why Traditional Power Analysis Fails for Neuroimaging

The Fundamental Problem

Standard power analysis computes the sample size for a single statistical test at a given effect size, alpha, and power. Neuroimaging violates every assumption of this framework:

Standard Assumption	Neuroimaging Reality	Consequence
Single test	~100,000 voxels tested	Alpha must be corrected, dramatically reducing per-test sensitivity
Independent tests	Voxels are spatially correlated (due to smoothing and neural organization)	Effective number of tests is much less than 100,000, but hard to compute analytically
Known effect size	Effect size varies across voxels and depends on ROI definition	No single "effect size" characterizes a study
Simple test statistic	Cluster-based, TFCE, and permutation tests have complex null distributions	Power depends on the specific inference method used
One-level inference	Subject-level estimation + group-level test	Within-subject variance and between-subject variance both affect power

Source: Mumford & Nichols, 2008; Poldrack et al., 2017.

The Pilot-Data-Based Simulation Approach

The gold standard for neuroimaging power analysis uses pilot data to simulate full datasets at varying sample sizes (Mumford & Nichols, 2008).

Step-by-Step Procedure

Step 1: Obtain pilot data or published effect-size maps
 |
Step 2: Estimate expected effect sizes at regions of interest
 |
Step 3: Simulate datasets with varying N
 |
Step 4: Apply full analysis pipeline (including multiple comparison correction)
 |
Step 5: Compute power = proportion of simulations detecting the effect
 |
Step 6: Find the N that achieves target power (typically 80% or 90%)

Step 1: Obtain Pilot Data

Source	Quality	Requirements	Caveats
Own pilot study	Best	At least 10-15 subjects for stable variance estimates	Effect sizes from small pilots are inflated; use conservative estimates
Published group map	Good	Unthresholded statistical map (t-map or z-map)	May not match your exact paradigm or population
NeuroVault repository	Good	Search for comparable paradigms	Maps may use different preprocessing/analysis pipelines
Meta-analytic map (NeuroSynth, NiMARE)	Moderate	Coordinate-based or image-based meta-analysis	Provides average effect across studies, may underestimate for specific paradigms

Source: Mumford & Nichols, 2008; Poldrack et al., 2017.

Critical warning: Effect sizes from small pilot studies (N < 20) are inflated due to the winner's curse. Assume the true effect is 50-75% of the pilot estimate (Button et al., 2013).

Step 2: Estimate Effect Sizes

For ROI-based analysis:

Define the ROI a priori (from atlas, meta-analysis, or independent data)
Extract the mean effect size (Cohen's d or percent signal change) from the pilot data within the ROI
Apply the deflation correction (multiply by 0.5-0.75) for conservative estimation

For whole-brain analysis:

Use the full unthresholded statistical map as the effect-size map
The map captures spatial variation in effect size across the brain
Power will vary by region -- focus on the primary region of interest for sample size determination

Step 3: Simulate Datasets

For each candidate sample size N:

Generate 1,000-5,000 simulated group maps by: a. Sampling N subjects from a population with the estimated effect size and variance b. Adding realistic noise (estimated from pilot residuals or assumed Gaussian with spatial smoothness matching the pilot data) c. Creating a group-level statistical map
Apply the smoothness estimate from the pilot data (or the planned smoothing kernel) to each simulated map

Step 4: Apply Full Analysis Pipeline

For each simulated dataset:

Compute the group-level statistical map (e.g., one-sample t-test)
Apply the planned multiple comparison correction method:

Cluster-based inference: apply cluster-defining threshold (CDT) of p < 0.001 (Eklund et al., 2016) and identify significant clusters
Voxelwise FWE: apply random field theory correction at p < 0.05 FWE
TFCE: compute TFCE image and apply permutation-based correction
FDR: apply Benjamini-Hochberg at q < 0.05

Step 5: Compute Power

Voxel-level power: For each voxel, power = proportion of simulations in which that voxel is significant
ROI-level power: Power = proportion of simulations in which at least one voxel in the target ROI is significant
Cluster-level power: Power = proportion of simulations in which a significant cluster overlaps with the target region

Report the power metric most relevant to your planned analysis (Mumford & Nichols, 2008).

Tools and Implementations

fMRIpower (Mumford & Nichols, 2008)

Feature	Description
Input	Pilot group-level statistical maps (from FSL)
Method	Resamples from pilot to estimate power at varying N
Output	Power curves for specified ROIs at different sample sizes
Requirements	FSL, R; pilot data from at least 10-15 subjects
Strengths	Uses actual pilot data; accounts for design-specific temporal autocorrelation
Limitations	Assumes pilot effect sizes are representative; FSL-specific

NeuroPowerTools (Durnez et al., 2016)

Feature	Description
Input	Unthresholded statistical map (any software)
Method	Fits mixture model to peak distribution; estimates prevalence and effect size
Output	Power estimates at varying N; optimal sample size for target power
Access	Web-based: https://neuropowertools.org
Strengths	Does not require individual subject data; works with published maps
Limitations	Peak-based approximation; may underestimate power for distributed effects

PowerMap (Joyce & Hayasaka, 2012)

Feature	Description
Input	Assumed effect size map, noise model, smoothness
Method	Full simulation with parametric statistical testing
Output	Voxelwise power maps at specified N
Requirements	MATLAB
Strengths	Voxel-level power visualization; flexible correction methods
Limitations	Computationally intensive; requires specification of noise model

AFNI 3dClustSim

Feature	Description
Input	Smoothness estimates (from 3dFWHMx), voxel dimensions, mask
Method	Monte Carlo simulation of random fields
Output	Cluster-size thresholds for a given alpha level
Use for power	Estimate minimum detectable cluster size at a given sample size; not a full power tool
Strengths	Fast, accounts for non-Gaussian smoothness (ACF model; Cox et al., 2017)
Limitations	Does not compute power directly; only provides cluster-extent thresholds

ROI-Based Power Shortcuts

When full simulation is impractical, ROI-based power analysis provides a reasonable alternative:

Procedure

Define the target ROI a priori (from atlas, meta-analysis, or independent data)
Extract the expected effect size (Cohen's d) from pilot data or literature:

Mean activation within ROI / standard deviation of activation across subjects

Use standard power formulas (G*Power or similar) with the ROI-level effect size
No multiple comparison correction is needed for a single a priori ROI

Effect Size Extraction from Published Results

Published Statistic	Conversion to Cohen's d	Source
t-value (within-subject)	d = t / sqrt(N)	Standard formula
t-value (between-group)	d = 2t / sqrt(df)	Standard formula
z-value	d = z / sqrt(N) (approximate)	Approximate for large N
Percent signal change + SD	d = mean_PSC / SD_PSC	Direct computation
Partial eta-squared	d = sqrt(eta^2 / (1 - eta^2))	Conversion formula

Meta-Analytic Effect Sizes

Use coordinate-based meta-analysis tools to estimate effect sizes at specific brain locations:

Tool	Method	Output	Source
NiMARE	ALE, MKDA, or other CBMA	Meta-analytic map; extract effect at ROI	Salo et al., 2023
NeuroSynth	Automated term-based meta-analysis	Association maps; extract effect at coordinates	Yarkoni et al., 2011
BrainMap	ALE meta-analysis	Coordinate-based likelihood maps	Laird et al., 2005

Caveat: Meta-analytic effect sizes aggregate across many studies with different designs, populations, and analysis pipelines. They provide a reasonable lower bound but may not match your specific paradigm (Yarkoni et al., 2011).

Current Sample Size Recommendations

Landmark Findings

Finding	Recommendation	Source
Brain-behavior associations require massive samples for replicability	N > 2,000 for whole-brain brain-behavior correlations	Marek et al., 2022
N = 20 gives ~50% power for medium fMRI effects	N = 40+ for 80% power with medium effects	Poldrack et al., 2017
80% power at uncorrected p < 0.001 requires N ~ 40 for d = 0.8	N = 40 per group for large between-group effects	Turner et al., 2018
Cluster-based inference with CDT p < 0.01 produces inflated false positives	Use CDT p < 0.001 and increase N to compensate for reduced sensitivity	Eklund et al., 2016
Within-subject designs are much more powerful than between-subject	Prefer within-subject designs when scientifically appropriate	Mumford & Nichols, 2008

Minimum Sample Size Table

Analysis Type	Minimum N (80% Power)	Effect Size Assumed	Correction Method	Source
Within-subject activation (whole-brain)	25-30	d = 0.8 (large)	Cluster-based, CDT p < 0.001	Desmond & Glover, 2002
Between-group (whole-brain, large effect)	20-25 per group	d = 0.8	Cluster-based, CDT p < 0.001	Thirion et al., 2007
Between-group (whole-brain, medium effect)	40-50 per group	d = 0.5	Cluster-based, CDT p < 0.001	Poldrack et al., 2017
ROI-based (single a priori ROI)	15-25	d = 0.5-0.8	Uncorrected (single test)	Desmond & Glover, 2002
Resting-state connectivity (group mean)	25-40	r = 0.3-0.5	FDR or NBS	Smith et al., 2011
Brain-behavior correlation (whole-brain)	2,000+	r < 0.1 (replicable)	Permutation	Marek et al., 2022
Brain-behavior correlation (single ROI)	80-200	r = 0.2-0.3	Uncorrected	Standard formula

Registered Report Considerations

Registered reports require pre-specification of sample size with a formal power analysis. For neuroimaging registered reports:

Specify the primary analysis (whole-brain vs. ROI) and the corresponding power analysis method
Use simulation-based power when possible; if not, use ROI-based power with conservative effect size estimates
Pre-specify the multiple comparison correction method and document its impact on required N
Include sensitivity analysis: What is the minimum detectable effect size at the planned N?
State stopping rules: Pre-register the exact N and analysis plan; sequential analysis requires adjustment (Lakens, 2014)
Account for attrition: Specify expected exclusion rate (typically 10-20% for fMRI) and over-recruit

Domain insight: Reviewers will be suspicious of power analyses based on large effect sizes from small pilot studies. Use conservative (deflated) effect size estimates and show power curves across a range of plausible effect sizes.

Practical Workflow for Grant Applications

When Pilot Data Are Available

Run fMRIpower or NeuroPowerTools with pilot maps
Generate power curves showing power vs. N for the primary contrast and ROI
Select N that achieves 80-90% power for the primary analysis
Add 15-20% for expected participant exclusions
Report: pilot study details, effect size estimates, power tool used, correction method, target power, final N

When No Pilot Data Are Available

Search NeuroVault for comparable paradigms; download unthresholded maps
Use NeuroPowerTools with the published map
Alternatively, estimate ROI-level effect sizes from published papers:

Extract t-values and convert to Cohen's d
Apply deflation (multiply by 0.5-0.75; Button et al., 2013)
Use G*Power for ROI-based power

As a last resort, use the benchmark table above with the analysis type closest to your planned study
Document all assumptions and state that the power analysis is based on estimated (not measured) effect sizes

Common Pitfalls

Using G*Power for whole-brain analyses: Standard power tools compute power for a single test and do not account for multiple comparison correction. This overestimates power by an order of magnitude (Mumford & Nichols, 2008)
Trusting pilot study effect sizes: Small pilot studies (N < 20) produce inflated effect sizes. Always deflate by 25-50% (Button et al., 2013)
Ignoring the correction method: Power depends critically on whether you use voxelwise FWE, cluster-based, FDR, or permutation-based correction. Power at FDR q < 0.05 can be 2-3x higher than voxelwise FWE p < 0.05 for the same N
Conflating within-subject and between-subject power: Within-subject designs (one-sample t-test on contrast maps) are much more powerful than between-subject designs (two-sample t-test) because they eliminate between-subject variance (Mumford & Nichols, 2008)
Not accounting for attrition: In fMRI, 10-20% of data may be unusable due to motion, scanner artifacts, or task non-compliance. Over-recruit accordingly
Treating all regions equally: Power varies across the brain because effect sizes and noise vary spatially. Power at your primary ROI may be adequate even if whole-brain power is low
Assuming published N is adequate: Most published fMRI studies are underpowered (Button et al., 2013). Matching a published study's N does not guarantee adequate power
Not reporting sensitivity analysis: Always report the minimum detectable effect size at your planned N, in addition to the power estimate for the expected effect

Minimum Reporting Checklist

References

Button, K. S., Ioannidis, J. P. A., Mokrysz, C., et al. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365-376.
Cox, R. W., Chen, G., Glen, D. R., Reynolds, R. C., & Taylor, P. A. (2017). FMRI clustering in AFNI: False-positive rates redux. Brain Connectivity, 7(3), 152-171.
Desmond, J. E., & Glover, G. H. (2002). Estimating sample size in functional MRI (fMRI) neuroimaging studies. Journal of Neuroscience Methods, 118(2), 115-128.
Durnez, J., Degryse, J., Moerkerke, B., et al. (2016). Power and sample size calculations for fMRI studies based on the prevalence of active peaks. bioRxiv, 049429.
Eklund, A., Nichols, T. E., & Knutsson, H. (2016). Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates. PNAS, 113(28), 7900-7905.
Joyce, K. E., & Hayasaka, S. (2012). Development of PowerMap: A software package for statistical power calculation in neuroimaging studies. Neuroinformatics, 10(4), 351-365.
Laird, A. R., Fox, P. M., Price, C. J., et al. (2005). ALE meta-analysis: Controlling the false discovery rate and performing statistical contrasts. Human Brain Mapping, 25(1), 155-164.
Lakens, D. (2014). Performing high-powered studies efficiently with sequential analyses. European Journal of Social Psychology, 44(7), 701-710.
Marek, S., Tervo-Clemmens, B., Calabro, F. J., et al. (2022). Reproducible brain-wide association studies require thousands of individuals. Nature, 603(7902), 654-660.
Mumford, J. A., & Nichols, T. E. (2008). Power calculation for group fMRI studies accounting for arbitrary design and temporal autocorrelation. NeuroImage, 39(1), 261-268.
Poldrack, R. A., Baker, C. I., Durnez, J., et al. (2017). Scanning the horizon: Towards transparent and reproducible neuroimaging research. Nature Reviews Neuroscience, 18(2), 115-126.
Salo, T., Yarkoni, T., Nichols, T. E., et al. (2023). NiMARE: Neuroimaging Meta-Analysis Research Environment. NeuroImage, 268, 119862.
Smith, S. M., Miller, K. L., Salimi-Khorshidi, G., et al. (2011). Network modelling methods for FMRI. NeuroImage, 54(2), 875-891.
Thirion, B., Pinel, P., Meriaux, S., et al. (2007). Analysis of a large fMRI cohort: Statistical and methodological issues for group analyses. NeuroImage, 35(1), 105-120.
Turner, B. O., Paul, E. J., Miller, M. B., & Barbey, A. K. (2018). Small sample sizes reduce the replicability of task-based fMRI studies. Communications Biology, 1, 62.
Yarkoni, T., Poldrack, R. A., Nichols, T. E., Van Essen, D. C., & Wager, T. D. (2011). Large-scale automated synthesis of human functional neuroimaging data. Nature Methods, 8(8), 665-670.

See references/ for worked examples and simulation code templates.