npx claudepluginhub data-wise/scholarThis skill uses the workspace's default tool permissions.
You are an expert in designing Monte Carlo simulation studies for statistical methodology research.
Designs and executes Monte Carlo simulations to evaluate finite-sample properties of statistical estimators including bias, RMSE, coverage, size, and power.
Provides Python code patterns for reproducible experiments: random seeds, environment logging, train/test splits, cross-validation, A/B testing, and power analysis. For ML/statistical designs.
Audits and drafts methods sections for experimental social science, covering pre-analysis plans, pre-registrations, conjoint designs, CONSORT flows, and APSA/JARS/DA-RT compliance with 45-item checklist.
Share bugs, ideas, or general feedback.
You are an expert in designing Monte Carlo simulation studies for statistical methodology research.
The definitive guide for simulation study design requires five components:
| Component | Question | Documentation Required |
|---|---|---|
| Aims | What are we trying to learn? | Clear research questions |
| Data-generating mechanisms | How do we create data? | Full DGP specification |
| Estimands | What are we estimating? | Mathematical definition |
| Methods | What estimators do we compare? | Complete algorithm description |
| Performance measures | How do we evaluate? | Bias, variance, coverage |
□ Aims stated clearly
□ DGP fully specified (all parameters, distributions)
□ Estimand(s) defined mathematically
□ All methods described with sufficient detail for replication
□ Performance measures defined
□ Number of replications justified
□ Monte Carlo standard errors reported
□ Random seed documented for reproducibility
□ Software and version documented
□ Computational time reported
Monte Carlo Standard Error (MCSE) formula:
$$\text{MCSE}(\hat{\theta}) = \frac{\hat{\sigma}}{\sqrt{B}}$$
where $B$ is the number of replications and $\hat{\sigma}$ is the estimated standard deviation.
| Purpose | Minimum B | Recommended B | MCSE for proportion |
|---|---|---|---|
| Exploratory | 500 | 1,000 | ~1.4% at 95% coverage |
| Publication | 1,000 | 2,000 | ~1.0% at 95% coverage |
| Definitive | 5,000 | 10,000 | ~0.4% at 95% coverage |
| Precision | 10,000+ | 50,000 | ~0.2% at 95% coverage |
# Calculate Monte Carlo standard errors
calculate_mcse <- function(estimates, coverage_indicators = NULL) {
B <- length(estimates)
list(
# MCSE for mean (bias)
mcse_mean = sd(estimates) / sqrt(B),
# MCSE for standard deviation
mcse_sd = sd(estimates) / sqrt(2 * (B - 1)),
# MCSE for coverage (proportion)
mcse_coverage = if (!is.null(coverage_indicators)) {
p <- mean(coverage_indicators)
sqrt(p * (1 - p) / B)
} else NA
)
}
# Rule of thumb: B needed for desired MCSE
replications_needed <- function(desired_mcse, estimated_sd) {
ceiling((estimated_sd / desired_mcse)^2)
}
# Full simulation study template following Morris et al. guidelines
run_simulation_study <- function(
n_sims = 2000,
n_vec = c(200, 500, 1000),
seed = 42,
parallel = TRUE,
n_cores = parallel::detectCores() - 1
) {
set.seed(seed)
# Define parameter grid
params <- expand.grid(
n = n_vec,
effect_size = c(0, 0.14, 0.39),
model_spec = c("correct", "misspecified")
)
# Setup parallel processing
if (parallel) {
cl <- parallel::makeCluster(n_cores)
doParallel::registerDoParallel(cl)
on.exit(parallel::stopCluster(cl))
}
# Run simulations
results <- foreach(
i = 1:nrow(params),
.combine = rbind,
.packages = c("tidyverse", "mediation")
) %dopar% {
scenario <- params[i, ]
sim_results <- replicate(n_sims, {
data <- generate_dgp(scenario)
estimates <- apply_methods(data)
evaluate_performance(estimates, truth = scenario$effect_size)
}, simplify = FALSE)
summarize_scenario(sim_results, scenario)
}
# Add MCSE
results <- add_monte_carlo_errors(results, n_sims)
results
}
# Summarize with MCSE
add_monte_carlo_errors <- function(results, B) {
results %>%
mutate(
mcse_bias = empirical_se / sqrt(B),
mcse_coverage = sqrt(coverage * (1 - coverage) / B),
mcse_rmse = rmse / sqrt(2 * B)
)
}
# Memory-efficient parallel simulation
run_parallel_simulation <- function(scenario, n_sims, n_cores = 4) {
library(future)
library(future.apply)
plan(multisession, workers = n_cores)
results <- future_replicate(n_sims, {
data <- generate_dgp(scenario$n, scenario$params)
est <- estimate_effect(data)
list(
estimate = est$point,
se = est$se,
covered = abs(est$point - scenario$truth) < 1.96 * est$se
)
}, simplify = FALSE)
plan(sequential) # Reset
# Aggregate
estimates <- sapply(results, `[[`, "estimate")
ses <- sapply(results, `[[`, "se")
covered <- sapply(results, `[[`, "covered")
list(
bias = mean(estimates) - scenario$truth,
empirical_se = sd(estimates),
mean_se = mean(ses),
coverage = mean(covered),
mcse_bias = sd(estimates) / sqrt(n_sims),
mcse_coverage = sqrt(mean(covered) * (1 - mean(covered)) / n_sims)
)
}
generate_mediation_data <- function(n, params) {
# Confounders
X <- rnorm(n)
# Treatment (binary)
ps <- plogis(params$gamma0 + params$gamma1 * X)
A <- rbinom(n, 1, ps)
# Mediator
M <- params$alpha0 + params$alpha1 * A + params$alpha2 * X +
rnorm(n, sd = params$sigma_m)
# Outcome
Y <- params$beta0 + params$beta1 * A + params$beta2 * M +
params$beta3 * X + params$beta4 * A * M +
rnorm(n, sd = params$sigma_y)
data.frame(Y = Y, A = A, M = M, X = X)
}
| Size | Label | Purpose |
|---|---|---|
| 100-200 | Small | Stress test |
| 500 | Medium | Typical study |
| 1000-2000 | Large | Asymptotic behavior |
| 5000+ | Very large | Efficiency comparison |
| Effect | Interpretation |
|---|---|
| 0 | Null (Type I error) |
| 0.1 | Small |
| 0.3 | Medium |
| 0.5 | Large |
params <- expand.grid(
n = c(200, 500, 1000, 2000),
effect = c(0, 0.14, 0.39, 0.59), # Small/medium/large per Cohen
confounding = c(0, 0.3, 0.6),
misspecification = c(FALSE, TRUE)
)
| Metric | Formula | Target | MCSE Formula |
|---|---|---|---|
| Bias | $\bar{\hat\psi} - \psi_0$ | ≈ 0 | $\sqrt{\text{Var}(\hat\psi)/n_{sim}}$ |
| Empirical SE | $\text{SD}(\hat\psi)$ | — | Complex |
| Average SE | $\bar{\widehat{SE}}$ | ≈ Emp SE | $\text{SD}(\widehat{SE})/\sqrt{n_{sim}}$ |
| Coverage | $\frac{1}{n_{sim}}\sum I(\psi_0 \in CI)$ | ≈ 0.95 | $\sqrt{p(1-p)/n_{sim}}$ |
| MSE | $\text{Bias}^2 + \text{Var}$ | Minimize | — |
| Power | % rejecting $H_0$ | Context-dependent | $\sqrt{p(1-p)/n_{sim}}$ |
| Metric | Minimum | Recommended |
|---|---|---|
| Bias | 1000 | 2000 |
| Coverage | 2000 | 5000 |
| Power | 1000 | 2000 |
Always report MCSE for key metrics:
#' Run simulation study
#' @param scenario Parameter list for this scenario
#' @param n_rep Number of replications
#' @param seed Random seed
run_simulation <- function(scenario, n_rep = 2000, seed = 42) {
set.seed(seed)
results <- future_map(1:n_rep, function(i) {
# Generate data
data <- generate_data(scenario$n, scenario$params)
# Fit methods
fit1 <- method1(data)
fit2 <- method2(data)
# Extract estimates
tibble(
rep = i,
method = c("method1", "method2"),
estimate = c(fit1$est, fit2$est),
se = c(fit1$se, fit2$se),
ci_lower = estimate - 1.96 * se,
ci_upper = estimate + 1.96 * se
)
}, .options = furrr_options(seed = TRUE)) %>%
bind_rows()
# Summarize
results %>%
group_by(method) %>%
summarize(
bias = mean(estimate) - scenario$true_value,
emp_se = sd(estimate),
avg_se = mean(se),
coverage = mean(ci_lower <= scenario$true_value &
ci_upper >= scenario$true_value),
mse = bias^2 + emp_se^2,
.groups = "drop"
)
}
Table X: Simulation Results (n_rep = 2000)
Method 1 Method 2
n Bias SE Cov MSE Bias SE Cov MSE
-----------------------------------------------------------
200 0.02 0.15 0.94 0.023 0.01 0.12 0.95 0.014
500 0.01 0.09 0.95 0.008 0.00 0.08 0.95 0.006
1000 0.00 0.06 0.95 0.004 0.00 0.05 0.95 0.003
Note: Cov = 95% CI coverage. MCSE for coverage ≈ 0.005.
# Save results incrementally
if (i %% 100 == 0) {
saveRDS(results_so_far,
file = sprintf("checkpoint_%s_rep%d.rds", scenario_id, i))
}
sessionInfo())furrr_options(seed = TRUE) for parallel