Meta-analysis frameworks and methods for mediation studies
Performs meta-analysis of mediation effects including random effects pooling, heterogeneity assessment, and publication bias detection.
npx claudepluginhub data-wise/scholarThis skill inherits all available tools. When active, it can use any tool Claude has access to.
Methods for synthesizing mediation effects across multiple studies
Use this skill when working on: meta-analysis of indirect effects, cross-study effect aggregation, heterogeneity assessment in mediation, individual participant data (IPD) meta-analysis, or systematic reviews of mediation studies.
| Challenge | Description | Solution Approach |
|---|---|---|
| Non-normal effects | Product $ab$ is not normally distributed | Use appropriate pooling methods |
| Correlated paths | $a$ and $b$ may be correlated | Model correlation structure |
| Heterogeneity | Effects vary across studies | Random effects models |
| Missing information | Studies report different statistics | Imputation or subset analysis |
| Publication bias | Small studies with null effects unpublished | Sensitivity analysis |
| Measure | Formula | Use Case |
|---|---|---|
| Unstandardized $ab$ | $a \times b$ | Same scales across studies |
| Partially standardized | $a \times b / SD_Y$ | Standardize by outcome only |
| Fully standardized | $a' \times b'$ (standardized coefficients) | Different scales |
| $R^2_{med}$ | Proportion of effect mediated | Bounded measure |
When assuming homogeneous true effects, the pooled estimate is:
$$\hat{\theta}_{FE} = \frac{\sum_i w_i \hat{\theta}_i}{\sum_i w_i}, \quad w_i = \frac{1}{\text{SE}_i^2}$$
This pooled estimate uses inverse-variance weights to optimally combine study-specific estimates.
When true effects vary across studies, the pooled estimate incorporates between-study variance:
$$\hat{\theta}_{RE} = \frac{\sum_i w_i^* \hat{\theta}_i}{\sum_i w_i^}, \quad w_i^ = \frac{1}{\text{SE}_i^2 + \hat{\tau}^2}$$
where $\hat{\tau}^2$ is the between-study variance (tau-squared). The pooled estimate under random effects provides a more generalizable result when heterogeneity is present.
Key statistics for assessing heterogeneity:
| Statistic | Formula | Interpretation |
|---|---|---|
| Q statistic | $Q = \sum_i w_i(\hat{\theta}_i - \hat{\theta})^2$ | Test for heterogeneity |
| I-squared ($I^2$) | $I^2 = \max(0, \frac{Q - (k-1)}{Q})$ | % variance due to heterogeneity |
| tau-squared ($\tau^2$) | Between-study variance | Absolute heterogeneity |
| H-squared | $H^2 = Q/(k-1)$ | Relative excess variance |
The I-squared statistic ranges from 0-100%: <25% indicates low heterogeneity, 25-75% moderate, and >75% high heterogeneity. The tau-squared provides the absolute magnitude of between-study variance.
#' Meta-Analysis of Indirect Effects
#'
#' @param effects Vector of indirect effect estimates
#' @param se Vector of standard errors
#' @param study_names Optional study identifiers
#' @param method "FE" for fixed effects, "RE" for random effects
#' @return Meta-analysis results
meta_indirect <- function(effects, se, study_names = NULL,
method = c("RE", "FE")) {
method <- match.arg(method)
k <- length(effects)
if (is.null(study_names)) {
study_names <- paste0("Study ", 1:k)
}
# Fixed effects weights
w_fe <- 1 / se^2
# Q statistic
theta_fe <- sum(w_fe * effects) / sum(w_fe)
Q <- sum(w_fe * (effects - theta_fe)^2)
df <- k - 1
# Heterogeneity
I2 <- max(0, (Q - df) / Q)
# DerSimonian-Laird tau^2 estimate
c <- sum(w_fe) - sum(w_fe^2) / sum(w_fe)
tau2 <- max(0, (Q - df) / c)
if (method == "FE") {
weights <- w_fe
pooled <- theta_fe
se_pooled <- sqrt(1 / sum(w_fe))
} else {
# Random effects weights
weights <- 1 / (se^2 + tau2)
pooled <- sum(weights * effects) / sum(weights)
se_pooled <- sqrt(1 / sum(weights))
}
# Confidence interval
ci <- pooled + c(-1.96, 1.96) * se_pooled
# Test for heterogeneity
p_het <- 1 - pchisq(Q, df)
list(
pooled_effect = pooled,
se = se_pooled,
ci = ci,
z = pooled / se_pooled,
p_value = 2 * pnorm(-abs(pooled / se_pooled)),
heterogeneity = list(
Q = Q,
df = df,
p = p_het,
I2 = I2,
tau2 = tau2
),
study_data = data.frame(
study = study_names,
effect = effects,
se = se,
weight = weights / sum(weights)
),
method = method
)
}
When studies report both $a$ and $b$ paths:
$$\begin{pmatrix} \hat{a} \ \hat{b} \end{pmatrix} \sim N\left(\begin{pmatrix} a \ b \end{pmatrix}, \Sigma\right)$$
Stage 1: Extract path coefficients from each study Stage 2: Pool using multivariate random effects
#' Multivariate Meta-Analysis of Mediation Paths
#'
#' @param a_effects Vector of a path estimates
#' @param b_effects Vector of b path estimates
#' @param a_se Standard errors for a
#' @param b_se Standard errors for b
#' @param ab_cor Correlation between a and b estimates (often assumed 0)
#' @return Multivariate meta-analysis results
multivariate_meta_mediation <- function(a_effects, b_effects,
a_se, b_se, ab_cor = 0) {
library(metafor)
k <- length(a_effects)
# Construct variance-covariance matrices for each study
V_list <- lapply(1:k, function(i) {
cov_ab <- ab_cor * a_se[i] * b_se[i]
matrix(c(a_se[i]^2, cov_ab, cov_ab, b_se[i]^2), 2, 2)
})
# Stack effects
yi <- c(rbind(a_effects, b_effects))
vi <- unlist(lapply(V_list, as.vector))
# Create V matrix (block diagonal)
V <- bldiag(V_list)
# Fit multivariate model
# Effect type indicator
effect_type <- rep(c("a", "b"), k)
study_id <- rep(1:k, each = 2)
fit <- rma.mv(yi = yi, V = V,
mods = ~ effect_type - 1,
random = ~ effect_type | study_id,
struct = "UN",
data = data.frame(yi, effect_type, study_id))
# Extract pooled estimates
pooled_a <- coef(fit)["effect_typea"]
pooled_b <- coef(fit)["effect_typeb"]
# Compute indirect effect and CI via delta method
vcov_pooled <- vcov(fit)
indirect <- pooled_a * pooled_b
# Delta method SE
grad <- c(pooled_b, pooled_a)
se_indirect <- sqrt(t(grad) %*% vcov_pooled %*% grad)
list(
pooled_a = pooled_a,
pooled_b = pooled_b,
pooled_indirect = indirect,
se_indirect = as.numeric(se_indirect),
ci_indirect = indirect + c(-1.96, 1.96) * as.numeric(se_indirect),
model_fit = fit
)
}
Pool all data and fit single model with study-level random effects:
#' IPD Meta-Analysis for Mediation
#'
#' @param data Combined dataset with study indicator
#' @param study_var Name of study variable
#' @param treatment Name of treatment variable
#' @param mediator Name of mediator variable
#' @param outcome Name of outcome variable
#' @return IPD meta-analysis results
ipd_meta_mediation <- function(data, study_var, treatment, mediator, outcome) {
library(lme4)
# Mediator model with random slopes
m_formula <- as.formula(paste(
mediator, "~", treatment, "+ (1 +", treatment, "|", study_var, ")"
))
m_model <- lmer(m_formula, data = data)
# Outcome model with random slopes
y_formula <- as.formula(paste(
outcome, "~", treatment, "+", mediator,
"+ (1 +", treatment, "+", mediator, "|", study_var, ")"
))
y_model <- lmer(y_formula, data = data)
# Extract fixed effects (pooled estimates)
a <- fixef(m_model)[treatment]
b <- fixef(y_model)[mediator]
c_prime <- fixef(y_model)[treatment]
# Indirect effect
indirect <- a * b
# Bootstrap for CI
boot_indirect <- replicate(1000, {
boot_idx <- sample(nrow(data), replace = TRUE)
boot_data <- data[boot_idx, ]
m_boot <- tryCatch(
lmer(m_formula, data = boot_data),
error = function(e) NULL
)
y_boot <- tryCatch(
lmer(y_formula, data = boot_data),
error = function(e) NULL
)
if (is.null(m_boot) || is.null(y_boot)) return(NA)
fixef(m_boot)[treatment] * fixef(y_boot)[mediator]
})
boot_indirect <- boot_indirect[!is.na(boot_indirect)]
list(
pooled_a = a,
pooled_b = b,
pooled_c_prime = c_prime,
pooled_indirect = indirect,
pooled_total = indirect + c_prime,
se_indirect = sd(boot_indirect),
ci_indirect = quantile(boot_indirect, c(0.025, 0.975)),
n_studies = length(unique(data[[study_var]])),
n_total = nrow(data),
m_model = m_model,
y_model = y_model
)
}
Estimate effects within each study, then pool:
#' Two-Stage IPD Meta-Analysis
#'
#' @param data Combined dataset
#' @param study_var Study identifier
#' @return Two-stage meta-analysis results
two_stage_ipd <- function(data, study_var, treatment, mediator, outcome) {
studies <- unique(data[[study_var]])
k <- length(studies)
# Stage 1: Study-specific estimates
study_results <- lapply(studies, function(s) {
study_data <- data[data[[study_var]] == s, ]
# Fit models
m_model <- lm(as.formula(paste(mediator, "~", treatment)), data = study_data)
y_model <- lm(as.formula(paste(outcome, "~", treatment, "+", mediator)),
data = study_data)
a <- coef(m_model)[treatment]
b <- coef(y_model)[mediator]
# Delta method SE for indirect
se_a <- sqrt(vcov(m_model)[treatment, treatment])
se_b <- sqrt(vcov(y_model)[mediator, mediator])
se_indirect <- sqrt(a^2 * se_b^2 + b^2 * se_a^2)
data.frame(
study = s,
n = nrow(study_data),
a = a,
b = b,
indirect = a * b,
se_a = se_a,
se_b = se_b,
se_indirect = se_indirect
)
})
study_df <- do.call(rbind, study_results)
# Stage 2: Pool indirect effects
meta_result <- meta_indirect(
effects = study_df$indirect,
se = study_df$se_indirect,
study_names = study_df$study,
method = "RE"
)
list(
stage1 = study_df,
stage2 = meta_result,
pooled_indirect = meta_result$pooled_effect,
ci = meta_result$ci,
I2 = meta_result$heterogeneity$I2
)
}
| Method | Description | Limitation |
|---|---|---|
| Funnel plot | SE vs effect plot | Visual, subjective |
| Egger's test | Regression of effect on SE | Low power |
| Trim-and-fill | Impute missing studies | Assumes specific mechanism |
| PET-PEESE | Conditional regression | Requires assumptions |
| Selection models | Model publication process | Complex, sensitive |
#' Publication Bias Assessment for Mediation Meta-Analysis
#'
#' @param effects Indirect effect estimates
#' @param se Standard errors
#' @return Publication bias diagnostics
publication_bias_mediation <- function(effects, se) {
library(metafor)
# Fit random effects model
res <- rma(yi = effects, sei = se, method = "REML")
# Funnel plot data
funnel_data <- data.frame(
effect = effects,
se = se,
precision = 1/se
)
# Egger's test
egger <- regtest(res, model = "lm")
# Trim and fill
tf <- trimfill(res)
# PET-PEESE (if significant, use PEESE; otherwise PET)
pet <- lm(effects ~ se, weights = 1/se^2)
peese <- lm(effects ~ I(se^2), weights = 1/se^2)
pet_est <- coef(pet)[1]
peese_est <- coef(peese)[1]
# Use PEESE if PET significant, else PET
if (coef(summary(pet))[2, 4] < 0.10) {
adjusted_estimate <- peese_est
method_used <- "PEESE"
} else {
adjusted_estimate <- pet_est
method_used <- "PET"
}
list(
original_estimate = coef(res),
egger_test = list(
z = egger$zval,
p = egger$pval,
interpretation = ifelse(egger$pval < 0.10,
"Evidence of funnel asymmetry",
"No strong evidence of asymmetry")
),
trim_fill = list(
original_k = res$k,
imputed_k = tf$k0,
adjusted_estimate = coef(tf),
adjusted_ci = c(tf$ci.lb, tf$ci.ub)
),
pet_peese = list(
pet_estimate = pet_est,
peese_estimate = peese_est,
method_used = method_used,
adjusted_estimate = adjusted_estimate
),
funnel_data = funnel_data
)
}
Test whether study-level characteristics explain heterogeneity:
#' Meta-Regression for Mediation Effects
#'
#' @param effects Indirect effect estimates
#' @param se Standard errors
#' @param moderators Data frame of moderator variables
#' @return Meta-regression results
meta_regression_mediation <- function(effects, se, moderators) {
library(metafor)
# Build formula from moderator names
mod_formula <- as.formula(paste("~", paste(names(moderators), collapse = " + ")))
# Fit mixed-effects meta-regression
res <- rma(yi = effects, sei = se,
mods = mod_formula,
data = moderators,
method = "REML")
# R^2 analog
res_null <- rma(yi = effects, sei = se, method = "REML")
R2 <- max(0, (res_null$tau2 - res$tau2) / res_null$tau2)
# Test for residual heterogeneity
QE_test <- list(
QE = res$QE,
df = res$k - res$p,
p = res$QEp
)
list(
coefficients = coef(summary(res)),
tau2_residual = res$tau2,
I2_residual = res$I2,
R2 = R2,
residual_heterogeneity = QE_test,
model = res
)
}
#' Subgroup Analysis for Mediation Meta-Analysis
#'
#' @param effects Indirect effect estimates
#' @param se Standard errors
#' @param subgroup Factor variable defining subgroups
#' @return Subgroup analysis results
subgroup_analysis <- function(effects, se, subgroup) {
groups <- unique(subgroup)
# Within-group estimates
group_results <- lapply(groups, function(g) {
idx <- subgroup == g
meta_indirect(effects[idx], se[idx], method = "RE")
})
names(group_results) <- groups
# Extract pooled estimates
group_effects <- sapply(group_results, function(x) x$pooled_effect)
group_se <- sapply(group_results, function(x) x$se)
group_k <- sapply(group_results, function(x) length(x$study_data$effect))
# Test for subgroup differences
# Q_between = Q_total - sum(Q_within)
overall <- meta_indirect(effects, se, method = "RE")
Q_total <- overall$heterogeneity$Q
Q_within <- sapply(group_results, function(x) x$heterogeneity$Q)
Q_between <- Q_total - sum(Q_within)
df_between <- length(groups) - 1
p_between <- 1 - pchisq(Q_between, df_between)
list(
subgroup_estimates = data.frame(
subgroup = groups,
k = group_k,
effect = group_effects,
se = group_se,
ci_lower = group_effects - 1.96 * group_se,
ci_upper = group_effects + 1.96 * group_se
),
test_for_differences = list(
Q_between = Q_between,
df = df_between,
p = p_between,
interpretation = ifelse(p_between < 0.05,
"Significant subgroup differences",
"No significant subgroup differences")
),
group_results = group_results
)
}
#' Create Forest Plot for Mediation Meta-Analysis
#'
#' @param meta_result Result from meta_indirect()
#' @return ggplot2 forest plot
forest_plot_mediation <- function(meta_result) {
library(ggplot2)
df <- meta_result$study_data
df$ci_lower <- df$effect - 1.96 * df$se
df$ci_upper <- df$effect + 1.96 * df$se
# Add pooled estimate
pooled <- data.frame(
study = "Pooled",
effect = meta_result$pooled_effect,
se = meta_result$se,
weight = NA,
ci_lower = meta_result$ci[1],
ci_upper = meta_result$ci[2]
)
df <- rbind(df, pooled)
df$study <- factor(df$study, levels = rev(df$study))
ggplot(df, aes(x = effect, y = study)) +
geom_vline(xintercept = 0, linetype = "dashed", color = "gray50") +
geom_point(aes(size = weight)) +
geom_errorbarh(aes(xmin = ci_lower, xmax = ci_upper), height = 0.2) +
geom_point(data = df[df$study == "Pooled", ],
shape = 18, size = 5, color = "darkred") +
labs(
x = "Indirect Effect",
y = "",
title = "Forest Plot: Meta-Analysis of Indirect Effects",
subtitle = sprintf("I² = %.1f%%, τ² = %.4f",
meta_result$heterogeneity$I2 * 100,
meta_result$heterogeneity$tau2)
) +
theme_minimal() +
theme(legend.position = "none")
}
Version: 1.0.0 Created: 2025-12-09 Domain: Meta-analysis of mediation effects Applications: Systematic reviews, research synthesis, evidence aggregation
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.