JASA/Biometrika manuscript structure with VanderWeele notation standards
Generates statistical methodology manuscripts following JASA/Biometrika standards with VanderWeele notation.
npx claudepluginhub data-wise/scholarThis skill inherits all available tools. When active, it can use any tool Claude has access to.
Comprehensive guide for writing statistical methodology manuscripts
Use this skill when working on: methodology manuscripts, journal submissions, methods sections, simulation study write-ups, theoretical results presentation, or adapting papers for specific journals (JASA, Biometrika, Biostatistics).
| Element | JASA Requirement |
|---|---|
| Page limit | ~25 pages main text + unlimited supplement |
| Abstract | 150-200 words, no math symbols |
| Keywords | 3-6 keywords after abstract |
| Sections | Standard: Intro, Methods, Theory, Simulation, Application, Discussion |
| References | Author-year format (natbib) |
| Figures | High resolution, grayscale-compatible |
| Code | Reproducibility materials required |
# JASA-compliant simulation results table
create_jasa_table <- function(results_df) {
# Format for JASA: clean, no vertical lines, proper decimal alignment
results_df %>%
mutate(across(where(is.numeric), ~sprintf("%.3f", .))) %>%
kable(format = "latex",
booktabs = TRUE,
align = c("l", rep("r", ncol(.) - 1)),
caption = "Simulation results: Bias, SE, and Coverage") %>%
kable_styling(latex_options = "hold_position") %>%
add_header_above(c(" " = 1, "n = 200" = 3, "n = 500" = 3))
}
\documentclass[12pt]{article}
\usepackage{natbib}
\usepackage{amsmath,amssymb}
\usepackage{graphicx}
\usepackage{booktabs}
\title{Your Title Here}
\author{Author One\thanks{Department, University, email} \and
Author Two\thanks{Department, University, email}}
\date{}
\begin{document}
\maketitle
\begin{abstract}
Your abstract here (150-200 words, no math symbols).
\end{abstract}
\noindent\textbf{Keywords:} keyword1; keyword2; keyword3
| Paragraph | Purpose | Word Count |
|---|---|---|
| 1 | Hook + Scientific Problem | 100-150 |
| 2 | Existing Methods | 150-200 |
| 3 | Gap/Limitation | 100-150 |
| 4 | Our Contribution | 150-200 |
| 5 | Results Preview | 100-150 |
| 6 | Paper Organization | 50-100 |
# Template for tracking introduction components
intro_checklist <- function() {
data.frame(
paragraph = 1:6,
element = c("Hook + Problem", "Literature", "Gap",
"Contribution", "Results", "Organization"),
key_phrases = c(
"is fundamental to..., has important implications for...",
"Existing methods include..., Prior work has...",
"However, current approaches cannot..., A key limitation is...",
"We propose..., Our method..., We develop...",
"We show that..., Simulations demonstrate..., Application reveals...",
"The remainder of this paper is organized as follows..."
),
status = rep("pending", 6)
)
}
1. Simulation Design
- Data generating process (DGP)
- Sample sizes
- Number of replications
- Scenarios/conditions
2. Methods Compared
- Proposed method
- Competing methods (2-4)
- Oracle/benchmark
3. Performance Metrics
- Bias
- Standard error / RMSE
- Coverage probability
- Efficiency (relative to oracle)
4. Results
- Tables by scenario
- Figures for key patterns
- Sensitivity analyses
# Complete simulation template for mediation methods paper
run_simulation_study <- function(n_sims = 1000, n_vec = c(200, 500, 1000)) {
scenarios <- expand.grid(
n = n_vec,
misspecification = c("none", "outcome", "mediator", "both"),
effect_size = c("small", "medium", "large")
)
results <- map_dfr(1:nrow(scenarios), function(i) {
scenario <- scenarios[i, ]
replicate_results <- replicate(n_sims, {
# Generate data under scenario
data <- generate_dgp(
n = scenario$n,
misspec = scenario$misspecification,
effect = scenario$effect_size
)
# Apply all methods
list(
proposed = proposed_method(data),
baron_kenny = baron_kenny(data),
product = product_method(data),
bootstrap = bootstrap_method(data)
)
}, simplify = FALSE)
# Summarize across replications
summarize_simulation(replicate_results, true_effect)
})
results
}
# Standard metrics calculation
calculate_metrics <- function(estimates, true_value, ses) {
list(
bias = mean(estimates) - true_value,
empirical_se = sd(estimates),
mean_se = mean(ses),
rmse = sqrt(mean((estimates - true_value)^2)),
coverage = mean(abs(estimates - true_value) < 1.96 * ses)
)
}
| Symbol | Meaning | Usage |
|---|---|---|
| $Y$ | Outcome | Capital for random variable |
| $y$ | Observed value | Lowercase for realization |
| $A$ | Treatment | Binary: $A \in {0,1}$ |
| $M$ | Mediator | Can be vector $\mathbf{M}$ |
| $X$ | Covariates | Often $\mathbf{X}$ for vector |
| $\theta$ | Parameter | Target of estimation |
| $\hat{\theta}$ | Estimator | Hat for estimate |
| $P, \mathbb{P}$ | Probability | Distribution |
| $E, \mathbb{E}$ | Expectation | Expected value |
% Standard potential outcomes notation
Y(a) % Outcome under treatment a
M(a) % Mediator under treatment a
Y(a,m) % Outcome under treatment a and mediator m
% Mediation effects
NDE(a) = E[Y(1,M(a)) - Y(0,M(a))] % Natural direct effect
NIE(a) = E[Y(a,M(1)) - Y(a,M(0))] % Natural indirect effect
TE = NDE + NIE % Total effect decomposition
| Aspect | Requirement |
|---|---|
| Resolution | 300+ DPI for print |
| Format | PDF or EPS preferred |
| Colors | Must work in grayscale |
| Font size | Legible at print size (8pt minimum) |
| Legends | Inside figure, not separate |
| Captions | Below figure, complete description |
# JASA-compliant ggplot theme
theme_jasa <- function() {
theme_bw(base_size = 11) +
theme(
panel.grid.minor = element_blank(),
panel.grid.major = element_line(color = "gray90"),
strip.background = element_rect(fill = "gray95"),
legend.position = "bottom",
legend.box = "horizontal",
axis.text = element_text(size = 9),
axis.title = element_text(size = 10),
plot.title = element_text(size = 11, face = "bold")
)
}
# Create publication-ready figure
create_simulation_figure <- function(results) {
ggplot(results, aes(x = n, y = bias, shape = method, linetype = method)) +
geom_point(size = 2) +
geom_line() +
geom_hline(yintercept = 0, linetype = "dashed", color = "gray50") +
facet_wrap(~scenario, scales = "free_y") +
scale_shape_manual(values = c(16, 17, 15, 18)) +
scale_linetype_manual(values = c("solid", "dashed", "dotted", "dotdash")) +
labs(
x = "Sample Size",
y = "Bias",
shape = "Method",
linetype = "Method"
) +
theme_jasa()
ggsave("figure1.pdf", width = 7, height = 5, dpi = 300)
}
1. Title
2. Abstract (structured or unstructured)
3. Introduction
4. Methods / Methodology
- Notation and Setup
- Identification
- Estimation
- Inference
5. Simulation Study
6. Application / Data Analysis
7. Discussion
8. Acknowledgments
9. References
10. Appendix / Supplementary Materials
- Proofs
- Additional simulations
- Implementation details
Formula: [Method/Approach] for [Problem/Setting]
Examples:
Tips:
Structure (150-250 words):
[1-2 sentences: Problem/motivation]
[1-2 sentences: Gap in existing methods]
[2-3 sentences: Our contribution/approach]
[1-2 sentences: Key results - theory + empirical]
[1 sentence: Implications/availability]
Example:
Mediation analysis is fundamental for understanding causal mechanisms in health research. Existing methods for sequential mediation assume correctly specified parametric models and cannot accommodate high-dimensional confounders. We develop a doubly robust estimator for sequential mediation effects that remains consistent when either the outcome or mediator models are correctly specified. We derive the efficient influence function and show our estimator achieves the semiparametric efficiency bound. Simulations demonstrate substantial efficiency gains over existing approaches, particularly under model misspecification. We apply our method to study the pathway from childhood adversity through inflammation to adult depression using MIDUS data. Software is available in the R package medrobust.
Structure (4-6 paragraphs):
Paragraph 1: Problem and Motivation
Paragraph 2: Existing Approaches
Paragraph 3: Gap/Limitation
Paragraph 4: Our Contribution
Paragraph 5: Results Preview
Paragraph 6: Paper Organization
Tips:
Template:
\section{Notation and Setup}
\label{sec:setup}
Let $O = (Y, A, M, X)$ denote the observed data, where:
\begin{itemize}
\item $Y \in \mathcal{Y}$ is the outcome of interest
\item $A \in \{0,1\}$ is the binary treatment
\item $M \in \mathcal{M}$ is the mediator
\item $X \in \mathcal{X}$ is a vector of pre-treatment confounders
\end{itemize}
We assume $n$ i.i.d. copies $O_1, \ldots, O_n$ from distribution $P$.
\subsection{Causal Framework}
We adopt the potential outcomes framework \citep{Rubin1974}. Let $Y(a)$
denote the potential outcome under treatment $A=a$, and $Y(a,m)$ the
potential outcome when treatment is set to $a$ and mediator to $m$.
Tips:
Structure:
\section{Identification}
\label{sec:identification}
\subsection{Target Estimand}
Our target estimand is [precise definition with formula].
\subsection{Identification Assumptions}
We require the following assumptions:
\begin{assumption}[Consistency]
\label{A:consistency}
$Y = Y(A, M)$ and $M = M(A)$.
\end{assumption}
[... additional assumptions ...]
\subsection{Identification Result}
\begin{theorem}[Identification]
\label{thm:identification}
Under Assumptions \ref{A:consistency}--\ref{A:positivity},
the estimand $\psi$ is identified by [formula].
\end{theorem}
Tips:
Structure:
\section{Estimation}
\label{sec:estimation}
\subsection{Proposed Estimator}
Based on the identification result, we propose the estimator:
\begin{equation}
\hat{\psi}_n = [estimator formula]
\end{equation}
\subsection{Nuisance Estimation}
The estimator depends on nuisance functions $\eta = (\mu, \pi, \ldots)$.
We estimate these using [approach].
\subsection{Algorithm}
[Pseudocode or step-by-step procedure]
Tips:
Structure:
\section{Asymptotic Properties}
\label{sec:theory}
\subsection{Regularity Conditions}
We impose the following regularity conditions:
\begin{condition}
\label{C1}
[Condition statement]
\end{condition}
\subsection{Main Result}
\begin{theorem}[Asymptotic Normality]
\label{thm:asymptotics}
Under Conditions \ref{C1}--\ref{Cn}, as $n \to \infty$:
\[
\sqrt{n}(\hat{\psi}_n - \psi_0) \xrightarrow{d} N(0, V)
\]
where $V = E[\phi(O)^2]$ and $\phi$ is the influence function given by [formula].
\end{theorem}
\subsection{Variance Estimation}
Consistent variance estimation via [approach].
\subsection{Efficiency} [optional]
\begin{theorem}[Semiparametric Efficiency]
The estimator $\hat{\psi}_n$ achieves the semiparametric efficiency bound.
\end{theorem}
Tips:
Structure:
\section{Simulation Study}
\label{sec:simulation}
\subsection{Design}
We assess finite-sample performance through Monte Carlo simulation.
\paragraph{Data Generation.}
[Describe DGP with formulas]
\paragraph{Parameter Grid.}
\begin{itemize}
\item Sample size: $n \in \{200, 500, 1000, 2000\}$
\item Effect size: $\psi \in \{0, 0.1, 0.3\}$
\item [Other factors]
\end{itemize}
\paragraph{Estimators.}
We compare:
\begin{enumerate}
\item Proposed estimator
\item [Competitor 1] \citep{...}
\item [Competitor 2] \citep{...}
\item Oracle (if applicable)
\end{enumerate}
\paragraph{Performance Metrics.}
\begin{itemize}
\item Bias: $\text{Bias} = \bar{\hat{\psi}} - \psi_0$
\item Empirical SE: $\text{ESE} = \text{SD}(\hat{\psi})$
\item Average SE: $\text{ASE} = \bar{\widehat{SE}}$
\item Coverage: $\text{Cov} = \text{proportion of CIs containing } \psi_0$
\item MSE: $\text{MSE} = \text{Bias}^2 + \text{ESE}^2$
\end{itemize}
Each scenario: 1000 replications.
\subsection{Results}
[Tables and interpretation]
Tips:
Structure:
\section{Application}
\label{sec:application}
\subsection{Data Description}
We apply our method to [dataset] to study [scientific question].
[Describe sample, variables, missingness]
\subsection{Analysis}
[Model specification, covariate selection, etc.]
\subsection{Results}
[Point estimates, CIs, interpretation]
\subsection{Sensitivity Analysis}
[Robustness to assumptions]
Tips:
Structure (4-5 paragraphs):
Paragraph 1: Summary
Paragraph 2: Implications
Paragraph 3: Limitations
Paragraph 4: Future Directions
Paragraph 5: Conclusion
Format:
Abstract: ~150 words, unstructured
Sections: Standard methods paper structure
Key reviewer expectations:
Word limit: ~25-30 pages (main), unlimited supplement
Format:
Abstract: ~100-150 words
Emphasis:
Word limit: ~20-25 pages
Format:
Abstract: 250 words max
Emphasis:
Word limit: ~30 pages
Format:
Emphasis:
| Symbol | Meaning |
|---|---|
| $Y(a)$ | Potential outcome under $A=a$ |
| $Y(a,m)$ | Potential outcome under $A=a$, $M=m$ |
| $M(a)$ | Potential mediator under $A=a$ |
| $NDE$ | Natural Direct Effect |
| $NIE$ | Natural Indirect Effect |
| $CDE(m)$ | Controlled Direct Effect at $M=m$ |
| $TE$ | Total Effect |
| $P_M$ | Proportion Mediated |
| Symbol | Meaning |
|---|---|
| $\theta_0$ | True parameter value |
| $\hat{\theta}_n$ | Estimator based on $n$ observations |
| $\phi(O)$ | Influence function |
| $\mathbb{P}_n$ | Empirical measure: $n^{-1}\sum_i \delta_{O_i}$ |
| $\mathbb{G}_n$ | Empirical process: $\sqrt{n}(\mathbb{P}_n - P)$ |
| $\xrightarrow{p}$ | Convergence in probability |
| $\xrightarrow{d}$ | Convergence in distribution |
| $O_p(\cdot)$, $o_p(\cdot)$ | Stochastic order |
We require the following assumptions for identification:
\begin{assumption}[Name]
\label{A:name}
[Mathematical statement]
\end{assumption}
Assumption \ref{A:name} requires that [plain language explanation]. This is plausible when [conditions]. It would be violated if [counter-examples].
Our main theoretical result establishes the asymptotic properties of $\hat{\psi}_n$.
\begin{theorem}[Title]
\label{thm:main}
Under Conditions \ref{C1}--\ref{Cn}, [statement].
\end{theorem}
Theorem \ref{thm:main} shows that [interpretation]. The key insight is [intuition]. Compared to [existing result], our result [improvement].
Our approach differs from \citet{Author2020} in several ways. First, [difference 1]. Second, [difference 2]. Whereas their method requires [strong assumption], our estimator only needs [weaker assumption]. In the simulation study, we demonstrate [empirical comparison].
Several limitations deserve mention. First, our method assumes [assumption], which may not hold in settings where [violation scenario]. Second, the asymptotic approximation requires [sample size consideration]. Future work could address these by [potential solutions].
\documentclass[12pt]{article}
\usepackage{amsmath,amsthm,amssymb}
\usepackage{natbib}
\usepackage{graphicx}
\usepackage{booktabs}
% Theorem environments
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{corollary}[theorem]{Corollary}
\newtheorem{proposition}[theorem]{Proposition}
\newtheorem{assumption}{Assumption}
\newtheorem{condition}{Condition}
% Custom commands
\newcommand{\E}{\mathbb{E}}
\newcommand{\Var}{\text{Var}}
\newcommand{\Cov}{\text{Cov}}
\newcommand{\indep}{\perp\!\!\!\perp}
\begin{document}
...
\end{document}
\begin{table}[ht]
\centering
\caption{Simulation results: Bias ($\times 100$), ESE, ASE, and Coverage (\%)}
\label{tab:sim}
\begin{tabular}{lcccccc}
\toprule
& \multicolumn{3}{c}{$n=500$} & \multicolumn{3}{c}{$n=1000$} \\
\cmidrule(lr){2-4} \cmidrule(lr){5-7}
Method & Bias & SE & Cov & Bias & SE & Cov \\
\midrule
Proposed & 0.2 & 0.15 & 94.8 & 0.1 & 0.11 & 95.2 \\
Naive & 5.3 & 0.12 & 82.1 & 5.1 & 0.09 & 71.3 \\
\bottomrule
\end{tabular}
\end{table}
\begin{figure}[ht]
\centering
\includegraphics[width=0.8\textwidth]{figures/sim_results.pdf}
\caption{Simulation results across sample sizes. Left: Bias. Right: Coverage.
Dashed line indicates nominal 95\% level.}
\label{fig:sim}
\end{figure}
Content:
Writing:
Formatting:
Reproducibility:
This skill works with:
VanderWeele notation
JASA style guide
APA citations
Morris, T.P. et al. (2019). Using simulation studies to evaluate statistical methods. Statistics in Medicine.
VanderWeele, T.J. (2015). Explanation in Causal Inference. Oxford.
van der Laan, M.J. & Rose, S. (2018). Targeted Learning in Data Science. Springer.
Version: 1.0 Created: 2025-12-08 Domain: Statistical Methods, Scientific Writing
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.