Design and document statistical algorithms with pseudocode and complexity analysis
Designs statistical algorithms with pseudocode, complexity analysis, and formal documentation.
npx claudepluginhub data-wise/scholarThis skill inherits all available tools. When active, it can use any tool Claude has access to.
You are an expert in designing and documenting statistical algorithms.
Every algorithm must have precise input/output documentation:
INPUT SPECIFICATION:
- Data: D = {(Y_i, A_i, M_i, X_i)}_{i=1}^n where:
- Y_i ∈ ℝ (continuous outcome)
- A_i ∈ {0,1} (binary treatment)
- M_i ∈ ℝ^d (d-dimensional mediator)
- X_i ∈ ℝ^p (p covariates)
- Parameters: θ ∈ Θ ⊆ ℝ^k (parameter space)
- Tolerance: ε > 0 (convergence criterion)
- Max iterations: T_max ∈ ℕ
OUTPUT SPECIFICATION:
- Estimate: θ̂ ∈ ℝ^k (point estimate)
- Variance: V̂ ∈ ℝ^{k×k} (covariance matrix)
- Convergence: boolean (did algorithm converge?)
- Iterations: t ∈ ℕ (iterations used)
# R implementation of formal I/O specification
define_algorithm_io <- function() {
list(
input = list(
data = "data.frame with columns Y, A, M, X",
params = "list(tol = 1e-6, max_iter = 1000)",
models = "list(outcome_formula, mediator_formula, propensity_formula)"
),
output = list(
estimate = "numeric vector of parameter estimates",
se = "numeric vector of standard errors",
vcov = "variance-covariance matrix",
converged = "logical indicating convergence",
iterations = "integer count of iterations"
),
complexity = list(
time = "O(n * p^2) per iteration",
space = "O(n * p)",
iterations = "O(log(1/epsilon)) for Newton-type"
)
)
}
| Criterion | Formula | Use Case |
|---|---|---|
| Absolute | $|\theta^{(t+1)} - \theta^{(t)}| < \varepsilon$ | Parameter convergence |
| Relative | $|\theta^{(t+1)} - \theta^{(t)}|/|\theta^{(t)}| < \varepsilon$ | Scale-invariant |
| Gradient | $|\nabla L(\theta^{(t)})| < \varepsilon$ | Optimization |
| Function | $|L(\theta^{(t+1)}) - L(\theta^{(t)})| < \varepsilon$ | Objective convergence |
| Cauchy | $\max_{i} | \theta_i^{(t+1)} - \theta_i^{(t)} |
Convergence tolerance: $\varepsilon = 10^{-6}$ (typical default)
Standard tolerances by application:
Linear complexity $O(n)$: Operations grow proportionally to input size $$T(n) = c \cdot n + O(1)$$
Quadratic complexity $O(n^2)$: Nested iterations over input $$T(n) = c \cdot n^2 + O(n)$$
Linearithmic complexity $O(n \log n)$: Divide-and-conquer with linear work per level $$T(n) = c \cdot n \log_2 n + O(n)$$
Space-Time Tradeoff: $$\text{Time} \times \text{Space} \geq \Omega(\text{Information Content})$$
Convergence rate analysis:
# Comprehensive convergence checking
check_convergence <- function(theta_new, theta_old, gradient = NULL,
objective_new = NULL, objective_old = NULL,
tol = 1e-6, method = "relative") {
switch(method,
"absolute" = {
# |θ^(t+1) - θ^t| < ε
converged <- max(abs(theta_new - theta_old)) < tol
criterion <- max(abs(theta_new - theta_old))
},
"relative" = {
# |θ^(t+1) - θ^t| / |θ^t| < ε
denom <- pmax(abs(theta_old), 1) # Avoid division by zero
converged <- max(abs(theta_new - theta_old) / denom) < tol
criterion <- max(abs(theta_new - theta_old) / denom)
},
"gradient" = {
# |∇L(θ)| < ε
stopifnot(!is.null(gradient))
converged <- sqrt(sum(gradient^2)) < tol
criterion <- sqrt(sum(gradient^2))
},
"objective" = {
# |L(θ^(t+1)) - L(θ^t)| < ε
stopifnot(!is.null(objective_new), !is.null(objective_old))
converged <- abs(objective_new - objective_old) < tol
criterion <- abs(objective_new - objective_old)
}
)
list(converged = converged, criterion = criterion, method = method)
}
# Newton-Raphson with convergence monitoring
newton_raphson <- function(f, grad, hess, theta0, tol = 1e-6, max_iter = 100) {
theta <- theta0
history <- list()
for (t in 1:max_iter) {
g <- grad(theta)
H <- hess(theta)
# Newton step: θ^(t+1) = θ^t - H^(-1) * g
# Time complexity: O(p^3) for matrix inversion
delta <- solve(H, g)
theta_new <- theta - delta
# Check convergence
conv <- check_convergence(theta_new, theta, gradient = g, tol = tol)
history[[t]] <- list(theta = theta, gradient_norm = sqrt(sum(g^2)))
if (conv$converged) {
return(list(
estimate = theta_new,
iterations = t,
converged = TRUE,
history = history
))
}
theta <- theta_new
}
list(estimate = theta, iterations = max_iter, converged = FALSE, history = history)
}
| Algorithm | Convergence Rate | Iterations to $\varepsilon$ |
|---|---|---|
| Gradient Descent | $O(1/t)$ | $O(1/\varepsilon)$ |
| Accelerated GD | $O(1/t^2)$ | $O(1/\sqrt{\varepsilon})$ |
| Newton-Raphson | Quadratic | $O(\log\log(1/\varepsilon))$ |
| EM Algorithm | Linear | $O(\log(1/\varepsilon))$ |
| Coordinate Descent | Linear | $O(p \cdot \log(1/\varepsilon))$ |
ALGORITHM: [Name]
INPUT: [List inputs with types]
OUTPUT: [List outputs with types]
1. [Initialize]
2. [Main loop or procedure]
2.1 [Sub-step]
2.2 [Sub-step]
3. [Return]
ALGORITHM: Augmented IPW for Mediation
INPUT:
- Data (Y, A, M, X) of size n
- Propensity model specification
- Outcome model specification
- Mediator model specification
OUTPUT:
- Point estimate ψ̂
- Standard error SE(ψ̂)
- 95% confidence interval
1. ESTIMATE NUISANCE FUNCTIONS
1.1 Fit propensity score: π̂(x) = P̂(A=1|X=x)
1.2 Fit mediator density: f̂(m|a,x)
1.3 Fit outcome regression: μ̂(a,m,x) = Ê[Y|A=a,M=m,X=x]
2. COMPUTE PSEUDO-OUTCOMES
For i = 1 to n:
2.1 Compute IPW weight: w_i = A_i/π̂(X_i) + (1-A_i)/(1-π̂(X_i))
2.2 Compute outcome prediction: μ̂_i = μ̂(A_i, M_i, X_i)
2.3 Compute augmentation term
2.4 φ_i = w_i(Y_i - μ̂_i) + [integration term]
3. ESTIMATE AND INFERENCE
3.1 ψ̂ = n⁻¹ Σᵢ φ_i
3.2 SE = √(n⁻¹ Σᵢ (φ_i - ψ̂)²)
3.3 CI = [ψ̂ - 1.96·SE, ψ̂ + 1.96·SE]
4. RETURN (ψ̂, SE, CI)
Formal Definition: $f(n) = O(g(n))$ if $\exists c, n_0$ such that $f(n) \leq c \cdot g(n)$ for all $n \geq n_0$
| Complexity | Name | Example | Operations at n=1000 |
|---|---|---|---|
| $O(1)$ | Constant | Array access | 1 |
| $O(\log n)$ | Logarithmic | Binary search | ~10 |
| $O(n)$ | Linear | Single loop | 1,000 |
| $O(n \log n)$ | Linearithmic | Merge sort, FFT | ~10,000 |
| $O(n^2)$ | Quadratic | Nested loops | 1,000,000 |
| $O(n^3)$ | Cubic | Matrix multiplication | 1,000,000,000 |
| $O(2^n)$ | Exponential | Subset enumeration | ~10^301 |
Master Theorem for recurrences $T(n) = aT(n/b) + f(n)$:
Sorting lower bound: Any comparison-based sort requires $\Omega(n \log n)$ comparisons
Matrix operations:
# Complexity analysis helper
analyze_complexity <- function(f, n_values = c(100, 500, 1000, 5000)) {
times <- sapply(n_values, function(n) {
system.time(f(n))[["elapsed"]]
})
# Fit log-log regression to estimate complexity
fit <- lm(log(times) ~ log(n_values))
estimated_power <- coef(fit)[2]
list(
times = data.frame(n = n_values, time = times),
estimated_complexity = paste0("O(n^", round(estimated_power, 2), ")"),
power = estimated_power
)
}
| Algorithm | Time | Space |
|---|---|---|
| OLS | O(np² + p³) | O(np) |
| Logistic (Newton) | O(np² + p³) per iter | O(np) |
| Bootstrap (B reps) | O(B × base) | O(n) |
| MCMC (T iters) | O(T × per_iter) | O(n + T) |
| Cross-validation (K) | O(K × base) | O(n) |
| Random forest | O(n log n × B × p) | O(n × B) |
TIME COMPLEXITY:
- Initialization: O(...)
- Per iteration: O(...)
- Total (T iterations): O(...)
- Convergence typically in T = O(...) iterations
SPACE COMPLEXITY:
- Data storage: O(n × p)
- Working memory: O(...)
- Output: O(...)
CONVERGENCE:
- Type: [Linear/Superlinear/Quadratic]
- Rate: [Expression]
- Conditions: [What must hold]
- Stopping criterion: [When to stop]
- Typical iterations: [Order of magnitude]
ALGORITHM: Gradient Descent
INPUT: f (objective), ∇f (gradient), x₀ (initial), η (step size), ε (tolerance)
OUTPUT: x* (minimizer)
1. k ← 0
2. WHILE ‖∇f(xₖ)‖ > ε:
2.1 xₖ₊₁ ← xₖ - η∇f(xₖ)
2.2 k ← k + 1
3. RETURN xₖ
COMPLEXITY: O(iterations × gradient_cost)
CONVERGENCE: Linear with rate (1 - η·μ) for μ-strongly convex f
ALGORITHM: Newton-Raphson
INPUT: f, ∇f, ∇²f, x₀, ε
OUTPUT: x*
1. k ← 0
2. WHILE ‖∇f(xₖ)‖ > ε:
2.1 Solve ∇²f(xₖ)·d = -∇f(xₖ) for direction d
2.2 xₖ₊₁ ← xₖ + d
2.3 k ← k + 1
3. RETURN xₖ
COMPLEXITY: O(iterations × p³) for p-dimensional
CONVERGENCE: Quadratic near solution
ALGORITHM: Expectation-Maximization
INPUT: Data Y, model parameters θ₀, tolerance ε
OUTPUT: MLE θ̂
1. θ ← θ₀
2. REPEAT:
2.1 E-STEP: Compute Q(θ'|θ) = E[log L(θ'|Y,Z) | Y, θ]
2.2 M-STEP: θ_new ← argmax_θ' Q(θ'|θ)
2.3 Δ ← |θ_new - θ|
2.4 θ ← θ_new
3. UNTIL Δ < ε
4. RETURN θ
CONVERGENCE: Monotonic increase in likelihood
Linear rate near optimum
ALGORITHM: Nonparametric Bootstrap
INPUT: Data X of size n, statistic T, B (number of replicates)
OUTPUT: SE estimate, CI
1. FOR b = 1 to B:
1.1 Draw X*_b by sampling n observations with replacement from X
1.2 Compute T*_b = T(X*_b)
2. SE_boot ← SD({T*_1, ..., T*_B})
3. CI_percentile ← [quantile(T*, 0.025), quantile(T*, 0.975)]
4. RETURN (SE_boot, CI_percentile)
COMPLEXITY: O(B × cost(T))
NOTES: B ≥ 1000 for SE, B ≥ 10000 for percentile CI
ALGORITHM: Parametric Bootstrap
INPUT: Data X, parametric model M, B replicates
OUTPUT: SE estimate
1. Fit θ̂ = MLE(X, M)
2. FOR b = 1 to B:
2.1 Generate X*_b ~ M(θ̂)
2.2 Compute θ̂*_b = MLE(X*_b, M)
3. SE_boot ← SD({θ̂*_1, ..., θ̂*_B})
4. RETURN SE_boot
# Log-sum-exp trick
log_sum_exp <- function(x) {
max_x <- max(x)
max_x + log(sum(exp(x - max_x)))
}
# Numerically stable variance
stable_var <- function(x) {
n <- length(x)
m <- mean(x)
sum((x - m)^2) / (n - 1) # One-pass with correction
}
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.