From sciagent-skills
Models censored time-to-event data with scikit-survival: Cox PH (elastic net), Random Survival Forests, boosting, survival SVMs. Computes C-index, Brier score, time-dependent AUC; Kaplan-Meier, competing risks.
npx claudepluginhub jaechang-hits/sciagent-skills --plugin sciagent-skillsThis skill uses the workspace's default tool permissions.
scikit-survival is a Python library for time-to-event analysis built on scikit-learn. It handles right-censored data (observations where the event has not yet occurred) using Cox models, ensemble methods, survival SVMs, and non-parametric estimators. All models follow the scikit-learn `fit/predict` API and integrate with Pipelines, cross-validation, and GridSearchCV.
Performs survival analysis and time-to-event modeling in Python with scikit-survival: handles censored data, fits Cox models, Random Survival Forests, Gradient Boosting, Survival SVMs, evaluates with concordance index or Brier score.
Analyzes survival data and fits time-to-event models in Python with scikit-survival: Cox, Random Survival Forests, Gradient Boosting, SVMs; handles censoring, competing risks, evaluates with concordance index or Brier score.
Analyzes time-to-event data using Kaplan-Meier curves, Cox proportional hazards models, log-rank tests; handles censoring with Python lifelines library.
Share bugs, ideas, or general feedback.
scikit-survival is a Python library for time-to-event analysis built on scikit-learn. It handles right-censored data (observations where the event has not yet occurred) using Cox models, ensemble methods, survival SVMs, and non-parametric estimators. All models follow the scikit-learn fit/predict API and integrate with Pipelines, cross-validation, and GridSearchCV.
lifelinespycox or torchlifepip install scikit-survival scikit-learn pandas numpy matplotlib
Python: >= 3.9. Dependencies: scikit-learn, numpy, scipy, pandas, joblib, osqp (for some SVM solvers).
Data format: Survival outcomes are NumPy structured arrays with (event, time) fields. Events are boolean (True = event occurred, False = censored). Times are positive floats.
from sksurv.datasets import load_breast_cancer
from sksurv.ensemble import RandomSurvivalForest
from sksurv.metrics import concordance_index_ipcw
from sklearn.model_selection import train_test_split
X, y = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
rsf = RandomSurvivalForest(n_estimators=100, random_state=42)
rsf.fit(X_train, y_train)
risk_scores = rsf.predict(X_test)
c_index = concordance_index_ipcw(y_train, y_test, risk_scores)[0]
print(f"C-index: {c_index:.3f}") # e.g., 0.68
# Individual survival curves
surv_fns = rsf.predict_survival_function(X_test[:2])
for fn in surv_fns:
print(f"5-year survival: {fn(365 * 5):.3f}")
Create structured survival arrays and preprocess features.
import numpy as np
import pandas as pd
from sksurv.util import Surv
from sksurv.preprocessing import OneHotEncoder, encode_categorical
from sksurv.datasets import load_gbsg2, load_breast_cancer
from sklearn.preprocessing import StandardScaler
# Create survival outcome from arrays
event = np.array([True, False, True, True, False])
time = np.array([120.0, 365.0, 200.0, 90.0, 400.0])
y = Surv.from_arrays(event=event, time=time)
print(y.dtype) # [('event', '?'), ('time', '<f8')]
# From DataFrame columns
# y = Surv.from_dataframe("event_col", "time_col", df)
# Load built-in datasets
# Available: load_gbsg2, load_breast_cancer, load_veterans_lung_cancer,
# load_whas500, load_aids, load_flchain
X, y = load_gbsg2()
print(f"Shape: {X.shape}, Events: {y['event'].sum()}, "
f"Censoring rate: {1 - y['event'].mean():.1%}")
# Encode categoricals (survival-aware one-hot)
X_encoded = encode_categorical(X) # auto-detect and encode all categorical cols
# Standardize (critical for Cox and SVM models)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X_encoded)
from sksurv.io import loadarff
# Load ARFF format (Weka format)
data = loadarff("survival_data.arff")
X_arff, y_arff = data[0], data[1] # DataFrame, structured array
Semi-parametric model: h(t|x) = h_0(t) * exp(beta^T x). Interpretable coefficients as log hazard ratios.
from sksurv.linear_model import CoxPHSurvivalAnalysis, CoxnetSurvivalAnalysis, IPCRidge
# Standard Cox PH model
cox = CoxPHSurvivalAnalysis(alpha=0.0, ties="breslow")
cox.fit(X_train, y_train)
print(f"Coefficients: {cox.coef_}") # log hazard ratios
# Hazard ratio interpretation: exp(coef) = HR for 1-unit increase
risk_scores = cox.predict(X_test) # Higher = higher risk
# Survival function for individual patients
surv_funcs = cox.predict_survival_function(X_test[:3])
for fn in surv_funcs:
print(f"5-year survival: {fn(365 * 5):.3f}")
# Penalized Cox (elastic net) -- for high-dimensional data (p > n)
coxnet = CoxnetSurvivalAnalysis(
l1_ratio=0.9, # 0=Ridge, 1=Lasso, between=Elastic Net
alpha_min_ratio=0.01, # smallest alpha / largest alpha ratio
n_alphas=100, # steps in regularization path
)
coxnet.fit(X_train, y_train)
# Feature selection: non-zero coefficients
selected = np.where(coxnet.coef_ != 0)[0]
print(f"Selected {len(selected)} / {X_train.shape[1]} features")
# IPCRidge: accelerated failure time model (predicts log survival time)
ipcridge = IPCRidge(alpha=1.0)
ipcridge.fit(X_train, y_train)
log_survival_time = ipcridge.predict(X_test)
Non-parametric tree-based models for complex non-linear relationships.
from sksurv.ensemble import (
RandomSurvivalForest,
GradientBoostingSurvivalAnalysis,
ComponentwiseGradientBoostingSurvivalAnalysis,
ExtraSurvivalTrees,
)
# Random Survival Forest -- robust, minimal tuning
rsf = RandomSurvivalForest(
n_estimators=200, min_samples_split=10, min_samples_leaf=15,
max_features="sqrt", random_state=42, n_jobs=-1,
)
rsf.fit(X_train, y_train)
risk = rsf.predict(X_test)
# Gradient Boosting -- best performance, needs tuning
gbs = GradientBoostingSurvivalAnalysis(
loss="coxph", # "coxph" or "ipcwls" (AFT)
n_estimators=300, learning_rate=0.05, max_depth=3,
subsample=0.8, dropout_rate=0.1, random_state=42,
)
gbs.fit(X_train, y_train)
# ComponentwiseGB -- linear model with automatic feature selection
cgbs = ComponentwiseGradientBoostingSurvivalAnalysis(
n_estimators=100, learning_rate=0.1,
)
cgbs.fit(X_train, y_train)
print(f"Non-zero coefficients: {np.sum(cgbs.coef_ != 0)}")
# ExtraSurvivalTrees -- more regularized than RSF, faster training
est = ExtraSurvivalTrees(n_estimators=100, random_state=42)
est.fit(X_train, y_train)
# Survival curves from any ensemble model
surv_funcs = rsf.predict_survival_function(X_test[:1])
chf_funcs = rsf.predict_cumulative_hazard_function(X_test[:1])
Margin-based learning for survival ranking. Always standardize features.
from sksurv.svm import FastSurvivalSVM, FastKernelSurvivalSVM, HingeLossSurvivalSVM
# Linear SVM -- fast, for linear relationships
lsvm = FastSurvivalSVM(alpha=1.0, rank_ratio=1.0, max_iter=100, random_state=42)
lsvm.fit(X_train_scaled, y_train)
risk = lsvm.predict(X_test_scaled)
# Kernel SVM -- for non-linear relationships (rbf, poly, sigmoid)
ksvm = FastKernelSurvivalSVM(
alpha=1.0, kernel="rbf", gamma="scale",
max_iter=50, random_state=42,
)
ksvm.fit(X_train_scaled, y_train)
# Hinge loss variant
hsvm = HingeLossSurvivalSVM(alpha=1.0, random_state=42)
hsvm.fit(X_train_scaled, y_train)
# NaiveSurvivalSVM also available but slower (O(n^3))
from sksurv.kernels import ClinicalKernelTransform
# Clinical kernel: combines clinical + molecular features
# Weighs clinical variables separately from high-dimensional molecular data
transform = ClinicalKernelTransform(fit_once=True)
transform.prepare(X_train) # auto-detect clinical features
X_kern = transform.fit_transform(X_train)
Estimate survival and hazard curves without model assumptions.
from sksurv.nonparametric import kaplan_meier_estimator, nelson_aalen_estimator
import matplotlib.pyplot as plt
# Kaplan-Meier survival curve
time_km, surv_prob = kaplan_meier_estimator(y["event"], y["time"])
plt.step(time_km, surv_prob, where="post")
plt.xlabel("Time (days)")
plt.ylabel("Survival probability")
plt.title("Kaplan-Meier Estimate")
plt.savefig("km_curve.png", dpi=150)
# With confidence intervals
time_km, surv_prob, conf_int = kaplan_meier_estimator(
y["event"], y["time"], conf_type="log-log",
)
# Nelson-Aalen cumulative hazard
time_na, cum_hazard = nelson_aalen_estimator(y["event"], y["time"])
# Stratified KM by group
for group_name, mask in [("Treated", treated_mask), ("Control", control_mask)]:
t, s = kaplan_meier_estimator(y[mask]["event"], y[mask]["time"])
plt.step(t, s, where="post", label=group_name)
plt.legend()
Censoring-aware metrics for discrimination and calibration.
from sksurv.metrics import (
concordance_index_censored,
concordance_index_ipcw,
cumulative_dynamic_auc,
integrated_brier_score,
brier_score,
as_concordance_index_ipcw_scorer,
as_integrated_brier_score_scorer,
)
import numpy as np
risk_scores = model.predict(X_test)
# Harrell's C-index (simple, biased with high censoring)
c_harrell = concordance_index_censored(
y_test["event"], y_test["time"], risk_scores
)[0]
# Uno's C-index (recommended -- robust to censoring)
c_uno = concordance_index_ipcw(y_train, y_test, risk_scores)[0]
print(f"C-index (Harrell): {c_harrell:.3f}, (Uno): {c_uno:.3f}")
# Time-dependent AUC at clinically relevant timepoints
times = np.array([365, 730, 1095]) # 1, 2, 3 years
auc, mean_auc = cumulative_dynamic_auc(y_train, y_test, risk_scores, times)
print(f"AUC at 1/2/3yr: {auc}, Mean: {mean_auc:.3f}")
# Integrated Brier Score (measures discrimination + calibration)
surv_funcs = model.predict_survival_function(X_test)
preds = np.row_stack([fn(times) for fn in surv_funcs])
ibs = integrated_brier_score(y_train, y_test, preds, times)
print(f"IBS: {ibs:.3f}") # Lower is better; compare vs KM baseline
Multiple mutually exclusive event types (e.g., death from cancer vs cardiovascular).
from sksurv.nonparametric import cumulative_incidence_competing_risks
from sksurv.linear_model import CoxPHSurvivalAnalysis
from sksurv.util import Surv
import numpy as np
# Cumulative Incidence Function (CIF) estimation
# y_competing: structured array with integer event codes
# (0=censored, 1=relapse, 2=death_in_remission)
time_pts, cif_relapse, cif_death = cumulative_incidence_competing_risks(y_competing)
import matplotlib.pyplot as plt
plt.step(time_pts, cif_relapse, where="post", label="Relapse")
plt.step(time_pts, cif_death, where="post", label="Death in remission")
plt.xlabel("Time")
plt.ylabel("Cumulative Incidence")
plt.legend()
# Cause-specific hazard models: fit separate Cox per event type
event_types = np.array([0, 1, 2, 1, 0, 2, 1])
times = np.array([10.2, 5.3, 8.1, 3.7, 12.5, 6.8, 4.2])
# Model for relapse (event type 1); treat type 2 as censored
y_relapse = Surv.from_arrays(event=(event_types == 1), time=times)
cox_relapse = CoxPHSurvivalAnalysis()
cox_relapse.fit(X, y_relapse)
# Model for death (event type 2); treat type 1 as censored
y_death = Surv.from_arrays(event=(event_types == 2), time=times)
cox_death = CoxPHSurvivalAnalysis()
cox_death.fit(X, y_death)
print(f"Relapse HR (age): {np.exp(cox_relapse.coef_[0]):.3f}")
print(f"Death HR (age): {np.exp(cox_death.coef_[0]):.3f}")
High-dimensional (p > n)?
├── Yes → CoxnetSurvivalAnalysis (elastic net)
└── No
├── Need interpretable coefficients?
│ ├── Yes → CoxPHSurvivalAnalysis or ComponentwiseGB
│ └── No
│ ├── Large data (n > 1000) → GradientBoostingSurvivalAnalysis
│ ├── Medium data → RandomSurvivalForest or FastKernelSurvivalSVM
│ └── Small data → RandomSurvivalForest
└── For max performance → compare multiple models
scikit-survival primarily handles right-censoring (event not observed by end of study). Left-censoring and interval-censoring require other tools (e.g., lifelines). Censoring rate affects metric choice: >40% censoring favors Uno's C-index over Harrell's.
| Metric | Measures | Best For |
|---|---|---|
| Harrell's C-index | Ranking only | Quick development, low censoring |
| Uno's C-index | Ranking (censoring-adjusted) | Publication, any censoring rate |
| Time-dependent AUC | Discrimination at fixed t | Clinical decision at specific horizons |
| Brier Score (single t) | Calibration at fixed t | Probability accuracy at timepoint |
| Integrated Brier Score | Overall calibration | Comprehensive model assessment |
Goal: End-to-end analysis with preprocessing, model fitting, and evaluation.
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV, train_test_split
from sksurv.datasets import load_breast_cancer
from sksurv.ensemble import GradientBoostingSurvivalAnalysis
from sksurv.metrics import (
concordance_index_ipcw, cumulative_dynamic_auc,
integrated_brier_score, as_concordance_index_ipcw_scorer,
)
import numpy as np
X, y = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Build pipeline
pipe = Pipeline([
("scaler", StandardScaler()),
("model", GradientBoostingSurvivalAnalysis(random_state=42)),
])
# Hyperparameter search
param_grid = {
"model__learning_rate": [0.01, 0.05, 0.1],
"model__n_estimators": [100, 200],
"model__max_depth": [3, 5],
}
cv = GridSearchCV(pipe, param_grid, cv=5,
scoring=as_concordance_index_ipcw_scorer(), n_jobs=-1)
cv.fit(X_train, y_train)
print(f"Best CV C-index: {cv.best_score_:.3f}, Params: {cv.best_params_}")
# Final evaluation on test set
best = cv.best_estimator_
risk = best.predict(X_test)
c_uno = concordance_index_ipcw(y_train, y_test, risk)[0]
times = np.percentile(y_test["time"][y_test["event"]], [25, 50, 75])
auc, mean_auc = cumulative_dynamic_auc(y_train, y_test, risk, times)
print(f"Test C-index: {c_uno:.3f}, Mean AUC: {mean_auc:.3f}")
Goal: Compare multiple model families and select the best.
from sksurv.linear_model import CoxPHSurvivalAnalysis
from sksurv.ensemble import RandomSurvivalForest, GradientBoostingSurvivalAnalysis
from sksurv.svm import FastSurvivalSVM
from sksurv.metrics import concordance_index_ipcw, integrated_brier_score
from sklearn.preprocessing import StandardScaler
import numpy as np
scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_test_s = scaler.transform(X_test)
models = {
"CoxPH": CoxPHSurvivalAnalysis(),
"RSF": RandomSurvivalForest(n_estimators=200, random_state=42),
"GBS": GradientBoostingSurvivalAnalysis(n_estimators=200, random_state=42),
"LinearSVM": FastSurvivalSVM(alpha=1.0, random_state=42),
}
times = np.percentile(y_test["time"][y_test["event"]], [25, 50, 75])
for name, model in models.items():
X_tr = X_train_s if name in ("CoxPH", "LinearSVM") else X_train
X_te = X_test_s if name in ("CoxPH", "LinearSVM") else X_test
model.fit(X_tr, y_train)
risk = model.predict(X_te)
c = concordance_index_ipcw(y_train, y_test, risk)[0]
print(f"{name:12s}: C-index = {c:.3f}")
| Parameter | Module | Default | Range | Effect |
|---|---|---|---|---|
alpha | CoxPH | 0.0 | >= 0 | Regularization (0 = none) |
l1_ratio | CoxNet | 0.5 | 0-1 | L1 vs L2 penalty (1=Lasso, 0=Ridge) |
n_estimators | RSF, GBS | 100 | 50-1000 | Number of trees/iterations |
max_depth | RSF, GBS | None/3 | 1-20 | Tree depth (GBS: 3-5 recommended) |
learning_rate | GBS | 0.1 | 0.001-0.3 | Step size shrinkage per iteration |
subsample | GBS | 1.0 | 0.5-1.0 | Fraction of samples per tree |
dropout_rate | GBS | 0.0 | 0-0.3 | Drop previous learners during training |
max_features | RSF | None | "sqrt","log2",int | Features per split |
alpha | FastSurvivalSVM | 1.0 | 0.01-100 | SVM regularization |
kernel | FastKernelSVM | "rbf" | "linear","rbf","poly","sigmoid" | Kernel function |
gamma | FastKernelSVM | "scale" | "scale","auto",float | Kernel coefficient |
Always standardize features for Cox and SVM models: Coefficients and margins are scale-dependent. Use StandardScaler inside a Pipeline to prevent leakage.
Use Uno's C-index over Harrell's: More robust to censoring distribution differences; required for comparing models across datasets. Pass y_train as the first argument.
Anti-pattern -- using built-in RSF feature importance: It is biased toward continuous and high-cardinality features. Use sklearn.inspection.permutation_importance() instead.
Report multiple metrics: C-index alone measures only discrimination. Add Brier score for calibration and time-dependent AUC for time-specific performance.
Anti-pattern -- ignoring proportional hazards assumption: Cox models assume constant hazard ratios over time. Validate with Schoenfeld residuals (available in lifelines) or use non-parametric alternatives (RSF, GBS).
Ensure sufficient events per feature: Rule of thumb: 10+ events per feature for Cox. With fewer events, use CoxNet (l1_ratio=0.9) or ensemble methods.
Use scikit-learn Pipelines: Wrapping preprocessing + model prevents data leakage and simplifies cross-validation and deployment.
Anti-pattern -- treating competing events as censored for KM: Kaplan-Meier overestimates event probability when competing risks exist. Use cumulative_incidence_competing_risks() instead.
GBS early stopping: For GradientBoosting, set n_estimators=1000 with n_iter_no_change=10 to automatically stop when validation performance plateaus.
from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sksurv.ensemble import RandomSurvivalForest
from sksurv.metrics import as_concordance_index_ipcw_scorer
pipe = Pipeline([
("scaler", StandardScaler()),
("model", RandomSurvivalForest(random_state=42)),
])
param_grid = {
"model__n_estimators": [100, 300, 500],
"model__min_samples_split": [6, 10, 20],
"model__max_depth": [None, 10, 20],
}
cv = GridSearchCV(pipe, param_grid, cv=5,
scoring=as_concordance_index_ipcw_scorer(), n_jobs=-1)
cv.fit(X, y)
print(f"Best: {cv.best_score_:.3f}, Params: {cv.best_params_}")
from sklearn.inspection import permutation_importance
from sksurv.metrics import as_concordance_index_ipcw_scorer
result = permutation_importance(
rsf, X_test, y_test, n_repeats=15,
scoring=as_concordance_index_ipcw_scorer(), random_state=42, n_jobs=-1,
)
sorted_idx = result.importances_mean.argsort()[::-1]
for i in sorted_idx[:10]:
print(f"{X.columns[i]:20s}: {result.importances_mean[i]:.4f} "
f"+/- {result.importances_std[i]:.4f}")
from sksurv.nonparametric import kaplan_meier_estimator
import matplotlib.pyplot as plt
time, surv, conf_int = kaplan_meier_estimator(
y["event"], y["time"], conf_type="log-log",
)
plt.step(time, surv, where="post", label="KM estimate")
plt.fill_between(time, conf_int[0], conf_int[1], alpha=0.25, step="post")
plt.xlabel("Time")
plt.ylabel("Survival probability")
plt.legend()
plt.savefig("km_with_ci.png", dpi=150)
| Problem | Cause | Solution |
|---|---|---|
ValueError: y must be a structured array | Wrong outcome format | Use Surv.from_arrays(event, time) -- event must be bool, time must be float |
concordance_index_ipcw raises error | Missing y_train argument | IPCW requires training set for censoring distribution; pass y_train as first arg |
| Cox model won't converge | Correlated or unstandardized features | Standardize with StandardScaler; switch to CoxnetSurvivalAnalysis |
| Very low C-index (< 0.5) | Wrong sign or non-predictive features | Verify event/time coding; check feature engineering |
as_concordance_index_ipcw_scorer error in CV | Time range mismatch between folds | Ensure all folds contain events at evaluated timepoints |
| SVM extremely slow | Large dataset with kernel SVM | Use FastSurvivalSVM (linear, O(np)) for large n; subsample for kernel |
| RSF feature importance misleading | Built-in impurity importance biased | Use permutation_importance() from scikit-learn |
| Kaplan-Meier overestimates | Competing events treated as censoring | Use cumulative_incidence_competing_risks() instead |
Covers: Detailed Cox model family (CoxPH, CoxNet, IPCRidge with parameter guidance and interpretation), ensemble methods (RSF, GBS, ComponentwiseGB, ExtraSurvivalTrees with hyperparameter tuning, early stopping, loss functions), SVM models (FastSurvivalSVM, FastKernelSurvivalSVM, HingeLossSurvivalSVM, NaiveSurvivalSVM, MinlipSurvivalAnalysis with kernel selection), evaluation metrics (Harrell vs Uno C-index selection, time-dependent AUC plotting, Brier score with null model comparison, comprehensive evaluation pipeline), model comparison tables.
Relocated inline: Core usage patterns for all model families consolidated into Core API Modules 2-4 and Module 6. Model selection decision tree relocated to Key Concepts.
Omitted: CoxNet cross-validation alpha selection via scoring string (not a valid sklearn scorer usage -- use as_concordance_index_ipcw_scorer() as shown in Core API). NaiveSurvivalSVM detailed usage (slow, primarily for benchmarking). MinlipSurvivalAnalysis detailed usage (research-only). Redundant "when to use" prose duplicated across original reference files.
Consolidates 4 originals: cox-models.md (182 lines), ensemble-models.md (327 lines), svm-models.md (411 lines), evaluation-metrics.md (378 lines). Total: 1,298 lines.
Covers: Complete data preprocessing workflows (Surv arrays, loading custom CSV, ARFF I/O, categorical encoding methods, standardization, missing data imputation strategies, feature selection), data validation checks (negative times, censoring rate, events-per-feature), train-test splitting (random and stratified by event/time), complete preprocessing pipeline with sklearn, competing risks analysis (CIF estimation, data format with event types, stratified group comparison, cause-specific hazard modeling, complete practical example with simulated data).
Relocated inline: Core Surv array creation, encode_categorical, StandardScaler usage consolidated into Core API Module 1. CIF estimation and cause-specific Cox models relocated to Core API Module 7.
Omitted: Custom preprocessing function with docstring (duplicates sklearn Pipeline pattern shown in Workflow 1). Time-varying covariates note (scikit-survival does not support them -- mentioned only as a 1-line note in original). Fine-Gray sub-distribution model (not available in scikit-survival; noted as lifelines alternative).
Consolidates 2 originals: data-handling.md (494 lines), competing-risks.md (397 lines). Total: 891 lines.