From research-ops-skills
Use when designing a prospective clinical study before submission — selecting and classifying endpoints (primary / key-secondary / exploratory, with surrogate-endpoint flagging), estimating sample size and power for two-arm designs (means / proportions / survival), or scoring a study plan for feasibility and a GO / GO-WITH-CONDITIONS / REDESIGN / NO-GO phase-gate decision. Every output is an ESTIMATE plus a named human owner (clinician / biostatistician / regulatory owner) — never clinical fact, never a finished protocol. Distinct from ra-qm-team, which handles the regulatory/QM submission (ISO 13485, EU MDR, FDA 510(k)/PMA/QSR), not the study design.
How this skill is triggered — by the user, by Claude, or both
Slash command
/research-ops-skills:clinical-researchThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Prospective clinical study DESIGN: endpoints, sample size / power, and phase-gate feasibility. Every output is an **estimate with stated assumptions** routed to a **named human owner**. This skill never gives clinical advice as fact and never substitutes for a biostatistician or regulatory affairs.
Prospective clinical study DESIGN: endpoints, sample size / power, and phase-gate feasibility. Every output is an estimate with stated assumptions routed to a named human owner. This skill never gives clinical advice as fact and never substitutes for a biostatistician or regulatory affairs.
R&D clinical teams, medical monitors, and biostatistics functions live at the moment between we-have-a-hypothesis and we-have-a-protocol-ready-for-submission. This skill structures three of the hardest design decisions:
Three deterministic tools:
sample_size_estimator.py — Closed-form power / sample-size for two-arm means (Cohen's d), proportions (normal approximation), and survival (Schoenfeld events). Inflates for dropout. Prints an "ESTIMATE — confirm with a biostatistician" banner.endpoint_selector.py — Scores candidate endpoints across 5 weighted dimensions (clinical relevance, measurability, regulatory acceptance, sensitivity-to-change, burden) and classifies each as PRIMARY / KEY-SECONDARY / EXPLORATORY. Penalizes unvalidated surrogate endpoints.phase_gate_scorer.py — Scores a study plan 0-100 across recruitment feasibility, endpoint readiness, statistical power, operational complexity, and budget fit; returns GO / GO-WITH-CONDITIONS / REDESIGN / NO-GO plus the named owners who must sign.Invoke this skill when:
Do NOT use this skill to: prepare a regulatory submission or clinical evaluation report (use ra-qm-team), find or position a grant (use research/grants), design a live product A/B experiment (use product-team/experiment-designer), or replace a biostatistician's final sample-size justification.
assets/protocol_synopsis_template.md (objectives, design, population, endpoints, statistical plan placeholder, owners-to-sign).endpoint_selector.py --input endpoints.json --profile {drug|device|biologic|diagnostic|digital-therapeutic}. Read the classification + surrogate flags. If >1 primary, plan multiplicity control.sample_size_estimator.py --design {means|proportions|survival} .... Trace the effect/difference/HR to a published or anchor-based source; inflate for dropout.phase_gate_scorer.py --input study.json --profile <same> --phase {1|2|3|4}. Read the verdict + blockers + named owners.| Script | Purpose | Profiles |
|---|---|---|
scripts/sample_size_estimator.py | Power / sample-size for means, proportions, survival | n/a (design-driven) |
scripts/endpoint_selector.py | 5-dimension endpoint scoring + classification + surrogate flag | drug, device, biologic, diagnostic, digital-therapeutic |
scripts/phase_gate_scorer.py | Feasibility 0-100 + GO/GO-WITH-CONDITIONS/REDESIGN/NO-GO + owners | drug, device, biologic, diagnostic, digital-therapeutic |
All three: stdlib-only, --help, --sample, --output {human,json}.
Run the onboarding questionnaire once before you start — it captures your defaults and named owners so every tool in this skill is pre-configured. Customization is the point: the answers actually change tool behavior.
python3 scripts/onboard.py # interactive (also: --defaults, --set key=value, --reset)
python3 scripts/onboard.py --show # see the questions + current effective config
Answers are saved to ~/.config/research-ops/clinical-research.json (global) or ./.research-ops/clinical-research.json (--scope project) and are read automatically by config_loader.py. They set the default development-area profile, default alpha / power / dropout, and the named biostatistician / medical monitor / regulatory owner printed on outputs. CLI flags always override saved config; RESEARCH_OPS_NO_CONFIG=1 ignores it entirely.
The seven questions: development area · alpha · power · dropout · biostatistician · medical monitor · regulatory owner.
This skill ships an isolated, opt-in bridge to engineering/autoresearch-agent. Only when you ask to "optimize" / "run a loop" does an autoresearch experiment iteratively improve a study plan against this skill's own feasibility score. scripts/ar_evaluator.py is the ground-truth evaluator; it prints feasibility_composite: <0-100> (higher is better).
/ar:setup --domain custom --name trial-feasibility \
--target study.json \
--eval "python3 ar_evaluator.py --target study.json" \
--metric feasibility_composite --direction higher
/ar:loop custom/trial-feasibility
Isolated: no hard dependency — autoresearch runs only on demand, and the loop edits study.json, never the evaluator (locked ground truth).
references/study_design_canon.md — ICH E8(R1) general considerations; ICH E9 + E9(R1) estimand addendum; CONSORT 2010; SPIRIT 2013; FDA Multiple Endpoints guidance (2022).references/endpoint_and_power.md — Cohen Statistical Power Analysis; Schoenfeld (1983) survival sample size; FDA Surrogate Endpoint Table / BEST glossary; FDA PRO guidance (2009); Chow, Shao & Wang Sample Size Calculations in Clinical Research.references/trial_operations.md — ICH E6(R2/R3) GCP; TransCelerate risk-based monitoring; FDA RBM guidance; CTTI recruitment best practices; site-feasibility scoring literature.--profile. Company- or indication-specific precedent overrides the prior.| Sibling / neighbor | Scope | Difference |
|---|---|---|
ra-qm-team | ISO 13485 QMS, ISO 14971 risk, EU MDR tech docs + clinical evaluation, FDA 510(k)/PMA/De Novo/QSR submission | That is the submission; clinical-research designs the study beforehand |
research/grants | NIH funding discovery + positioning | That finds funding; this designs the trial |
product-team/experiment-designer | Live product A/B hypothesis + sample size | That is a product experiment; this is a clinical trial |
research-finance (sibling) | R&D program budget + burn | That funds the program; this scopes the study |
python3 scripts/sample_size_estimator.py --sample
python3 scripts/sample_size_estimator.py --design proportions --p1 0.30 --p2 0.45 --dropout 0.15
python3 scripts/endpoint_selector.py --sample
python3 scripts/phase_gate_scorer.py --sample --output json
The sample correctly flags an unvalidated serum-cytokine surrogate (cannot be primary) and ranks PASI-75 as the PRIMARY endpoint; the phase-gate sample returns a verdict with a named owner chain.
Walked one at a time by /cs:grill-research-ops or the orchestrator. Recommended answer + canon citation per question. Never bundled.
"Is your primary endpoint a clinical outcome or a surrogate — and if surrogate, is it on FDA's validated table?" Recommended: clinical outcome unless the surrogate is validated for this indication. Canon: FDA Surrogate Endpoint Table; BEST (Biomarkers, EndpointS, and other Tools) glossary.
"What's the minimal clinically important difference you're powering for — and where did that number come from?" Recommended: a published or anchor-based MCID, cited; never a convenience effect size. Canon: ICH E9; Cohen Statistical Power Analysis.
"What dropout rate are you assuming, and is the sample size inflated for it?" Recommended: inflate n by 1/(1 − dropout) using a justified rate. Canon: Chow, Shao & Wang; ICH E9(R1).
"Single primary endpoint or multiple — and if multiple, what's the multiplicity control?" Recommended: pre-specify alpha allocation (hierarchical / Bonferroni). Canon: FDA Multiple Endpoints guidance (2022).
"Who is the named biostatistician / medical monitor / regulatory owner signing this synopsis?" Recommended: name them now — this output is a recommendation, not a protocol. Canon: ICH E6(R2) GCP roles & responsibilities.
Walk depth-first. Lock 1-2 before opening 3-5. After all are answered, invoke endpoint_selector.py → sample_size_estimator.py → phase_gate_scorer.py.
npx claudepluginhub ai-integr8tor/alirezarezvani-claude-skills --plugin research-ops-skillsSets up isolated workspaces using native worktree tools or git worktree fallback. Use before starting feature work to protect the current branch.
3plugins reuse this skill
First indexed Jun 3, 2026