Help us improve
Share bugs, ideas, or general feedback.
From probabl-skills
Opinionated Python data science/ML stack: one library per job organized by tier. Triggers on missing imports, library decisions, new projects, or substitute libraries.
npx claudepluginhub probabl-ai/skills --plugin probabl-skillsHow this skill is triggered — by the user, by Claude, or both
Slash command
/probabl-skills:data-science-python-stackThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Opinionated stack — one library per job, organized into four tiers
references/ipykernel.mdreferences/jupyterlab.mdreferences/jupytext.mdreferences/keras.mdreferences/matplotlib.mdreferences/mlflow.mdreferences/numpy.mdreferences/pandas.mdreferences/plotly.mdreferences/polars.mdreferences/pyarrow.mdreferences/pytest.mdreferences/pytorch.mdreferences/ruff.mdreferences/scikit-learn.mdreferences/scipy.mdreferences/seaborn.mdreferences/skorch.mdreferences/skore.mdreferences/skrub.mdGuides machine learning tasks in Python with scikit-learn: classification, regression, clustering, preprocessing, model evaluation, hyperparameter tuning, and ML pipelines.
Provides scikit-learn API patterns for preprocessing, pipelines, model selection, evaluation, and hyperparameter tuning. Useful when /ds:experiment builds sklearn pipelines or evaluates models.
Activates senior ML engineer mode with Leeroopedia KB (27k+ pages on vLLM, SGLang, DeepSpeed, Axolotl) enforcing lookups, citations, and grounding before code in ML/AI discussions.
Share bugs, ideas, or general feedback.
Opinionated stack — one library per job, organized into four tiers plus an orthogonal agent feature:
AskUserQuestion before any
import lands.ipython, pyright), kept out of the production-shape
runtime via a manager-specific scope. Install logistics owned
by python-env-manager § "Agent feature"; consumed by
audit-ml-pipeline and the opencode LSP integration.AskUserQuestion before any Write that imports the library
and before any install command runs. "Already pulled in
transitively" / "user said 'quick'" / "the folder has no
preference signalled" are not waivers. A silent pick is a
Stop-condition violation, full stop.import fails, install it; do not rewrite to a
non-stack equivalent (see § "Missing dependency"). The most
common silent-rewrite path —
import skrub fails → rewrite as sklearn.Pipeline,
import skore fails → rewrite as cross_val_score —
silently undoes the workflow skills' contract.AskUserQuestion. The Tier 2 pick is an
operating-contract gate, not a clarifying question. The same
applies to user urgency phrasing: "quick baseline", "just do it",
"go fast", "you pick", "whatever" do NOT resolve a competing-
library gate. See § "Free-text resolution" in the general rule
below for what does resolve a gate.AskUserQuestion answer
recorded this session, or (b) a matching row in
journal/JOURNAL.md Status Workspace decisions. If any
competing-library job ran without one of those, surface the
non-compliance to the user explicitly as part of your final
message — do not hide it.| Shortcut | Why it feels right | Why it's wrong |
|---|---|---|
pandas is already pulled in by skore → skip the Tier 2 ask | "Free" library, no install needed | Tier 2 is a project-shape decision (every data.py signature, every fixture); transitive presence is not a pick |
User said "quick baseline" → assume pandas | Task urgency reads as permission | Urgency phrasing never waives a competing-library gate (Stop conditions above) |
| Folder has no existing tabular code → infer pandas | "No preference signalled" | Inference is a silent pick; the gate requires a structured ask or a recorded JOURNAL.md decision |
One competing option requires a new pixi add → pick the "free" one | Avoids an install step | Install cost is not the criterion; project fit is |
User picked pytorch last project → reuse without asking | Continuity is friendly | Each workspace records its own Workspace decisions; cross-project memory is forbidden |
This is the meta-rule that governs every "user choice" entry in
this skill. It applies to the Tier 2 table below and to any new
competing-library job added in the future. It also applies inside
Tier 3 when two optional libraries cover the same job (e.g.
pytorch vs keras as the deep-learning framework).
Whenever the stack offers two or more libraries for the same job:
AskUserQuestion before any import or install. Use the
options listed for the job in the competing-jobs table; do not
editorialize the option labels.journal/JOURNAL.md Status under
Workspace decisions. This block is immutable until the user
explicitly pivots. On future sessions, read Status first;
do not re-ask a recorded decision. The persistence contract
lives in iterate-ml-experiment's JOURNAL.md template — the
Workspace decisions block is the source of truth for cross-
session continuity.AskUserQuestion.A user message resolves a competing-library gate only if it names one of the listed options for the job. Apply in priority order:
AskUserQuestion.When a new job appears in the stack with two viable libraries,
add a row to the Tier 2 competing-jobs table. Every row must
name an explicit Default-on-no-preference — rows without one
are forbidden, because they re-create the silent-pick loophole
this rule exists to close. If a sensible default cannot be
named, the job does not belong in the table; surface the gap to
the user and pick per-project via a free-form AskUserQuestion.
Two events trigger this skill before any other action:
In both cases, read the whole SKILL.md before deciding. The tier structure below determines whether a library should already be present, needs a user prompt, or is opt-in — that decision can't be made from a single index entry.
When code in this stack needs a library but import fails, the answer
is install it, not substitute. Specifically:
python-env-manager to detect the project's
environment manager (pixi / uv / poetry / hatch / conda / pip+venv)
and produce the right install command — don't infer the manager
from memory; the project may not use the default. Stop and wait
for confirmation before doing anything else.sklearn.Pipeline for skrub, cross_val_score + handwritten
metric prints for skore. Substitution silently breaks the contract
that the workflow skills (build-ml-pipeline,
evaluate-ml-pipeline, organize-ml-workspace) rely on.references/<library>.md for the chosen library's
scope and tradeoffs before introducing it.pixi by default. If the project already uses a
different manager (pip+venv, uv, conda), follow that instead.These five libraries are always installed in a data-science / ML
project. The first three co-own the modeling workflow:
scikit-learn provides the estimators, skrub provides the
data-cleaning + DataOps layer that sits before them, skore
evaluates the result and persists it as a project on disk. The
fourth, ruff, owns lint + format and is non-negotiable: every
project Claude touches should pass ruff check. The fifth,
pytest, runs the smoke test that every approved experiment
must have per the test-ml-pipeline / smoke-test-ml-pipeline
contract — without pytest the smoke-test gate can't enforce
predict-time correctness, so pytest stays mandatory even when
no other tests have been written yet. Each is named explicitly
even when transitively present, because the workflow skills
(build-ml-pipeline, evaluate-ml-pipeline,
python-code-style, test-ml-pipeline) depend on them
directly and should not silently lose them if upstream packaging
changes.
scikit-learn — tabular ML
algorithms, preprocessing, model-selection helpers. Use
HistGradientBoosting{Classifier,Regressor} instead of pulling in
xgboost or lightgbm. Evaluation, cross-validation reports, and
model comparison are owned by skore — don't inline
cross_val_score / classification_report for analysis output.
skrub — wrap custom dataframe operations
in a sklearn-compatible computation graph that replays
deterministically across train and test splits. Use for the
data-cleaning + feature-engineering layer that sits before the
sklearn pipeline.
skore — predictive-model evaluation built
on top of scikit-learn (evaluate, EstimatorReport,
CrossValidationReport, ComparisonReport) and experiment
tracking via the Project API (skore.Project(...),
project.put(...), project.get(...)). Replaces ad-hoc
cross_val_score + handwritten metric printouts; replaces
mlflow for tracking. Brings numpy, pandas, matplotlib,
seaborn, plotly, joblib, and others transitively (see
Tier 4) — so static and interactive plotting are available
without any extra install.
Install variant per mode. skore.Project(...) supports three
mutually exclusive modes: local (artifacts on disk; no extra
deps), hub (artifacts on Skore Hub; requires skore[hub]
extra + skore.login() before first use), mlflow (artifacts
in an MLflow tracking server; requires skore[mlflow] extra).
The choice is a workspace-level decision owned by
organize-ml-workspace § "G-SKORE-MODE" — fired at scaffold
alongside G-PKG-NAME / G-TABULAR / G-ENV-MGR. python-env-manager
§ "Tier 1 install: skore variant per mode" maps the recorded
decision to the right install command per env manager.
Default-on-no-preference: local.
ruff — single-tool lint + format,
replaces black / isort / flake8 / pydocstyle. Install in
the same feature/env as the rest of the Tier 1 stack so
pixi run ruff works without extra activation. The
configuration (rule selection, numpydoc convention, per-file
ignores) and the rule "Claude runs ruff after generating code"
are owned by the python-code-style skill, which also ships the
canonical ruff.toml template.
pytest — test runner for the
smoke-test gate enforced by test-ml-pipeline /
smoke-test-ml-pipeline. Every approved experiment must have a
passing tests/smoke/test_NN_<short_name>.py before its row
in JOURNAL.md can flip to done; pytest is what runs that
test, so the dependency is non-negotiable even on workspaces
that haven't authored any tests yet. Install in the same
feature/env as the rest of the Tier 1 stack.
Jobs in this tier have more than one valid library in the
stack. The user picks via AskUserQuestion before any import or
install (see § "Competing libraries — general rule" above).
Recorded picks live in journal/JOURNAL.md Status Workspace decisions and are read first on every subsequent session.
| Job | Options | Default-on-no-preference |
|---|---|---|
| Tabular dataframe | pandas (+ pyarrow), polars | pandas (free via skore) |
| Deep-learning framework | pytorch, keras (multi-backend) | pytorch |
| sklearn-compatible DL wrapper | skorch (pytorch-only), keras (sklearn-compatible API) | skorch |
| Static vs interactive plotting | matplotlib / seaborn, plotly | task-driven — ask which output shape the user wants |
| Model serving / registry | mlflow.pyfunc + registry, FastAPI + joblib | mlflow |
Per-option detail:
pandas (+ pyarrow) — established tabular library;
pyarrow is the recommended Parquet engine + Arrow-backed
dtype backend. pandas is already pulled in by skore
(Tier 4), so the only explicit install for this option is
pyarrow if Parquet IO is in scope. See
pandas /
pyarrow.polars — Arrow-native tabular library; faster on
large frames, stricter type system. Requires an explicit
install (not pulled in by anything in Tier 1). See
polars.skore (matplotlib, seaborn,
plotly all land without an explicit install). The ask is
which output shape the user wants for this project, not
which install to run. Pick by output medium: static reports
/ papers / static skore reports → matplotlib + seaborn;
interactive notebooks / dashboards → plotly.mlflow is the default (registry + REST out of the
box); FastAPI + joblib is the lighter custom path.Workspace decisions
before the matching code is written. The tabular gate fires
on every project; the others fire only when the project's
roadmap brings them in scope.Workspace decisions block is amended with the new row.Add these only when the task calls for them. Do not pre-install.
For NLP, computer vision, or any task where deep learning is the right tool. None of these are mandatory; reach for them only when the project's modeling task requires DL.
The which library pick (pytorch vs keras as framework, skorch
vs keras as sklearn-compatible wrapper) is a competing-library
job — owned by Tier 2. This section only covers when to
reach for DL at all; the framework choice has its own row in the
Tier 2 competing-jobs table and fires an AskUserQuestion the
first time DL comes into the project's scope.
Per-library reference material:
pytorch — tensor library with GPU /
MPS support and autograd; also the GPU alternative to numpy
for raw numerical work.keras — high-level, layer-oriented
deep-learning API; multi-backend (pytorch / TensorFlow / JAX).skorch — wraps a PyTorch nn.Module
so it behaves like a sklearn estimator (fit / predict,
GridSearchCV, pipelines).The which library pick (mlflow.pyfunc vs FastAPI +
joblib) is a competing-library job — owned by Tier 2. This
section only covers when serving is in scope.
Per-library reference material:
mlflow — model packaging, registry,
and REST serving (mlflow.pyfunc, mlflow models serve).
Use only for serving and registry concerns; tracking
belongs to skore.For notebook-based work, prefer Python files with # %% cell
markers (jupytext percent format) over .ipynb files. Python
files are diffable and version-control friendly; jupytext handles
the conversion to/from notebook format when needed.
jupyterlab + ipykernel
— ambient in the dev feature (alongside ruff + pytest,
per python-env-manager § "Where does the package belong?").
Always installed; no per-project ask. The reference pages
describe the tools' role, not an opt-in install.jupytext — Tier 3 opt-in: sync
.ipynb ↔ .py (# %% markers) so the notebook source-of-
truth stays version-control friendly. Install only when the
project wants .ipynb interop with the # %% scripts.These land in the env as runtime dependencies of the mandatory tier (or of the chosen tabular library). Documented here so you don't add a redundant explicit dependency, and so you know what's available without an extra install.
numpy — N-d arrays, numerical
primitives. Pulled in by scikit-learn and skore.scipy — scientific computing on top of
numpy (stats, optimize, sparse, signal). Supports the array API.
Pulled in by scikit-learn.matplotlib — static plotting
foundation. Pulled in by skore (via seaborn).seaborn — static statistical plots
(distributions, regression, faceting). Pulled in by skore.plotly — interactive plots (hover,
zoom, pan); browser-based, suited for dashboards and exploratory
notebooks. Pulled in by skore — interactive viz is free, no
extra install needed.The audit flow owned by audit-ml-pipeline and the editor LSP
integration both need agent-only tooling (ipython + pyright).
These deps don't fit cleanly into the four tiers above:
So they live in their own bucket: the agent feature, a manager-scoped install that composes alongside (not replaces) the data-science deps.
| Library | Role |
|---|---|
ipython | Powers audit-ml-pipeline/scripts/run_audit.py via IPython.core.interactiveshell.InteractiveShell.run_cell. Executes the audit # %% cells in-process, captures plain-text repr per cell. |
pyright | Powers the opencode LSP integration for Python files. Surfaces import / type / undefined-symbol diagnostics in the editor. Configured via the bundled pyrightconfig.json template (shipped by python-env-manager). |
Install + config drop: owned by python-env-manager § "Agent
feature". That skill carries the per-manager install table
(pixi features / uv groups / poetry groups / hatch envs / conda
envs / pip+venv extras) and the pyrightconfig.json placement
step.
Consumed by audit-ml-pipeline and the LSP. When either
consumer fires and the agent feature isn't present, the calling
skill routes through python-env-manager's G-AGENT-FEATURE
gate before proceeding.
No kernel registration. The in-process runner doesn't need a Jupyter kernel.
Distinct from the dev feature's notebook tooling. jupyterlab
ipykernel are ambient in dev for interactive notebook
editing; jupytext stays Tier 3 opt-in. The agent feature
(ipython + pyright) is its own bucket — ipython powers the
in-process audit runner (no kernel), not user-facing notebook work.
A workspace may have any combination of the three concerns.python-env-manager skill — invoke it for any add / remove /
upgrade. Default recommendation is pixi; if the project
already uses a different manager (uv / poetry / hatch / conda /
pip+venv), python-env-manager's detection table picks it up
and never substitutes one manager for another.skore and skrub must always be
the latest available release.skore covers both evaluation and tracking.
The rule forbids piling a second tool onto a covered job, not a
single tool covering multiple jobs.)