Skill

mlflow-python

MLflow experiment tracking via Python API. TRIGGERS - MLflow metrics, log backtest, experiment tracking, search runs.

Install

Run in your terminal

npx claudepluginhub terrylica/cc-skills --plugin devops-tools

Tool Access

This skill is limited to using the following tools:

ReadBashGrepGlob

Supporting Assets

View in Repository

references/authentication.md

references/evolution-log.md

references/migration-from-cli.md

references/quantstats-metrics.md

references/query-patterns.md

scripts/create_experiment.py

scripts/get_metric_history.py

scripts/log_backtest.py

scripts/query_experiments.py

Skill Content

Similar Skills

skill-lookup

Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.

prompts.chat

157.5k

prompt-lookup

Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.

prompts.chat

157.5k

frontend-patterns

Provides React and Next.js patterns for component composition, compound components, state management, data fetching, performance optimization, forms, routing, and accessible UIs.

everything-claude-code

139.2k

Stats

Parent Repo Stars28

Parent Repo Forks4

Last CommitApr 1, 2026

Actions

View Source View Plugin View on GitHub View README

MLflow Python Skill

Unified read/write MLflow operations via Python API with QuantStats integration for comprehensive trading metrics.

ADR: 2025-12-12-mlflow-python-skill

Note: This skill uses Pandas (MLflow API requires it). The mlflow-python path is auto-skipped by the Polars preference hook.

Self-Evolving Skill: This skill improves through use. If instructions are wrong, parameters drifted, or a workaround was needed — fix this file immediately, don't defer. Only update for real, reproducible issues.

When to Use This Skill

CAN Do:

Log backtest metrics (Sharpe, max_drawdown, total_return, etc.)
Log experiment parameters (strategy config, timeframes)
Create and manage experiments
Query runs with SQL-like filtering
Calculate 70+ trading metrics via QuantStats
Retrieve metric history (time-series data)

CANNOT Do:

Direct database access to MLflow backend
Artifact storage management (S3/GCS configuration)
MLflow server administration

Prerequisites

Authentication Setup

MLflow uses separate environment variables for credentials (NOT embedded in URI):

# Option 1: mise + .env.local (recommended)
# Create .env.local in skill directory with:
MLFLOW_TRACKING_URI=http://mlflow.eonlabs.com:5000
MLFLOW_TRACKING_USERNAME=eonlabs
MLFLOW_TRACKING_PASSWORD=<password>

# Option 2: Direct environment variables
export MLFLOW_TRACKING_URI="http://mlflow.eonlabs.com:5000"
export MLFLOW_TRACKING_USERNAME="eonlabs"
export MLFLOW_TRACKING_PASSWORD="<password>"

Verify Connection

/usr/bin/env bash << 'SKILL_SCRIPT_EOF'
cd ${CLAUDE_PLUGIN_ROOT}/skills/mlflow-python
uv run scripts/query_experiments.py experiments
SKILL_SCRIPT_EOF

Quick Start Workflows

A. Log Backtest Results (Primary Use Case)

/usr/bin/env bash << 'SKILL_SCRIPT_EOF_2'
cd ${CLAUDE_PLUGIN_ROOT}/skills/mlflow-python
uv run scripts/log_backtest.py \
  --experiment "crypto-backtests" \
  --run-name "btc_momentum_v2" \
  --returns path/to/returns.csv \
  --params '{"strategy": "momentum", "timeframe": "1h"}'
SKILL_SCRIPT_EOF_2

B. Search Experiments

uv run scripts/query_experiments.py experiments

C. Query Runs with Filter

uv run scripts/query_experiments.py runs \
  --experiment "crypto-backtests" \
  --filter "metrics.sharpe_ratio > 1.5" \
  --order-by "metrics.sharpe_ratio DESC"

D. Create New Experiment

uv run scripts/create_experiment.py \
  --name "crypto-backtests-2025" \
  --description "Q1 2025 cryptocurrency trading strategy backtests"

E. Get Metric History

uv run scripts/get_metric_history.py \
  --run-id abc123 \
  --metrics sharpe_ratio,cumulative_return

QuantStats Metrics Available

The log_backtest.py script calculates 70+ metrics via QuantStats, including:

Category	Metrics
Ratios	sharpe, sortino, calmar, omega, treynor
Returns	cagr, total_return, avg_return, best, worst
Drawdown	max_drawdown, avg_drawdown, drawdown_days
Trade	win_rate, profit_factor, payoff_ratio, consecutive_wins/losses
Risk	volatility, var, cvar, ulcer_index, serenity_index
Advanced	kelly_criterion, recovery_factor, risk_of_ruin, information_ratio

See quantstats-metrics.md for full list.

Bundled Scripts

Script	Purpose
`log_backtest.py`	Log backtest returns with QuantStats metrics
`query_experiments.py`	Search experiments and runs (replaces CLI)
`create_experiment.py`	Create new experiment with metadata
`get_metric_history.py`	Retrieve metric time-series data

Configuration

The skill uses mise [env] pattern for configuration. See .mise.toml for defaults.

Create .env.local (gitignored) for credentials:

MLFLOW_TRACKING_URI=http://mlflow.eonlabs.com:5000
MLFLOW_TRACKING_USERNAME=eonlabs
MLFLOW_TRACKING_PASSWORD=<password>

Reference Documentation

Authentication Patterns - Idiomatic MLflow auth
QuantStats Metrics - Full list of 70+ metrics
Query Patterns - DataFrame operations
Migration from CLI - CLI to Python API mapping

Migration from mlflow-query

This skill replaces the CLI-based mlflow-query skill. Key differences:

Feature	mlflow-query (old)	mlflow-python (new)
Log metrics	Not supported	`mlflow.log_metrics()`
Log params	Not supported	`mlflow.log_params()`
Query runs	CLI text parsing	DataFrame output
Metric history	Workaround only	Native support
Auth pattern	Embedded in URI	Separate env vars

See migration-from-cli.md for detailed mapping.

Troubleshooting

Issue	Cause	Solution
Connection refused	MLflow server not running	Verify MLFLOW_TRACKING_URI and server status
Authentication failed	Wrong credentials	Check MLFLOW_TRACKING_USERNAME and PASSWORD in .env
Experiment not found	Experiment name typo	Run `query_experiments.py experiments` to list all
QuantStats import error	Missing dependency	`uv add quantstats` in skill directory
Pandas import warning	Expected for this skill	Ignore - MLflow requires Pandas (hook-excluded)
Run creation fails	Experiment doesn't exist	Use `create_experiment.py` to create first
Metric history empty	Wrong run_id or metric name	Verify run_id with `query_experiments.py runs`
Returns CSV parse error	Wrong date format or columns	Check CSV has date index and returns column

Post-Execution Reflection

After this skill completes, check before closing:

Did the command succeed? — If not, fix the instruction or error table that caused the failure.
Did parameters or output change? — If the underlying tool's interface drifted, update Usage examples and Parameters table to match.
Was a workaround needed? — If you had to improvise (different flags, extra steps), update this SKILL.md so the next invocation doesn't need the same workaround.

Only update if the issue is real and reproducible — not speculative.