From agent-almanac
Registers trained models in MLflow Model Registry with versioning, stage transitions (Staging, Production, Archived), approval workflows, and lineage tracking. Use for promoting models to production, governance, rollback, and compliance.
npx claudepluginhub pjt222/agent-almanacThis skill is limited to using the following tools:
> See [Extended Examples](references/EXAMPLES.md) for complete configuration files and templates.
Manages model registry operations for ML deployment, providing step-by-step guidance, production-ready code, and configurations for model serving, MLOps pipelines, monitoring, and optimization.
Tracks AI/ML model versions using MLflow: logs hyperparameters/metrics, registers models, manages Staging/Production stages, compares performance, generates model cards.
Guides MLOps workflows for ML model deployment: readiness checklists, serving infrastructure (FastAPI, SageMaker, Triton), inference optimization, versioning, A/B testing, drift detection, retraining, and monitoring.
Share bugs, ideas, or general feedback.
See Extended Examples for complete configuration files and templates.
Implement MLflow Model Registry for systematic model versioning, stage management, and deployment governance.
Set up MLflow Model Registry with database backend (file-based registry not recommended for production).
# Start MLflow server with Model Registry support
mlflow server \
--backend-store-uri postgresql://user:pass@localhost:5432/mlflow \
--default-artifact-root s3://mlflow-artifacts/models \
--host 0.0.0.0 \
--port 5000
Python configuration:
# model_registry_config.py
import mlflow
from mlflow.tracking import MlflowClient
# Set tracking URI (must support Model Registry)
MLFLOW_TRACKING_URI = "http://mlflow-server.company.com:5000"
mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)
# ... (see EXAMPLES.md for complete implementation)
Expected: Model Registry UI tab appears in MLflow, search_registered_models() returns successfully (even if empty), database contains registered_models table.
On failure: Verify MLflow version ≥1.2 (Model Registry introduced in 1.2), check database backend (SQLite not fully supported for Model Registry), ensure --backend-store-uri points to database (not file://), verify database user has CREATE TABLE permissions, check MLflow server logs for migration errors.
Register a logged model to the Model Registry with comprehensive metadata.
# register_model.py
import mlflow
from mlflow.tracking import MlflowClient
from model_registry_config import MLFLOW_TRACKING_URI
mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)
client = MlflowClient()
# ... (see EXAMPLES.md for complete implementation)
Expected: New model version appears in Model Registry UI, version includes description and tags, model artifacts are accessible via models:/<model-name>/<version> URI, model signature and input example are preserved.
On failure: Verify run_id exists and has completed (client.get_run(run_id)), check model artifact path matches logged artifact (mlflow.search_runs() to inspect), ensure model was logged with proper framework flavor (mlflow.sklearn.log_model not mlflow.log_artifact), verify no special characters in model name (use hyphens not underscores), check artifact storage accessibility.
Move model versions through stages (None → Staging → Production → Archived) with validation checks.
# stage_management.py
import mlflow
from mlflow.tracking import MlflowClient
from datetime import datetime
client = MlflowClient()
class ModelStageManager:
# ... (see EXAMPLES.md for complete implementation)
Expected: Model version stage updates in registry, old versions archived automatically, transition timestamps recorded in tags, rollback restores previous production version.
On failure: Check version exists and is in expected stage, verify archive_existing_versions flag behavior (may not archive if only one version), ensure database supports concurrent transactions for stage updates, check for stage transition locks (only one transition per version at a time), verify approval workflow integration.
Use model aliases for stable deployment references (MLflow ≥2.0).
# model_aliases.py
from mlflow.tracking import MlflowClient
client = MlflowClient()
def set_model_alias(model_name, version, alias):
"""
Set an alias for a model version (MLflow 2.0+).
# ... (see EXAMPLES.md for complete implementation)
Expected: Aliases appear in Model Registry UI, loading models by alias works (models:/name@alias), updating alias immediately affects new loads, A/B test infrastructure functional.
On failure: Upgrade MLflow to ≥2.0 for native alias support, use tag-based fallback for older versions, verify alias naming (alphanumeric and hyphens only), check for alias conflicts (one alias per model version).
Track full lineage from data to deployment with comprehensive metadata.
# model_lineage.py
import mlflow
from mlflow.tracking import MlflowClient
import json
client = MlflowClient()
def enrich_model_metadata(model_name, version, lineage_data):
# ... (see EXAMPLES.md for complete implementation)
Expected: Model version tags include comprehensive lineage information, get_model_lineage() returns full history, JSON report contains data source, training details, and deployment info.
On failure: Verify tag values are strings (convert dicts to JSON), check tag key naming (no spaces or special chars), ensure lineage data captured during training, verify run_id is valid and accessible.
Integrate model registration into CI/CD pipelines for automated promotion.
# .github/workflows/model_promotion.yml
name: Model Promotion Pipeline
on:
workflow_dispatch:
inputs:
model_name:
description: 'Model name to promote'
# ... (see EXAMPLES.md for complete implementation)
Python automation script:
# scripts/promote_model.py
import argparse
from stage_management import ModelStageManager
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--model-name", required=True)
parser.add_argument("--version", type=int, required=True)
# ... (see EXAMPLES.md for complete implementation)
Expected: GitHub Actions workflow triggers on manual dispatch, validation tests pass, model promoted to target stage, Slack notification sent, deployment pipeline triggered automatically.
On failure: Check GitHub secrets configuration for MLFLOW_TRACKING_URI, verify network access from GitHub Actions to MLflow server (may need VPN or IP allowlist), ensure validation script has correct metric thresholds, check Slack webhook configuration, verify Python script executable permissions.
archive_existing_versions=True to auto-archivetrack-ml-experiments - Log models to MLflow before registering themdeploy-ml-model-serving - Deploy registered models to serving infrastructurerun-ab-test-models - A/B test models using registry aliasesorchestrate-ml-pipeline - Automate model training and registrationversion-ml-data - Version training data for model lineage