From agent-almanac
Implements data and concept drift monitoring for production ML models using Evidently AI, PSI/KS tests, with alerting workflows. Use for performance degradation, data shifts, or regulatory needs.
npx claudepluginhub pjt222/agent-almanacThis skill is limited to using the following tools:
> See [Extended Examples](references/EXAMPLES.md) for complete configuration files and templates.
Assists with model drift detection in ML deployments by providing step-by-step guidance, best practices, production-ready code, and configurations for MLOps monitoring.
Monitors production ML models for performance metrics, data drift, concept drift, and anomalies using Prometheus, Grafana, and MLflow.
Provides guidance for monitoring DataRobot models: tracks performance metrics, detects data/feature/target drift, and identifies prediction anomalies using Python SDK. For production ML health checks.
Share bugs, ideas, or general feedback.
See Extended Examples for complete configuration files and templates.
Detect and alert on data drift and concept drift in production ML models using statistical tests and automated monitoring.
Set up the monitoring framework with appropriate dependencies.
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install Evidently and dependencies
pip install evidently pandas scikit-learn prometheus-client
# Create monitoring directory structure
mkdir -p monitoring/{reports,config,alerts}
Create configuration file:
# monitoring/config/drift_config.py
from evidently.metric_preset import DataDriftPreset, TargetDriftPreset
from evidently.metrics import (
DatasetDriftMetric,
DatasetMissingValuesMetric,
ColumnDriftMetric,
)
# ... (see EXAMPLES.md for complete implementation)
Expected: Configuration file created with thresholds matching your model's tolerance.
On failure: Start with conservative thresholds (PSI > 0.2, KS p-value < 0.01) and tune based on false positive rate.
Create drift detection pipeline with multiple statistical tests.
# monitoring/drift_detector.py
import pandas as pd
import numpy as np
from scipy.stats import ks_2samp, chi2_contingency
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset
from evidently.metrics import ColumnDriftMetric, DatasetDriftMetric
from datetime import datetime, timedelta
# ... (see EXAMPLES.md for complete implementation)
Expected: Drift detection runs successfully, produces JSON report with per-feature statistics, and identifies drifted features.
On failure: Check for missing values (impute or drop), ensure reference and current data have same columns, verify data types match between datasets.
Create visual HTML reports for human review and debugging.
# monitoring/generate_reports.py
from evidently.report import Report
from evidently.metric_preset import DataDriftPreset, TargetDriftPreset
from evidently.metrics import (
ColumnDriftMetric,
DatasetDriftMetric,
DatasetMissingValuesMetric,
)
# ... (see EXAMPLES.md for complete implementation)
Expected: HTML reports generated in monitoring/reports/, viewable in browser with interactive charts showing distribution comparisons.
On failure: Verify write permissions to output directory, check that Evidently version is >= 0.4.0, ensure data frames have sufficient rows (>100 recommended).
Monitor prediction performance to detect concept drift (relationship between features and target changes).
# monitoring/concept_drift.py
import pandas as pd
import numpy as np
from sklearn.metrics import roc_auc_score, mean_squared_error, accuracy_score
from typing import Dict, List
import json
# ... (see EXAMPLES.md for complete implementation)
Expected: Performance monitoring detects when model accuracy/AUC drops below threshold, signaling potential concept drift.
On failure: Ensure ground truth labels are available (may require delayed validation batch job), verify prediction scores are properly calibrated (0-1 range for classification), check for label leakage in features.
Integrate drift detection with alerting systems (Slack, PagerDuty, email).
# monitoring/alerting.py
import requests
import json
from typing import Dict, List
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# ... (see EXAMPLES.md for complete implementation)
Expected: Alerts sent to Slack/PagerDuty when drift detected, with severity based on drift share and critical feature involvement.
On failure: Test webhook URLs with curl first, verify PagerDuty integration key has correct permissions, check firewall rules for outbound HTTPS, implement retry logic for transient network failures.
Automate drift detection to run on schedule (daily or weekly).
# monitoring/scheduler.py
import schedule
import time
import logging
from datetime import datetime, timedelta
import pandas as pd
logging.basicConfig(
# ... (see EXAMPLES.md for complete implementation)
Alternatively, use cron:
# Add to crontab (crontab -e)
# Run daily at 2 AM
0 2 * * * cd /path/to/monitoring && /path/to/venv/bin/python scheduler.py >> logs/cron.log 2>&1
Or use Airflow DAG:
# airflow/dags/drift_monitoring_dag.py
from airflow import DAG
from airflow.operators.python import PythonOperator
from datetime import datetime, timedelta
default_args = {
'owner': 'ml-team',
'depends_on_past': False,
# ... (see EXAMPLES.md for complete implementation)
Expected: Monitoring runs automatically on schedule, generates reports, sends alerts only when drift exceeds thresholds, logs all activity.
On failure: Check scheduler process is running (ps aux | grep scheduler), verify cron service is active, ensure data sources are accessible, review logs for exceptions, set up dead man's switch alert if job doesn't run.
detect-anomalies-aiops - Time series anomaly detection for operational metricsdeploy-ml-model-serving - Model deployment patterns and versioningsetup-prometheus-monitoring - Infrastructure metrics collectionreview-data-analysis - Statistical analysis validation and peer review