Skill

mlops

Guides MLOps workflows for ML model deployment: readiness checklists, serving infrastructure (FastAPI, SageMaker, Triton), inference optimization, versioning, A/B testing, drift detection, retraining, and monitoring.

Python

npx claudepluginhub arbazkhan971/godmode

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/godmode:mlops

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

- `/godmode:mlops`, "deploy model", "model serving"

SKILL.md

150 lines · ~963 tokens

Similar Skills

deploy-ml-model-serving

Deploys ML models to production serving infrastructure using MLflow, BentoML, or Seldon Core with REST/gRPC endpoints. Implements autoscaling, monitoring, and A/B testing for real-time inference.

1 file6 tools

agent-almanac

ml-engineer

40.4k

Builds production ML systems using PyTorch, TensorFlow, and modern frameworks. Covers model serving, feature engineering, A/B testing, and monitoring.

antigravity-awesome-skills

Harness ML Ops

Audits ML pipeline reproducibility, experiment tracking hygiene, and model versioning. Advises on serving patterns and prompt evaluation across MLflow, W&B, SageMaker, Vertex AI.

1 file

harness-claude

Stats

LanguageShell

Stars18

Forks8

MaintenanceExcellent

Last CommitApr 25, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Stats

Actions

Help us improve

Share bugs, ideas, or general feedback.

Model: <name and version> Source: EXP-<ID> Checklist: [ ] Evaluation complete (test metrics documented) [ ] Bias/fairness check passed [ ] Artifacts saved (weights, config, preprocessor) [ ] Input/output schema documented [ ] Latency benchmarked (< target p99 ms) [ ] Size acceptable (< N MB)

| Optimization | Latency | Size | Accuracy | | Baseline FP32 | <ms> | <MB> | <val> | | FP16 quant | <ms> | <MB> | <val> | | INT8 quant | <ms> | <MB> | <val> | | ONNX | <ms> | <MB> | <val> | | Distillation | <ms> | <MB> | <val> |

Champion: v<N> Challenger: v<N> Split: <champion%>/<challenger%> Routing: random|user-hash|feature-flag Duration: <minimum days> Sample size: <minimum per variant> Success: primary metric >= <threshold> improvement Guardrails: latency p99, error rate, business KPIs

Feature drift (PSI): < 0.1: no drift 0.1-0.2: moderate — monitor closely > 0.2: significant — trigger retraining Performance: < 2% drop: normal variance 2-5% drop: warning — schedule review > 5% drop: alert — trigger retraining

Trigger: scheduled|drift-based|performance-based Frequency: daily|weekly|monthly Data window: last N days Auto_deploy: false (requires A/B or human gate) Cooldown: minimum time between retraining runs

Failure	Action
OOM during training	Resume checkpoint, reduce batch
Performance degrades	Check drift, trigger retrain
A/B no difference	Verify sample size, document null

Failure

Action

OOM during training

Resume checkpoint, reduce batch

Performance degrades

Check drift, trigger retrain

A/B no difference

Verify sample size, document null

Failure	Action
OOM during training	Resume checkpoint, reduce batch
Performance degrades	Check drift, trigger retrain
A/B no difference	Verify sample size, document null

Failure

Action

OOM during training

Resume checkpoint, reduce batch

Performance degrades

Check drift, trigger retrain

A/B no difference

Verify sample size, document null

mlops

Popularity

Invocation

Context Preview

SKILL.md

Similar Skills

Help us improve

Help us improve

Find plugins for your project

mlops

Popularity

Invocation

Context Preview

SKILL.md

Activate When

Workflow

1. Model Readiness

2. Serving Infrastructure

3. Inference Optimization

4. Model Versioning

5. A/B Testing

6. Drift Detection

7. Retraining

8. Monitoring Dashboard

Hard Rules

TSV Logging

Keep/Discard

Stop Conditions

Autonomous Operation

Error Recovery

Similar Skills

Help us improve

Activate When

Workflow

1. Model Readiness

2. Serving Infrastructure

3. Inference Optimization

4. Model Versioning

5. A/B Testing

6. Drift Detection

7. Retraining

8. Monitoring Dashboard

Hard Rules

TSV Logging

Keep/Discard

Stop Conditions

Autonomous Operation

Error Recovery