From jeremylongshore-claude-code-plugins-plus-skills
Guides streaming inference setup for ML deployment with step-by-step instructions, production-ready code, configurations, and best practices for model serving, MLOps pipelines, monitoring, and optimization.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin langchain-py-packThis skill is limited to using the following tools:
This skill provides automated assistance for streaming inference setup tasks within the ML Deployment domain.
Provides step-by-step guidance and generates configurations for TensorFlow Serving setup in ML deployment, covering model serving, MLOps pipelines, monitoring, and production optimization.
Deploys ML models to production serving infrastructure using MLflow, BentoML, Seldon Core with REST/gRPC endpoints, autoscaling, monitoring, A/B testing for scalable real-time inference.
Optimizes ML inference latency via model compression, distillation, pruning, quantization, caching strategies, and edge deployment patterns.
Share bugs, ideas, or general feedback.
This skill provides automated assistance for streaming inference setup tasks within the ML Deployment domain.
This skill activates automatically when you:
Example: Basic Usage Request: "Help me with streaming inference setup" Result: Provides step-by-step guidance and generates appropriate configurations
| Error | Cause | Solution |
|---|---|---|
| Configuration invalid | Missing required fields | Check documentation for required parameters |
| Tool not found | Dependency not installed | Install required tools per prerequisites |
| Permission denied | Insufficient access | Verify credentials and permissions |
Part of the ML Deployment skill category. Tags: mlops, serving, inference, monitoring, production