Plugins listed here are tagged for this technology stack and auto-indexed from public GitHub repositories.
Plugins listed here are tagged for this technology stack and auto-indexed from public GitHub repositories.
Claude Code plugins tagged for Prometheus development. Browse commands, agents, skills, and more.
Diagnose performance bottlenecks, implement distributed tracing, and manage incident response with Prometheus, Grafana, OpenTelemetry, and Datadog. Define SLIs/SLOs, run blameless postmortems, and build production-ready observability pipelines for microservices and infrastructure.
Orchestrate multi-agent teams for complex AI-driven projects: decompose tasks, match capabilities, coordinate workflows, manage shared context and errors, distribute workloads, monitor performance with Prometheus and OpenTelemetry, and synthesize insights from interactions. Integrates PowerShell, .NET, Azure ops via specialist subagents.
Generate alerting rules for Prometheus, Grafana, PagerDuty, and Datadog to monitor performance metrics like latency, errors, throughput, resources, availability, and SLO violations. Produces configs with thresholds and rationale, routing, escalation policies, runbooks, and testing steps.
Centralize performance metrics from apps, systems, databases, caches, and services into Prometheus, StatsD, or CloudWatch using unified naming. Generate instrumentation code, Prometheus configs, Grafana dashboards, retention policies, and alerts for comprehensive monitoring workflows.
Deploy full monitoring stacks like Prometheus, Grafana, or Datadog to Kubernetes or Docker environments, configuring exporters, scrape targets, alerting rules, and Grafana dashboards. Generate production-ready DevOps setup code and configurations tailored to your infrastructure requirements.
Use slash commands to set up performance monitoring with New Relic or Datadog APM in Node.js apps, including instrumentation and custom metrics, and deploy full observability stacks with Prometheus metrics, Jaeger or Zipkin tracing, ELK or Fluentd logging, alerting, and Grafana or Kibana dashboards.
Apply 28 prioritized best-practice rules for ClickHouse schema design, query optimization, and data ingestion, with companion skills for running ClickHouse SQL in Python, reviewing schemas and queries, writing Node.js client code, troubleshooting performance issues, and setting up local or cloud ClickHouse environments.
Use gcx CLI to debug Grafana observability stacks: investigate alerts, SLO breaches, synthetic check failures via Prometheus metrics and Loki logs; manage dashboards, SLOs, resources with GitOps; scaffold Go projects; automate setups and code generation for resources-as-code.
Organize knowledge spatially with memory palace techniques—build, navigate, and maintain virtual structures for enhanced recall. Capture PR review insights and manage digital garden health to preserve context and decisions across projects.
Optimize, scale, and maintain Qdrant vector search deployments across installation, embedding model migration, performance tuning, monitoring with Prometheus/Grafana, and zero-downtime upgrades.
Build and manage apps on the Grafana App Platform using the grafana-app-sdk — define CUE kind schemas, implement reconciliation logic, and configure admission webhooks for custom resources. Also covers Grafana Cloud configuration, monitoring, cost optimization, dashboards, alerting, observability pipelines, and performance testing.
Adopt OpenTelemetry observability across your stack: configure and deploy the Collector, instrument applications in multiple languages, write and debug OTTL transformations, and validate attribute conventions.
Conduct specialized code reviews on Go projects, auditing web server architecture, middleware, concurrency patterns, data persistence with PostgreSQL, BubbleTea TUIs, Wish SSH servers, Prometheus instrumentation, and testing practices to ensure idiomatic, secure, high-performance code.
Delegate observability implementation to expert agents that handle OpenTelemetry instrumentation for distributed tracing, structured logging pipelines with tools like Vector and Loki, Prometheus metrics and alerting, Grafana dashboards, SLO definitions, and incident response workflows for optimized system debugging.
Delegate SRE expertise to an agent for production incident response with triage, roles, and templates; generate Prometheus queries for golden signals, SLIs, alerting rules, and dashboards; define SLOs, error budgets, and capacity plans; implement JavaScript patterns like circuit breakers and retries for reliable distributed systems.
Manage Cloud SQL for PostgreSQL on GCP: provision instances, explore databases, audit health, monitor performance via PromQL, manage replication, optimize vector search, and tune configurations.
Run syncable CLI skills to analyze project tech stacks and monorepos, audit dependencies for CVEs/licenses/copyleft, scan code for secrets/vulnerabilities/insecure patterns, validate IaC (Dockerfiles/Compose/Terraform/K8s manifests), optimize K8s clusters for cost/resources, and execute secure deployments to GCP/Azure with audits.
Delegate SDLC workflows to specialist AI agents that architect cloud-native systems, design databases, conduct deep web research, optimize performance and observability, distill repo knowledge, and build production agents via orchestrated pipelines.
Manage the full lifecycle of AlloyDB for PostgreSQL databases on GCP: provision clusters and instances, create and manage IAM or built-in users with role grants, explore schemas and run SQL queries, monitor health and replication, and troubleshoot performance using Cloud Monitoring metrics.
Automate Rootly incident management in Claude: create/triage/resolve incidents, manage alerts/workflows/services/on-call schedules, generate blameless postmortems with AI analysis, track action items, and check service health/status.
Implement production observability and SRE reliability: configure dashboards, metrics, alerts, SLOs, tracing in Datadog, CloudWatch, Prometheus, Grafana; orchestrate incident response from triage to postmortems; audit logs for SOC2, GDPR compliance; leverage specialist agents for log analysis, performance optimization, and cost-effective monitoring.
Process and transform data using jq, SQL, or pandas; design ETL/ELT pipelines for batch or streaming; perform time series forecasting, anomaly detection, and analytics; architect streaming systems with Kafka; generate insights and visualizations via natural language commands and specialist agents.
Investigate observability stacks by querying traces, logs, and metrics in OpenSearch with PPL and Prometheus with PromQL, correlating via OTel conventions from metric spikes to error logs, checking component health, and defining SLOs/SLIs.
Diagnose VictoriaMetrics performance issues by analyzing query execution traces for bottlenecks, cardinality bloat, unused metrics, and orchestrating investigations across metrics, logs, traces, and alerts in Kubernetes environments.
Automatically discover Grafana Cloud stacks via gcx, configure analysis environment with Python venv, analyze Prometheus metric DPM rates, and identify top cost drivers through per-series breakdowns in sorted tables.
Manage D&D 5e campaigns as a Dungeon Master by creating modules, NPCs, characters, and encounters; auditing plot continuity, encounter balance, and loot distribution; generating procedural Dungeondraft battle maps; mapping NPC networks; pressure-testing for exploits; searching monster and spell catalogs; running session prep checklists; and querying the local Mimir database.
Query the full VictoriaMetrics observability stack directly from your editor: run PromQL/MetricsQL metric queries on VictoriaMetrics, search and analyze logs with LogsQL in VictoriaLogs, discover and retrieve distributed traces via Jaeger API in VictoriaTraces, and manage AlertManager alerts and silences using curl-based bash skills.
Automate AIDLC operations on AWS: execute self-improving loops via continuous trace evaluation and PR proposals, run 4-stage canary deployments on Kubernetes with SLO gates and human approvals, handle incident response from alarms, enforce cost budgets with model recommendations, and log audits for compliance.
Scaffold Grafana v12.x plugin projects (panels, data sources, apps, backends) with @grafana/create-plugin and Docker hot-reload dev environments. Develop full lifecycle using React/Go/TypeScript SDKs: build, test, sign, publish. Query Prometheus/Loki billing metrics (active series, ingestion, storage, cardinality, costs) via Grafana API.
Diagnose Kubernetes cluster health comprehensively with dynamic API discovery, run kubectl operations for debugging pods/services/deployments, and monitor operator-specific status for ArgoCD, Prometheus, Crossplane, and Cert-Manager using specialized agents.
Manage Cloud SQL for SQL Server instances on GCP: provision instances, create databases and users, clone environments, take backups, and monitor performance with PromQL queries for slow queries, CPU, and memory.
Orchestrate enterprise DevOps pipelines — enforce code quality across JS, Python, Go, Rust, and Java with test pyramid strategies; deploy to multi-cloud Kubernetes with blue-green strategies; generate OpenAPI specs, Mermaid diagrams, and ADRs; monitor performance via Prometheus, Grafana, and k6; coordinate review, docs, and API agents for automated team workflows.
Assume the Senior DevOps Engineer role to architect production infrastructure on AWS, GCP, and Azure using Docker, Kubernetes, and Terraform; design GitOps CI/CD pipelines with ArgoCD and GitHub Actions; configure Prometheus/Grafana monitoring stacks; implement Vault secrets management; conduct incident response with runbooks and postmortems; and optimize cloud costs through FinOps practices.
Architect end-to-end IoT systems from embedded firmware development and protocol selection (MQTT, CoAP) to edge computing on Docker/Kubernetes, device security with TLS/secure boot, time-series data pipelines using ClickHouse/Prometheus/Grafana, cloud integrations with AWS IoT Core/Azure IoT Hub/GCP, and digital twin modeling.
Analyze Kubernetes cluster resource efficiency across nodes, workloads, Karpenter provisioning, OOM events, and costs. Generate reports with utilization stats, detected issues, actionable recommendations, and historical comparisons using Prometheus metrics.
Deploy and manage OpenTelemetry Collector pipelines shipping to Coralogix, instrument applications with OTel SDKs, write and debug OTTL transformations, and resolve telemetry semantic issues across Kubernetes and cloud environments.
Act as expert vmkteam Go developer handling full SDLC for API services: scaffold projects with PostgreSQL repos and zenrpc, decompose and resolve YouTrack tasks end-to-end, perform multi-persona GitLab MR code reviews, automate CI/CD deploys to Nomad, monitor Prometheus/Sentry/Grafana/Loki metrics/logs/errors, investigate production incidents, generate RPC clients, and run Playwright browser automation.
Guides production investigations, SLO management, OpenTelemetry instrumentation, and Beeline migration for Honeycomb observability—query trace/event datasets, detect instrumentation gaps, and run multi-step debugging workflows.
Automate DevOps workflows by generating GitHub Actions CI/CD pipelines, Dockerfiles with multi-stage builds and security scans, docker-compose setups, and Kubernetes YAMLs for zero-downtime deployments using rolling, blue-green, or canary strategies with rollbacks and monitoring.