By grafana
Build Grafana App Platform apps using Go SDK with CUE schemas, reconcilers, admission webhooks, and Kubernetes operators; configure Grafana Cloud observability stacks including Alloy pipelines, Prometheus metrics, Loki logs, Tempo traces, Beyla eBPF instrumentation, cost optimization, synthetic testing, on-call IRM, and AI/ML integrations for full-stack monitoring and incident management.
npx claudepluginhub grafana/skills --plugin grafana-app-sdkUse when the user asks to "write a validator", "add validation", "implement admission control", "write a mutating webhook", "add a mutation handler", "validate incoming resources", "implement admission logic", "add admission webhooks", "write ingress validation", or asks how to validate or mutate resources before they are persisted in a grafana-app-sdk app. Provides guidance on implementing validation and mutation admission handlers for grafana-app-sdk apps.
Use when: user asks to create a Grafana app, initialize a grafana-app-sdk project, set up a Grafana App Platform app, scaffold a new app, or asks about deployment modes (standalone operator, grafana/apps, frontend-only), how grafana-app-sdk works, or the overall development workflow. Provides foundational knowledge of the grafana-app-sdk CLI, project structure, deployment modes, and overall workflow.
Use when working with CUE kind definitions, schemas, or versioning in grafana-app-sdk projects (app platform apps). This skill should be used when the user asks to "define a kind", "add a CUE kind", "write a kind schema", "create a CUE schema", "model a resource", "add a new resource type", "edit kinds/", "what is a kind in grafana-app-sdk", "add a version to a kind", or asks about CUE kind structure, versioning, schema fields, validation constraints, or the codegen configuration section. Provides guidance on authoring CUE kind definitions for grafana-app-sdk projects.
Use when the user asks to "write a reconciler", "implement a reconciler", "add business logic", "handle resource changes", "process resource events", "implement the reconcile loop", "add async processing", "write a controller", "handle create/update/delete events", "use TypedReconciler", "use a Watcher", or asks how to respond to resource state changes in a grafana-app-sdk app. Provides guidance on implementing reconciler and watcher business logic for grafana-app-sdk apps.
Reduce Grafana Cloud Metrics costs by managing cardinality with Adaptive Metrics aggregation rules. Use when the user asks to reduce metrics costs, manage cardinality, create aggregation rules, apply label dropping, analyse unused metrics, understand Active Series, or optimise Prometheus storage. Triggers on phrases like "adaptive metrics", "reduce cardinality", "aggregation rules", "metrics cost", "too many series", "Active Series", "label dropping", "unused metrics", "cardinality reduction", or "metrics spend".
Grafana Cloud account management — organizations, stacks, RBAC, SSO/SAML/OAuth, service accounts, API keys, team management, billing, and cloud-level provisioning. Use when managing Grafana Cloud access, configuring SSO, setting up service accounts for CI/CD, assigning roles, managing multiple stacks or organizations, or provisioning cloud resources via API.
Grafana Cloud Application Observability (APM), Frontend Observability (RUM/Faro), and AI Observability. Covers RED metrics (Rate/Error/Duration), service maps, span metrics from traces, Faro JavaScript/React SDK for browser instrumentation, session replay, AI/LLM model monitoring, and integration with traces/logs/profiles for full-stack correlation. Use when setting up APM, configuring frontend monitoring, analyzing service performance, or monitoring AI/LLM applications.
Connect AI coding agents (Claude Code, Cursor, VS Code, OpenAI Codex) to Grafana Cloud via the Model Context Protocol (MCP) server. Use when the user asks to connect Claude Code to Grafana, set up MCP for Grafana, use Grafana tools in Cursor, query Grafana from an AI agent, configure the Grafana MCP server, or make AI agents interact with Grafana Cloud APIs. Triggers on phrases like "MCP server", "connect Claude Code to Grafana", "Grafana MCP", "AI agent Grafana", "Claude Grafana tools", "Cursor Grafana", or "agent observability".
Set up, configure, and troubleshoot Grafana Cloud integrations for AWS, Azure, and other cloud providers. Use when the user asks to connect AWS CloudWatch, set up Azure Monitor, configure Confluent Cloud observability, install a Grafana integration, set up hosted exporters, use AWS Firehose for CloudWatch logs, or troubleshoot a cloud integration. Triggers on phrases like "AWS CloudWatch", "Azure Monitor", "Confluent integration", "cloud integration", "hosted exporter", "AWS Firehose", "install integration", "cloud metrics", or "cloud logs".
Grafana Cloud cost management — usage monitoring, cost attribution by label, usage alerts, invoice management, and optimization strategies. Covers Adaptive Metrics (cardinality reduction), Adaptive Logs (log filtering), cost attribution labels, and the FOCUS-compliant billing application. Use when analyzing Grafana Cloud spending, setting up cost alerts, attributing costs to teams, reducing metric/log cardinality, or forecasting observability budgets.
Grafana Cloud Database Observability — query-level performance insights for MySQL and PostgreSQL. Covers setup with Grafana Alloy, query samples, visual explain plans, RED metrics, pg_stat_statements and Performance Schema integration, and correlation with application traces. Use when monitoring database performance, diagnosing slow queries, setting up database observability for MySQL or PostgreSQL (self-managed, RDS, Aurora, Azure, Cloud SQL), or correlating DB metrics with APM data.
Grafana Professional Services tool for identifying which Prometheus metrics drive high Data Points per Minute (DPM). Analyzes metric-level DPM with per-label breakdown to help optimize Grafana Cloud costs. Use when the user asks about DPM analysis, high-cardinality metrics, metric cost optimization, finding noisy metrics, or running dpm-finder against a Grafana Cloud Prometheus endpoint.
Install, configure, and manage Grafana Alloy collector fleets using Fleet Management and remote configuration pipelines. Use when the user asks to configure Alloy, manage collector pipelines, deploy remote configurations, troubleshoot collector health, work with OpAMP, set up pipeline matchers, or manage collector attributes. Triggers on phrases like "configure Alloy", "fleet management", "remote configuration", "collector pipeline", "OpAMP", "pipeline matcher", "collector attributes", "deploy pipeline", "collector is unhealthy", or "Alloy pipeline YAML".
Grafana Cloud infrastructure monitoring — Kubernetes monitoring, cloud provider integrations (AWS, Azure, GCP), host and container monitoring, infrastructure dashboards, and collector setup. Use when setting up Kubernetes monitoring, connecting cloud provider metrics, configuring node exporter or cAdvisor, setting up infrastructure dashboards, or using the k8s-monitoring Helm chart.
Grafana Cloud AI and ML features — Grafana Assistant (natural language queries, dashboard generation, incident investigations), Dynamic Alerting (ML forecasting and outlier detection), Sift (automated root cause analysis with 8 analysis types), Knowledge Graph (entity discovery and RCA Workbench), and the LLM Plugin (OpenAI/Anthropic/Azure integration). Use when setting up AI-powered alerting, using natural language to query metrics/logs, automating incident investigation, or integrating LLMs with Grafana panels and workflows.
Grafana OnCall and Incident Response Management (IRM) — alert routing, escalation chains, on-call schedules, Jinja2 routing templates, Slack/mobile notifications, integrations (Alertmanager, Grafana Alerting, webhooks, PagerDuty), and incident lifecycle management. Use when setting up on-call rotations, configuring escalation policies, routing alerts to the right team, declaring and managing incidents, integrating with Alertmanager or Grafana Alerting, or configuring Slack-based alert workflows.
Grafana Cloud private network connectivity — AWS PrivateLink, Azure Private Link, and GCP Private Service Connect. Send telemetry (metrics, logs, traces, profiles) to Grafana Cloud without traversing the public internet. Eliminates cloud egress costs, meets compliance requirements (PCI-DSS, HIPAA). Use when setting up secure private telemetry ingestion from AWS/Azure/GCP, reducing egress costs, or meeting data residency/compliance requirements.
Sending telemetry data to Grafana Cloud — metrics via Prometheus remote write or OTLP, logs via Loki push or Alloy, traces via OTLP to Tempo, profiles via Pyroscope. Covers Alloy-based pipelines, direct SDK/agent integrations, cloud integrations catalog, and credentials management. Use when connecting an application or infrastructure to Grafana Cloud, setting up data ingestion, configuring remote write, or choosing between ingestion methods.
Grafana Cloud testing capabilities — Synthetic Monitoring (probing URLs, DNS, TCP, ping from multiple regions), k6 Cloud (managed load testing with distributed execution), and Frontend Observability (Faro, real user monitoring). Use when setting up uptime checks, external probes, configuring k6 cloud runs, monitoring frontend performance, or testing APIs from multiple locations.
Grafana Alerting, Incident Response Management (IRM), and SLOs. Covers Grafana-managed and data source-managed alert rules, notification policies, contact points (Slack/PagerDuty/email/webhook), silences, muting, on-call scheduling, incident management workflows, and SLO configuration with burn-rate alerts. Use when configuring alerts, debugging notification routing, setting up on-call rotations, managing incidents, defining SLOs, or provisioning alerting via YAML/API.
Grafana Alloy OpenTelemetry collector and telemetry pipeline configuration. Covers the Alloy configuration language (blocks, attributes, expressions), components for collecting metrics/logs/traces/profiles, sending data to Grafana Cloud/Prometheus/Loki/Tempo, clustering, Fleet Management remote config, and building telemetry pipelines. Use when configuring Alloy, writing Alloy config files (.alloy), building data collection pipelines, setting up scraping, or troubleshooting Alloy deployments.
Grafana Beyla eBPF auto-instrumentation for application observability without code changes. Covers supported languages/runtimes, requirements, installation, configuration (discovery, eBPF settings, OTLP traces export, Prometheus metrics export), Kubernetes deployment, and integration with Grafana Cloud. Use when setting up zero-code instrumentation, configuring eBPF probes, deploying Beyla to Kubernetes, connecting to Tempo/Prometheus, or troubleshooting instrumentation issues.
Create, modify, and organise Grafana dashboards including panels, variables, transformations, and alerting. Use when the user asks to create a Grafana dashboard, add a panel, configure a time series or stat panel, add template variables, set up dashboard linking, use transformations, configure thresholds, build a dashboard for a service, or export dashboard JSON. Triggers on phrases like "create dashboard", "add panel", "time series panel", "Grafana dashboard JSON", "template variables", "dashboard variable", "panel transformation", "threshold", "stat panel", "table panel", "Grafana annotations", or "dashboard folder".
Grafana OSS core features — dashboards, panels, visualization types, data sources, template variables, alerting, annotations, provisioning, RBAC, service accounts, and configuration. Use when building dashboards, configuring data sources, setting up provisioning YAML, managing users and permissions, writing PromQL/LogQL/TraceQL in panels, or configuring Grafana server settings.
OpenTelemetry with Grafana stack. Covers OTel SDK instrumentation for Go/Java/Python/Node.js/.NET, OTLP protocol and endpoint configuration, sending telemetry to Grafana Cloud via OTLP endpoint, Grafana Alloy as OTel collector, sampling strategies, Kubernetes OTel Operator, and migration from other observability tools. Use when instrumenting apps with OTel, configuring OTLP endpoints, setting up collectors, or migrating to OpenTelemetry.
Write, validate, and optimise PromQL queries for Prometheus and Grafana Cloud Metrics. Use when the user asks to query metrics, write a PromQL expression, calculate rates, aggregate across labels, build histogram quantiles, create recording rules, debug query performance, or understand metric cardinality. Triggers on phrases like "PromQL", "Prometheus query", "write a metric query", "calculate rate", "histogram_quantile", "recording rule", "metric cardinality", "sum by", "rate vs irate", "absent()", or "query is slow".
Optimise Grafana app plugin bundle size using React.lazy, Suspense, and webpack code splitting. Use when the user asks to reduce plugin bundle size, optimise module.js, add code splitting, improve initial plugin load performance, split plugin chunks, lazy load plugin pages, or help implement lazy loading in a Grafana app plugin. Triggers on phrases like "optimise plugin bundle size", "module.js is too large", "plugin is slow to load", "code split the plugin", "reduce initial JS payload", or "help me with Suspense in my plugin".
Migrate a Grafana plugin to React 19 compatibility. Use when the user asks to update a plugin for React 19, prepare for React 19, fix React 19 compatibility, upgrade to React 19, migrate to React 19, bump grafanaDependency to 12.3.0, externalize jsx-runtime, or run react-detect. Triggers on phrases like "update plugin for React 19", "React 19 migration", "prepare for React 19", "plugin React 19 compat", "grafanaDependency 12.3.0", "JSX runtime externals", "react-detect", "SECRET_INTERNALS", "ReactCurrentOwner", or "ReactCurrentDispatcher".
Use when writing or reviewing k6 documentation across TypeScript types, user docs, and release notes.
k6 performance and load testing. Covers writing test scripts in JavaScript/TypeScript, all test types (load/stress/spike/soak/smoke/breakpoint), thresholds, checks, scenarios, executors, extensions, result analysis, k6 Cloud execution, and CI/CD integration. Use when writing k6 tests, debugging test failures, setting up load testing pipelines, choosing executors/scenarios, or interpreting k6 results.
Grafana Loki log aggregation and LogQL query language. Covers LogQL syntax (log queries, metric queries, label matchers, line filters, parsers: json/logfmt/pattern/regexp/unpack, label filters, line_format), Loki architecture, log ingestion via Alloy/Promtail/Fluent Bit, structured metadata, and Logs Drilldown. Use when writing LogQL queries, configuring Loki, troubleshooting log pipelines, or analyzing logs.
Grafana Mimir scalable long-term metrics storage. Covers architecture (distributor/ingester/compactor/querier/ query-frontend/store-gateway/ruler), deployment modes (monolithic/microservices), configuration, Prometheus remote write, PromQL querying, multi-tenancy, compaction, and operations. Use when working with Mimir for metrics storage, scaling Prometheus, configuring Mimir clusters, writing PromQL, or debugging Mimir.
Prometheus and Grafana Cloud Metrics overview including PromQL query language, Metrics Drilldown, alerting, recording rules, and integration patterns. Use when working with Prometheus, writing PromQL queries, configuring alerting, or discussing metrics architecture and best practices.
Grafana Pyroscope continuous profiling platform. Covers instrumentation of Go/Java/Python/Ruby/Node.js/ .NET/Rust apps via SDKs or eBPF (Alloy), flame graph analysis, ProfileQL queries, server configuration and architecture, Grafana Cloud Profiles integration, and trace-profile linking (Span Profiles). Use when working with profiling data, instrumenting apps for Pyroscope, analyzing performance profiles, or deploying Pyroscope server.
Grafana Tempo distributed tracing backend. Covers TraceQL query language (span selectors, attribute scopes, pipeline operators, structural operators, metrics functions), trace ingestion via OTLP/Jaeger/Zipkin, Tempo architecture (distributor/ingester/compactor/querier/metrics-generator), full configuration reference with YAML, metrics-from-traces (span metrics, service graphs, TraceQL metrics), deployment modes (monolithic/microservices/Helm/Kubernetes), multi-tenancy, performance tuning, caching, and HTTP API. Use when working with distributed traces, writing TraceQL queries, deploying Tempo, configuring trace pipelines, or setting up Grafana-Tempo integrations (traces-to-logs, traces-to-metrics, traces-to-profiles).
Build Grafana plugin pages using the @grafana/scenes framework. Use this skill when creating new scene pages, adding panels/visualizations, setting up drilldown navigation, defining variables, configuring query runners, building table/timeseries/stat panels, or extending SceneObjectBase for custom scene objects. Triggers on any work involving SceneApp, SceneAppPage, EmbeddedScene, SceneQueryRunner, SceneDataTransformer, PanelBuilders, SceneFlexLayout, QueryVariable, or drilldown/tab configuration in Grafana plugins.
Debug, explore, and instrument with Grafana using gcx CLI
Share bugs, ideas, or general feedback.
Deploy monitoring stacks (Prometheus, Grafana, Datadog)
Editorial "Observability & Monitoring" bundle for Claude Code from Antigravity Awesome Skills.
Monitoring and alerting configuration with dashboard generation
Query and investigate traces, logs, and metrics from an OpenSearch-based observability stack using PPL and PromQL
Analyze Prometheus metric DPM rates with per-series breakdown to identify cost drivers in Grafana Cloud. Supports gcx-based stack discovery and automatic environment setup.