From grafana-app-sdk
Guides setup of Grafana Cloud APM with RED metrics from OTel traces, Frontend Observability via Faro JS/React SDK, and AI/LLM monitoring. Use for service maps, browser instrumentation, performance analysis, full-stack correlation.
npx claudepluginhub grafana/skills --plugin grafana-app-sdkThis skill uses the workspace's default tool permissions.
Grafana Cloud provides three tightly related application monitoring products:
Builds production-ready monitoring, logging, and tracing systems with observability strategies, SLI/SLO management, alerting, and incident response workflows.
Provides patterns for observability strategies covering logs, metrics, traces, and signal correlation. Use when designing monitoring systems or implementing the three pillars.
Automates test-driven Grafana Cloud observability setup: SLOs, alerting, synthetic monitoring, k6 load testing, IRM on-call, dashboards, cost optimization, GitOps export.
Share bugs, ideas, or general feedback.
Grafana Cloud provides three tightly related application monitoring products:
All three integrate with Grafana Tempo (traces), Loki (logs), and Pyroscope (profiles) for full-stack correlation.
Application Observability is a pre-built APM experience in Grafana Cloud built on top of OpenTelemetry. It generates RED (Rate, Error, Duration) metrics from distributed traces via span metrics, then surfaces them in:
Application Observability does NOT rely on traditional Prometheus scraping. Metrics come from span metrics - aggregations computed from OTel trace data:
spanmetrics connector in Alloy/OTel CollectorKey generated metric names:
traces_spanmetrics_calls_total, traces_spanmetrics_duration_secondstraces_span_metrics_calls_total, traces_span_metrics_duration_secondsThese attributes MUST be present on all spans for Application Observability to work:
| Attribute | Grafana Label | Purpose |
|---|---|---|
service.name | service_name / part of job | Identifies the service |
service.namespace | part of job label | Groups services; job = namespace/service.name |
deployment.environment | deployment_environment | Env filter (prod/dev/staging) |
The job label is constructed as:
service.namespace/service.name when namespace is setservice.name alone when no namespaceAdditional recommended attributes:
service.version - shown in service overviewk8s.cluster.name - for K8s environmentsk8s.namespace.name - Kubernetes namespacecloud.region - for multi-region setupsexport OTEL_SERVICE_NAME="my-api"
export OTEL_RESOURCE_ATTRIBUTES="service.namespace=myteam,deployment.environment=production,service.version=1.2.3"
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317"
export OTEL_EXPORTER_OTLP_PROTOCOL="grpc"
Alloy acts as a local OTel Collector and forwards data to Grafana Cloud:
// Receive traces, metrics, logs from instrumented apps
otelcol.receiver.otlp "default" {
grpc {
endpoint = "0.0.0.0:4317"
}
http {
endpoint = "0.0.0.0:4318"
}
output {
metrics = [otelcol.processor.resourcedetection.default.input]
logs = [otelcol.processor.resourcedetection.default.input]
traces = [otelcol.processor.resourcedetection.default.input]
}
}
// Auto-detect host/cloud metadata
otelcol.processor.resourcedetection "default" {
detectors = ["env", "system", "gcp", "aws", "azure"]
output {
metrics = [otelcol.processor.batch.default.input]
logs = [otelcol.processor.batch.default.input]
traces = [otelcol.processor.batch.default.input]
}
}
// Batch for efficiency
otelcol.processor.batch "default" {
output {
metrics = [otelcol.exporter.otlphttp.grafana_cloud.input]
logs = [otelcol.exporter.otlphttp.grafana_cloud.input]
traces = [otelcol.exporter.otlphttp.grafana_cloud.input]
}
}
// Auth
otelcol.auth.basic "grafana_cloud" {
username = env("GRAFANA_CLOUD_INSTANCE_ID")
password = env("GRAFANA_CLOUD_API_KEY")
}
// Export to Grafana Cloud OTLP endpoint
otelcol.exporter.otlphttp "grafana_cloud" {
client {
endpoint = env("GRAFANA_CLOUD_OTLP_ENDPOINT")
auth = otelcol.auth.basic.grafana_cloud.handler
}
}
Required environment variables for Alloy:
GRAFANA_CLOUD_OTLP_ENDPOINT=https://otlp-gateway-<region>.grafana.net/otlp
GRAFANA_CLOUD_INSTANCE_ID=<your-instance-id>
GRAFANA_CLOUD_API_KEY=<your-api-key>
The Service Map uses Tempo's metrics-generator to produce service graph metrics:
span.kind (CLIENT/SERVER) on spans for directional edgesEnable in Tempo (managed by Grafana Cloud automatically):
service-graphs metrics generator enabled by default in Grafana Cloud Tempotraces_service_graph_request_total, traces_service_graph_request_failed_total metricsApplication Observability provides one-click correlation:
service.name labelGrafana Faro is an open-source JavaScript/TypeScript SDK for Real User Monitoring (RUM). It instruments browser applications to capture:
Data flows: Faro SDK -> Grafana Alloy (faro receiver) OR Grafana Cloud OTLP endpoint -> Loki (logs) + Tempo (traces) + Mimir (metrics)
@grafana/faro-core # Core SDK - signals, transports, API
@grafana/faro-web-sdk # Web instrumentations + transports
@grafana/faro-web-tracing # OpenTelemetry-JS distributed tracing
@grafana/faro-react # React-specific integrations (error boundary, router)
npm install @grafana/faro-web-sdk
# or
yarn add @grafana/faro-web-sdk
import {
initializeFaro,
getWebInstrumentations,
} from '@grafana/faro-web-sdk';
const faro = initializeFaro({
url: 'https://faro-collector-prod-<region>.grafana.net/collect/<app-key>',
app: {
name: 'my-frontend-app',
version: '1.0.0',
environment: 'production',
},
instrumentations: [
...getWebInstrumentations({
captureConsole: true,
}),
],
});
// Manual API usage
faro.api.pushLog(['User clicked checkout button']);
faro.api.pushError(new Error('Payment failed'));
faro.api.pushEvent('button_click', { button: 'checkout' });
<script src="https://unpkg.com/@grafana/faro-web-sdk@latest/dist/library/faro-web-sdk.iife.js"></script>
<script>
const { initializeFaro, getWebInstrumentations } = GrafanaFaroWebSdk;
initializeFaro({
url: 'https://faro-collector-prod-<region>.grafana.net/collect/<app-key>',
app: { name: 'my-app', version: '1.0.0' },
instrumentations: [...getWebInstrumentations()],
});
</script>
npm install @grafana/faro-react @grafana/faro-web-tracing
import { initializeFaro, getWebInstrumentations } from '@grafana/faro-web-sdk';
import { TracingInstrumentation } from '@grafana/faro-web-tracing';
import {
createReactRouterV6DataOptions,
ReactIntegration,
withFaroRouterInstrumentation,
} from '@grafana/faro-react';
import { createBrowserRouter, RouterProvider } from 'react-router-dom';
const faro = initializeFaro({
url: 'https://faro-collector-prod-<region>.grafana.net/collect/<app-key>',
app: {
name: 'my-react-app',
version: '1.0.0',
environment: 'production',
},
instrumentations: [
...getWebInstrumentations({ captureConsole: true }),
new TracingInstrumentation(),
new ReactIntegration({
router: createReactRouterV6DataOptions({}),
}),
],
});
const router = withFaroRouterInstrumentation(
createBrowserRouter([
{ path: '/', element: <Home /> },
{ path: '/about', element: <About /> },
])
);
function App() {
return <RouterProvider router={router} />;
}
initializeFaro({
url: '...',
app: { name: 'my-app' },
sessionTracking: {
enabled: true,
persistent: true,
maxSessionPersistenceTime: 4 * 60 * 60 * 1000, // 4 hours in ms
samplingRate: 1, // 1 = 100%, 0.5 = 50% of sessions
onSessionChange: (oldSession, newSession) => {
console.log('Session changed', newSession.id);
},
},
instrumentations: [...getWebInstrumentations()],
});
url value - this is your unique collector endpointinitializeFaro({ url: '...' }) callWhen using getWebInstrumentations():
captureConsole: true)When TracingInstrumentation is included, Faro:
traceparent / tracestate headers into outgoing fetch/XHR requestsAI Observability monitors generative AI and LLM applications in production. Built on OTel GenAI semantic conventions and the OpenLIT instrumentation library.
Monitors:
| Metric | Description |
|---|---|
gen_ai_usage_input_tokens_total | Total input/prompt tokens consumed |
gen_ai_usage_output_tokens_total | Total output/completion tokens consumed |
gen_ai_usage_cost_USD_sum | Total cost in USD |
gen_ai_client_operation_duration | Latency per LLM call (histogram) |
gen_ai_client_token_usage | Token usage histogram |
Trace spans capture:
gen_ai.request.model)gen_ai.system: openai, anthropic, etc.)pip install openlit openai anthropic cohere
import openlit
import openai
# One-line initialization - auto-instruments all supported LLM libraries
openlit.init()
# Optional parameters
openlit.init(
application_name="my-ai-app",
environment="production",
)
# Your existing code works unchanged - OpenLIT intercepts all LLM calls
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}]
)
export OTEL_SERVICE_NAME="my-ai-app"
export OTEL_DEPLOYMENT_ENVIRONMENT="production"
export OTEL_EXPORTER_OTLP_ENDPOINT="https://otlp-gateway-<region>.grafana.net/otlp"
# Base64 encode "instanceID:apiToken"
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic <base64-encoded-instanceid:apitoken>"
To get the credentials:
# Hallucination detection
evals = openlit.evals.Hallucination(
provider="openai",
api_key=os.getenv("OPENAI_API_KEY")
)
result = evals.measure(
prompt=user_message,
contexts=["Your knowledge base content here"],
text=llm_answer
)
# Content safety guard
guard = openlit.guard.All(
provider="openai",
api_key=os.getenv("OPENAI_API_KEY")
)
guard.detect(text=user_message)
Once metrics arrive, Grafana Cloud auto-populates five dashboards:
pip install openlit and call openlit.init() at app startup| Signal | Product | Storage | Query Language |
|---|---|---|---|
| Metrics (RED) | App Observability | Mimir | PromQL |
| Traces | Tempo | Tempo | TraceQL |
| Logs | Loki | Loki | LogQL |
| Profiles | Pyroscope | Pyroscope | - |
| Browser RUM | Faro/Frontend Obs | Loki + Tempo | - |
| LLM metrics | AI Observability | Mimir | PromQL |
Correlation keys:
service.name / service_name links all signals for a servicetraceID in logs enables log-to-trace correlationprofileID / time range enables trace-to-profile correlationtraceparent headers to link browser sessions to backend tracesTracingInstrumentation is configuredgen_ai_usage_cost_USD_sum metric to see cost by model/provider