Help us improve
Share bugs, ideas, or general feedback.
Queries Prometheus using PromQL for HTTP request rates, latency percentiles, error rates, active connections, and GenAI token usage via curl on localhost:9090.
npx claudepluginhub opensearch-project/observability-stack --plugin observabilityHow this skill is triggered — by the user, by Claude, or both
Slash command
/opensearch@observability:metricsThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill provides PromQL query templates for querying metrics from Prometheus. All queries use the Prometheus HTTP API at `http://localhost:9090/api/v1/query`. No authentication is needed for local Prometheus.
Provides PromQL queries for Prometheus and PPL queries for OpenSearch to retrieve RED metrics (Rate, Errors, Duration) for HTTP service health monitoring.
Write, validate, and optimise PromQL queries for Prometheus and Grafana Cloud Metrics. Covers rates, aggregations, histogram quantiles, recording rules, and query debugging.
Prometheus instrumentation discipline: right metric type, right name, right labels. Invoke whenever task involves any interaction with Prometheus metrics — instrumenting application code, writing PromQL queries, defining alerting or recording rules, choosing metric types, managing label cardinality, building exporters, or reviewing monitoring configuration.
Share bugs, ideas, or general feedback.
This skill provides PromQL query templates for querying metrics from Prometheus. All queries use the Prometheus HTTP API at http://localhost:9090/api/v1/query. No authentication is needed for local Prometheus.
Prometheus runs on port 9090 using HTTP (not HTTPS).
| Variable | Default | Description |
|---|---|---|
PROMETHEUS_ENDPOINT | http://localhost:9090 | Prometheus base URL |
Different OTel SDK versions and languages emit HTTP metrics under different names. Before querying, discover which metric names are active in your stack:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/label/__name__/values" | python3 -c "
import json, sys
for m in json.load(sys.stdin).get('data', []):
if any(k in m for k in ['http_server', 'gen_ai', 'db_client']):
print(m)"
Common HTTP metric name variants:
| Metric Name | Unit | Emitted By |
|---|---|---|
http_server_duration_milliseconds | milliseconds | Python OTel SDK (older semconv) |
http_server_duration_seconds | seconds | .NET, Java OTel SDKs |
http_server_request_duration_seconds | seconds | Stable HTTP semconv (newer SDKs) |
Important: Replace the metric name in the queries below with whichever variant is active in your stack. The query patterns (rate, histogram_quantile, etc.) are identical — only the metric name changes. For histogram bucket queries, replace
_seconds_bucketwith_milliseconds_bucketas appropriate, and adjust latency thresholds accordingly (e.g.,le="0.25"for seconds vsle="250"for milliseconds).
Calculate the per-second HTTP request rate over a 5-minute window, grouped by service:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=sum(rate(http_server_duration_seconds_count[5m])) by (service_name)'
Calculate the 95th percentile HTTP request latency by service:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=histogram_quantile(0.95, sum(rate(http_server_duration_seconds_bucket[5m])) by (le, service_name))'
Calculate the 99th percentile HTTP request latency by service:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=histogram_quantile(0.99, sum(rate(http_server_duration_seconds_bucket[5m])) by (le, service_name))'
Calculate the ratio of 5xx error responses to total requests by service:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=sum(rate(http_server_duration_seconds_count{http_response_status_code=~"5.."}[5m])) by (service_name) / sum(rate(http_server_duration_seconds_count[5m])) by (service_name)'
Note on status code labels: The label name varies by OTel SDK version. Older semconv uses
http_status_code; newer stable semconv useshttp_response_status_code. Use the Metric Discovery section to check which label is present, or query both:sum(rate(http_server_duration_seconds_count{http_status_code=~"5.."}[5m])) by (service_name)
Query the current number of active HTTP connections by service:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=sum(http_server_active_requests) by (service_name)'
Calculate the 95th percentile database operation latency by service:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=histogram_quantile(0.95, sum(rate(db_client_operation_duration_seconds_bucket[5m])) by (le, service_name))'
Query GenAI token usage histograms grouped by operation name and request model:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=sum(rate(gen_ai_client_token_usage_bucket[5m])) by (le, gen_ai_operation_name, gen_ai_request_model)'
Token usage p95 by operation and model:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=histogram_quantile(0.95, sum(rate(gen_ai_client_token_usage_bucket[5m])) by (le, gen_ai_operation_name, gen_ai_request_model))'
Query GenAI operation duration histograms grouped by operation and model:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=sum(rate(gen_ai_client_operation_duration_seconds_bucket[5m])) by (le, gen_ai_operation_name, gen_ai_request_model)'
Operation duration p95 by operation and model:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=histogram_quantile(0.95, sum(rate(gen_ai_client_operation_duration_seconds_bucket[5m])) by (le, gen_ai_operation_name, gen_ai_request_model))'
| Metric | Type | Labels |
|---|---|---|
http_server_duration_milliseconds | histogram | service_name, http_response_status_code |
http_server_duration_seconds | histogram | service_name, http_response_status_code |
http_server_request_duration_seconds | histogram | service_name, http_response_status_code |
http_server_active_requests | gauge | service_name |
db_client_operation_duration_seconds | histogram | service_name |
gen_ai_client_token_usage | histogram | gen_ai.operation.name, gen_ai.request.model |
gen_ai_client_operation_duration_seconds | histogram | gen_ai.operation.name, gen_ai.request.model |
Note on Prometheus label names: Prometheus replaces dots in label names with underscores. The OTel attribute
gen_ai.operation.namebecomes the Prometheus labelgen_ai_operation_namein PromQL queries. The table above shows the original OTel attribute names for reference.
PPL can also query metrics stored in OpenSearch when metrics are ingested via Data Prepper, as an alternative to PromQL. This is useful for OpenSearch-native workflows where you want to query metrics alongside traces and logs using a single query language. When Data Prepper is configured to ingest metrics into OpenSearch, you can use PPL source= queries against the metrics index just as you would for traces and logs.
To query metrics on Amazon Managed Service for Prometheus (AMP), replace the local endpoint and add AWS SigV4 authentication:
curl -s --aws-sigv4 "aws:amz:REGION:aps" \
--user "$AWS_ACCESS_KEY_ID:$AWS_SECRET_ACCESS_KEY" \
'https://aps-workspaces.REGION.amazonaws.com/workspaces/WORKSPACE_ID/api/v1/query' \
--data-urlencode 'query=sum(rate(http_server_duration_seconds_count[5m])) by (service_name)'
https://aps-workspaces.REGION.amazonaws.com/workspaces/WORKSPACE_ID/api/v1/query--aws-sigv4 "aws:amz:REGION:aps" with --user "$AWS_ACCESS_KEY_ID:$AWS_SECRET_ACCESS_KEY"