Queries Prometheus using PromQL for HTTP request rates, latency percentiles, error rates, active connections, and GenAI token usage via curl on localhost:9090.
npx claudepluginhub opensearch-project/observability-stack --plugin observabilityThis skill is limited to using the following tools:
This skill provides PromQL query templates for querying metrics from Prometheus. All queries use the Prometheus HTTP API at `http://localhost:9090/api/v1/query`. No authentication is needed for local Prometheus.
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Generates original PNG/PDF visual art via design philosophy manifestos for posters, graphics, and static designs on user request.
This skill provides PromQL query templates for querying metrics from Prometheus. All queries use the Prometheus HTTP API at http://localhost:9090/api/v1/query. No authentication is needed for local Prometheus.
Prometheus runs on port 9090 using HTTP (not HTTPS).
| Variable | Default | Description |
|---|---|---|
PROMETHEUS_ENDPOINT | http://localhost:9090 | Prometheus base URL |
Different OTel SDK versions and languages emit HTTP metrics under different names. Before querying, discover which metric names are active in your stack:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/label/__name__/values" | python3 -c "
import json, sys
for m in json.load(sys.stdin).get('data', []):
if any(k in m for k in ['http_server', 'gen_ai', 'db_client']):
print(m)"
Common HTTP metric name variants:
| Metric Name | Unit | Emitted By |
|---|---|---|
http_server_duration_milliseconds | milliseconds | Python OTel SDK (older semconv) |
http_server_duration_seconds | seconds | .NET, Java OTel SDKs |
http_server_request_duration_seconds | seconds | Stable HTTP semconv (newer SDKs) |
Important: Replace the metric name in the queries below with whichever variant is active in your stack. The query patterns (rate, histogram_quantile, etc.) are identical — only the metric name changes. For histogram bucket queries, replace
_seconds_bucketwith_milliseconds_bucketas appropriate, and adjust latency thresholds accordingly (e.g.,le="0.25"for seconds vsle="250"for milliseconds).
Calculate the per-second HTTP request rate over a 5-minute window, grouped by service:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=sum(rate(http_server_duration_seconds_count[5m])) by (service_name)'
Calculate the 95th percentile HTTP request latency by service:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=histogram_quantile(0.95, sum(rate(http_server_duration_seconds_bucket[5m])) by (le, service_name))'
Calculate the 99th percentile HTTP request latency by service:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=histogram_quantile(0.99, sum(rate(http_server_duration_seconds_bucket[5m])) by (le, service_name))'
Calculate the ratio of 5xx error responses to total requests by service:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=sum(rate(http_server_duration_seconds_count{http_response_status_code=~"5.."}[5m])) by (service_name) / sum(rate(http_server_duration_seconds_count[5m])) by (service_name)'
Note on status code labels: The label name varies by OTel SDK version. Older semconv uses
http_status_code; newer stable semconv useshttp_response_status_code. Use the Metric Discovery section to check which label is present, or query both:sum(rate(http_server_duration_seconds_count{http_status_code=~"5.."}[5m])) by (service_name)
Query the current number of active HTTP connections by service:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=sum(http_server_active_requests) by (service_name)'
Calculate the 95th percentile database operation latency by service:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=histogram_quantile(0.95, sum(rate(db_client_operation_duration_seconds_bucket[5m])) by (le, service_name))'
Query GenAI token usage histograms grouped by operation name and request model:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=sum(rate(gen_ai_client_token_usage_bucket[5m])) by (le, gen_ai_operation_name, gen_ai_request_model)'
Token usage p95 by operation and model:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=histogram_quantile(0.95, sum(rate(gen_ai_client_token_usage_bucket[5m])) by (le, gen_ai_operation_name, gen_ai_request_model))'
Query GenAI operation duration histograms grouped by operation and model:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=sum(rate(gen_ai_client_operation_duration_seconds_bucket[5m])) by (le, gen_ai_operation_name, gen_ai_request_model)'
Operation duration p95 by operation and model:
curl -s "$PROMETHEUS_ENDPOINT/api/v1/query" \
--data-urlencode 'query=histogram_quantile(0.95, sum(rate(gen_ai_client_operation_duration_seconds_bucket[5m])) by (le, gen_ai_operation_name, gen_ai_request_model))'
| Metric | Type | Labels |
|---|---|---|
http_server_duration_milliseconds | histogram | service_name, http_response_status_code |
http_server_duration_seconds | histogram | service_name, http_response_status_code |
http_server_request_duration_seconds | histogram | service_name, http_response_status_code |
http_server_active_requests | gauge | service_name |
db_client_operation_duration_seconds | histogram | service_name |
gen_ai_client_token_usage | histogram | gen_ai.operation.name, gen_ai.request.model |
gen_ai_client_operation_duration_seconds | histogram | gen_ai.operation.name, gen_ai.request.model |
Note on Prometheus label names: Prometheus replaces dots in label names with underscores. The OTel attribute
gen_ai.operation.namebecomes the Prometheus labelgen_ai_operation_namein PromQL queries. The table above shows the original OTel attribute names for reference.
PPL can also query metrics stored in OpenSearch when metrics are ingested via Data Prepper, as an alternative to PromQL. This is useful for OpenSearch-native workflows where you want to query metrics alongside traces and logs using a single query language. When Data Prepper is configured to ingest metrics into OpenSearch, you can use PPL source= queries against the metrics index just as you would for traces and logs.
To query metrics on Amazon Managed Service for Prometheus (AMP), replace the local endpoint and add AWS SigV4 authentication:
curl -s --aws-sigv4 "aws:amz:REGION:aps" \
--user "$AWS_ACCESS_KEY_ID:$AWS_SECRET_ACCESS_KEY" \
'https://aps-workspaces.REGION.amazonaws.com/workspaces/WORKSPACE_ID/api/v1/query' \
--data-urlencode 'query=sum(rate(http_server_duration_seconds_count[5m])) by (service_name)'
https://aps-workspaces.REGION.amazonaws.com/workspaces/WORKSPACE_ID/api/v1/query--aws-sigv4 "aws:amz:REGION:aps" with --user "$AWS_ACCESS_KEY_ID:$AWS_SECRET_ACCESS_KEY"