References PPL syntax, commands (source, where, fields), functions, API endpoints, and bash/curl examples for OpenSearch observability queries on OTel traces and logs.
npx claudepluginhub opensearch-project/observability-stack --plugin observabilityThis skill is limited to using the following tools:
This is a comprehensive reference for the Piped Processing Language (PPL) used by OpenSearch. PPL queries follow a pipe-delimited syntax starting with `source=<index>` and chaining commands with `|`. This reference covers all commands, functions, API endpoints, and usage patterns needed to construct observability queries against trace and log indices.
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Generates original PNG/PDF visual art via design philosophy manifestos for posters, graphics, and static designs on user request.
This is a comprehensive reference for the Piped Processing Language (PPL) used by OpenSearch. PPL queries follow a pipe-delimited syntax starting with source=<index> and chaining commands with |. This reference covers all commands, functions, API endpoints, and usage patterns needed to construct observability queries against trace and log indices.
Grammar sourced from the opensearch-project/sql repository's docs/user/ppl/ directory:
https://github.com/opensearch-project/sql
| Variable | Default | Description |
|---|---|---|
OPENSEARCH_ENDPOINT | https://localhost:9200 | OpenSearch base URL |
OPENSEARCH_USER | admin | OpenSearch username |
OPENSEARCH_PASSWORD | My_password_123!@# | OpenSearch password |
Field names containing dots must be enclosed in backticks to avoid parsing errors:
`attributes.gen_ai.operation.name`
`attributes.gen_ai.usage.input_tokens`
`status.code`
`events.attributes.exception.type`
`@timestamp`
This is critical for OTel attribute fields which use dotted naming conventions.
Execute a PPL query:
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() by serviceName"}'
Request body: {"query": "<ppl_query>"}
Retrieve the query execution plan (useful for debugging and profiling):
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl/_explain" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `status.code` = 2 | stats count() by serviceName"}'
Start a query by specifying the data source index pattern.
Syntax: search source=<index-pattern> or source=<index-pattern>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | head 10"}'
Filter results based on a condition.
Syntax: where <condition>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `status.code` = 2 | head 10"}'
Supports: =, !=, <, >, <=, >=, AND, OR, NOT, LIKE, IN, BETWEEN, IS NULL, IS NOT NULL.
Select specific fields to return.
Syntax: fields [+|-] <field-list>
Use + to include or - to exclude fields.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | fields traceId, spanId, serviceName, durationInNanos | head 10"}'
Aggregate data using statistical functions.
Syntax: stats <aggregation>... [by <field-list>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as span_count, avg(durationInNanos) as avg_duration by serviceName"}'
Supports: count(), sum(), avg(), max(), min(), var_samp(), var_pop(), stddev_samp(), stddev_pop(), distinct_count(), percentile(), earliest(), latest(), list(), values(), first(), last().
Order results by one or more fields.
Syntax: sort [+|-] <field> [, ...]
Use + for ascending (default), - for descending.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | sort - durationInNanos | head 10"}'
Limit the number of results returned.
Syntax: head [N] (default N=10)
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | head 5"}'
Compute new fields from expressions.
Syntax: eval <new-field> = <expression> [, ...]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = durationInNanos / 1000000 | fields traceId, serviceName, duration_ms | sort - duration_ms | head 10"}'
Remove duplicate results based on field values.
Syntax: dedup [N] <field-list> [keepempty=<bool>] [consecutive=<bool>]
Caveat:
dedupmay throw a ClassCastException on fields with mixed types (e.g., a field that contains both strings and numbers across documents). Ensure the dedup field has a consistent type.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | dedup serviceName | fields serviceName"}'
Rename one or more fields.
Syntax: rename <old-field> AS <new-field> [, ...]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | rename serviceName as service, durationInNanos as duration | fields traceId, service, duration | head 10"}'
Find the most frequent values for a field.
Syntax: top [N] <field> [by <group-field>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | top 5 serviceName"}'
Find the least frequent values for a field.
Syntax: rare <field> [by <group-field>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | rare `attributes.gen_ai.operation.name`"}'
Display results in tabular format (alias for fields in some contexts).
Syntax: table <field-list>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `status.code` = 2 | table traceId, spanId, serviceName, name | head 10"}'
Aggregate data into time buckets for time-series visualization.
Syntax: timechart span=<interval> <aggregation>... [by <field>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | timechart span=5m count() as span_count by serviceName"}'
Rate functions for timechart: per_second(), per_minute(), per_hour(), per_day() — compute rate of an aggregation per time unit.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | timechart span=1h per_minute(count()) as spans_per_min by serviceName"}'
General charting command for aggregation over arbitrary fields.
Syntax: chart <aggregation>... by <field> [span(<field>, <interval>)]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | chart avg(durationInNanos) by serviceName"}'
Bucket numeric or date values into intervals.
Syntax: eval <new-field> = bin(<field>, <interval>) or used within stats ... by span(<field>, <interval>)
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() by span(durationInNanos, 1000000000)"}'
Calculate moving averages over sorted data.
Syntax: trendline [sort <field>] sma(<period>, <field>) [as <alias>]
SMA = Simple Moving Average.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | trendline sort startTime sma(10, durationInNanos) as avg_duration | fields startTime, durationInNanos, avg_duration | head 50"}'
Compute running (cumulative) statistics over ordered results.
Syntax: streamstats <aggregation>... [by <field>]
Caveat:
streamstatsprocesses all matching rows in memory. On large indices, this will fail with "insufficient resources" errors. Always add| head Nbeforestreamstatsto limit data volume.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | sort startTime | streamstats count() as running_count, sum(`attributes.gen_ai.usage.input_tokens`) as cumulative_tokens | fields startTime, running_count, cumulative_tokens | head 50"}'
Add aggregation results as new fields to each row without collapsing rows (unlike stats).
Syntax: eventstats <aggregation>... [by <field>]
Caveat:
eventstatsprocesses all matching rows in memory. On large indices, this will fail with "insufficient resources" errors. Always add| head Nbeforeeventstatsto limit data volume.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eventstats avg(durationInNanos) as avg_svc_duration by serviceName | eval deviation = durationInNanos - avg_svc_duration | fields traceId, serviceName, durationInNanos, avg_svc_duration, deviation | sort - deviation | head 20"}'
Extract fields from text using a regular expression with named groups.
Syntax: parse <field> '<regex-with-named-groups>'
Caveat:
parsemay silently drop extracted fields on some OpenSearch versions. If extracted fields are missing from results, usegrokorrexas more reliable alternatives.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | parse body '\''(?P<level>\\w+): (?P<msg>.+)'\'' | fields level, msg | head 10"}'
Extract fields using Grok patterns (predefined regex patterns).
Syntax: grok <field> '<grok-pattern>'
Caveat:
grokprocesses all matching rows in memory. On large indices, this will fail with "insufficient resources" errors. Always add| head Nbeforegrokto limit data volume.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | grok body '\''%{LOGLEVEL:level} %{GREEDYDATA:message}'\'' | fields level, message | head 10"}'
Extract fields using named capture groups (similar to parse but with Splunk-compatible syntax).
Syntax: rex field=<field> '<regex-with-named-groups>'
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | rex field=body '\''(?<statuscode>\\d{3})'\'' | fields statuscode, body | head 10"}'
Filter results using a regular expression match on a field.
Syntax: <field> = regex '<pattern>' (used within where)
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | where body like '\''%error%'\'' | fields traceId, body, severityText | head 10"}'
Auto-discover log patterns by clustering similar log messages.
Syntax: patterns <field> [pattern='<regex>'] [new_field=<name>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | patterns body | fields body, patterns_field | head 20"}'
Extract values from structured data (JSON, XML) using path expressions.
Syntax: spath input=<field> [path=<path>] [output=<field>]
Note: Verify the target field exists in your index before using
spath. Rundescribe <index-pattern>first to confirm the field name. The example below uses a representative field; adjust to match your actual schema.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where isnotnull(`attributes.gen_ai.tool.name`) | spath input=`attributes.gen_ai.tool.name` | head 10"}'
Join results from two indices.
Syntax: join left=<alias> right=<alias> ON <condition> <right-source> or join <right-source> on <field>
Types: inner, left, right, cross.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | join left=s right=l ON s.traceId = l.traceId logs-otel-v1-* | fields s.spanId, s.name, l.severityText, l.body | head 10"}'
Enrich results by looking up values from another index.
Syntax: lookup <lookup-index> <match-field> [AS <alias>] [OUTPUT <field-list>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | lookup otel-v2-apm-service-map serviceName AS `sourceNode` | fields serviceName, `targetNode`, durationInNanos | head 10"}'
Note: The service map index (
otel-v2-apm-service-map) uses nested fieldssourceNodeandtargetNode, notserviceName. Match accordingly when joining or looking up against this index.
Perform graph traversal lookups for hierarchical or connected data.
Syntax: graphlookup <index> connectFromField=<field> connectToField=<field> [maxDepth=<N>] [as <alias>]
Caveat:
graphlookuphas limited support in OpenSearch 3.x PPL and may not work as expected. Test carefully against your OpenSearch version before relying on this command in production queries.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v2-apm-service-map | graphlookup otel-v2-apm-service-map connectFromField=`destination.domain` connectToField=serviceName maxDepth=3 as dependencies | head 10"}'
Use a nested query as a data source or filter.
Syntax: where <field> IN [ source=<index> | ... | fields <field> ]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where traceId IN [ source=otel-v1-apm-span-* | where `status.code` = 2 | fields traceId ] | fields traceId, spanId, serviceName, name | head 20"}'
Append results from another query to the current result set.
Syntax: append [ source=<index> | ... ]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as cnt by serviceName | append [ source=logs-otel-v1-* | stats count() as cnt by `resource.attributes.service.name` ] | head 20"}'
Append columns from another query to the current result set.
Syntax: appendcol [ <commands> ]
Caveat:
source=is not valid insideappendcol []subqueries. The subquery insideappendcoloperates on the current result set, not a new index. Useappendfollowed by reshaping if you need to bring in data from another index.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as span_count | appendcol [ source=logs-otel-v1-* | stats count() as log_count ]"}'
Append the results of a sub-pipeline to the current results.
Syntax: appendpipe [ <commands> ]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as cnt by serviceName | appendpipe [ stats sum(cnt) as total ]"}'
Replace null values with a specified value.
Syntax: fillnull [with <value>] [<field-list>]
Caveat: Backtick-quoted field names are not supported in the
fillnullfield list. Useevalto rename dotted fields to simple names before applyingfillnull, or applyfillnullwithout a field list to fill all null fields.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | fillnull with 0 `attributes.gen_ai.usage.input_tokens`, `attributes.gen_ai.usage.output_tokens` | fields traceId, `attributes.gen_ai.usage.input_tokens`, `attributes.gen_ai.usage.output_tokens` | head 10"}'
Flatten nested fields into top-level fields.
Syntax: flatten <field>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | flatten events | head 10"}'
Expand multi-value or array fields into separate rows.
Syntax: expand <field>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | expand events | fields traceId, spanId, events | head 20"}'
Pivot rows into columns.
Syntax: transpose [<N>] [include_null=<bool>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as cnt by serviceName | transpose"}'
Convert field types (e.g., string to number).
Syntax: convert <function>(<field>) [as <alias>]
Functions: auto(), num(), ip(), ctime(), dur2sec(), mktime(), mstime(), rmcomma(), rmunit(), memk(), none().
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_str = CAST(durationInNanos AS STRING) | convert num(duration_str) as duration_num | fields traceId, duration_num | head 10"}'
Replace values in a field using the replace() string function inside eval.
Syntax: eval <field> = replace(<field>, '<old>', '<new>')
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | eval severityText = replace(severityText, '\''ERROR'\'', '\''ERR'\'') | fields severityText, body | head 10"}'
Reverse the order of results.
Syntax: reverse
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | sort startTime | head 20 | reverse"}'
Expand a multi-value field into separate rows (one row per value).
Syntax: mvexpand <field>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | mvexpand events | fields traceId, spanId, events | head 20"}'
Combine multiple rows with the same key into a single row with a multi-value field.
Syntax: mvcombine <field>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | fields traceId, serviceName | mvcombine serviceName | head 10"}'
Convert a multi-value field to a single-value field (takes the first value or joins with a delimiter).
Syntax: nomv <field>
Caveat:
nomvonly works on string arrays, not nested object arrays. If the field contains nested objects (e.g.,eventswith sub-fields),nomvwill fail or produce unexpected results. Useflattenorexpandfor nested object arrays instead.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | nomv events | fields traceId, events | head 10"}'
Add a summary row at the bottom with column totals.
Syntax: addcoltotals [<field-list>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as cnt by serviceName | addcoltotals"}'
Add a new field to each row containing the sum of specified numeric fields.
Syntax: addtotals [row=<bool>] [col=<bool>] [<field-list>]
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats sum(`attributes.gen_ai.usage.input_tokens`) as input_tok, sum(`attributes.gen_ai.usage.output_tokens`) as output_tok by serviceName | addtotals"}'
Anomaly detection — identify anomalous values in numeric fields using built-in ML algorithms.
Syntax: ad [time_field=<field>] [number_of_trees=<N>] [shingle_size=<N>] [time_zone=<tz>]
Note: The
adcommand does not take a positional field argument. It auto-detects the input field(s) from the precedingstatsorevaloutput in the pipeline.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where durationInNanos > 0 | ad time_field=startTime number_of_trees=100 time_zone=\"UTC\" | head 50"}'
Cluster data points using the k-means algorithm.
Syntax: kmeans [centroids=<N>] [iterations=<N>] [distance_type=<type>]
Note: The
kmeanscommand does not take positional field arguments. It operates on all numeric fields from the preceding pipeline output. Usefieldsorevalbeforekmeansto control which numeric fields are used for clustering.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where durationInNanos > 0 | fields traceId, serviceName, durationInNanos | kmeans centroids=3 | fields traceId, serviceName, durationInNanos, ClusterID | head 30"}'
General ML command for running machine learning algorithms on query results.
Syntax: ml action=<algorithm> [parameters...]
Supported algorithms include: kmeans, ad (anomaly detection).
Note:
ml action=rcfis not a valid action in OpenSearch 3.x PPL. Random Cut Forest anomaly detection is accessed via theadcommand directly (see theadsection above), not throughml action=rcf.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where durationInNanos > 0 | ml action=kmeans centroids=3 | head 50"}'
Inspect the index mapping and field types for an index.
Syntax: describe <index-pattern>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "describe otel-v1-apm-span-*"}'
Show the query execution plan (used via the _explain API endpoint rather than as an inline command).
Syntax: Use the /_plugins/_ppl/_explain endpoint with the query body.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl/_explain" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `status.code` = 2 | stats count() by serviceName"}'
List all configured data sources available for PPL queries.
Syntax: show datasources
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "show datasources"}'
Execute multiple PPL queries in a single request. Each query is independent.
Syntax: Use the /_plugins/_ppl endpoint with multiple queries separated by newlines (NDJSON format), or execute sequentially.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as total_spans"}'
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | stats count() as total_logs"}'
Format the display of a field's values without changing the underlying data.
Syntax: fieldformat <field> = <format-expression>
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = durationInNanos / 1000000 | fieldformat duration_ms = CONCAT(CAST(duration_ms AS STRING), '\'' ms'\'') | fields traceId, serviceName, duration_ms | head 10"}'
The span() function buckets numeric or datetime values into intervals. Used with stats, timechart, and chart.
Syntax: span(<field>, <interval>)
| Unit | Description | Example |
|---|---|---|
ms | Milliseconds | span(startTime, 500ms) |
s | Seconds | span(startTime, 30s) |
m | Minutes | span(startTime, 5m) |
h | Hours | span(startTime, 1h) |
d | Days | span(startTime, 1d) |
w | Weeks | span(startTime, 1w) |
M | Months | span(startTime, 1M) |
q | Quarters | span(startTime, 1q) |
y | Years | span(startTime, 1y) |
For numeric fields, the interval is a plain number:
stats count() by span(durationInNanos, 1000000000)
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as span_count, avg(durationInNanos) as avg_duration by span(startTime, 1h)"}'
Rate functions normalize aggregation values to a per-time-unit rate within timechart:
| Function | Description |
|---|---|
per_second(<agg>) | Aggregation value per second |
per_minute(<agg>) | Aggregation value per minute |
per_hour(<agg>) | Aggregation value per hour |
per_day(<agg>) | Aggregation value per day |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | timechart span=5m per_second(count()) as requests_per_sec"}'
Used with stats, eventstats, streamstats, timechart, and chart commands.
| Function | Syntax | Description |
|---|---|---|
COUNT | count() | Count of events |
SUM | sum(field) | Sum of numeric values |
AVG | avg(field) | Arithmetic mean |
MAX | max(field) | Maximum value |
MIN | min(field) | Minimum value |
VAR_SAMP | var_samp(field) | Sample variance |
VAR_POP | var_pop(field) | Population variance |
STDDEV_SAMP | stddev_samp(field) | Sample standard deviation |
STDDEV_POP | stddev_pop(field) | Population standard deviation |
DISTINCT_COUNT | distinct_count(field) | Count of distinct values |
PERCENTILE | percentile(field, pct) | Value at the given percentile |
EARLIEST | earliest(field) | Earliest (first chronological) value |
LATEST | latest(field) | Latest (most recent) value |
LIST | list(field) | All values as a list |
VALUES | values(field) | Distinct values as a list |
FIRST | first(field) | First value encountered |
LAST | last(field) | Last value encountered |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | stats count() as total, avg(durationInNanos) as avg_ns, percentile(durationInNanos, 95) as p95_ns, distinct_count(serviceName) as services"}'
Functions for working with multi-value fields and arrays.
| Function | Syntax | Description |
|---|---|---|
ARRAY | array(val1, val2, ...) | Create an array from values |
SPLIT | split(field, delimiter) | Split a string into an array |
MVJOIN | mvjoin(field, delimiter) | Join multi-value field into a string |
MVCOUNT | mvcount(field) | Count of values in a multi-value field |
MVINDEX | mvindex(field, index) | Get value at index from multi-value field |
MVFIRST | mvfirst(field) | First value of a multi-value field |
MVLAST | mvlast(field) | Last value of a multi-value field |
MVAPPEND | mvappend(field1, field2) | Append two multi-value fields |
MVDEDUP | mvdedup(field) | Remove duplicates from multi-value field |
MVSORT | mvsort(field) | Sort values in a multi-value field |
MVZIP | mvzip(field1, field2, delim) | Zip two multi-value fields together |
MVRANGE | mvrange(start, end, step) | Generate a range of numeric values |
MVFILTER | mvfilter(expression) | Filter values in a multi-value field |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval tokens = array(`attributes.gen_ai.usage.input_tokens`, `attributes.gen_ai.usage.output_tokens`) | fields traceId, tokens | head 10"}'
Functions for conditional logic and null handling.
| Function | Syntax | Description |
|---|---|---|
ISNULL | isnull(field) | Returns true if field is null |
ISNOTNULL | isnotnull(field) | Returns true if field is not null |
IF | if(cond, true_val, false_val) | Conditional expression |
IFNULL | ifnull(field, default) | Return default if field is null |
NULLIF | nullif(val1, val2) | Return null if val1 equals val2 |
CASE | case(cond1, val1, cond2, val2, ..., else_val) | Multi-branch conditional |
COALESCE | coalesce(val1, val2, ...) | First non-null value |
LIKE | field LIKE 'pattern' | Wildcard pattern match (% and _) |
IN | field IN (val1, val2, ...) | Check membership in a set |
BETWEEN | field BETWEEN val1 AND val2 | Range check (inclusive) |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval status_label = case(`status.code` = 0, '\''UNSET'\'', `status.code` = 1, '\''OK'\'', `status.code` = 2, '\''ERROR'\'') | stats count() by status_label"}'
Functions for type casting and conversion.
| Function | Syntax | Description |
|---|---|---|
CAST | cast(field AS type) | Cast to a specified type (STRING, INT, LONG, FLOAT, DOUBLE, BOOLEAN, DATE, TIMESTAMP) |
TOSTRING | tostring(field) | Convert to string |
TONUMBER | tonumber(field) | Convert to number |
TOINT | toint(field) | Convert to integer |
TOLONG | tolong(field) | Convert to long |
TOFLOAT | tofloat(field) | Convert to float |
TODOUBLE | todouble(field) | Convert to double |
TOBOOLEAN | toboolean(field) | Convert to boolean |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = CAST(durationInNanos AS DOUBLE) / 1000000.0 | fields traceId, serviceName, duration_ms | sort - duration_ms | head 10"}'
Functions for computing hash digests.
| Function | Syntax | Description |
|---|---|---|
MD5 | md5(field) | MD5 hash of the value |
SHA1 | sha1(field) | SHA-1 hash of the value |
SHA2 | sha2(field, numBits) | SHA-2 hash (numBits: 224, 256, 384, 512) |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval trace_hash = md5(traceId) | fields traceId, trace_hash | head 5"}'
Functions for date and time manipulation.
| Function | Syntax | Description |
|---|---|---|
NOW | now() | Current timestamp |
CURDATE | curdate() | Current date |
CURTIME | curtime() | Current time |
DATE_FORMAT | date_format(date, fmt) | Format a date (%Y-%m-%d %H:%i:%s) |
DATE_ADD | date_add(date, INTERVAL n unit) | Add interval to date |
DATE_SUB | date_sub(date, INTERVAL n unit) | Subtract interval from date |
DATEDIFF | datediff(date1, date2) | Difference in days between two dates |
DAY | day(date) | Day of month (1–31) |
MONTH | month(date) | Month (1–12) |
YEAR | year(date) | Year |
HOUR | hour(time) | Hour (0–23) |
MINUTE | minute(time) | Minute (0–59) |
SECOND | second(time) | Second (0–59) |
DAYOFWEEK | dayofweek(date) | Day of week (1=Sun, 7=Sat) |
DAYOFYEAR | dayofyear(date) | Day of year (1–366) |
WEEK | week(date) | Week number of the year |
UNIX_TIMESTAMP | unix_timestamp(date) | Convert to Unix epoch seconds |
FROM_UNIXTIME | from_unixtime(epoch) | Convert Unix epoch to timestamp |
TIMESTAMPADD | timestampadd(unit, n, ts) | Add interval to timestamp |
TIMESTAMPDIFF | timestampdiff(unit, ts1, ts2) | Difference between timestamps in given unit |
PERIOD_ADD | period_add(period, n) | Add months to a period (YYMM/YYYYMM) |
PERIOD_DIFF | period_diff(p1, p2) | Difference in months between periods |
MAKETIME | maketime(h, m, s) | Create a time value |
MAKEDATE | makedate(year, dayofyear) | Create a date from year and day-of-year |
ADDDATE | adddate(date, INTERVAL n unit) | Alias for DATE_ADD |
SUBDATE | subdate(date, INTERVAL n unit) | Alias for DATE_SUB |
SYSDATE | sysdate() | Current date and time (evaluated at execution) |
UTC_DATE | utc_date() | Current UTC date |
UTC_TIME | utc_time() | Current UTC time |
UTC_TIMESTAMP | utc_timestamp() | Current UTC timestamp |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where startTime > DATE_SUB(NOW(), INTERVAL 1 HOUR) | stats count() as recent_spans by serviceName"}'
Operators for arithmetic, comparison, and logical expressions used in eval, where, and other commands.
| Operator | Description | Example |
|---|---|---|
+ | Addition | eval total = input_tokens + output_tokens |
- | Subtraction | eval gap = endTime - startTime |
* | Multiplication | eval cost = tokens * price_per_token |
/ | Division | eval duration_ms = durationInNanos / 1000000 |
| Operator | Description |
|---|---|
= | Equal to |
!= or <> | Not equal to |
< | Less than |
> | Greater than |
<= | Less than or equal to |
>= | Greater than or equal to |
| Operator | Description |
|---|---|
AND | Logical AND |
OR | Logical OR |
NOT | Logical NOT |
XOR | Logical exclusive OR |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = durationInNanos / 1000000, total_tokens = `attributes.gen_ai.usage.input_tokens` + `attributes.gen_ai.usage.output_tokens` | where duration_ms > 1000 AND total_tokens > 0 | fields traceId, serviceName, duration_ms, total_tokens | head 10"}'
Functions for IP address operations.
| Function | Syntax | Description |
|---|---|---|
CIDRMATCH | cidrmatch(ip_field, 'cidr') | Check if IP is within a CIDR range |
GEOIP | geoip(ip_field) | Geo-locate an IP address (returns country, region, city, lat/lon) |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where isnotnull(`attributes.net.peer.ip`) | where cidrmatch(`attributes.net.peer.ip`, '\''10.0.0.0/8'\'') | fields traceId, `attributes.net.peer.ip`, serviceName | head 10"}'
Functions for working with JSON data.
| Function | Syntax | Description |
|---|---|---|
JSON_EXTRACT | json_extract(field, path) | Extract value at JSON path |
JSON_KEYS | json_keys(field) | Get all keys from a JSON object |
JSON_VALID | json_valid(field) | Check if value is valid JSON |
JSON_ARRAY | json_array(val1, val2, ...) | Create a JSON array |
JSON_OBJECT | json_object(key1, val1, ...) | Create a JSON object |
JSON_ARRAY_LENGTH | json_array_length(field) | Length of a JSON array |
JSON_EXTRACT_PATH_TEXT | json_extract_path_text(field, path) | Extract value as text from JSON path |
TO_JSON_STRING | to_json_string(field) | Convert value to JSON string |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where json_valid(`attributes.gen_ai.tool.call.arguments`) | eval tool_args = json_extract(`attributes.gen_ai.tool.call.arguments`, '\''$'\'') | fields traceId, `attributes.gen_ai.tool.name`, tool_args | head 10"}'
Functions for mathematical operations.
| Function | Syntax | Description |
|---|---|---|
ABS | abs(val) | Absolute value |
CEIL | ceil(val) | Round up to nearest integer |
FLOOR | floor(val) | Round down to nearest integer |
ROUND | round(val [, decimals]) | Round to N decimal places |
SQRT | sqrt(val) | Square root |
POW | pow(base, exp) | Exponentiation |
MOD | mod(a, b) | Modulo (remainder) |
LOG | log(val) | Natural logarithm |
LOG2 | log2(val) | Base-2 logarithm |
LOG10 | log10(val) | Base-10 logarithm |
LN | ln(val) | Natural logarithm (alias for LOG) |
EXP | exp(val) | e raised to the power of val |
SIGN | sign(val) | Sign of value (-1, 0, 1) |
TRUNCATE | truncate(val, decimals) | Truncate to N decimal places |
PI | pi() | Value of π |
E | e() | Value of Euler's number |
RAND | rand([seed]) | Random float between 0 and 1 |
ACOS | acos(val) | Arc cosine |
ASIN | asin(val) | Arc sine |
ATAN | atan(val) | Arc tangent |
ATAN2 | atan2(y, x) | Two-argument arc tangent |
COS | cos(val) | Cosine |
SIN | sin(val) | Sine |
TAN | tan(val) | Tangent |
COT | cot(val) | Cotangent |
DEGREES | degrees(radians) | Convert radians to degrees |
RADIANS | radians(degrees) | Convert degrees to radians |
CONV | conv(val, from_base, to_base) | Convert between number bases |
CRC32 | crc32(val) | CRC-32 checksum |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval duration_ms = round(durationInNanos / 1000000.0, 2) | where duration_ms > 0 | fields traceId, serviceName, duration_ms | sort - duration_ms | head 10"}'
Full-text search functions for relevance-based querying.
| Function | Syntax | Description |
|---|---|---|
MATCH | match(field, query) | Full-text match on a single field |
MATCH_PHRASE | match_phrase(field, phrase) | Exact phrase match |
MATCH_BOOL_PREFIX | match_bool_prefix(field, query) | Boolean prefix match |
MATCH_PHRASE_PREFIX | match_phrase_prefix(field, prefix) | Phrase prefix match |
MULTI_MATCH | multi_match([field1, field2], query) | Match across multiple fields |
QUERY_STRING | query_string([field1, field2], query) | Lucene query string syntax |
SIMPLE_QUERY_STRING | simple_query_string([field1, field2], query) | Simplified query string |
HIGHLIGHT | highlight(field) | Return highlighted matching fragments |
SCORE | score(relevance_func) | Return relevance score |
SCOREQUERY | scorequery(relevance_func) | Filter by relevance score |
MATCH_QUERY | match_query(field, query) | Alias for MATCH |
WILDCARD_QUERY | wildcard_query(field, pattern) | Wildcard pattern match (* and ?) |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | where match(body, '\''timeout error'\'') | fields traceId, severityText, body | head 10"}'
Functions for computing statistical correlations and covariances.
| Function | Syntax | Description |
|---|---|---|
COVAR_POP | covar_pop(field1, field2) | Population covariance |
COVAR_SAMP | covar_samp(field1, field2) | Sample covariance |
Note:
corr()is not a recognized stats aggregation function in OpenSearch 3.x PPL. To approximate Pearson correlation, useevalwith manual calculation or compute covariance and standard deviations separately. Thecovar_sampandcovar_popfunctions are supported.
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | where `attributes.gen_ai.usage.input_tokens` > 0 | stats covar_samp(`attributes.gen_ai.usage.input_tokens`, durationInNanos) as token_duration_covar"}'
Functions for string manipulation.
| Function | Syntax | Description |
|---|---|---|
CONCAT | concat(str1, str2, ...) | Concatenate strings |
LENGTH | length(str) | String length in bytes |
LOWER | lower(str) | Convert to lowercase |
UPPER | upper(str) | Convert to uppercase |
TRIM | trim(str) | Remove leading/trailing whitespace |
LTRIM | ltrim(str) | Remove leading whitespace |
RTRIM | rtrim(str) | Remove trailing whitespace |
SUBSTRING | substring(str, start [, len]) | Extract substring |
LEFT | left(str, len) | Leftmost N characters |
RIGHT | right(str, len) | Rightmost N characters |
REPLACE | replace(str, from, to) | Replace occurrences |
REVERSE | reverse(str) | Reverse a string |
LOCATE | locate(substr, str [, pos]) | Position of substring |
POSITION | position(substr IN str) | Position of substring |
ASCII | ascii(str) | ASCII code of first character |
CHAR_LENGTH | char_length(str) | Character count |
CHARACTER_LENGTH | character_length(str) | Alias for CHAR_LENGTH |
OCTET_LENGTH | octet_length(str) | Byte count |
BIT_LENGTH | bit_length(str) | Bit count |
LPAD | lpad(str, len, pad) | Left-pad to length |
RPAD | rpad(str, len, pad) | Right-pad to length |
SPACE | space(n) | String of N spaces |
REPEAT | repeat(str, n) | Repeat string N times |
STRCMP | strcmp(str1, str2) | Compare strings (-1, 0, 1) |
SUBSTR | substr(str, start [, len]) | Alias for SUBSTRING |
MID | mid(str, start, len) | Alias for SUBSTRING |
FIELD | field(str, val1, val2, ...) | Index of str in value list |
FIND_IN_SET | find_in_set(str, strlist) | Position in comma-separated list |
FORMAT | format(val, decimals) | Format number with commas and decimals |
INSERT | insert(str, pos, len, newstr) | Insert string at position |
INSTR | instr(str, substr) | Position of first occurrence |
REGEXP | regexp(str, pattern) | Regex match (returns 1 or 0) |
REGEXP_EXTRACT | regexp_extract(str, pattern [, group]) | Extract regex capture group |
REGEXP_REPLACE | regexp_replace(str, pattern, replacement) | Replace regex matches |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=logs-otel-v1-* | eval body_lower = lower(body) | where body_lower like '\''%exception%'\'' | eval short_body = left(body, 200) | fields traceId, severityText, short_body | head 10"}'
| Function | Syntax | Description |
|---|---|---|
TYPEOF | typeof(field) | Returns the data type of a field value |
curl -sk -u "$OPENSEARCH_USER:$OPENSEARCH_PASSWORD" \
-X POST "$OPENSEARCH_ENDPOINT/_plugins/_ppl" \
-H 'Content-Type: application/json' \
-d '{"query": "source=otel-v1-apm-span-* | eval type_of_duration = typeof(durationInNanos) | fields traceId, durationInNanos, type_of_duration | head 5"}'
This PPL reference is sourced from the opensearch-project/sql repository's docs/user/ppl/ directory.
Repository: https://github.com/opensearch-project/sql
The PPL grammar is maintained as part of the OpenSearch SQL plugin. For the latest syntax additions and changes, consult the repository documentation directly.