From truefoundry-gateway
Queries TrueFoundry spans API to monitor AI Gateway traffic, costs, latency, errors, and token usage from request traces. Useful for recent requests, cost breakdowns, error rates, model usage, and latency analysis.
npx claudepluginhub truefoundry/tfy-gateway-skills --plugin truefoundry-gatewayThis skill is limited to using the following tools:
> Routing note: For ambiguous user intents, use the shared clarification templates in [references/intent-clarification.md](references/intent-clarification.md).
references/api-endpoints.mdreferences/cli-fallback.mdreferences/cluster-discovery.mdreferences/container-versions.mdreferences/gpu-reference.mdreferences/health-probes.mdreferences/intent-clarification.mdreferences/manifest-defaults.mdreferences/manifest-schema.mdreferences/prerequisites.mdreferences/resource-estimation.mdreferences/rest-api-manifest.mdreferences/tfy-api-setup.mdscripts/tfy-api.shscripts/tfy-version.shQueries OpenSearch OpenTelemetry traces using PPL for GenAI agent invocations, tool executions, slow spans, errors, latency, and token usage via curl and Bash.
Downloads, exports, and inspects Arize traces, spans, and sessions using ax CLI to debug LLM apps, investigate errors, and analyze regressions.
Exports Arize traces, spans, and sessions via ax CLI for LLM app debugging. Covers ID-based pulls, exploratory sampling, auth troubleshooting, and untrusted content safeguards.
Share bugs, ideas, or general feedback.
Routing note: For ambiguous user intents, use the shared clarification templates in references/intent-clarification.md.
Query AI Gateway request traces, costs, latency, errors, and token usage via the spans query API.
Investigate gateway traffic: recent requests, cost breakdowns, error rates, model usage, per-user activity, MCP tool calls, or latency analysis.
tracing skill (this skill is for querying existing gateway traces, not adding instrumentation)ai-gateway skilllogs skillstatus skillRun the status skill first to confirm TFY_BASE_URL and TFY_API_KEY are set and valid.
When using direct API, set TFY_API_SH to the full path of this skill's scripts/tfy-api.sh. See references/tfy-api-setup.md for paths per agent.
Every query requires one of these two parameters. Ask the user which one to use:
| Parameter | Description |
|---|---|
tracingProjectFqn | Fully qualified name of the tracing project, e.g. tenant:tracing-project:name |
dataRoutingDestination | Data routing destination name, e.g. default |
If the user does not know which to use, suggest "dataRoutingDestination": "default" as a starting point.
Endpoint: POST /api/svc/v1/spans/query
# Set the path to tfy-api.sh for your agent (example for Claude Code):
TFY_API_SH=~/.claude/skills/truefoundry-ai-monitoring/scripts/tfy-api.sh
# Basic query: recent spans in the last 24 hours
$TFY_API_SH POST '/api/svc/v1/spans/query' '{
"startTime": "2026-03-26T00:00:00.000Z",
"endTime": "2026-03-27T00:00:00.000Z",
"dataRoutingDestination": "default",
"limit": 50,
"sortDirection": "desc"
}'
$TFY_API_SH POST '/api/svc/v1/spans/query' '{
"startTime": "2026-03-26T00:00:00.000Z",
"dataRoutingDestination": "default",
"limit": 20,
"sortDirection": "desc"
}'
Filter for LLM spans and extract cost attributes:
$TFY_API_SH POST '/api/svc/v1/spans/query' '{
"startTime": "2026-03-26T00:00:00.000Z",
"dataRoutingDestination": "default",
"filters": [
{"spanAttributeKey": "tfy.span_type", "operator": "eq", "value": "LLM"}
],
"limit": 200,
"sortDirection": "desc"
}'
Cost fields in spanAttributes:
gen_ai.usage.cost or tfy.request_cost -- cost of the requestgen_ai.usage.input_tokens -- input token countgen_ai.usage.output_tokens -- output token count$TFY_API_SH POST '/api/svc/v1/spans/query' '{
"startTime": "2026-03-26T00:00:00.000Z",
"dataRoutingDestination": "default",
"filters": [
{"spanFieldName": "statusCode", "operator": "eq", "value": "ERROR"}
],
"limit": 50,
"sortDirection": "desc"
}'
Query all LLM spans and extract model info from span attributes to see which models are being used:
$TFY_API_SH POST '/api/svc/v1/spans/query' '{
"startTime": "2026-03-26T00:00:00.000Z",
"dataRoutingDestination": "default",
"filters": [
{"spanAttributeKey": "tfy.span_type", "operator": "eq", "value": "LLM"}
],
"limit": 200,
"sortDirection": "desc"
}'
Parse spanAttributes in the response for model name fields.
$TFY_API_SH POST '/api/svc/v1/spans/query' '{
"startTime": "2026-03-26T00:00:00.000Z",
"dataRoutingDestination": "default",
"createdBySubjectSlugs": ["user@example.com"],
"limit": 50,
"sortDirection": "desc"
}'
You can also filter by subject type:
$TFY_API_SH POST '/api/svc/v1/spans/query' '{
"startTime": "2026-03-26T00:00:00.000Z",
"dataRoutingDestination": "default",
"createdBySubjectTypes": ["virtualaccount"],
"limit": 50,
"sortDirection": "desc"
}'
$TFY_API_SH POST '/api/svc/v1/spans/query' '{
"startTime": "2026-03-26T00:00:00.000Z",
"dataRoutingDestination": "default",
"filters": [
{"spanAttributeKey": "tfy.span_type", "operator": "eq", "value": "MCP"}
],
"limit": 50,
"sortDirection": "desc"
}'
For MCP Gateway spans use "value": "MCPGateway" instead.
$TFY_API_SH POST '/api/svc/v1/spans/query' '{
"startTime": "2026-03-26T00:00:00.000Z",
"dataRoutingDestination": "default",
"applicationNames": ["tfy-llm-gateway"],
"limit": 50,
"sortDirection": "desc"
}'
$TFY_API_SH POST '/api/svc/v1/spans/query' '{
"startTime": "2026-03-26T00:00:00.000Z",
"dataRoutingDestination": "default",
"filters": [
{"spanFieldName": "spanName", "operator": "contains", "value": "completions"}
],
"limit": 50,
"sortDirection": "desc"
}'
$TFY_API_SH POST '/api/svc/v1/spans/query' '{
"startTime": "2026-03-26T00:00:00.000Z",
"dataRoutingDestination": "default",
"filters": [
{"gatewayRequestMetadataKey": "tfy_gateway_region", "operator": "eq", "value": "US"}
],
"limit": 50,
"sortDirection": "desc"
}'
| Field | Type | Required | Description |
|---|---|---|---|
startTime | string (ISO 8601) | Yes | Start of time range |
endTime | string (ISO 8601) | No | End of time range (defaults to now) |
tracingProjectFqn | string | One of this or dataRoutingDestination | Tracing project FQN |
dataRoutingDestination | string | One of this or tracingProjectFqn | Data routing destination |
traceIds | string[] | No | Filter by trace IDs |
spanIds | string[] | No | Filter by span IDs |
parentSpanIds | string[] | No | Filter by parent span IDs |
createdBySubjectTypes | string[] | No | Filter by subject type (user, virtualaccount) |
createdBySubjectSlugs | string[] | No | Filter by subject slug (e.g. email) |
applicationNames | string[] | No | Filter by application name |
limit | integer | No | Max results (default 200) |
sortDirection | string | No | asc or desc |
pageToken | string | No | Pagination token from previous response |
filters | array | No | Array of filter objects (see Filter Types) |
includeFeedbacks | boolean | No | Include feedback data |
{"spanFieldName": "<field>", "operator": "<op>", "value": "<val>"}
Fields: spanName, serviceName, spanKind, statusCode, etc.
{"spanAttributeKey": "<key>", "operator": "<op>", "value": "<val>"}
Any key from the spanAttributes dict (e.g. tfy.span_type, gen_ai.usage.cost).
{"gatewayRequestMetadataKey": "<key>", "operator": "<op>", "value": "<val>"}
Custom metadata keys set via X-TFY-LOGGING-CONFIG headers.
eq, neq, contains, not_contains, starts_with, ends_with
{
"data": [
{
"spanId": "...",
"traceId": "...",
"parentSpanId": "...",
"serviceName": "tfy-llm-gateway",
"spanName": "POST https://api.openai.com/v1/chat/completions",
"spanKind": "Client",
"scopeName": "...",
"scopeVersion": "...",
"timestamp": "2026-03-26T14:30:00.000Z",
"durationNs": 1234567890,
"statusCode": "OK",
"statusMessage": "",
"spanAttributes": {
"gen_ai.usage.input_tokens": 150,
"gen_ai.usage.output_tokens": 80,
"gen_ai.usage.cost": 0.0023,
"tfy.request_cost": 0.0023,
"tfy.span_type": "LLM"
},
"events": [],
"createdBySubject": {
"subjectId": "...",
"subjectSlug": "user@example.com",
"subjectType": "user",
"tenantName": "my-tenant"
},
"feedbacks": []
}
],
"pagination": {
"nextPageToken": "..."
}
}
When the response includes pagination.nextPageToken, pass it as pageToken in the next request to fetch the next page:
$TFY_API_SH POST '/api/svc/v1/spans/query' '{
"startTime": "2026-03-26T00:00:00.000Z",
"dataRoutingDestination": "default",
"limit": 200,
"pageToken": "TOKEN_FROM_PREVIOUS_RESPONSE"
}'
Continue until nextPageToken is null or absent.
Format results as tables for readability:
Recent Gateway Requests (last 24h):
| Time | Model | Status | Tokens (in/out) | Cost | Latency | User |
|---------------------|----------------|--------|-----------------|----------|-----------|-------------------|
| 2026-03-26 14:30:00 | openai/gpt-4o | OK | 150 / 80 | $0.0023 | 1.23s | user@example.com |
| 2026-03-26 14:29:55 | anthropic/... | OK | 200 / 120 | $0.0045 | 2.10s | bot@svc |
| 2026-03-26 14:29:30 | openai/gpt-4o | ERROR | 100 / 0 | $0.0000 | 0.45s | user@example.com |
For cost summaries, aggregate across spans:
Cost Summary (last 24h):
| Model | Requests | Total Cost | Avg Cost/Req | Total Tokens |
|--------------------|----------|------------|--------------|--------------|
| openai/gpt-4o | 142 | $3.21 | $0.023 | 45,200 |
| anthropic/claude | 58 | $1.87 | $0.032 | 22,100 |
| Total | 200 | $5.08 | $0.025 | 67,300 |
Convert durationNs (nanoseconds) to human-readable format: divide by 1,000,000,000 for seconds.
<success_criteria>
dataRoutingDestination or tracingProjectFqn before querying</success_criteria>
status skill to verify credentials before queryingai-gateway skill to configure models, routing, rate limitstracing skill to add tracing to your own applications (different from monitoring existing gateway traces)logs skill for application-level logs (not gateway request traces)access-tokens skill to create/manage PAT or VAT used for gateway authMissing required parameter. Ensure you provide either:
- "tracingProjectFqn": "tenant:tracing-project:name"
- "dataRoutingDestination": "default"
And a valid "startTime" in ISO 8601 format.
Authentication failed. Run the status skill to verify your TFY_API_KEY is valid.
Empty results. Check:
- Time range is correct (startTime/endTime)
- The dataRoutingDestination or tracingProjectFqn exists
- Filters are not too restrictive (try removing filters first)
- Gateway has actually received requests in this time period
If a pageToken returns an error, restart the query from the beginning
with a fresh request (no pageToken).