From arize-toolkit
Retrieves and debugs trace and span data from Arize ML observability platform using arize_toolkit CLI. Lists recent traces, fetches by ID, shows spans, analyzes latency/tokens/cost, exports data.
npx claudepluginhub duncankmckinnon/arize_toolkit --plugin arize-toolkitThis skill uses the workspace's default tool permissions.
Retrieve trace and span data from Arize using the `arize_toolkit` CLI.
Downloads, exports, and inspects Arize traces, spans, and sessions using ax CLI to debug LLM apps, investigate errors, and analyze regressions.
Exports Arize traces, spans, and sessions via ax CLI for LLM app debugging. Covers ID-based pulls, exploratory sampling, auth troubleshooting, and untrusted content safeguards.
Debugs and inspects LLM/AI agent traces using PostHog MCP tools. Fetches traces by ID, analyzes spans/generations/tool calls, verifies context/subagents, and checks token usage/costs.
Share bugs, ideas, or general feedback.
Retrieve trace and span data from Arize using the arize_toolkit CLI.
--jsonEvery arize_toolkit trace command MUST use --json.
--json is a global flag that goes BEFORE the subcommand: arize_toolkit --json traces ... (NOT arize_toolkit traces --json ...). Without it, output renders as Rich tables that wrap poorly, are hard to parse, and waste tokens.
Correct: arize_toolkit --json traces list --model-name my-agent
Wrong: arize_toolkit traces list --json --model-name my-agent (--json in wrong position)
Wrong: arize_toolkit traces --json list --model-name my-agent (--json in wrong position)
When using traces get, use --all or --columns based on the user's column detail choice (see Step 3). Truncate input.value and output.value with jq [:120] in list views; show full values only when inspecting individual spans.
1. Check Setup → 2. List Traces → 3. Choose Column Detail → 4. Get Trace Detail → 5. Summarize
Verify the CLI is installed:
arize_toolkit --version
If not installed:
pip install arize_toolkit[cli]
Verify configuration:
arize_toolkit config list
If no profile exists, ask the user for their API key, organization name, and space name, then create the profile:
arize_toolkit config init --api-key "API_KEY" --org "ORG_NAME" --space "SPACE_NAME"
Always specify --start-time to narrow the query window. The default is 7 days, which can be slow and hit rate limits. Use a short window (e.g., 1 hour) unless the user needs a wider range. Generate the ISO timestamp dynamically:
# Last 1 hour (recommended default)
arize_toolkit --json traces list --model-name my-agent --count 5 \
--start-time "$(date -u -v-1H +%Y-%m-%dT%H:%M:%SZ)" \
| jq '.[] | {name, traceId, statusCode, latencyMs, input: .["attributes.input.value"][:120]}'
# Last 24 hours
arize_toolkit --json traces list --model-name my-agent --count 5 \
--start-time "$(date -u -v-24H +%Y-%m-%dT%H:%M:%SZ)"
# Specific time window
arize_toolkit --json traces list --model-name my-agent --count 5 --start-time 2025-01-01T00:00:00Z
# Sort ascending (oldest first)
arize_toolkit --json traces list --model-name my-agent --count 5 --sort asc
# More results when needed
arize_toolkit --json traces list --model-name my-agent --count 20
# Export to CSV
arize_toolkit traces list --model-name my-agent --csv traces.csv
# Use model ID instead of name
arize_toolkit --json traces list --model-id "TW9kZWw6..."
Present results as a table of traces with: trace ID, root span name, status, latency, start time.
Before fetching span data, ask the user using AskUserQuestion:
--all. Note: this pulls 30+ fields per span including many empty values, which significantly increases context window usage in longer sessions.--columns.Remember their choice and use it for subsequent trace queries in the session.
To discover what columns exist for a model:
arize_toolkit --json traces columns --model-name my-agent
Once the user picks a trace ID, get all spans using their chosen column detail level.
Recommended columns (lower token usage):
arize_toolkit --json traces get TRACE_ID --model-name my-agent \
--columns "attributes.input.value,attributes.output.value,attributes.llm.model_name,attributes.llm.token_count.prompt,attributes.llm.token_count.completion,attributes.tool.name"
All columns (higher token usage):
arize_toolkit --json traces get TRACE_ID --model-name my-agent --all
Specific columns (user-specified):
arize_toolkit --json traces get TRACE_ID --model-name my-agent \
--columns "attributes.input.value,attributes.output.value,attributes.tool.name"
Export to CSV (does not consume context tokens):
arize_toolkit traces get TRACE_ID --model-name my-agent --all --csv trace.csv
Present trace detail as:
parentId (root has parentId: "")| Command | Option | Description |
|---|---|---|
| All | --model-name | Model name (either this or --model-id required) |
| All | --model-id | Model ID, base64-encoded (either this or --model-name required) |
| All | --start-time | Start of time window, ISO format (default: 7 days ago) |
| All | --end-time | End of time window, ISO format (default: now) |
list | --count | Number of traces per page (default: 20) |
list | --sort | Sort direction: desc or asc (default: desc) |
list | --csv PATH | Export to CSV file |
get | TRACE_ID | Trace ID to look up (positional argument) |
get | --columns | Comma-separated column names to include |
get | --all | Include all available columns (auto-discovered) |
get | --count | Number of spans per page (default: 20) |
get | --csv PATH | Export to CSV file |
# List recent traces
arize_toolkit --json traces list --model-name my-agent --count 5 | jq '.[] | {name, traceId, statusCode, latencyMs, input: .["attributes.input.value"][:120]}'
# Get spans with recommended columns
arize_toolkit --json traces get TRACE_ID --model-name my-agent \
--columns "attributes.input.value,attributes.output.value,attributes.llm.model_name,attributes.llm.token_count.prompt,attributes.llm.token_count.completion,attributes.tool.name"
Always prefer --json over Rich table output for trace inspection — it avoids terminal wrapping issues and is easier to filter. Use arize_toolkit --json (global flag, before the subcommand).
Compact span summary — name, kind, latency, truncated input/output:
arize_toolkit --json traces get TRACE_ID --model-name my-agent | jq '.[] | {name, spanKind, statusCode, latencyMs, input: .["attributes.input.value"][:80], output: .["attributes.output.value"][:80]}'
All attributes, formatted per-span — uses --json --all and pipes through Python to produce clean readable output with empty fields filtered out:
arize_toolkit --json traces get TRACE_ID --model-name my-agent --all 2>&1 | python3 -c "
import sys, json
data = json.load(sys.stdin)
for idx, span in enumerate(data):
print(f'=== Span {idx+1}: {span.get(\"name\", \"unknown\")} ===')
for k, v in span.items():
if k == 'name':
continue
val = str(v).strip()
if not val or val == 'None':
continue
if len(val) > 300:
val = val[:300] + '...'
print(f' {k}: {val}')
print()
"
Single span by name — get all non-empty attributes for a specific span (uses a jq file to avoid zsh != escaping issues):
cat > /tmp/span.jq << 'JQEOF'
first(.[] | select(.name == "SPAN_NAME")) | with_entries(select(.value != null and .value != ""))
JQEOF
arize_toolkit --json traces get TRACE_ID --model-name my-agent --all | jq -f /tmp/span.jq
List traces as compact summary:
arize_toolkit --json traces list --model-name my-agent | jq '.[] | {name, traceId, statusCode, latencyMs, input: .["attributes.input.value"][:80]}'
arize_toolkit traces list --model-name my-agent --count 100 --csv traces.csv
arize_toolkit traces get TRACE_ID --model-name my-agent --all --csv spans.csv
arize_toolkit --json traces list --model-name my-agent | jq '[.[] | select(.statusCode == "ERROR")]'
arize_toolkit --profile staging traces list --model-name my-agent
--json — it is a GLOBAL flag that MUST go before the subcommand: arize_toolkit --json traces ... (not arize_toolkit traces --json)--start-time — default is 7 days which is slow and burns rate limits. Use --start-time "$(date -u -v-1H +%Y-%m-%dT%H:%M:%SZ)" for last hour as a default, widen only if needed--count 5 — paginate up if the user needs more[:120] on value fields to keep context small!= filters — zsh escapes ! in inline jq causing errors. Write filters to a temp file and use jq -f:
cat > /tmp/filter.jq << 'JQEOF'
.[] | with_entries(select(.value != null and .value != ""))
JQEOF
arize_toolkit --json traces get TRACE_ID --model-name my-agent --all | jq -f /tmp/filter.jq
--start-time to avoid slow queries and rate limits--help on any command for full usage: arize_toolkit traces list --help| Issue | Solution |
|---|---|
command not found | Install with pip install arize_toolkit[cli] |
| Authentication error | Check API key: arize_toolkit config show |
| No traces returned | Check model name and time window; widen --start-time if needed |
| Rate limit exceeded | Narrow the time window with --start-time; avoid default 7-day range |
| Missing columns | Run traces columns to discover available attributes |
| Wrong space/org | Use --space / --org flags or switch profile |
--count at 10-20 and paginateenvironmentName is always "tracing" for trace/span data (handled automatically by the CLI)