From posthog
Debugs and inspects LLM/AI agent traces using PostHog MCP tools. Fetches traces by ID, analyzes spans/generations/tool calls, verifies context/subagents, and checks token usage/costs.
npx claudepluginhub anthropics/claude-plugins-official --plugin posthogThis skill uses the workspace's default tool permissions.
PostHog captures LLM/AI agent activity as traces. Each trace is a tree of events representing
Downloads, exports, and inspects Arize traces, spans, and sessions using ax CLI to debug LLM apps, investigate errors, and analyze regressions.
Retrieves and debugs trace and span data from Arize ML observability platform using arize_toolkit CLI. Lists recent traces, fetches by ID, shows spans, analyzes latency/tokens/cost, exports data.
Exports Arize traces, spans, and sessions via ax CLI for LLM app debugging. Covers ID-based pulls, exploratory sampling, auth troubleshooting, and untrusted content safeguards.
Share bugs, ideas, or general feedback.
PostHog captures LLM/AI agent activity as traces. Each trace is a tree of events representing a single AI interaction — from the top-level agent invocation down to individual LLM API calls.
| Tool | Purpose |
|---|---|
posthog:query-llm-traces-list | Search and list traces (compact — no large content) |
posthog:query-llm-trace | Get a single trace by ID with full event tree |
posthog:execute-sql | Ad-hoc SQL for complex trace analysis |
See the event reference for the full schema.
$ai_trace (top-level container)
└── $ai_span (logical groupings, e.g. "RAG retrieval", "tool execution")
├── $ai_generation (individual LLM API call)
└── $ai_embedding (embedding creation)
Events are linked via $ai_parent_id → parent's $ai_span_id or $ai_trace_id.
posthog:query-llm-trace
{
"traceId": "<trace_id>",
"dateRange": {"date_from": "-7d"}
}
The result contains the full event tree with all properties. The response may be large — when it exceeds the inline limit, Claude Code auto-persists it to a file.
From the result you get:
$ai_span, $ai_generation, etc.)$ai_span_name) — these are the tool/step names$ai_parent_id_posthogUrl — always include this in your response so the user can click through to the UIWhen the result is persisted to a file (large traces with full $ai_input/$ai_output_choices),
use the parsing scripts to explore it.
Start with the summary to get the full picture, then drill into specifics:
# 1. Overview: metadata, tool calls, final output, errors
python3 scripts/print_summary.py /path/to/persisted-file.json
# 2. Timeline: chronological event list with truncated I/O
python3 scripts/print_timeline.py /path/to/persisted-file.json
# 3. Drill into a specific span's full input/output
SPAN="tool_name" python3 scripts/extract_span.py /path/to/persisted-file.json
# 4. Full conversation with thinking blocks and tool calls
python3 scripts/extract_conversation.py /path/to/persisted-file.json
# 5. Search for a keyword across all properties
SEARCH="keyword" python3 scripts/search_traces.py /path/to/persisted-file.json
All scripts support MAX_LEN=N env var to control truncation (0 = unlimited).
$ai_span for the tool call (look at $ai_span_name)$ai_input_state — what arguments were passed to the tool?$ai_output_state — what did the tool return?$ai_is_error — did the tool call fail?$ai_generation event where the LLM made the decision$ai_input — this is the full message history the LLM saw$ai_span events for retrieval/search steps$ai_output_state — what content was retrieved and fed to the LLM?$ai_parent_id)$ai_output_state and $ai_is_error$ai_generation events, those are the subagent's LLM callssearch_traces.py to find where the text appears: SEARCH="the text" python3 scripts/search_traces.py FILE$ai_input of that generation to see what the LLM was told before it said XThe trace tools return _posthogUrl — always surface this to the user.
You can also construct links manually:
https://app.posthog.com/llm-observability/traces/<trace_id>?timestamp=<url_encoded_timestamp>&event=<optional_event_id>_posthogUrl from query-llm-traces-listThe timestamp query param is required — use the createdAt of the earliest event in the trace, URL-encoded (e.g. timestamp=2026-04-01T19%3A39%3A20Z).
When presenting findings, always include the relevant PostHog URL so the user can verify.
Use posthog:query-llm-traces-list to search and filter traces.
CRITICAL: Never assume event names, property names, or property values from training data.
Every project instruments different custom properties. Always call posthog:read-data-schema first
to discover what properties and values actually exist in the project's data before constructing filters.
Before filtering traces, discover what's available:
posthog:read-data-schema with kind: "events" and look for $ai_* eventsposthog:read-data-schema with kind: "event_properties" and event_name: "$ai_generation" (or another AI event) to see what properties are capturedposthog:read-data-schema with kind: "event_property_values", event_name: "$ai_generation", and property_name: "$ai_model" to see real model names in useOnly then construct the query-llm-traces-list call with property filters.
This is especially important for custom properties like project_id, conversation_id, user_tier, etc. — these vary per project and cannot be guessed.
Do not confirm $ai_* properties, but confirm any other like email of a person.
posthog:query-llm-traces-list
{
"dateRange": {"date_from": "-1h"},
"filterTestAccounts": true,
"limit": 20,
"properties": [
{"type": "event", "key": "$ai_model", "value": "gpt-4o", "operator": "exact"}
]
}
Multiple filters are AND-ed together:
posthog:query-llm-traces-list
{
"dateRange": {"date_from": "-1h"},
"filterTestAccounts": true,
"properties": [
{"type": "event", "key": "$ai_provider", "value": "anthropic", "operator": "exact"},
{"type": "event", "key": "$ai_is_error", "value": ["true"], "operator": "exact"}
]
}
You can also filter by person properties (discover them via read-data-schema with kind: "entity_properties" and entity: "person"):
posthog:query-llm-traces-list
{
"dateRange": {"date_from": "-1h"},
"filterTestAccounts": true,
"properties": [
{"type": "person", "key": "email", "value": "@company.com", "operator": "icontains"}
]
}
Customers often store their own IDs as event or person properties.
Use posthog:read-data-schema to discover what custom properties exist, then filter:
posthog:read-data-schema with kind: "event_properties" and event_name: "$ai_trace" to find custom propertiesposthog:query-llm-traces-list
{
"dateRange": {"date_from": "-7d"},
"properties": [
{"type": "event", "key": "project_id", "value": "proj_abc123", "operator": "exact"}
]
}
Use SQL when you need something query-llm-traces-list can't express — typically full-text search across message content or custom aggregations.
SELECT
properties.$ai_trace_id AS trace_id,
properties.$ai_model AS model,
timestamp
FROM events
WHERE
event = '$ai_generation'
AND timestamp >= now() - INTERVAL 1 HOUR
AND properties.$ai_input ILIKE '%search term%'
ORDER BY timestamp DESC
LIMIT 20
For more complex SQL patterns, read these references:
TraceQuery HogQL)Trace tool results are JSON. When too large to read inline, Claude Code persists them to a file.
[{ "type": "text", "text": "{\"results\": [...], \"_posthogUrl\": \"...\"}" }]
results (array for list, object for single trace)
├── id, traceName, createdAt, totalLatency, totalCost
├── inputState, outputState (trace-level state)
└── events[]
├── event ($ai_span | $ai_generation | $ai_embedding | $ai_metric | $ai_feedback)
├── id, createdAt
└── properties
├── $ai_span_name, $ai_latency, $ai_is_error
├── $ai_input_state, $ai_output_state (span tool I/O)
├── $ai_input, $ai_output_choices (generation messages)
├── $ai_model, $ai_provider
└── $ai_input_tokens, $ai_output_tokens, $ai_total_cost_usd
| Script | Purpose | Usage |
|---|---|---|
print_summary.py | Trace metadata, tool calls, errors, and final LLM output | python3 scripts/print_summary.py FILE |
print_timeline.py | Chronological event timeline with I/O summaries | python3 scripts/print_timeline.py FILE |
extract_span.py | Full input/output of a specific span by name | SPAN="name" python3 scripts/extract_span.py FILE |
extract_conversation.py | LLM messages with thinking blocks and tool calls | python3 scripts/extract_conversation.py FILE |
search_traces.py | Find a keyword across all event properties | SEARCH="keyword" python3 scripts/search_traces.py FILE |
show_structure.py | Show JSON keys and types without values | cat blob.json | python3 scripts/show_structure.py |
dateRange — queries without a time range are slow. Use narrow windows (-30m, -1h) for broad listing queries; wider windows (-7d, -30d) are fine for narrow queries filtered by trace ID or specific property values_posthogUrl in your response so the user can click through$ai_input_state / $ai_output_state on spans contain tool call inputs and outputs$ai_input / $ai_output_choices on generations contain the full LLM conversation — can be megabytes; when the result is persisted to a file, use the parsing scriptsfilterTestAccounts: true to exclude internal/test traffic when searching$ai_trace events are NOT in the events array — their data is surfaced via trace-level inputState, outputState, and traceName