From langsmith-cli
Inspects and manages LangSmith traces, runs, datasets, and prompts using langsmith-cli to debug AI chains, analyze token costs, and follow cache-first workflows.
npx claudepluginhub gigaverse-app/langsmith-cli --plugin langsmith-cliThis skill uses the workspace's default tool permissions.
Use this tool to debug AI chains, inspect past runs, manage datasets, and analyze token costs in LangSmith.
Downloads, exports, and inspects Arize traces and spans using ax CLI to debug LLM apps, investigate errors, and analyze behavior regressions.
Fetches and analyzes LangSmith traces to debug LangChain and LangGraph agents. Use for agent errors, tool calls, memory operations, and performance review.
Debugs and inspects LLM/AI agent traces using PostHog MCP tools: query traces by ID, analyze event trees, verify tool calls, subagents, context, and token usage.
Share bugs, ideas, or general feedback.
Use this tool to debug AI chains, inspect past runs, manage datasets, and analyze token costs in LangSmith.
uv tool install langsmith-cli
/plugin marketplace add gigaverse-app/langsmith-cli
See Installation Guide if install fails or for alternative methods.
--jsonALWAYS pass --json as the FIRST argument. Without it you get Rich terminal tables โ unparseable, useless to agents.
# โ
CORRECT
langsmith-cli --json runs list --project my-project --limit 5
# โ WRONG โ Rich table output, cannot be parsed
langsmith-cli runs list --project my-project --limit 5
--output for data extraction โ never shell redirection# โ
CORRECT โ atomic write, errors visible, non-zero exit on failure
langsmith-cli --json runs list --project my-project --output runs.jsonl
python3 -c "import json; runs = [json.loads(l) for l in open('runs.jsonl')]"
# โ WRONG โ errors go to stderr silently, you get empty/corrupt file
langsmith-cli --json runs list --project my-project > runs.json
# โ WRONG โ heredoc overrides pipe stdin, python3 reads empty stdin
langsmith-cli --json runs get <id> --fields outputs | python3 << 'EOF'
import sys, json; data = json.load(sys.stdin) # stdin is EMPTY
EOF
Use python3 -c "..." (no heredoc) if you must pipe inline.
Step 1: langsmith-cli runs cache list
โ
Step 2: Is the project listed with recent data?
YES โ Use `runs cache grep` directly. Zero API calls. STOP.
NO โ Tell user: "Project X is not in cache. Downloading in background."
Run `langsmith-cli --json runs cache download ...` in background,
poll TaskOutput(block=false) for progress, use cache grep when done.
Red flags โ STOP if you're about to:
runs list, --fetch N) when the project is already cachedruns cache download without first checking runs cache listruns cache listruns cache download without --json โ Rich output is swallowed when captured to a file, leaving you with zero progress visibility--fetch N after a cache download โ --fetch always hits the API, never the cacheBackground download + progress tracking:
# โ
CORRECT โ --json emits {"event":"progress","project":"...","new_runs":N} to stderr per batch
langsmith-cli --json runs cache download --project "dev/my-project" --last 30d
# Run in background, poll TaskOutput(block=false), relay new_runs count to user
# Final stdout: {"event":"download_complete","total_new_runs":N}
--fields to reduce token usagelangsmith-cli --json runs list --fields id,name,status,start_time
langsmith-cli --json runs get <id> --fields inputs,outputs,error
| Task | Command |
|---|---|
| List recent runs | langsmith-cli --json runs list --project <name> --limit 10 --fields id,name,status |
| Get a single run | langsmith-cli --json runs get <id> --fields inputs,outputs,error |
| Get run + child outputs | langsmith-cli --json runs get <id> --follow-children --fields id,name,inputs,outputs |
| Get latest run | langsmith-cli --json runs get-latest --project <name> --fields inputs,outputs |
| Get latest error | langsmith-cli --json runs get-latest --project <name> --failed --fields id,name,error |
| Search run content | langsmith-cli --json runs list --grep "pattern" --grep-in outputs --limit 20 |
| Search cached runs | langsmith-cli runs cache grep "pattern" -E --grep-in outputs --project <name> |
| Download cache | langsmith-cli --json runs cache download --project <name> --last 7d |
| List cache | langsmith-cli runs cache list |
| Discover cache schema | langsmith-cli --json runs cache schema --project <name> --include outputs |
| Analyze token costs | langsmith-cli --json runs usage --from-cache --breakdown model --active-only |
| List projects | langsmith-cli --json projects list --name-pattern "dev/*" --fields name |
| Count runs | langsmith-cli --json runs list --project <name> --count |
| Run stats | langsmith-cli --json runs stats --project <name> |
| List datasets | langsmith-cli --json datasets list --fields id,name |
| List prompts | langsmith-cli --json prompts list --fields repo_handle,description |
| List feedback for a run | langsmith-cli --json feedback list --run-id <run-id> |
| Create feedback | langsmith-cli --json feedback create <run-id> --key correctness --score 0.9 |
| List annotation queues | langsmith-cli --json annotation-queues list |
| Get annotation queue | langsmith-cli --json annotation-queues get <queue-id> |
| View experiment results | langsmith-cli --json experiments results <experiment-name> |
| Open run in browser | Construct URL manually โ see LangSmith URLs section below |
When your task matches one of the sections below, you MUST load that reference file before proceeding โ don't load them speculatively for unrelated tasks.
runs list, runs get, runs get-latest, runs search, runs sample, runs analyze, runs tags, runs fields, runs export--trace-filter, --tree-filter, --sort-by, --roots, --run-type, --tag, --model, --min-latency, --max-latency--metadata key=value (supports wildcards key=val* and regex key=/pattern/)--query (server-side, fast, first 250 chars) vs --grep (client-side, all content, regex)eq, gt, has, and, search, metadata_key/metadata_value)runs usage, runs pricing, runs cache download + --from-cache--from-cache, --group-by, --breakdown, --apply-pricing)feedback commands when:feedback list [--run-id <id>] [--key <key>] [--limit N], feedback get <id>, feedback create <run-id> --key <key> [--score N] [--comment <str>], feedback delete <id> [--confirm]annotation-queues commands when:annotation-queues list, annotation-queues get <id>, annotation-queues create <name> [--description <str>], annotation-queues update <id> [--name <str>] [--description <str>], annotation-queues delete <id> [--confirm]experiments commands when:experiments results <experiment-name>--filter expression and want operator referencemetadata_key/metadata_value filter syntaxextracted_entities) โ there's a complete recipe covering cache download, Python JSONL scanning, deduplication of sub-runs, and llm_recognition filteringruns open generates broken URLs. Build trace URLs manually using the project's id and tenant_id:
# Step 1: Get org ID (tenant_id) and project ID
langsmith-cli --json projects get "dev/my-project" --fields id,tenant_id
# Step 2: Build the URL
# https://smith.langchain.com/o/{tenant_id}/projects/p/{project_id}?peek={run_id}&peeked_trace={trace_id}
Example:
org_id = "b658ea18-0431-42c0-8d03-337d43fed8cf" # tenant_id from projects get
proj_id = "730acc6c-ec97-4f08-915e-7d3f7f775300" # id from projects get
url = f"https://smith.langchain.com/o/{org_id}/projects/p/{proj_id}?peek={run_id}&peeked_trace={trace_id}"
peek = the specific run ID to open in the side panelpeeked_trace = the trace (root run) ID it belongs torun.id and run.trace_id)# Multi-project matching
--project-name-pattern "prd/*" # wildcard
--project-name-regex "^(prd|stg)" # regex
# Time windows (combinable)
--since 2026-01-15 --before 2026-01-29
--last 7d
--since 2026-01-15 --last 14d # forward window
# Content search
--query "text" # server-side, fast, first ~250 chars only
--grep "pattern" --grep-regex --grep-in inputs,outputs # client-side, all content
# Metadata filter (server-side, supports wildcards and regex)
--metadata channel_id=Gigaverse_Daily_Standup*
--metadata channel_id=/^Gigaverse/
# Reduce output size
--fields id,name,status,start_time
--roots # root traces only (cleaner)
--limit 10 --fetch 500 # fetch 500 from API, return top 10 matches