From openrouter
Retrieves metadata, costs, latency, token usage, provider routing, and stored content for OpenRouter generation IDs. Use for inspecting or debugging specific LLM requests.
How this skill is triggered — by the user, by Claude, or both
Slash command
/openrouter:openrouter-generationsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Retrieve detailed metadata and stored content for individual OpenRouter generations. Use this skill when you need to inspect a specific request — its cost, latency, token usage, provider routing, or the actual prompt/completion text.
Retrieve detailed metadata and stored content for individual OpenRouter generations. Use this skill when you need to inspect a specific request — its cost, latency, token usage, provider routing, or the actual prompt/completion text.
--api-key <key> or set the OPENROUTER_API_KEY environment variablegen-1234567890 or gen-aBcDeFgHiJkLmNoPqRsT.cd <skill-path>/scripts && npm install
| Endpoint | Method | Purpose |
|---|---|---|
/api/v1/generation | GET | Request metadata and usage (tokens, cost, latency, model, provider) |
/api/v1/generation/content | GET | Stored prompt and completion text |
Both take a single query parameter: id (the generation ID).
Full API reference: openrouter.ai/docs/api/api-reference/generations/get-generation
Retrieves everything about a generation except the actual prompt/completion text:
cd <skill-path>/scripts && npx tsx get-generation.ts gen-1234567890
npx tsx get-generation.ts --id gen-1234567890 --json
What you get back:
model, provider_name, router, service_tiertokens_prompt, tokens_completion, native_tokens_reasoning, native_tokens_cachedtotal_cost, usage, upstream_inference_cost, cache_discountlatency, generation_time, moderation_latencyfinish_reason, streamed, cancelled, is_byokcreated_at, app_id, external_user, session_id, request_idprovider_responses array showing fallback attempts with per-provider latency and statusRetrieves the stored prompt and completion:
cd <skill-path>/scripts && npx tsx get-generation-content.ts gen-1234567890
npx tsx get-generation-content.ts --id gen-1234567890 --json
What you get back:
prompt (raw text) and/or messages (array of {role, content})completion (the model's response) and reasoning (chain-of-thought, if applicable)Note: Content is only available if the generation was not made with Zero Data Retention (ZDR) enabled. If ZDR was on, this endpoint returns empty/null content.
curl -G https://openrouter.ai/api/v1/generation \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-d id=gen-1234567890
curl -G https://openrouter.ai/api/v1/generation/content \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-d id=gen-1234567890
/api/v1/generation){
"data": {
"id": "gen-3bhGkxlo4XFrqiabUM7NDtwDzWwG",
"api_type": "completions",
"model": "openai/gpt-4o",
"provider_name": "OpenAI",
"created_at": "2024-07-15T23:33:19.433273+00:00",
"tokens_prompt": 10,
"tokens_completion": 25,
"native_tokens_reasoning": 5,
"native_tokens_cached": 3,
"total_cost": 0.0015,
"usage": 0.0015,
"upstream_inference_cost": 0.0012,
"latency": 1250,
"generation_time": 1200,
"finish_reason": "stop",
"streamed": true,
"is_byok": false,
"cancelled": false,
"router": "openrouter/auto",
"service_tier": "priority",
"provider_responses": [
{
"provider_name": "OpenAI",
"model_permaslug": "openai/gpt-4o",
"status": 200,
"latency": 1200,
"is_byok": false
}
]
}
}
/api/v1/generation/content){
"data": {
"input": {
"prompt": "What is the meaning of life?",
"messages": [
{
"content": "What is the meaning of life?",
"role": "user"
}
]
},
"output": {
"completion": "The meaning of life is a philosophical question...",
"reasoning": null
}
}
}
# Check what happened — look at finish_reason, provider_responses, and cancelled
cd <skill-path>/scripts && npx tsx get-generation.ts gen-abc123 --json
Look for:
finish_reason = "length" means the model hit max tokensfinish_reason = "content_filter" means content was filteredcancelled = true means the request was cancelled by the clientprovider_responses with multiple entries means fallbacks occurredcd <skill-path>/scripts && npx tsx get-generation.ts gen-abc123
Check total_cost (what you were charged) vs upstream_inference_cost (what the provider charged OpenRouter).
cd <skill-path>/scripts && npx tsx get-generation-content.ts gen-abc123
Useful for debugging unexpected outputs — verify the actual prompt sent and completion received.
If you have a request_id or session_id from one generation, you can find related generations via the analytics query endpoint (see openrouter-analytics skill).
| Status | Meaning |
|---|---|
| 401 | Invalid or missing API key |
| 403 | You don't have access to this generation (belongs to another user) |
| 404 | Generation ID not found |
| 429 | Rate limited — wait and retry |
| 500 | Server error — retry |
| 502 | Upstream failure — retry |
| Field | Type | Description |
|---|---|---|
id | string | Generation ID (gen-...) |
model | string | Model permaslug (e.g., openai/gpt-4o) |
provider_name | string|null | Provider that served the request |
api_type | string | One of: completions, embeddings, rerank, tts, stt, video |
tokens_prompt | int|null | Prompt token count |
tokens_completion | int|null | Completion token count |
native_tokens_reasoning | int|null | Reasoning/thinking tokens |
native_tokens_cached | int|null | Cached input tokens |
total_cost | number | Total cost in USD |
usage | number | Usage amount in USD |
upstream_inference_cost | number|null | Provider's cost in USD |
cache_discount | number|null | Discount from caching |
latency | number|null | Total latency in ms |
generation_time | number|null | Model generation time in ms |
moderation_latency | number|null | Moderation check time in ms |
finish_reason | string|null | Why generation stopped (stop, length, content_filter, etc.) |
native_finish_reason | string|null | Raw finish reason from provider |
streamed | bool|null | Whether response was streamed |
is_byok | bool | Whether user's own provider key was used |
cancelled | bool|null | Whether request was cancelled |
app_id | int|null | OAuth app ID |
external_user | string|null | External user identifier (X-External-User header) |
session_id | string|null | Session grouping ID |
request_id | string|null | Request grouping ID (all gens from one API call) |
router | string|null | Router used (e.g., openrouter/auto) |
service_tier | string|null | Provider service tier |
web_search_engine | string|null | Search engine used (e.g., exa, firecrawl) |
num_search_results | int|null | Number of search results included |
provider_responses | array|null | Provider attempt chain with per-provider latency/status |
| Field | Type | Description |
|---|---|---|
data.input.prompt | string|null | Raw prompt text |
data.input.messages | array|null | Messages array ([{role, content}]) |
data.output.completion | string|null | Model's completion text |
data.output.reasoning | string|null | Chain-of-thought reasoning |
npx claudepluginhub openrouterteam/skills --plugin openrouterImplements Python wrapper for audited OpenRouter API completions with generation metadata logging, cost/token tracking, prompt hashing, and compliance-ready entries.
Answers natural-language questions about OpenRouter usage data: spend, request volume, model breakdown, latency, token usage, and cost optimization.