Skill

openrouter-pricing-basics

Queries OpenRouter model pricing via API with curl/jq, calculates request costs using Python, lists cost tiers, and explains credit purchases for budgeting.

Bash

Python

ai-ml

npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin openrouter-pack

Tool Access

This skill is limited to using the following tools:

ReadWriteEditBashGrep

Preview

OpenRouter charges per token with separate rates for prompt (input) and completion (output) tokens. Prices are listed per token in the models API (multiply by 1M for per-million rates). Credits are prepaid with a 5.5% processing fee ($0.80 minimum). Free models are available for testing and low-volume use.

Supporting Assets

references/cost-comparison-tool.mdreferences/cost-optimization.mdreferences/credit-system.mdreferences/errors.mdreferences/examples.mdreferences/model-pricing-tiers.mdreferences/monitoring-costs.md

SKILL.md

Similar Skills

openrouter-cost-controls

1.9k

Implements OpenRouter API cost controls: bash credit balance checks, Python per-key limits, and budget enforcement middleware. Use for budgets, overspend prevention, key management.

7 files5 tools

openrouter-pack

anth-cost-tuning

1.9k

Optimizes Anthropic Claude API costs with model routing, prompt caching, batching, spend monitoring, and Python cost calculators. For billing analysis and reduction.

4 tools

anthropic-pack

groq-cost-tuning

1.9k

Optimizes Groq LLM inference costs using task-based model routing, prompt token minimization, and usage monitoring. For billing analysis, cost reduction, or budget alerts.

2 tools

groq-pack

Stats

Parent Repo Stars1854

Parent Repo Forks248

Last CommitApr 3, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

OpenRouter Pricing Basics

Overview

How Pricing Works

Buy credits at openrouter.ai/credits (5.5% fee, $0.80 minimum)
Each request deducts (prompt_tokens * prompt_rate) + (completion_tokens * completion_rate)
Check balance via GET /api/v1/auth/key or the dashboard
Auto-topup is available to prevent service interruption

Query Model Pricing

# Get pricing for all models
curl -s https://openrouter.ai/api/v1/models | jq '.data[] | select(.id == "anthropic/claude-3.5-sonnet") | {
  id: .id,
  prompt_per_M: ((.pricing.prompt | tonumber) * 1000000),
  completion_per_M: ((.pricing.completion | tonumber) * 1000000),
  context: .context_length
}'
# → { "id": "anthropic/claude-3.5-sonnet", "prompt_per_M": 3, "completion_per_M": 15, "context": 200000 }

Cost Tiers (Representative)

Tier	Example Model	Prompt/1M	Completion/1M	Use Case
Free	`google/gemma-2-9b-it:free`	$0.00	$0.00	Testing, prototyping
Budget	`meta-llama/llama-3.1-8b-instruct`	$0.06	$0.06	Simple Q&A, classification
Mid	`openai/gpt-4o-mini`	$0.15	$0.60	General purpose
Standard	`anthropic/claude-3.5-sonnet`	$3.00	$15.00	Complex reasoning, code
Premium	`openai/o1`	$15.00	$60.00	Deep reasoning

Calculate Request Cost

def estimate_cost(model_id: str, prompt_tokens: int, completion_tokens: int) -> float:
    """Calculate cost for a single request."""
    import requests
    models = requests.get("https://openrouter.ai/api/v1/models").json()["data"]
    model = next((m for m in models if m["id"] == model_id), None)
    if not model:
        raise ValueError(f"Model {model_id} not found")

    prompt_rate = float(model["pricing"]["prompt"])       # Cost per token
    completion_rate = float(model["pricing"]["completion"])
    return (prompt_tokens * prompt_rate) + (completion_tokens * completion_rate)

# Example: Claude 3.5 Sonnet, 1000 prompt + 500 completion tokens
cost = estimate_cost("anthropic/claude-3.5-sonnet", 1000, 500)
print(f"Estimated cost: ${cost:.6f}")  # ~$0.0105

Track Actual Cost Per Request

import requests

# Method 1: From response usage (estimate)
response = client.chat.completions.create(
    model="anthropic/claude-3.5-sonnet",
    messages=[{"role": "user", "content": "Hello"}],
    max_tokens=100,
)
# response.usage.prompt_tokens, response.usage.completion_tokens

# Method 2: Query generation endpoint (exact cost from OpenRouter)
gen = requests.get(
    f"https://openrouter.ai/api/v1/generation?id={response.id}",
    headers={"Authorization": f"Bearer {os.environ['OPENROUTER_API_KEY']}"},
).json()
print(f"Exact cost: ${gen['data']['total_cost']}")
print(f"Tokens: {gen['data']['tokens_prompt']} prompt + {gen['data']['tokens_completion']} completion")

Check Credit Balance

curl -s https://openrouter.ai/api/v1/auth/key \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" | jq '{
    credits_used: .data.usage,
    credit_limit: .data.limit,
    remaining: ((.data.limit // 0) - .data.usage),
    is_free_tier: .data.is_free_tier
  }'

Save Money with Variants

# :floor variant picks the cheapest provider for a model
response = client.chat.completions.create(
    model="anthropic/claude-3.5-sonnet:floor",  # Cheapest provider
    messages=[{"role": "user", "content": "Hello"}],
    max_tokens=100,
)

# :free variant uses free providers (where available)
response = client.chat.completions.create(
    model="google/gemma-2-9b-it:free",
    messages=[{"role": "user", "content": "Hello"}],
    max_tokens=100,
)

Special Pricing

Item	Pricing
Reasoning tokens	Charged as output tokens at completion rate
Image inputs	Per-image charge listed in `pricing.image`
Per-request fee	Some models charge a flat fee per request (`pricing.request`)
BYOK	First 1M requests/month free; then 5% of normal provider cost
Free model limits	50 req/day (free users), 1000 req/day (with $10+ credits)

Error Handling

HTTP	Cause	Fix
402	Insufficient credits	Top up at openrouter.ai/credits or use `:free` model
402	Key credit limit reached	Increase key limit or use a different key

Enterprise Considerations

Set per-key credit limits via the dashboard or provisioning API to isolate blast radius
Query /api/v1/generation?id= after each request for exact cost auditing
Use :floor variant to automatically pick the cheapest provider
Route simple tasks to budget models and complex tasks to premium models (see openrouter-model-routing)
Set max_tokens on every request to cap completion cost
Enable auto-topup to prevent service interruptions in production

References

Examples | Errors
Pricing | Credits | Models API