From liter-llm
Universal LLM client for 143 providers, 14 language bindings, OpenAI-compatible proxy, and MCP server. Use when integrating LLMs via liter-llm.
How this skill is triggered — by the user, by Claude, or both
Slash command
/liter-llm:liter-llmThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Liter-LLM is a universal LLM API client with a Rust core and native bindings for
Liter-LLM is a universal LLM API client with a Rust core and native bindings for Python, TypeScript/Node.js, Go, Java, C#, Ruby, PHP, Elixir, WebAssembly, and C (FFI). One unified interface reaches 143 providers (OpenAI, Anthropic, Google Gemini, Groq, Mistral, Cohere, AWS Bedrock, Azure, and many more).
provider/model prefix selects the backend
(openai/gpt-4o, anthropic/claude-sonnet-4-20250514,
google/gemini-2.0-flash). Set model_hint to skip the prefix.hard / soft), fallback chains, circuit-breaker
cooldown, and background health checks.search
and ocr methods routed by the same prefix convention.liter-llm api exposes an OpenAI-compatible gateway with
22 REST endpoints, virtual API keys, budgets, and SSE streaming.liter-llm mcp exposes 22 tools mirroring the proxy
endpoints, for MCP-compatible clients (Claude Code, Claude Desktop).| Use the … | When you want to … | Entry point |
|---|---|---|
| MCP server | Let the agent call LLM APIs directly as tools, with no glue code | liter-llm mcp --transport stdio (this plugin auto-registers it) |
| SDK / binding | Write application code that calls LLMs in a specific language | pip install liter-llm, cargo add liter-llm, etc. |
| Proxy | Give many apps/teams a shared OpenAI-compatible endpoint with keys, budgets, and rate limits | liter-llm api --config liter-llm-proxy.toml |
Rule of thumb: reach for the MCP server inside an agent session; the SDK when building software; the proxy when centralizing access for multiple consumers.
# Homebrew (macOS / Linux)
brew install xberg-io/tap/liter-llm
# or run it without a persistent install (the CLI proxy package self-installs the binary)
npx @kreuzberg/liter-llm-cli --help
uvx --from liter-llm-cli liter-llm --help
# or download a prebuilt binary from the latest GitHub release:
# https://github.com/xberg-io/liter-llm/releases/latest
# or build from source
cargo install --git https://github.com/xberg-io/liter-llm liter-llm-cli
# or Docker (35MB image)
docker pull ghcr.io/xberg-io/liter-llm
| Language | Install |
|---|---|
| Python | pip install liter-llm |
| Node.js | pnpm add @kreuzberg/liter-llm-node |
| Rust | cargo add liter-llm |
| Go | go get github.com/xberg-io/liter-llm/packages/go |
| Ruby | gem install liter_llm |
| PHP | composer require xberg-io/liter-llm |
| C# | dotnet add package LiterLlm |
| WASM | pnpm add @kreuzberg/liter-llm-wasm |
import asyncio
import os
from liter_llm import LlmClient
async def main() -> None:
client = LlmClient(api_key=os.environ["OPENAI_API_KEY"])
response = await client.chat(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
asyncio.run(main())
The prefix before / selects the provider:
await client.chat(model="openai/gpt-4o", messages=[...])
await client.chat(model="anthropic/claude-sonnet-4-20250514", messages=[...])
await client.chat(model="google/gemini-2.0-flash", messages=[...])
await client.chat(model="groq/llama3-70b", messages=[...])
API keys come from environment variables: OPENAI_API_KEY, ANTHROPIC_API_KEY,
GEMINI_API_KEY, GROQ_API_KEY, MISTRAL_API_KEY, CO_API_KEY, and AWS
credentials for Bedrock.
Pass options to the constructor, or create a liter-llm.toml (SDK) /
liter-llm-proxy.toml (proxy) — both auto-discover from the cwd upward.
# liter-llm.toml
api_key = "${OPENAI_API_KEY}"
model_hint = "openai"
timeout_secs = 120
max_retries = 5
[cache]
max_entries = 512
ttl_seconds = 600
[budget]
global_limit = 50.0
enforcement = "hard"
[rate_limit]
rpm = 60
tpm = 100000
| Option | Default | Description |
|---|---|---|
api_key | required | Provider key, wrapped in SecretString (never logged). |
base_url | from registry | Override the provider base URL. |
model_hint | none | Pre-resolve a provider, skipping the prefix lookup. |
timeout | 60s | Request timeout. |
max_retries | 3 | Retries on 429/5xx with exponential backoff. |
cache | none | max_entries, ttl_seconds. |
budget | none | global_limit, model_limits, enforcement. |
rate_limit | none | rpm, tpm. |
cost_tracking | false | Per-request cost tracking. |
tracing | false | OpenTelemetry spans. |
liter-llm api --config liter-llm-proxy.toml
22 OpenAI-compatible endpoints, model routing by prefix, virtual API keys,
per-key RPM/TPM limits, cost tracking, budget enforcement, response caching, SSE
streaming, and an OpenAPI 3.1 spec at /openapi.json.
liter-llm mcp --transport stdio # for Claude Code / Claude Desktop
liter-llm mcp --transport http --port 3001
This plugin auto-registers the stdio server via scripts/mcp-launch.sh. The 22
tools mirror the proxy endpoints: chat, embed, generate_image, speech,
transcribe, moderate, rerank, search, ocr, list_models; file ops
(create_file, list_files, retrieve_file, delete_file, file_content);
batch ops (create_batch, list_batches, retrieve_batch, cancel_batch);
and Responses API (create_response, retrieve_response, cancel_response).
"provider/model" unless model_hint
is set, or routing fails.await inside an async context.chunk.choices[0].delta.content before use.hard rejects over-budget requests; soft logs and
allows.npx claudepluginhub xberg-io/plugins --plugin liter-llmCreates bite-sized, testable implementation plans from specs or requirements, with file structure and task decomposition. Activates before coding multi-step tasks.