From llm-router
Route tasks to the cheapest capable model automatically using llm-router MCP tools.
npx claudepluginhub ypollak2/llm-router --plugin llm-routerThis skill uses the workspace's default tool permissions.
Route tasks to the cheapest capable model automatically using llm-router MCP tools.
Guides Next.js Cache Components and Partial Prerendering (PPR): 'use cache' directives, cacheLife(), cacheTag(), revalidateTag() for caching, invalidation, static/dynamic optimization. Auto-activates on cacheComponents: true.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Share bugs, ideas, or general feedback.
Route tasks to the cheapest capable model automatically using llm-router MCP tools.
Before answering research, code, writing, or analysis tasks — call the appropriate llm-router tool instead of answering directly. The router picks the cheapest model that can handle the task (Ollama → Codex → paid APIs in free-first order).
| Task type | Tool to call | Why |
|---|---|---|
| Simple factual question | llm_query | Gemini Flash / Groq — 50× cheaper than o3 |
| Research / current events | llm_research | Perplexity (web-grounded, not stale) |
| Writing / summaries / brainstorm | llm_generate | Gemini Flash / Haiku |
| Deep analysis / debugging | llm_analyze | GPT-4o / Gemini Pro |
| Code generation / refactoring | llm_code | Ollama → Codex built-in → o3 |
| Don't know which type | llm_auto | Auto-classifies + routes, tracks savings |
llm_query(prompt="What is the capital of France?")
llm_code(prompt="Refactor this function to use async/await", complexity="moderate")
llm_research(prompt="What changed in Python 3.13?")
llm_auto(prompt="<the full user request>") # safest default
Routing simple tasks to Gemini Flash instead of o3 saves ~50–100×.
llm_auto shows cumulative savings every 5 calls automatically.
Run llm_savings anytime to see your totals.