Help us improve
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
Share bugs, ideas, or general feedback.
Portable multi-model orchestration: delegate to Ollama cloud, NVIDIA NIM, NVIDIA Security, and Codex from Claude Code.
npx claudepluginhub ranjankumarpatel/claude-code-multi-model --plugin multi-modelHand off to Codex for review, rescue, or adversarial verification
Auto-route a task — Opus picks models, dispatches in parallel, Codex verifies
List all available delegation models across providers
Security audit / PII / guardrail task via NVIDIA Security NIM
Delegate a prompt to a NVIDIA NIM frontier model
Admin access level
Server config contains admin-level keywords
Requires secrets
Needs API keys or credentials to function
Share bugs, ideas, or general feedback.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge.
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Flagship+ skill pack for OpenRouter - 30 skills for multi-model routing, fallbacks, and LLM gateway mastery
AI/ML development: LLM architecture, prompt engineering, ML ops, and NLP with production deployment focus
Delegate plan execution to Codex CLI via ASP. Part of cc-multi-cli-plugin. Requires the `multi` plugin.
AI-to-AI collaboration — review code, brainstorm ideas, and debate plans across Gemini, Codex, and Ollama
When calling LLM APIs from Python code. When connecting to llamafile or local LLM servers. When switching between OpenAI/Anthropic/local providers. When implementing retry/fallback logic for LLM calls. When code imports litellm or uses completion() patterns.
Harness-native ECC plugin for engineering teams - 60 agents, 232 skills, 75 legacy command shims, reusable hooks, rules, MCP conventions, and operator workflows for Claude Code plus adjacent agent harnesses
Share bugs, ideas, or general feedback.
Portable Claude Code plugin for automatic multi-model orchestration. Opus plans + synthesizes. Sonnet/Haiku/Ollama cloud/NVIDIA NIM/NVIDIA Security/Codex execute in parallel. Codex verifies before merge. No user prompting for model choice — Opus auto-routes from task signal.
Drop into any project and start delegating across providers immediately.
multi-model) bundling 3 MCP servers + 6 slash commands + 1 auto-trigger skill..mcp.json needed — the plugin manifest loads the MCP servers.| Requirement | Notes |
|---|---|
| Claude Code | Version with plugin + marketplace support |
| Node.js ≥ 18 | on PATH |
@modelcontextprotocol/sdk, zod (global npm) | npm i -g @modelcontextprotocol/sdk zod |
MCP_GLOBAL_MODULES env | Points at your global node_modules. Windows: C:\Users\<you>\AppData\Roaming\npm\node_modules. macOS/Linux: output of npm root -g. |
NVIDIA_API_KEY (optional) | For NVIDIA NIM + Security. Get at build.nvidia.com. |
OLLAMA_HOST (optional) | Default http://localhost:11434. Ollama cloud models require an Ollama install + cloud-enabled account. |
| Codex plugin (optional) | For /codex:review, /codex:rescue, /codex:adversarial-review. Install from openai/codex-plugin-cc. Requires the Codex CLI on PATH. |
Two commands, any project, any machine:
claude plugin marketplace add ranjankumarpatel/claude-code-multi-model
claude plugin install multi-model@claude-code-multi-model
Restart Claude Code → plugin auto-loads with its 3 MCP servers. Verify:
claude mcp list # expect plugin:multi-model:{ollama,nvidia-nim,nvidia-security}
Updates: claude plugin update multi-model@claude-code-multi-model.
For hacking on the plugin itself:
git clone https://github.com/ranjankumarpatel/claude-code-multi-model.git
claude plugin marketplace add /absolute/path/to/claude-code-multi-model
claude plugin install multi-model@claude-code-multi-model
Set once per machine (shell profile):
# Required for MCP servers to find the SDK
export MCP_GLOBAL_MODULES="$(npm root -g)"
# Optional — NVIDIA NIM + Security
export NVIDIA_API_KEY="nvapi-..."
# Optional — override Ollama host
export OLLAMA_HOST="http://localhost:11434"
Windows PowerShell:
setx MCP_GLOBAL_MODULES "C:\Users\$env:USERNAME\AppData\Roaming\npm\node_modules"
setx NVIDIA_API_KEY "nvapi-..."
Install MCP deps globally:
npm i -g @modelcontextprotocol/sdk zod
Codex is optional but recommended — it's the verification gate + rescue executor in the auto-routing pattern.
codex runs on your terminal.claude plugin install codex@claude-code-multi-model
/codex:review or /codex:rescue inside Claude Code.If Codex is not installed, multi-model still works — auto-routing will simply skip the Codex verification step.
Opus never edits files or runs shell directly. It parses your request, decomposes into subtasks, and dispatches each to the best executor using this rubric:
| Task signal | Auto-route to |
|---|---|
| Bulk read / grep / rename / format | Haiku |
| Multi-file refactor, debugging, tests | Sonnet |
| Deep chain-of-thought reasoning | kimi-k2-thinking:cloud or deepseek-r1 |
| Coding second opinion / alt-frontier | gemma4:31b-cloud or nemotron-ultra |
| Long-context / agentic / vision | kimi-k2.5:cloud |
| Multilingual / non-English code | mistral-large |
| Large general-purpose | llama405b |
| Security audit / CVE / OWASP / PII / injection | NVIDIA Security |
| Stuck / failing tests / pre-merge verify | Codex |
| ≥2 independent subtasks | Parallel in one message |
You just state the goal. Opus reports the route in one line (e.g. Routing: refactor → Sonnet; rename → Haiku; audit → NVIDIA Security) and runs.