ComfyUI-Agent-Kit
Local-first ComfyUI for every AI coding agent (Claude Code, Codex, Gemini CLI, Qwen Code).
Your GPU, your models, no cloud, no account.
By AI VFX NEWS.
Make Claude Code, Codex, Gemini CLI, or Qwen Code drive ComfyUI at full power on your own machine - generate
images, video, and audio, build and run workflows, pick the model variant that fits your hardware, and show the
graph live in your own ComfyUI canvas. No hosted service, no per-generation billing: one installer wires the same
stack into every agent you run, then you hand the whole setup to someone else with one command.

This is the portable, machine-independent, multi-agent version of a working ComfyUI setup. One shared core
(the knowledge + the MCP driver) plus a thin adapter per agent. Clone it, run the installer, and each of your
agents gets the same stack, wired to your hardware. GLM (z.ai) run through Claude Code is covered by the
claude adapter. See docs/AGENTS.md for how each agent connects.
Local-first, for experts and everyday users alike. It scales from one-command image / video generation to a
professional VFX color pipeline: v2 ships ComfyUI-OCIO (eight
Nuke-style OpenColorIO nodes - Read a sequence, grade in ACES, write ProRes, fully color-managed) and the field
guide to building your own custom nodes.
Local-first by design. Prefer the cloud? The official Comfy Cloud MCP runs your workflows on Comfy's
GPUs, no local setup. This kit is the local-first counterpart: everything runs on hardware you control, with no
account and no per-generation cost, the model picker sizes each job to your VRAM, and it serves four agents, not
one. Use whichever fits the job.
What it can do
- Drive ComfyUI from four agents (Claude Code, Codex, Gemini CLI, Qwen Code) off one shared core. GLM via
Claude Code is covered too. (docs/AGENTS.md)
- ~90-tool MCP driver. The agent operates ComfyUI directly: generate, build / edit / validate graphs, queue,
download models, manage VRAM, read logs, diagnose.
- Per-model "mega-brain": 68 prompt recipes distilled from official sources (image, video, audio, 3D);
the agent auto-pulls the right recipe when you name a model, so it prompts each one in its own dialect.
- Knows where each model runs: a full index of all 149 library models (recipe /
utility / template-only), local vs API.
- Hardware-aware model selection: detects your VRAM, RAM, and free disk, then recommends the variant that
fits (fp8 / offload / multi-GPU / quant) and refuses a download that won't fit, before wasting the bandwidth.
- 18 enhancement and utility tools: upscale / restore (Real-ESRGAN, SUPIR, SeedVR2), frame interpolation
(FILM, RIFE), segmentation / depth / pose (SAM3, BiRefNet, Depth Anything), plus restoration chains.
- 545-template library (and 94 official Subgraph Blueprints, reusable subgraph bricks) as the source of truth, plus fetch any shared workflow by hash and a model
shootout (run a prompt through many models small, pick the winner, then scale up).
- Assembles new workflows from parts: decomposes a task into stages, mixes templates and blueprint subgraphs,
and wires the nodes correctly (output-to-input by type, with converters where needed), validated against
/object_info before running. Not a preset runner.
- Expert color + custom nodes (new in v2): ships ComfyUI-OCIO
- eight Nuke-style OpenColorIO nodes (Read / Write a still, sequence or video, grade in ACES, write ProRes / EXR,
fully color-managed) - and the agent knows each node's I/O plus the field guide to building a custom node pack.
(docs/NODE_LIBRARY/ocio.md, docs/BUILDING_NODES.md)
- Starts ComfyUI for you: when the server is down, the agent launches it headless in the background and
generates (no need to open the app first); to peek, you open
http://127.0.0.1:8188 in a browser. For an
unattended pipeline the start policy is configurable per project (env vars or a .comfyui-agent.json), so it
never blocks on a prompt.
- GUI bridge + persistence: the agent writes graphs into your ComfyUI canvas, and SAVES every workflow it