Drive a single AI-generated video project end-to-end: creative brief, model selection, character sheets, script (script-writer agent), storyboard, generation pipelines (text-to-image, image-to-video, voice, lip-sync), GPU-aware concatenation/normalisation/aspect conversion, ComfyUI import-export, budget estimation, and final export. State-aware lifecycle plugin designed to operate inside a per-project workspace. Ships with fal.ai + Replicate MCP servers preconfigured against a persistent ~/.config/ai-video-producer/.env.
npx claudepluginhub danielrosehill/claude-code-plugins --plugin ai-video-producerPropose an ffmpeg/edit plan to concatenate clips/edited/ into a rough cut in output/drafts/.
Add a recurring character or subject with appearance and voice notes. Saves to characters/.
Produce a script draft from the brief. Saves to scripts/drafts/.
Finalise the locked draft — encode to spec, embed subtitles, write to output/final/.
Scaffold the AI video production workspace (folder skeleton + project CLAUDE.md) into the current directory.
Record a generated artefact's prompt, model, and parameters alongside the file. Usage: /log-take <artefact-path>
Capture the creative brief, target spec, and model selection for this video project. Run this first.
Promote a raw take into clips/selected/ with rationale. Usage: /promote-take <raw-path>
Execute a predefined pipeline from pipelines/ for a given shot. Usage: /run-pipeline <pipeline-name> <shot-number>
One-time setup for fal.ai, Replicate, WaveSpeed, and MiniMax credentials, plus runner toolchain bootstrap. Writes ~/.config/ai-video-producer/.env, sources it from the user's shell profile, and installs the npm + pip dependencies needed by the SDK runners under runners/.
Break the locked script into a numbered shot list. Writes scripts/storyboards/ and timeline/shot-list.md.
Use this agent to upscale and/or convert aspect ratio for video clips — e.g. taking a 720p 16:9 generation and producing a 1080p 9:16 vertical for Reels/Shorts/TikTok, or upscaling SD-era footage for inclusion alongside HD generations. Picks between resize-only, letterbox, blur-fill, smart-crop, and AI upscaling depending on the source-vs-target relationship and the user's quality bar. GPU-aware: uses NVENC/AMF/QSV when available, falls back to CPU. For AI upscaling defers to the existing `upscale-and-interpolate` skill or external tools (Topaz, Real-ESRGAN) and does not reinvent them. <example> user: "Convert all my 16:9 clips to 9:16 for Shorts, upscale to 1080×1920" assistant: "Launching the aspect-converter agent." </example> <example> user: "This 720p clip needs to sit in a 4K timeline — upscale it" assistant: "I'll use the aspect-converter agent." </example>
Use this agent when the user has produced video elements in parts (clips, segments, scenes, intro/outro, B-roll inserts) and is ready to join and render them into a single deliverable. Detects the host's GPU (NVIDIA / AMD / Intel / CPU-only) and picks the right ffmpeg encoder (NVENC / AMF / QSV / libx264 / libx265). Handles concat-demuxer for same-codec same-resolution sources and re-encode-and-concat for mixed sources. Respects the workspace's `clips/selected/` ordering and writes to `output/`. <example> user: "I'm done picking takes — concat them and render the rough cut" assistant: "Launching the concatenator agent to detect GPU and assemble." </example> <example> user: "Stitch the intro, the three scene clips, and the outro into one MP4" assistant: "I'll use the concatenator agent — it'll pick the right encoder for your GPU." </example>
Use this agent to normalise a set of video clips so they can be safely concatenated or compared — equalises loudness (EBU R128 / -16 LUFS by default), aligns sample rate / channel layout, conforms framerate and pixel format, and optionally applies basic colour-level normalisation (full↔limited range, BT.601↔BT.709). Run before `concatenator` when sources are mixed (different generation models, screen recordings, ElevenLabs voice tracks, etc.). Writes normalised copies to `clips/normalised/` — never overwrites originals. <example> user: "These clips were generated with different models and the audio levels are all over the place — fix them" assistant: "Launching the normalizer agent." </example>
Use this agent to turn an approved pipeline SPEC.md (produced by pipeline-scaffolder) into an executable pipeline — runnable scripts/configs that `/run-pipeline` can invoke. Generates per-stage runner files, parameter templates, an entry-point script, and a pipeline README. Wires up calls to fal/replicate (or other providers) using the project's MCP servers and respects the workspace's path conventions. Use after the user approves a SPEC.md. <example> user: "Build out the pipeline I just scaffolded — talking-head v1" assistant: "Launching the pipeline-builder agent to generate the runners from SPEC.md." </example>
Use this agent to set up a private GitHub repository where the user versions reusable AI video pipelines across projects. Creates the repo via `gh`, scaffolds an opinionated structure (one folder per pipeline, shared model/cost reference, CHANGELOG, README), seeds it from any pipelines already present in the current AI-Video-Producer workspace, and wires the workspace to consume pipelines from there. Use the first time the user wants pipelines to live outside a single video project, or when migrating an existing pipeline collection into version control. <example> user: "Set me up a private repo for my pipelines" assistant: "I'll launch the pipeline-repo-setup agent to create and seed it." </example>
Use this agent to scaffold a new generation pipeline definition inside an AI-Video-Producer workspace. Takes a high-level pipeline description (e.g. "text → image (Flux) → image-to-video (Kling) → upscale (Topaz)") and produces a structured pipeline spec under `pipelines/<name>/` with a SPEC.md, stage definitions, expected inputs/outputs, model parameters, and integration points. Pure planning — does not execute or write runnable code. Hand off to pipeline-builder when the spec is approved. <example> user: "Scaffold a pipeline for talking-head shots: ElevenLabs voice → Hedra lip-sync → upscale" assistant: "I'll launch the pipeline-scaffolder agent to draft the spec." </example>
Use this agent to draft, revise, or polish a script for an AI video project. Operates inside an AI-Video-Producer workspace — reads `brief/creative-brief.md` and `brief/tools-and-models.md`, writes drafts to `scripts/drafts/`, iterates with the user until a draft is ready to promote to `scripts/final/script.md`. Picks the right format (narration / dialogue / silent visual) based on the brief, matches target runtime (~150 wpm for VO), and respects character seeds from `characters/`. Use when the user wants a draft written, an existing draft tightened, scenes reordered, runtime trimmed, or VO rewritten for a different voice/tone. <example> user: "Draft me a 90-second narration script from the brief" assistant: "Launching the script-writer agent to draft from brief/creative-brief.md." </example> <example> user: "Cut draft-02 down to 60 seconds and make it punchier" assistant: "I'll use the script-writer agent to produce a tightened revision." </example>
Estimate API costs for a planned pipeline or for an entire script before any generation runs. Reads pipeline SPEC.md(s), the storyboard, and `brief/tools-and-models.md`, then projects cost per shot and total — with a low / typical / high range that accounts for take iteration. Writes a budget report to `budgets/estimate-<YYYY-MM-DD>.md`. Use before `/run-pipeline` on any non-trivial project.
Export an AI-Video-Producer pipeline to ComfyUI workflow JSON — produce a workflow the user can load in ComfyUI to reproduce (or approximate) a hosted-API pipeline locally. Best-effort: maps fal/replicate model calls to equivalent ComfyUI checkpoints/LoRAs/samplers and flags stages that have no native ComfyUI equivalent. Use when the user wants to take a pipeline that's been working via APIs and bring it onto local hardware (cost reduction, offline work, fine-grained control).
Import a ComfyUI workflow JSON (the API-format export from ComfyUI's "Save (API Format)" option, or the UI-format graph JSON) and turn it into an AI-Video-Producer pipeline SPEC plus stage runners. Maps Comfy nodes to pipeline stages, surfaces the model checkpoints and LoRAs needed, and flags nodes that have no clean equivalent in a hosted-API pipeline. Use when the user has a ComfyUI workflow that works locally and wants to either run it inside a project workspace or document it as a versioned pipeline.
Research what fal.ai and Replicate (and optionally other providers — Runway, Kling, Pika, Hedra, ElevenLabs, OpenAI) currently offer for a given workload — text-to-image, text-to-video, image-to-video, lip-sync, voice, upscaling, interpolation — and recommend the best-fit model for the project's brief. Compares quality reputation, price, max duration/resolution, aspect ratio support, and known failure modes. Updates `brief/tools-and-models.md` with the recommendation and rationale on user approval.
Intelligent, preference-driven model recommendation across fal.ai, Replicate, WaveSpeedAI, and MiniMax (Hailuo). Asks the user a short set of preference questions (workload, priority — quality vs speed vs cost, max budget per output, resolution/aspect/duration, NSFW tolerance, must-have features like lip-sync or audio), queries each available provider, and returns a ranked shortlist (3–5 options) with approximate per-output costs, quality/speed notes, and a recommendation. Differs from `model-researcher`: live-API backed, cost-first, and conversational about preferences.
Reformat an existing script for a specific text-to-speech model (ElevenLabs, OpenAI TTS, Google, Azure, Hume, Chatterbox, etc.). Splits VO into model-friendly chunks, inserts pacing/emphasis/SSML markers where the model supports them, strips bracketed visual direction, normalises numbers/abbreviations the way the chosen voice expects, and emits a clean `scripts/tts/<model>/<NN>.txt` (or `.ssml`) file per beat. Use when the user has a locked script and is about to generate VO.
Two-stage pipeline that generates a still image from a text prompt (Flux/SDXL/Imagen) then animates it into a video clip (Kling/Runway/Hailuo/Wan). Use when the user wants tight visual control over the opening frame before motion.
Single-step text-to-video generation (Sora, Veo, Kling text mode, Hailuo). Use for quick exploratory shots or when motion is more important than precise framing.
Post-processing chain — upscale a clip's resolution and/or interpolate frames for smoother motion. Run after a clip is selected but before final assembly.
Reference for the AI video production workspace — lifecycle phases, folder layout, naming conventions, and working principles. Loaded by the install-video-workspace command and consulted by other commands when they need to know where things live.
Generate a voice take (ElevenLabs or similar) and lip-sync it to an existing video clip of a character speaking. Use for talking-head shots where the visual is already generated.
Comprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns
External network access
Connects to servers outside your machine
Requires secrets
Needs API keys or credentials to function
Uses power tools
Uses Bash, Write, or Edit tools
Standalone image generation plugin using Nano Banana MCP server. Generates and edits images, icons, diagrams, patterns, and visual assets via Gemini image models. No Gemini CLI dependency required.
Persistent memory system for Claude Code - seamlessly preserve context across sessions
Streamline people operations — recruiting, onboarding, performance reviews, compensation analysis, and policy guidance. Maintain compliance and keep your team running smoothly.
Uses Bash, Write, or Edit tools