Help us improve
Share bugs, ideas, or general feedback.
From gr
Orchestrates multi-clip AI video projects using anchor-first chaining for visual consistency across shots. Covers concept planning, style anchors, and generation phases.
npx claudepluginhub galbaz1/video-research-mcpHow this skill is triggered — by the user, by Claude, or both
Slash command
/gr:video-productionThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Orchestrates the full lifecycle of cinematic AI video: anchor image through multi-shot assembly. The core principle is **anchor-first chaining** — one perfect hero still locks visual identity (lighting, palette, texture) across every clip. Without this, each generation invents its own world.
Acts as AI creative director for video production including product ads, short films, montages, TikTok e-commerce. Analyzes inputs, writes English prompts, generates assets, submits tasks.
Provides prompting techniques for AI video generation models on Replicate. Covers scene description, camera direction, and cinematography language for better video outputs.
Orchestrates AI video production workflow: gathers specs interactively, generates scripts/storyboards, Gemini TTS voiceovers, Lyria music, Veo 3.1 clips or image animations, assembles with FFmpeg.
Share bugs, ideas, or general feedback.
Orchestrates the full lifecycle of cinematic AI video: anchor image through multi-shot assembly. The core principle is anchor-first chaining — one perfect hero still locks visual identity (lighting, palette, texture) across every clip. Without this, each generation invents its own world.
Every production follows five phases regardless of provider or chain pattern.
Define the visual brief before touching any tool.
Generate one hero image per visual world with mcp-image. This image is the single source of truth for all subsequent clips.
mcp-image parameters:
quality: "quality"
imageSize: "4K"
purpose: "cinematic video style anchor"
maintainCharacterConsistency: true (for multi-image sets)
Iterate with inputImagePath until lighting, composition, and subject are exactly right. The anchor is immutable once approved — everything flows from it.
Anchor requirements:
assets/style-anchors/ with descriptor strings in descriptors.mdAnchor sandwich (for sequential scenes):
ANCHOR-START --> [motion] --> HERO --> [motion] --> ANCHOR-END
| |
= ANCHOR-END of previous scene = ANCHOR-START of next scene
This ensures visual continuity: each scene's endpoint feeds the next scene's entry point.
Animate the anchor into video clips. The prompt describes motion and environment only — not appearance (the image handles that).
Prompt structure: See the prompt template in video-generation skill. Key rule: prompt describes motion and environment only — the anchor image handles appearance. Never paraphrase character descriptions across shots.
Never leave lighting vague. Map scenes to concrete physical lights:
| Environment | Physical Light |
|---|---|
| Server room | cool fluorescent strips, blue accent from screens |
| Industrial floor | high-bay sodium vapor, warm amber + clerestory daylight |
| Office | recessed LED panels 4000K, natural window light from one side |
| Outdoor day | overcast sky, no hard shadows, gentle ambient fill |
| Night/moody | single desk lamp, warm pool, deep shadows beyond |
Multi-take protocol: Generate 3-4 variants for important shots. Evaluate on two axes:
Pick fewest rationality failures. Among ties, pick best technical. All fail: revise prompt — never fix blockers in post.
Extract frames and inspect visually instead of watching at playback speed. This turns subjective "looks off" into precise "frame 47 has a lighting discontinuity."
Standard extraction:
mkdir -p /tmp/qa/variant-${i}
ffmpeg -i variant_${i}.mp4 -vf "fps=10" /tmp/qa/variant-${i}/frame_%04d.png
Quick scan — contact sheet:
ffmpeg -i input.mp4 -vf "fps=1,scale=320:-1,tile=6x5" -frames:v 1 -q:v 3 contact_sheet.jpg
Visual inspection checklist:
Decision tree:
| Result | Action |
|---|---|
| All variants fail | Improve prompt, regenerate (max 3 rounds) |
| One variant good | Lock it, proceed to assembly |
| Multiple good | Pick best, lock it |
| Close but flawed | Note specific frame issues, improve prompt |
After 3 failed prompt rounds, simplify the scene: reduce motion, shorten duration, or split into sub-scenes.
Artifact severity:
| Artifact | Severity | Action |
|---|---|---|
| Temporal flicker | Blocker | Regenerate |
| Boundary inconsistency | Blocker | Regenerate |
| Rubber-sheet deformation | Blocker | Regenerate |
| Physics violations | Blocker | Regenerate |
| Color temperature drift | Major | Fix in color grade |
| Too-clean texture | Minor | Fix with grain overlay |
Gate: 0 blockers to pass. Any blocker means regenerate or revise prompt. Majors fix in post if possible.
Scene detection before assembly:
ffmpeg -i input.mp4 -vf "scdet=threshold=40" -f null - 2>&1 | grep scdet
AI generators sometimes insert hard cuts within a single clip. Detect before designing transitions.
Once all scenes pass QA individually:
ffprobe after every operation, final printscreen QA on transition pointsxfade offset formula:
offset = sum_of_previous_clip_durations - cumulative_overlap
Every xfade must have a paired acrossfade with identical duration (omitting causes audio discontinuities).
Four patterns for different production scenarios. Choose based on the decision tree, then see references/workflow-patterns.md for detailed walkthroughs.
Hero still branches into multiple clips sharing visual DNA. Best for: central character or environment across distinct scenes.
Hero Still --> animate_image --> Clip 1
--> style-ref video --> Clip 2
--> style-ref video --> Clip 3
--> FFmpeg montage
Each clip's last frame becomes the next clip's first frame. Best for: continuous motion through a space.
Hero --> animate --> Clip 1 --> extract last frame --> animate --> Clip 2 --> ...
Extract bridge frame: ffmpeg -sseof -0.1 -i clip_N.mp4 -frames:v 1 -q:v 2 last_frame_N.jpg
Gotcha: motion drift accumulates after 3-4 links. Trim each clip to its first 3-4 strong seconds before extracting.
One anchor, multiple independent clips with different treatments. Best for: same scene in different moods, lighting, or time-of-day.
Single clip extended sequentially beyond the generation limit. Best for: long takes, slow reveals, atmospheric holds.
Gotcha: quality degrades after 2 extensions (~24s). Plan the most important content in the first 8s.
Decision tree:
Multiple distinct scenes, same visual identity --> Pattern 1
Continuous motion through a space --> Pattern 2
Same scene, different moods/treatments --> Pattern 3
Single long unbroken shot --> Pattern 4
Unsure --> Pattern 1 (most versatile)
hqdn3d -> scale -> unsharp -> eq before assemblySee ffmpeg-production for the canonical post-processing chain order (temporal denoise → upscale → sharpen → color grade → grain → encode). The sequence is load-bearing — grain before denoising is destroyed, interpolation after grain causes tearing.
| Need | Skill |
|---|---|
| Hero image prompt optimization | image-generation |
| Provider tools and settings | video-generation |
| FFmpeg encoding and filters | ffmpeg-production |
| Voice-over and audio mixing | tts-production |
For detailed per-pattern walkthroughs, FFmpeg commands, QA inspection protocol, multi-take selection, and post-processing recipes: references/workflow-patterns.md