Generate videos using Google Veo models via the nano-banana CLI. Use this skill when the user asks to create, generate, animate, or produce videos with AI. Supports text-to-video, image-to-video animation, dialogue with lip-sync, and scene extensions. Trigger on requests like "create a video", "animate this image", "make a video clip", "generate footage", "produce a short film", "add motion to this".
Generate videos using Google Veo 3.1 models via the nano-banana CLI. Trigger this when users request to create, animate, or produce videos from text or images. Supports text-to-video, image-to-video animation, dialogue with lip-sync, and scene extensions.
/plugin marketplace add The-Focus-AI/nano-banana-cli/plugin install nano-banana@focus-marketplaceThis skill inherits all available tools. When active, it can use any tool Claude has access to.
examples/cinematic-shots.mdexamples/dialogue-and-audio.mdexamples/image-to-video.mdexamples/json-prompting.mdexamples/scene-extensions.mdprompting-guide.mdGenerate videos using Google Veo 3.1 models via the nano-banana CLI.
GEMINI_API_KEY environment variable must be setnpx @the-focus-ai/nano-banana# Generate a video from text
nano-banana --video "A sunset over mountains, slow dolly-in, cinematic lighting"
# Animate an existing image
nano-banana --video "The character slowly turns and smiles" --file portrait.png
# Cost-optimized development mode
nano-banana --video "Quick test scene" --video-fast --no-audio --resolution 720p
# Specify output path
nano-banana --video "A cat playing" --output cat-video.mp4
# Full control over settings
nano-banana --video "Dramatic reveal scene" \
--duration 8 --aspect 16:9 --resolution 1080p --seed 42
Before generating, clarify these video-specific aspects:
Structure prompts with these elements:
[Camera Movement] + [Subject] + [Action] + [Environment] + [Audio/Style]
Example - Weak prompt:
"a person walking"
Example - Strong prompt:
"Slow dolly-in shot. A woman in her 30s, shoulder-length wavy black hair,
green jacket, walks confidently through a sunlit park. Golden hour lighting,
warm color grading. Ambient sounds: birds chirping, distant traffic.
Cinematic, aspirational mood. No subtitles, no text overlay."
Use the prompting-guide.md for comprehensive guidance.
Key principles:
Video generation is significantly more expensive than images:
| Model | Cost per Second | 8-Second Video |
|---|---|---|
veo-3.1-generate-preview | $0.50-0.75 | $4-6 |
veo-3.1-fast-generate-preview | $0.10-0.15 | $0.80-1.20 |
Development workflow:
--video-fast --no-audio (cheapest)--video-fast (add audio when needed)nano-banana --video "your detailed prompt here"
Generation takes 2-4 minutes. Progress is shown in the terminal.
If the result isn't right:
nano-banana --video "<prompt>"
nano-banana --video "<motion description>" --file <input-image>
The motion description should describe how the image should animate:
| Option | Description | Default |
|---|---|---|
--video | Enable video mode | (required) |
--video-model <name> | Veo model to use | veo-3.1-generate-preview |
--video-fast | Use fast/cheap model | (premium model) |
--duration <sec> | 4, 6, or 8 seconds | 8 |
--aspect <ratio> | 16:9 or 9:16 | 16:9 |
--resolution <res> | 720p or 1080p | 1080p |
--audio | Generate audio | (enabled) |
--no-audio | Disable audio | - |
--seed <number> | Reproducibility seed | (random) |
--output <file> | Output path | output/video-<timestamp>.mp4 |
--file <image> | Input image to animate | - |
Use these terms for precise camera control:
| Movement | Description | Example Prompt |
|---|---|---|
| Static | No movement | "Static shot on tripod. A coffee cup steaming..." |
| Pan | Horizontal rotation | "Slow pan left across the city skyline..." |
| Tilt | Vertical rotation | "Tilt down from face to hands..." |
| Dolly In | Camera moves closer | "Slow dolly-in from medium to close-up..." |
| Dolly Out | Camera moves away | "Dolly-out revealing the vast landscape..." |
| Tracking | Parallel to subject | "Tracking shot following character walking..." |
| Crane | Sweeping vertical | "Crane shot ascending from ground level..." |
| Handheld | Realistic shake | "Handheld camera, documentary style..." |
Important: Use ONE primary movement per shot. Don't combine multiple movements.
For spoken dialogue, use the colon format:
Character description says: "Exact dialogue here."
Example:
"A friendly young woman, excited and cheerful, says: 'Welcome to our store!'
Standing in bright retail environment. Natural lip-sync. No subtitles."
Guidelines:
Structure audio in layers:
Example:
"Sound effects: Door closing at 2-second mark, footsteps on wood.
Ambient sounds: Quiet office hum, distant typing.
Background music: Soft jazz, low volume, ducks under dialogue."
When creating multiple related videos:
--seed for more reproducible results--video-fast for faster generation# Development (cheapest): ~$0.80 per video
nano-banana --video "test prompt" --video-fast --no-audio --resolution 720p
# Testing with audio: ~$1.20 per video
nano-banana --video "test prompt" --video-fast
# Production quality: ~$6 per video
nano-banana --video "final prompt" --resolution 1080p
See the examples/ directory for complete prompt examples:
Ensure GEMINI_API_KEY is set:
export GEMINI_API_KEY="your-api-key-here"
Or create a .env file in your project:
GEMINI_API_KEY=your-api-key-here
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Guide users through a structured workflow for co-authoring documentation. Use when user wants to write documentation, proposals, technical specs, decision docs, or similar structured content. This workflow helps users efficiently transfer context, refine content through iteration, and verify the doc works for readers. Trigger when user mentions writing docs, creating proposals, drafting specs, or similar documentation tasks.