From nano-banana
Generates AI videos using Google Veo models via nano-banana CLI for text-to-video, image animation, lip-sync dialogue, and scene extensions.
npx claudepluginhub the-focus-ai/claude-marketplace --plugin nano-bananaThis skill uses the workspace's default tool permissions.
Generate videos using Google Veo 3.1 models via the `nano-banana` CLI.
Retrieves texts, DMs, one-time codes, and inspects threads in ECC workflows. Provides evidence of exact sources checked for verification before replies.
Delivers expertise for HS tariff classification, customs documentation, duty optimization, restricted party screening, and trade compliance across jurisdictions.
Process documents with Nutrient API: convert formats (PDF, DOCX, XLSX, images), OCR scans (100+ languages), extract text/tables, redact PII, sign, fill forms.
Generate videos using Google Veo 3.1 models via the nano-banana CLI.
GEMINI_API_KEY environment variable must be setnpx @the-focus-ai/nano-banana# Generate a video from text
nano-banana --video "A sunset over mountains, slow dolly-in, cinematic lighting"
# Animate an existing image
nano-banana --video "The character slowly turns and smiles" --file portrait.png
# Cost-optimized development mode
nano-banana --video "Quick test scene" --video-fast --no-audio --resolution 720p
# Specify output path
nano-banana --video "A cat playing" --output cat-video.mp4
# Full control over settings
nano-banana --video "Dramatic reveal scene" \
--duration 8 --aspect 16:9 --resolution 1080p --seed 42
Before generating, clarify these video-specific aspects:
Structure prompts with these elements:
[Camera Movement] + [Subject] + [Action] + [Environment] + [Audio/Style]
Example - Weak prompt:
"a person walking"
Example - Strong prompt:
"Slow dolly-in shot. A woman in her 30s, shoulder-length wavy black hair,
green jacket, walks confidently through a sunlit park. Golden hour lighting,
warm color grading. Ambient sounds: birds chirping, distant traffic.
Cinematic, aspirational mood. No subtitles, no text overlay."
Use the prompting-guide.md for comprehensive guidance.
Key principles:
Video generation is significantly more expensive than images:
| Model | Cost per Second | 8-Second Video |
|---|---|---|
veo-3.1-generate-001 | $0.40 | $3.20 |
veo-3.1-fast-generate-001 | $0.15 | $1.20 |
Development workflow:
--video-fast --no-audio (cheapest)--video-fast (add audio when needed)nano-banana --video "your detailed prompt here"
Generation takes 2-4 minutes. Progress is shown in the terminal.
If the result isn't right:
nano-banana --video "<prompt>"
nano-banana --video "<motion description>" --file <input-image>
The motion description should describe how the image should animate:
| Option | Description | Default |
|---|---|---|
--video | Enable video mode | (required) |
--video-model <name> | Veo model to use | veo-3.1-generate-001 |
--video-fast | Use fast/cheap model | (premium model) |
--duration <sec> | 4, 6, or 8 seconds | 8 |
--aspect <ratio> | 16:9 or 9:16 | 16:9 |
--resolution <res> | 720p, 1080p, or 4K | 1080p |
--audio | Generate audio | (enabled) |
--no-audio | Disable audio | - |
--seed <number> | Reproducibility seed | (random) |
--output <file> | Output path | output/video-.mp4 |
--file <image> | Input image to animate | - |
Use these terms for precise camera control:
| Movement | Description | Example Prompt |
|---|---|---|
| Static | No movement | "Static shot on tripod. A coffee cup steaming..." |
| Pan | Horizontal rotation | "Slow pan left across the city skyline..." |
| Tilt | Vertical rotation | "Tilt down from face to hands..." |
| Dolly In | Camera moves closer | "Slow dolly-in from medium to close-up..." |
| Dolly Out | Camera moves away | "Dolly-out revealing the vast landscape..." |
| Tracking | Parallel to subject | "Tracking shot following character walking..." |
| Crane | Sweeping vertical | "Crane shot ascending from ground level..." |
| Handheld | Realistic shake | "Handheld camera, documentary style..." |
Important: Use ONE primary movement per shot. Don't combine multiple movements.
For spoken dialogue, use the colon format:
Character description says: "Exact dialogue here."
Example:
"A friendly young woman, excited and cheerful, says: 'Welcome to our store!'
Standing in bright retail environment. Natural lip-sync. No subtitles."
Guidelines:
Structure audio in layers:
Example:
"Sound effects: Door closing at 2-second mark, footsteps on wood.
Ambient sounds: Quiet office hum, distant typing.
Background music: Soft jazz, low volume, ducks under dialogue."
When creating multiple related videos:
--seed for more reproducible results--video-fast for faster generation# Development (cheapest): ~$1.20 per video
nano-banana --video "test prompt" --video-fast --no-audio --resolution 720p
# Testing with audio: ~$1.20 per video
nano-banana --video "test prompt" --video-fast
# Production quality: ~$3.20 per video
nano-banana --video "final prompt" --resolution 1080p
See the examples/ directory for complete prompt examples:
Ensure GEMINI_API_KEY is set:
export GEMINI_API_KEY="your-api-key-here"
Or create a .env file in your project:
GEMINI_API_KEY=your-api-key-here