Generate videos using Google Veo models via the nano-banana CLI. Use this skill when the user asks to create, generate, animate, or produce videos with AI.
Generate videos using Google Veo 3.1 models via the nano-banana CLI.
GEMINI_API_KEY environment variable must be setnpx @the-focus-ai/nano-banana# Generate a video from text
nano-banana --video "A sunset over mountains, slow dolly-in, cinematic lighting"
# Animate an existing image
nano-banana --video "The character slowly turns and smiles" --file portrait.png
# Cost-optimized development mode
nano-banana --video "Quick test scene" --video-fast --no-audio --resolution 720p
# Specify output path
nano-banana --video "A cat playing" --output cat-video.mp4
# Full control over settings
nano-banana --video "Dramatic reveal scene" \
--duration 8 --aspect 16:9 --resolution 1080p --seed 42
Before generating, clarify these video-specific aspects:
Structure prompts with these elements:
[Camera Movement] + [Subject] + [Action] + [Environment] + [Audio/Style]
Example - Weak prompt:
"a person walking"
Example - Strong prompt:
"Slow dolly-in shot. A woman in her 30s, shoulder-length wavy black hair,
green jacket, walks confidently through a sunlit park. Golden hour lighting,
warm color grading. Ambient sounds: birds chirping, distant traffic.
Cinematic, aspirational mood. No subtitles, no text overlay."
Use the prompting-guide.md for comprehensive guidance.
Key principles:
Video generation is significantly more expensive than images:
| Model | Cost per Second | 8-Second Video |
|---|---|---|
veo-3.1-generate-001 |
$0.40 | $3.20 |
veo-3.1-fast-generate-001 |
$0.15 | $1.20 |
Development workflow:
--video-fast --no-audio (cheapest)--video-fast (add audio when needed)nano-banana --video "your detailed prompt here"
Generation takes 2-4 minutes. Progress is shown in the terminal.
If the result isn't right:
nano-banana --video "<prompt>"
nano-banana --video "<motion description>" --file <input-image>
The motion description should describe how the image should animate:
| Option | Description | Default |
|---|---|---|
--video |
Enable video mode | (required) |
--video-model <name> |
Veo model to use | veo-3.1-generate-001 |
--video-fast |
Use fast/cheap model | (premium model) |
--duration <sec> |
4, 6, or 8 seconds | 8 |
--aspect <ratio> |
16:9 or 9:16 | 16:9 |
--resolution <res> |
720p, 1080p, or 4K | 1080p |
--audio |
Generate audio | (enabled) |
--no-audio |
Disable audio | - |
--seed <number> |
Reproducibility seed | (random) |
--output <file> |
Output path | output/video- |
--file <image> |
Input image to animate | - |
Use these terms for precise camera control:
| Movement | Description | Example Prompt |
|---|---|---|
| Static | No movement | "Static shot on tripod. A coffee cup steaming..." |
| Pan | Horizontal rotation | "Slow pan left across the city skyline..." |
| Tilt | Vertical rotation | "Tilt down from face to hands..." |
| Dolly In | Camera moves closer | "Slow dolly-in from medium to close-up..." |
| Dolly Out | Camera moves away | "Dolly-out revealing the vast landscape..." |
| Tracking | Parallel to subject | "Tracking shot following character walking..." |
| Crane | Sweeping vertical | "Crane shot ascending from ground level..." |
| Handheld | Realistic shake | "Handheld camera, documentary style..." |
Important: Use ONE primary movement per shot. Don't combine multiple movements.
For spoken dialogue, use the colon format:
Character description says: "Exact dialogue here."
Example:
"A friendly young woman, excited and cheerful, says: 'Welcome to our store!'
Standing in bright retail environment. Natural lip-sync. No subtitles."
Guidelines:
Structure audio in layers:
Example:
"Sound effects: Door closing at 2-second mark, footsteps on wood.
Ambient sounds: Quiet office hum, distant typing.
Background music: Soft jazz, low volume, ducks under dialogue."
When creating multiple related videos:
--seed for more reproducible results--video-fast for faster generation# Development (cheapest): ~$1.20 per video
nano-banana --video "test prompt" --video-fast --no-audio --resolution 720p
# Testing with audio: ~$1.20 per video
nano-banana --video "test prompt" --video-fast
# Production quality: ~$3.20 per video
nano-banana --video "final prompt" --resolution 1080p
See the examples/ directory for complete prompt examples:
Ensure GEMINI_API_KEY is set:
export GEMINI_API_KEY="your-api-key-here"
Or create a .env file in your project:
GEMINI_API_KEY=your-api-key-here