Vidu — AI Video Generation - Vidu Q3 & Vidu 2.0
Generate AI videos using Vidu — featuring text-to-video, image-to-video, reference-to-video, and start-end-to-video with up to 1080p resolution, anime style...
Description
name: vidu description: "Generate AI videos using Vidu — featuring text-to-video, image-to-video, reference-to-video, and start-end-to-video with up to 1080p resolution, anime style support, audio/BGM generation, and movement amplitude control. Supports Vidu Q3-Pro (latest) and Vidu 2.0 across 6 model variants. Available via Atlas Cloud API at up to 15% off standard pricing. Use this skill whenever the user wants to generate AI videos, create video clips, animate images, produce short films, make video content, or mentions Vidu, Shengshu AI, or video generation. Also trigger when users ask to create product demos, marketing videos, social media reels, animated scenes, cinematic clips, anime videos, start-end frame interpolation, character-consistent videos, or any video content using AI." source: "https://github.com/AtlasCloudAI/nano-banana-2-skill" homepage: "https://github.com/AtlasCloudAI/nano-banana-2-skill" env_vars: ATLASCLOUD_API_KEY: description: "Atlas Cloud API key for accessing Vidu video generation models" required: true
Vidu — AI Video Generation by Shengshu AI
Generate AI videos using Vidu Q3-Pro and Vidu 2.0 — featuring text-to-video, image-to-video, reference-based generation, and start-end frame interpolation with up to 1080p resolution, anime style support, and synchronized audio generation.
Vidu Q3-Pro is the latest flagship model with cinematic motion quality, smooth animation, optional audio/BGM generation, and multiple style modes (general/anime). Vidu 2.0 adds specialized capabilities including reference-to-video (character-consistent generation) and start-end-to-video (keyframe interpolation).
Data usage note: This skill sends text prompts, image URLs, and video data to the Atlas Cloud API (
api.atlascloud.ai) for video generation. No data is stored locally beyond the downloaded output files. API usage incurs charges per second based on the model selected.
Key Capabilities
- Text-to-Video — Generate video clips from text descriptions with audio (Q3-Pro)
- Image-to-Video — Animate still images into dynamic video (Q3-Pro / 2.0)
- Reference-to-Video — Generate videos with character/prop consistency from 1-3 reference images (2.0 / Q1)
- Start-End-to-Video — Interpolate between two keyframes to create smooth transitions (2.0)
- Audio & BGM — Optional synchronized sound effects and background music (Q3-Pro)
- Anime Style — Native anime style support (Q3-Pro)
- Movement Control — Adjustable movement amplitude: auto, small, medium, large
- Up to 1080p — Resolutions: 540p, 720p, 1080p (Q3-Pro)
- Multiple Aspect Ratios — 16:9, 9:16, 1:1, 4:3, 3:4 (Q3-Pro)
Setup
- Sign up at https://www.atlascloud.ai
- Console → API Keys → Create new key
- Set env:
export ATLASCLOUD_API_KEY="your-key"
The API key is tied to your Atlas Cloud account and its pay-as-you-go balance. All usage is billed to this account. Atlas Cloud does not currently support scoped keys — the key grants access to all models available on your account.
Pricing
Vidu Q3-Pro (per second, by resolution)
All prices are per second of video generated. Atlas Cloud pricing varies by resolution.
| Resolution | fal.ai | Atlas Cloud | Savings |
|---|---|---|---|
| 540p | $0.07/s | $0.06/s | 14% off |
| 720p | $0.154/s | $0.15/s | 3% off |
| 1080p | $0.154/s | $0.16/s | - |
Applies to both vidu/q3-pro/text-to-video and vidu/q3-pro/image-to-video.
Vidu 2.0 / Q1 (per video)
| Model | Atlas Cloud | Type |
|---|---|---|
vidu/image-to-video-2.0 |
$0.075/video | Image-to-Video |
vidu/start-end-to-video-2.0 |
$0.075/video | Start-End Interpolation |
vidu/reference-to-video-2.0 |
$0.2/video | Reference-to-Video (character consistent) |
vidu/reference-to-video-q1 |
$0.4/video | Reference-to-Video Q1 (highest quality) |
fal.ai pricing sourced from fal.ai/models/fal-ai/vidu/q3-pro/text-to-video.
Parameters
Vidu Q3-Pro — Text-to-Video
| Parameter | Type | Required | Default | Options |
|---|---|---|---|---|
prompt |
string | Yes | - | Video description (max 1500 chars) |
style |
string | No | general | general, anime |
resolution |
string | No | 720p | 540p, 720p, 1080p |
duration |
number | No | 5 | Duration in seconds |
aspect_ratio |
string | No | 4:3 | 16:9, 9:16, 4:3, 3:4, 1:1 |
movement_amplitude |
string | No | auto | auto, small, medium, large |
generate_audio |
boolean | No | true | Generate synchronized audio |
bgm |
boolean | No | true | Generate background music |
seed |
integer | No | random | For reproducible results (-1 for random) |
Vidu Q3-Pro — Image-to-Video
Same as Q3-Pro text-to-video (without style and aspect_ratio), plus:
| Parameter | Type | Required | Description |
|---|---|---|---|
image |
string | Yes | URL of the source image |
Vidu 2.0 — Image-to-Video
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
prompt |
string | Yes | - | Video description (max 1500 chars) |
image |
string | Yes | - | Start frame image URL (PNG/JPEG/WebP, max 50MB, ratio < 4:1) |
duration |
integer | No | 4 | 4 or 8 seconds |
movement_amplitude |
string | No | auto | auto, small, medium, large |
seed |
integer | No | 0 | For reproducible results |
Vidu 2.0 — Start-End-to-Video
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
prompt |
string | Yes | - | Video description (max 1500 chars) |
images |
array | Yes | - | Exactly 2 images: [start_frame, end_frame] (similar pixel density, ratio 0.8-1.25) |
duration |
integer | No | 4 | 4 or 8 seconds |
movement_amplitude |
string | No | auto | auto, small, medium, large |
seed |
integer | No | 0 | For reproducible results |
Vidu 2.0 / Q1 — Reference-to-Video
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
prompt |
string | Yes | - | Video description (max 1500 chars) |
images |
array | Yes | - | 1-3 reference images (PNG/JPEG/WebP, min 128×128, max 50MB) |
aspect_ratio |
string | No | 16:9 | 16:9, 9:16, 1:1 |
movement_amplitude |
string | No | auto | auto, small, medium, large |
seed |
integer | No | 0 | For reproducible results |
Workflow: Submit → Poll → Download
Text-to-Video Example (Q3-Pro)
# Step 1: Submit
curl -s -X POST "https://api.atlascloud.ai/api/v1/model/generateVideo" \
-H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "vidu/q3-pro/text-to-video",
"prompt": "A samurai walks through a bamboo forest at dawn, mist rising from the ground, cinematic lighting",
"style": "general",
"resolution": "1080p",
"duration": 5,
"aspect_ratio": "16:9",
"movement_amplitude": "medium",
"generate_audio": true,
"bgm": true
}'
# Returns: { "code": 200, "data": { "id": "prediction-id" } }
# Step 2: Poll (every 5 seconds until completed)
curl -s "https://api.atlascloud.ai/api/v1/model/result/{prediction-id}" \
-H "Authorization: Bearer $ATLASCLOUD_API_KEY"
# Returns: { "code": 200, "data": { "status": "completed", "outputs": ["https://...video-url..."] } }
# Step 3: Download
curl -o output.mp4 "VIDEO_URL_FROM_OUTPUTS"
Image-to-Video Example (Q3-Pro)
curl -s -X POST "https://api.atlascloud.ai/api/v1/model/generateVideo" \
-H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "vidu/q3-pro/image-to-video",
"image": "https://example.com/landscape.jpg",
"prompt": "The camera slowly zooms in as clouds drift across the sky and leaves rustle in the wind",
"resolution": "720p",
"duration": 5,
"movement_amplitude": "small",
"generate_audio": true
}'
Anime Style Example (Q3-Pro)
curl -s -X POST "https://api.atlascloud.ai/api/v1/model/generateVideo" \
-H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "vidu/q3-pro/text-to-video",
"prompt": "An anime girl with flowing hair runs through a cherry blossom garden, petals swirling around her",
"style": "anime",
"resolution": "1080p",
"duration": 5,
"aspect_ratio": "16:9"
}'
Start-End Interpolation Example (2.0)
curl -s -X POST "https://api.atlascloud.ai/api/v1/model/generateVideo" \
-H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "vidu/start-end-to-video-2.0",
"images": ["https://example.com/start-frame.jpg", "https://example.com/end-frame.jpg"],
"prompt": "Iron Man transforms into a sports car with smooth morphing animation",
"duration": 4,
"movement_amplitude": "large"
}'
Reference-to-Video Example (2.0)
curl -s -X POST "https://api.atlascloud.ai/api/v1/model/generateVideo" \
-H "Authorization: Bearer $ATLASCLOUD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "vidu/reference-to-video-2.0",
"images": ["https://example.com/character-ref1.jpg", "https://example.com/character-ref2.jpg"],
"prompt": "The girl walks from the painting to the room and puts the coffee cup on the table",
"aspect_ratio": "16:9",
"movement_amplitude": "auto"
}'
Polling Logic
processing/starting/running→ wait 5s, retry (typically takes ~60-120s)completed/succeeded→ done, get URL fromdata.outputs[]failed→ error, readdata.error
Atlas Cloud MCP Tools (if available)
If the Atlas Cloud MCP server is configured, use built-in tools:
atlas_generate_video(model="vidu/q3-pro/text-to-video", params={...})
atlas_get_prediction(prediction_id="...")
Implementation Guide
-
Determine task type:
- Text-to-video: user describes a scene/action in text → Q3-Pro T2V
- Image-to-video: user provides an image to animate → Q3-Pro I2V or 2.0 I2V
- Start-end interpolation: user provides two keyframes → 2.0 Start-End
- Character-consistent video: user provides reference images → 2.0 / Q1 Reference
- Anime content: user wants anime style → Q3-Pro with style="anime"
-
Choose model:
- Q3-Pro (recommended): Latest generation, best quality, audio/BGM, up to 1080p, anime support
- 2.0 Image-to-Video: Budget option at $0.075/video for simple animations
- 2.0 Start-End: Unique keyframe interpolation capability
- 2.0 Reference: Character-consistent generation from reference images ($0.2/video)
- Q1 Reference: Highest quality reference-to-video ($0.4/video)
-
Extract parameters:
- Prompt: describe scene, action, camera movement
- Style: general (realistic) or anime
- Resolution: 540p for drafts, 720p default, 1080p for final output
- Duration: Q3-Pro supports flexible durations; 2.0 supports 4 or 8 seconds
- Movement amplitude: small for subtle motion, large for dynamic action
- Audio: enabled by default on Q3-Pro, set generate_audio=false to disable
-
Execute: POST to generateVideo API → poll result → download MP4
-
Present result: show file path, offer to play
Prompt Tips
- Scene + Action: "A chef flips a pancake in a busy kitchen, steam rising from the pan"
- Camera direction: "Camera slowly pans left...", "Close-up tracking shot of...", "Aerial view..."
- Anime style: Use
style: "anime"+ anime-specific prompts: "An anime warrior charges forward, energy aura glowing..." - Movement amplitude: Use
smallfor talking heads/subtle scenes,largefor action/sports - Start-End: Provide visually similar frames for smooth interpolation; dramatic differences work for morphing effects
- Reference: Provide clear, well-lit character references from multiple angles for best consistency
Reviews (0)
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!