name: monet-ai-skill description: Monet AI - Comprehensive AI content generation API for AI agents. Video generation (Sora, Veo, Doubao Seedance, Wan, Hailuo, Kling), image generation (GPT-4o, Nano Banana, Seedream, Flux, Imagen, Ideogram), and music generation (MiniMax Music). Build intelligent workflows with multi-model AI generation capabilities. metadata: openclaw: requires: env: - MONET_API_KEY # Required: API key from monet.vision bins: []

Monet AI Skill

Comprehensive AI content generation API designed for AI agents. Monet AI provides unified access to state-of-the-art AI generation models for video (Sora, Veo, Doubao Seedance, Wan, Hailuo, Kling), image (GPT-4o, Nano Banana, Seedream, Flux, Imagen, Ideogram), and music (MiniMax Music) generation. Build intelligent workflows that combine multiple AI capabilities for automated content creation pipelines.

When to Use

Use this skill when:

Video Generation: Create AI-generated videos from text prompts using state-of-the-art models
- Sora: OpenAI's video generation model for high-quality, realistic videos
- Veo: Google's video generation model
- Doubao Seedance: ByteDance's AI video model with audio-visual sync
- Wan: Alibaba's video generation model with excellent localization support
- Hailuo: Fast video generation with good quality-speed balance
- Kling: Kuaishou's video generation model
Image Generation: Generate images from text descriptions with various artistic styles
- GPT-4o: OpenAI's multimodal model for image generation
- Nano Banana: Google's image model with ultra-high character consistency
- Seedream: ByteDance's intelligent visual reasoning model
- Wan: Alibaba's visual model for high-quality and expressive image generation
- Flux: High-quality photorealistic and artistic image generation
- Imagen: Google's text-to-image model
- Ideogram: Specialized in text rendering and precise composition
Music Generation: Create original music and audio from text descriptions
- MiniMax Music: AI music generation with support for custom lyrics and text-to-music conversion
AI Agent Integration: Build intelligent workflows that combine multiple AI generation capabilities for automated content creation pipelines

Getting API Key

Visit https://monet.vision to register an account
After login, go to https://monet.vision/skills/keys to create an API Key
Configure the API Key in environment variables or code

If you don't have an API Key, ask your owner to apply at monet.vision.

Quick Start

Create a Video Generation Task

curl -X POST https://monet.vision/api/v1/tasks/async \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $MONET_API_KEY" \
  -d '{
    "type": "video",
    "input": {
      "model": "sora-2",
      "prompt": "A cat running in the park",
      "duration": 5,
      "aspect_ratio": "16:9"
    },
    "idempotency_key": "unique-key-123"
  }'

⚠️ Important: idempotency_key is required. Use a unique value (e.g., UUID) to prevent duplicate task creation if the request is retried.

Response:

{
  "id": "task_abc123",
  "status": "pending",
  "type": "video",
  "created_at": "2026-02-27T10:00:00Z"
}

Get Task Status and Result

Task processing is asynchronous. You need to poll the task status until it becomes success or failed. Recommended polling interval: 5 seconds.

curl https://monet.vision/api/v1/tasks/task_abc123 \
  -H "Authorization: Bearer $MONET_API_KEY"

Response when completed:

{
  "id": "task_abc123",
  "status": "success",
  "type": "video",
  "outputs": [
    {
      "model": "sora-2",
      "status": "success",
      "progress": 100,
      "url": "https://files.monet.vision/..."
    }
  ],
  "created_at": "2026-02-27T10:00:00Z",
  "updated_at": "2026-02-27T10:01:30Z"
}

Example: Poll until completion

const TASK_ID = "task_abc123";
const MONET_API_KEY = process.env.MONET_API_KEY;

async function pollTask() {
  while (true) {
    const response = await fetch(
      `https://monet.vision/api/v1/tasks/${TASK_ID}`,
      {
        headers: {
          Authorization: `Bearer ${MONET_API_KEY}`,
        },
      },
    );

    const data = await response.json();
    const status = data.status;

    if (status === "success") {
      console.log("Task completed successfully!");
      console.log(JSON.stringify(data, null, 2));
      break;
    } else if (status === "failed") {
      console.log("Task failed!");
      console.log(JSON.stringify(data, null, 2));
      break;
    } else {
      console.log(`Task status: ${status}, waiting...`);
      await new Promise((resolve) => setTimeout(resolve, 5000)); // Wait 5 seconds
    }
  }
}

pollTask();

Supported Models

Video Generation

Sora (OpenAI)

sora-2 - Sora 2

OpenAI latest video generation model

🎯 Use Cases: Video projects requiring OpenAI's latest technology
⏱️ Duration: 10-15 seconds
🎵 Features: Audio generation support, reference image support

{
  model: "sora-2",
  prompt: string,                // Required
  images?: string[],             // Optional: Reference images
  duration?: 10 | 15,           // Optional, default: 10
  aspect_ratio?: "16:9" | "9:16"
}

sora-2-pro - Sora 2 Pro

Perfect quality for cinematic scenes

🎯 Use Cases: Professional film, advertising, and high-end production
⏱️ Duration: 15-25 seconds
🎵 Features: Audio generation support, reference image support

{
  model: "sora-2-pro",
  prompt: string,
  images?: string[],
  duration?: 15 | 25,           // Optional, default: 15
  aspect_ratio?: "16:9" | "9:16"
}

Veo (Google)

veo-3-1-fast - Google Veo 3.1 Fast

Ultra-fast video generation

🎯 Use Cases: Video projects requiring fast generation
⏱️ Duration: 8 seconds
📺 Resolution: 1080p with audio generation support

{
  model: "veo-3-1-fast",
  prompt: string,
  images?: string[],             // Reference images
  aspect_ratio?: "16:9" | "9:16"
}

veo-3-1 - Google Veo 3.1

Advanced AI video with sound

🎯 Use Cases: Professional-grade video production
⏱️ Duration: 8 seconds
📺 Resolution: 1080p with audio generation support

{
  model: "veo-3-1",
  prompt: string,
  images?: string[],
  aspect_ratio?: "16:9" | "9:16"
}

veo-3-fast - Google Veo 3 Fast

30% faster than standard Veo 3

🎯 Use Cases: Video projects requiring rapid iteration
⏱️ Duration: 8 seconds
📺 Resolution: 1080p, supports negative prompts

{
  model: "veo-3-fast",
  prompt: string,
  images?: string[],
  negative_prompt?: string       // Specify unwanted content
}

veo-3 - Google Veo 3

High-quality video generation

🎯 Use Cases: Standard high-quality video production
⏱️ Duration: 8 seconds
📺 Resolution: 1080p, supports negative prompts

{
  model: "veo-3",
  prompt: string,
  images?: string[],
  negative_prompt?: string
}

Wan

wan-2-6 - Wan 2.6

Multi-shot and automatic audio

🎯 Use Cases: Video production requiring multi-shot switching
⏱️ Duration: 5-15 seconds
📺 Resolution: 720p-1080p with audio generation support

{
  model: "wan-2-6",
  prompt: string,
  images?: string[],
  duration?: 5 | 10 | 15,
  resolution?: "720p" | "1080p",
  aspect_ratio?: "16:9" | "9:16" | "4:3" | "3:4" | "1:1",
  shot_type?: "single" | "multi"  // Single/multi-shot switching
}

wan-2-5 - Wan 2.5

Supports automatic audio generation

🎯 Use Cases: Quickly generating videos with audio
⏱️ Duration: 5-10 seconds
📺 Resolution: 480p-1080p with audio support

{
  model: "wan-2-5",
  prompt: string,
  images?: string[],
  duration?: 5 | 10,
  resolution?: "480p" | "720p" | "1080p",
  aspect_ratio?: "16:9" | "9:16" | "4:3" | "3:4" | "1:1"
}

wan-2-2-flash - Wan 2.2 Flash

Instruction understanding, controllable camera movement

🎯 Use Cases: Scenarios requiring precise camera movement control
⏱️ Duration: 5-10 seconds
📺 Resolution: 480p-1080p

{
  model: "wan-2-2-flash",
  prompt: string,
  images?: string[],
  duration?: 5 | 10,
  resolution?: "480p" | "720p" | "1080p",
  negative_prompt?: string
}

wan-2-2 - Wan 2.2

Excellent image details, strong motion stability

🎯 Use Cases: Video production requiring high stability
⏱️ Duration: 5-10 seconds
📺 Resolution: 480p-1080p

{
  model: "wan-2-2",
  prompt: string,
  images?: string[],
  duration?: 5 | 10,
  resolution?: "480p" | "1080p",
  aspect_ratio?: "16:9" | "9:16" | "4:3" | "3:4" | "1:1",
  negative_prompt?: string
}

Kling

kling-2-6 - Kling 2.6

Cinematic videos and audio

🎯 Use Cases: Cinematic video production
⏱️ Duration: 5-10 seconds
✨ Features: Strong visual realism, audio generation support

{
  model: "kling-2-6",
  prompt: string,
  images?: string[],
  duration?: 5 | 10,
  aspect_ratio?: "1:1" | "16:9" | "9:16",
  generate_audio?: boolean
}

kling-2-5 - Kling 2.5 Turbo

Smooth motion, stronger consistency

🎯 Use Cases: Video production requiring high consistency
⏱️ Duration: 5-10 seconds
✨ Features: Supports negative prompts

{
  model: "kling-2-5",
  prompt: string,
  images?: string[],
  duration?: 5 | 10,
  aspect_ratio?: "1:1" | "16:9" | "9:16",
  negative_prompt?: string
}

kling-v2-1-master - Kling 2.1 Master

Strong visual realism with enhanced features

🎯 Use Cases: Professional-grade high-quality video production
⏱️ Duration: 5-10 seconds
✨ Features: Strength adjustment support, negative prompts

{
  model: "kling-v2-1-master",
  prompt: string,
  images?: string[],
  duration?: 5 | 10,
  aspect_ratio?: "1:1" | "16:9" | "9:16",
  strength?: number,            // 0-1: Control generation effect
  negative_prompt?: string
}

kling-v2-1 - Kling 2.1

Strong visual realism

🎯 Use Cases: High-realism video production
⏱️ Duration: 5-10 seconds
✨ Features: Strength adjustment, negative prompts

{
  model: "kling-v2-1",
  prompt: string,
  images?: string[],
  duration?: 5 | 10,
  aspect_ratio?: "1:1" | "16:9" | "9:16",
  strength?: number,            // 0-1
  negative_prompt?: string
}

kling-v2 - Kling 2.0

Excellent aesthetics

🎯 Use Cases: Artistic creation and aesthetically-oriented videos
⏱️ Duration: 5-10 seconds
✨ Features: Strength adjustment, negative prompts

{
  model: "kling-v2",
  prompt: string,
  images?: string[],
  duration?: 5 | 10,
  aspect_ratio?: "1:1" | "16:9" | "9:16",
  strength?: number,            // 0-1
  negative_prompt?: string
}

Hailuo

hailuo-2-3 - Hailuo 2.3

Excellent body movements and physics performance

🎯 Use Cases: Videos requiring realistic physics effects
⏱️ Duration: 6-10 seconds
📺 Resolution: 768p-1080p, extreme physics simulations

{
  model: "hailuo-2-3",
  prompt: string,
  images?: string[],
  duration?: 6 | 10,
  resolution?: "768p" | "1080p"
}

hailuo-2-3-fast - Hailuo 2.3 Fast

Fast generation speed

🎯 Use Cases: Projects requiring rapid iteration
⏱️ Duration: 6-10 seconds
📺 Resolution: 768p-1080p

{
  model: "hailuo-2-3-fast",
  prompt: string,
  images?: string[],
  duration?: 6 | 10,
  resolution?: "768p" | "1080p"
}

hailuo-02 - Hailuo 02

Extreme physics simulations

🎯 Use Cases: Scenarios requiring accurate physics simulation
⏱️ Duration: 6-10 seconds
📺 Resolution: 768p-1080p

{
  model: "hailuo-02",
  prompt: string,
  images?: string[],
  duration?: 6 | 10,
  resolution?: "768p" | "1080p"
}

hailuo-01-live2d - Hailuo 01 Live2d

Hailuo Live2D model

🎯 Use Cases: 2D character animation production
✨ Features: Suitable for 2D character animation

{
  model: "hailuo-01-live2d",
  prompt: string,
  images?: string[]
}

hailuo-01 - Hailuo 01

Highest video quality

🎯 Use Cases: Video production requiring ultimate quality
✨ Features: Suitable for high-quality needs

{
  model: "hailuo-01",
  prompt: string,
  images?: string[]
}

Doubao Seedance

doubao-seedance-1-5-pro - Seedance 1.5 Pro

Pro-grade audio-visual sync

🎯 Use Cases: Professional production requiring audio-visual sync
⏱️ Duration: 4-12 seconds
📺 Resolution: 480p-720p with audio generation support

{
  model: "doubao-seedance-1-5-pro",
  prompt: string,
  images?: string[],
  duration?: number,
  resolution?: "480p" | "720p",
  aspect_ratio?: "1:1" | "4:3" | "16:9" | "3:4" | "9:16" | "21:9",
  generate_audio?: boolean
}

doubao-seedance-1-0-pro-fast - Seedance 1.0 Pro Fast

Premium quality & unbeatable efficiency

🎯 Use Cases: Scenarios requiring fast high-quality output
⏱️ Duration: 2-12 seconds
📺 Resolution: 720p-1080p, ByteDance's next-gen AI video model

{
  model: "doubao-seedance-1-0-pro-fast",
  prompt: string,
  images?: string[],
  duration?: number,
  resolution?: "720p" | "1080p",
  aspect_ratio?: "1:1" | "4:3" | "16:9" | "3:4" | "9:16" | "21:9"
}

doubao-seedance-1-0-pro - Seedance 1.0 Pro

Stable motion performance

🎯 Use Cases: Video production requiring stable motion
⏱️ Duration: 5-10 seconds
📺 Resolution: 480p-1080p

{
  model: "doubao-seedance-1-0-pro",
  prompt: string,
  images?: string[],
  duration?: 5 | 10,
  resolution?: "480p" | "1080p",
  aspect_ratio?: "1:1" | "4:3" | "16:9" | "3:4" | "9:16"
}

doubao-seedance-1-0-lite - Seedance 1.0 Lite

Precise semantic understanding

🎯 Use Cases: Scenarios requiring precise semantic understanding
⏱️ Duration: 5-10 seconds
📺 Resolution: 480p-1080p

{
  model: "doubao-seedance-1-0-lite",
  prompt: string,
  images?: string[],
  duration?: 5 | 10,
  resolution?: "480p" | "720p" | "1080p"
}

Special Features

kling-motion-control - Kling Motion Control

Precision motion control via video references

🎯 Use Cases: Scenarios requiring motion replication from reference videos
⏱️ Duration: 3-30 seconds
📺 Resolution: 720p/1080p with audio generation support
💰 Pricing: 720p: 8 credits/s, 1080p: 15 credits/s

{
  model: "kling-motion-control",
  prompt: string,                // Required: Detailed motion description
  images: string[],              // Required: min 1 reference image
  videos: string[],              // Required: min 1 reference video
  resolution?: "720p" | "1080p"
}

runway-act-two - Runway Act Two

Runway Next-Generation Motion Capture Model

🎯 Use Cases: Capturing motion from videos and applying to new characters
⏱️ Duration: 3-30 seconds
✨ Features: Motion transfer support
💰 Pricing: 10 credits/second

{
  model: "runway-act-two",
  images: string[],              // Required: min 1 target character image
  videos: string[],              // Required: min 1 motion reference video
  aspect_ratio?: "1:1" | "4:3" | "16:9" | "3:4" | "9:16" | "21:9"
}

wan-animate-mix - Wan Animate Mix (Standard)

Perfect for character replacement scenarios

🎯 Use Cases: Video character replacement
⏱️ Duration: 3-30 seconds
✨ Features: Replace characters in videos with specified image characters
💰 Pricing: 10 credits/second

{
  model: "wan-animate-mix",
  videos: string[],              // Required: Original videos
  images: string[]               // Required: Target character images
}

wan-animate-mix-pro - Wan Animate Mix Pro (Professional)

High animation fluidity with better results

🎯 Use Cases: Professional-grade video character replacement
⏱️ Duration: 3-30 seconds
✨ Features: Higher quality character replacement effects
💰 Pricing: 20 credits/second

{
  model: "wan-animate-mix-pro",
  videos: string[],              // Required
  images: string[]               // Required
}

wan-animate-move - Wan Animate Move (Standard)

Replicate dance and challenging body movements

🎯 Use Cases: Motion capture and transfer
⏱️ Duration: 3-30 seconds
✨ Features: Apply motion from reference videos to target images
💰 Pricing: 10 credits/second

{
  model: "wan-animate-move",
  videos: string[],              // Required: Motion reference videos
  images: string[]               // Required: Target character images
}

wan-animate-move-pro - Wan Animate Move Pro (Professional)

High animation fluidity with better results

🎯 Use Cases: Professional-grade motion capture and transfer
⏱️ Duration: 3-30 seconds
✨ Features: Higher quality motion transfer effects
💰 Pricing: 20 credits/second

{
  model: "wan-animate-move-pro",
  videos: string[],              // Required
  images: string[]               // Required
}

Image Generation

GPT (OpenAI)

gpt-4o - GPT 4o

Accurate, realistic output

🎯 Use Cases: High-quality, photorealistic image generation
✨ Features: Supports multiple reference images, multiple aspect ratios, customizable style

{
  model: "gpt-4o",
  prompt: string,
  images?: string[],             // Reference images for style guidance
  aspect_ratio?: "1:1" | "4:3" | "3:2" | "16:9" | "3:4" | "2:3" | "9:16",
  style?: string                 // Custom style description
}

gpt-image-1-5 - GPT Image 1.5

True-color precision rendering

🎯 Use Cases: Professional image generation requiring color accuracy
✨ Features: Supports up to 10 reference images, adjustable quality

{
  model: "gpt-image-1-5",
  prompt: string,
  images?: string[],             // max 10 reference images
  aspect_ratio?: "1:1" | "3:2" | "2:3",
  quality?: "auto" | "low" | "medium" | "high"
}

Nano Banana (Google)

nano-banana-1 - Google Nano Banana

Ultra-high character consistency

🎯 Use Cases: Image series requiring consistent character appearance
✨ Features: Supports up to 5 reference images, multiple aspect ratio options

{
  model: "nano-banana-1",
  prompt: string,
  images?: string[],             // max 5 reference images
  aspect_ratio?: "1:1" | "2:3" | "3:2" | "4:3" | "3:4" | "16:9" | "9:16"
}

nano-banana-1-pro - Nano Banana Pro

Google's flagship generation model

🎯 Use Cases: Professional-grade high-quality image generation
✨ Features: Supports 1K-4K resolution, up to 14 reference images, ultra-wide 21:9

{
  model: "nano-banana-1-pro",
  prompt: string,
  images?: string[],             // max 14 reference images
  aspect_ratio?: "1:1" | "2:3" | "3:2" | "4:3" | "3:4" | "4:5" | "5:4" | "16:9" | "9:16" | "21:9",
  resolution?: "1K" | "2K" | "4K"
}

nano-banana-2 - Nano Banana 2

Google Gemini latest model

🎯 Use Cases: Latest technology for high-quality image generation
✨ Features: Supports 1K-4K resolution, up to 14 reference images, ultra-wide 8:1 ratio

{
  model: "nano-banana-2",
  prompt: string,
  images?: string[],             // max 14 reference images
  aspect_ratio?: "1:1" | "2:3" | "3:2" | "4:3" | "3:4" | "4:5" | "5:4" | "16:9" | "9:16" | "21:9" | "4:1" | "1:4" | "8:1" | "1:8",
  resolution?: "1K" | "2K" | "4K"
}

Wan

wan-i-2-6 - Wan 2.6

High-quality and expressive

🎯 Use Cases: Creative image generation requiring high expressiveness
✨ Features: Supports up to 4 reference images, ultra-wide 21:9

{
  model: "wan-i-2-6",
  prompt: string,
  images?: string[],             // max 4 reference images
  aspect_ratio?: "1:1" | "4:3" | "3:2" | "16:9" | "3:4" | "2:3" | "9:16" | "21:9"
}

wan-2-5 - Wan 2.5

Fast, creative image generation

🎯 Use Cases: Quick creation and iteration
✨ Features: Supports up to 2 reference images, ultra-wide 21:9

{
  model: "wan-2-5",
  prompt: string,
  images?: string[],             // max 2 reference images
  aspect_ratio?: "1:1" | "4:3" | "3:2" | "16:9" | "3:4" | "2:3" | "9:16" | "21:9"
}

Seedream (ByteDance)

seedream-5-0 - Seedream 5.0 Lite

Intelligent visual reasoning

🎯 Use Cases: Complex scenarios requiring intelligent understanding and reasoning
✨ Features: 2K-3K resolution, up to 14 reference images, ultra-wide 21:9

{
  model: "seedream-5-0",
  prompt: string,
  images?: string[],             // max 14 reference images
  aspect_ratio?: "1:1" | "2:3" | "3:2" | "3:4" | "4:3" | "4:5" | "5:4" | "9:16" | "16:9" | "21:9",
  resolution?: "2K" | "3K"
}

seedream-4-5 - Seedream 4.5

ByteDance's 4K image model

🎯 Use Cases: High-resolution professional image generation
✨ Features: 2K-4K resolution, up to 14 reference images, ultra-wide 21:9

{
  model: "seedream-4-5",
  prompt: string,
  images?: string[],             // max 14 reference images
  aspect_ratio?: "1:1" | "2:3" | "3:2" | "3:4" | "4:3" | "4:5" | "5:4" | "9:16" | "16:9" | "21:9",
  resolution?: "2K" | "4K"
}

seedream-4-0 - Seedream 4.0

Support images with cohesive styles

🎯 Use Cases: Image series requiring consistent style
✨ Features: Supports up to 10 reference images

{
  model: "seedream-4-0",
  prompt: string,
  images?: string[],             // max 10 reference images
  aspect_ratio?: "1:1" | "4:3" | "3:2" | "16:9" | "3:4" | "2:3" | "9:16"
}

Flux (Black Forest Labs)

flux-2-dev - Flux.2 Dev

Photorealistic output

🎯 Use Cases: Image generation requiring high photorealism
✨ Features: Model by Black Forest Labs, multiple aspect ratio options

{
  model: "flux-2-dev",
  prompt: string,
  aspect_ratio?: "1:1" | "4:3" | "3:2" | "16:9" | "3:4" | "2:3" | "9:16"
}

flux-kontext-pro - Flux Kontext Pro

Perfect for editing, compositing

🎯 Use Cases: Professional image editing and compositing work
✨ Features: Supports reference images, customizable style

{
  model: "flux-kontext-pro",
  prompt: string,
  images?: string[],
  aspect_ratio?: "1:1" | "4:3" | "3:2" | "16:9" | "3:4" | "2:3" | "9:16",
  style?: string
}

flux-kontext-max - Flux Kontext Max

Excellent for prompt accuracy

🎯 Use Cases: Scenarios requiring precise control of generation results
✨ Features: Supports reference images, customizable style

{
  model: "flux-kontext-max",
  prompt: string,
  images?: string[],
  aspect_ratio?: "1:1" | "4:3" | "3:2" | "16:9" | "3:4" | "2:3" | "9:16",
  style?: string
}

flux-1-schnell - Flux Schnell

Suitable for simple basic scenes

🎯 Use Cases: Quick prototyping and simple scenarios
✨ Features: Fast generation speed

{
  model: "flux-1-schnell",
  prompt: string
}

Imagen (Google)

imagen-3-0 - Imagen 3.0

Fast, high-quality results

🎯 Use Cases: Fast high-quality image generation
✨ Features: Google's advanced image model, customizable style

{
  model: "imagen-3-0",
  prompt: string,
  aspect_ratio?: "1:1" | "3:4" | "4:3" | "9:16" | "16:9",
  style?: string
}

imagen-4-0 - Imagen 4.0

Google's latest generation model

🎯 Use Cases: High-quality images requiring latest technology
✨ Features: Higher quality and precision, customizable style

{
  model: "imagen-4-0",
  prompt: string,
  aspect_ratio?: "1:1" | "3:4" | "4:3" | "9:16" | "16:9",
  style?: string
}

Ideogram

ideogram-v2 - Ideogram V2

Highly recommended for text editing

🎯 Use Cases: Scenarios requiring text in images
✨ Features: Excellent text rendering performance

{
  model: "ideogram-v2",
  prompt: string,
  aspect_ratio?: "1:1" | "4:3" | "3:2" | "16:9" | "3:4" | "2:3" | "9:16",
  style?: string
}

ideogram-v3 - Ideogram V3

Outstanding design capabilities

🎯 Use Cases: First choice for designers and creative professionals
✨ Features: Better text rendering and typography

{
  model: "ideogram-v3",
  prompt: string,
  aspect_ratio?: "1:1" | "4:3" | "3:2" | "16:9" | "3:4" | "2:3" | "9:16",
  style?: string
}

Stability AI

stability-1-0 - Stability 1.0

Perfect for generating detailed images

🎯 Use Cases: Image generation requiring fine control and high detail
✨ Features: Supports negative prompts, customizable style

{
  model: "stability-1-0",
  prompt: string,
  aspect_ratio?: "1:1" | "4:3" | "3:2" | "16:9" | "3:4" | "2:3" | "9:16",
  style?: string,
  negative_prompt?: string       // Specify unwanted content
}

Music Generation

minimax-music - MiniMax Music

AI music generation from text with custom lyrics support

🎯 Provider: MiniMax
✨ Features: Text-to-music conversion, supports custom lyrics
🎵 Use Cases: Music creation from text descriptions or lyrics

{
  model: "minimax-music",
  prompt: string,                // Required: Music generation description (max 300 characters)
  lyrics?: string                // Optional: Custom lyrics (max 3000 characters)
}

API Reference

Create Task (Async)

POST /api/v1/tasks/async - Create an async task. Returns immediately with task ID.

Request:

curl -X POST https://monet.vision/api/v1/tasks/async \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $MONET_API_KEY" \
  -d '{
    "type": "video",
    "input": {
      "model": "sora-2",
      "prompt": "A cat running"
    },
    "idempotency_key": "unique-key-123"
  }'

⚠️ Important: idempotency_key is required. Use a unique value (e.g., UUID) to prevent duplicate task creation if the request is retried.

Response:

{
  "id": "task_abc123",
  "status": "pending",
  "type": "video",
  "created_at": "2026-02-27T10:00:00Z"
}

Create Task (Streaming)

POST /api/v1/tasks/sync - Create a task with SSE streaming. Waits for completion and streams progress.

Request:

curl -X POST https://monet.vision/api/v1/tasks/sync \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $MONET_API_KEY" \
  -N \
  -d '{
    "type": "video",
    "input": {
      "model": "sora-2",
      "prompt": "A cat running"
    },
    "idempotency_key": "unique-key-123"
  }'

Get Task

GET /api/v1/tasks/{taskId} - Get task status and result.

Request:

curl https://monet.vision/api/v1/tasks/task_abc123 \
  -H "Authorization: Bearer $MONET_API_KEY"

Response:

{
  "id": "task_abc123",
  "status": "success",
  "type": "video",
  "outputs": [
    {
      "model": "sora-2",
      "status": "success",
      "progress": 100,
      "url": "https://files.monet.vision/..."
    }
  ],
  "created_at": "2026-02-27T10:00:00Z",
  "updated_at": "2026-02-27T10:01:30Z"
}

List Tasks

GET /api/v1/tasks/list - List tasks with pagination.

Request:

curl "https://monet.vision/api/v1/tasks/list?page=1&pageSize=20" \
  -H "Authorization: Bearer $MONET_API_KEY"

Response:

{
  "tasks": [
    {
      "id": "task_abc123",
      "status": "success",
      "type": "video",
      "outputs": [
        {
          "model": "sora-2",
          "status": "success",
          "progress": 100,
          "url": "https://files.monet.vision/..."
        }
      ],
      "created_at": "2026-02-27T10:00:00Z",
      "updated_at": "2026-02-27T10:01:30Z"
    }
  ],
  "page": 1,
  "pageSize": 20,
  "total": 100
}

Upload File

POST /api/v1/files - Upload a file to get an online access URL.

📁 File Storage: Uploaded files are stored for 24 hours and will be automatically deleted after expiration.

Request:

curl -X POST https://monet.vision/api/v1/files \
  -H "Authorization: Bearer $MONET_API_KEY" \
  -F "file=@/path/to/your/file.mp4" \
  -v

Use Cases:

Upload reference images for video/image generation tasks
Upload video files for video processing
Upload audio files for music tasks
Get temporary online URLs for file sharing

Response:

{
  "id": "file_xyz789",
  "url": "...",
  "filename": "file.mp4",
  "size": 1048576,
  "content_type": "video/mp4",
  "created_at": "2026-02-27T10:00:00Z"
}

Configuration

Environment Variables

export MONET_API_KEY="monet_xxx"

Authentication

All API requests require authentication via the Authorization header:

Authorization: Bearer monet_xxx

Monet AI

Description

Monet AI Skill

When to Use

Getting API Key

Quick Start

Create a Video Generation Task

Get Task Status and Result

Supported Models

Video Generation

Sora (OpenAI)

Veo (Google)

Wan

Kling

Hailuo

Doubao Seedance

Special Features

Image Generation

GPT (OpenAI)

Nano Banana (Google)

Wan

Seedream (ByteDance)

Flux (Black Forest Labs)

Imagen (Google)

Ideogram

Stability AI

Music Generation

API Reference

Create Task (Async)

Create Task (Streaming)

Get Task

List Tasks

Upload File

Configuration

Environment Variables

Authentication

Reviews (0)

Comments (0)

Compatible Platforms

Links

Pricing

Related Configs

self-improving-agent

Self Improving Agent

Find Skills

Summarize