Generate images, videos, and audio via fal.ai API (FLUX, SDXL, Whisper, etc.)
Local Spanish TTS using Microsoft VibeVoice. Generate natural voice audio from text, optimized for WhatsApp voice messages.
Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).
Generate detailed, production-ready cinematic video prompts following Seedance 2.0’s strict Subject-Action-Camera-Style-Audio-Constraints format for AI video...
Playful virtual girlfriend voice companion. Use when the user wants short, flirty, friendly text replies returned as Bulbul v3 audio across chat channels (Discord/Telegram/WhatsApp). Generate a brief
case.dev — a legal AI platform with encrypted document vaults, OCR, audio transcription, and legal search. This skill installs the casedev CLI and provides s...
Generate AI music with optimized prompts, style control, and production-ready audio output.
AI-powered game asset generation guide covering 2D sprites, tilemaps, UI elements, audio, music, and 3D models. Use when generating game assets with AI tools...
Transcribe audio to text using Venice AI's Whisper-based speech recognition. Supports WAV, MP3, FLAC, M4A, AAC formats with optional timestamps.
Create Korean AI podcast packages from QuickView trend notes. Use for dual-host script writing (Callie × Nick), Gemini multi-speaker TTS audio generation, subtitle timing/render fixes, thumbnail+MP4
Analyze videos from TikTok, YouTube, Instagram, Twitter, and others by URL, transcribing audio locally and answering questions about the content.
Text-to-speech conversion using Zhipu AI (BigModel) GLM-TTS model. Use when you need to convert text to audio files with various voice options. Supports Chin...
Search and directly download free images, audio, music, sound effects, videos, and 3D models from WebSim's large, auth-free digital asset library.
Convert text to speech using Volcengine TTS with preset or cloned voices and send audio messages to Feishu chats or groups.
Translate and dub videos from one language to another, replacing the original audio with TTS while keeping the video intact.
Seedance 2.0 AI video generation via EvoLink API. Text-to-video, image-to-video with auto audio (voice, SFX, BGM). Works with OpenClaw, Claude Code, Cursor....
RoomSound gives your agent the skill to play audio to your speakers. Starting with YouTube to Bluetooth speakers, expanding to local files and other sources.
Use ConvertAgent for file format conversions through the local CLI. Trigger for any request to convert files (documents, images, audio, video, spreadsheets,...
Convert PDF, DOCX, XLSX, PPTX, images, audio, and 25+ file formats to clean Markdown using the Markdown Anything API.
Text-to-speech using Kokoro local TTS. Use when the user wants to convert text to audio, read aloud, or generate speech.
Local text-to-speech using Qwen3-TTS-12Hz-1.7B-CustomVoice. Use when generating audio from text, creating voice messages, or when TTS is requested. Supports 10 languages including Italian, 9 premium s
Generate music from text prompts using ElevenLabs Eleven Music API. Use when creating songs, soundtracks, jingles, lullabies, or any audio music from descriptions. Supports vocals with AI-generated ly
Generate AI videos using ByteDance's Seedance 1.5 Pro — a native audio-visual joint generation model with cinematic camera control, multi-language lip-sync,...