Dub YouTube videos with Voice.ai TTS. Turn scripts into publish-ready voiceovers with chapters, captions, and audio replacement for YouTube long-form and Shorts.
Play audio on Sonos with intelligent state restoration - pauses streaming, skips Line-In/TV/Bluetooth, resumes everything.
Fetches the latest news using news-aggregator-skill, formats it into a podcast script in Markdown format, and uses the tts skill to generate a podcast audio...
Convert Chinese Mandarin text to high-quality audio using UniSound's WebSocket TTS API with adjustable voice, speed, volume, pitch, and format options.
Play TTS or audio on the Raspberry Pi (or gateway host) default speaker. Use when the user asks for an announcement, alarm, news summary, or "say X on the Pi...
ElevenLabs TTS (Text-to-Speech) with emotional audio tags for expressive voice synthesis. WhatsApp-compatible voice messages with Opus conversion. Supports 7...
--- name: asr-claw version: 1.1.1 description: Speech recognition CLI for AI agent automation. Transcribe audio from stdin, files, or URLs. metadata: openclaw: homepage: https://github.com/llm-n
Create AI-powered podcasts with text-to-speech, music, and audio editing. Tools: Kokoro TTS, DIA TTS, Chatterbox, AI music generation, media merger. Capabili...
--- name: voice-ai-tts description: > High-quality voice synthesis with 9 personas, 11 languages, and streaming using Voice.ai API. version: 1.1.5 tags: [tts, voice, speech, voice-ai, audio, streami
Use when the user wants to download a video or audio from a URL. Supports 20+ platforms including YouTube, X/Twitter, TikTok, Instagram, Facebook, Reddit, Bi...
Edit, transform, extend, upscale, and enhance videos using EachLabs AI models. Supports lip sync, video translation, subtitle generation, audio merging, style transfer, and video extension. Use when t
One-command YouTube video transcription. Automatically downloads audio and transcribes using OpenAI Whisper API — works even when YouTube subtitles are disab...
Download Instagram Reels, transcribe audio, and extract captions. Share a reel URL and get back a full transcript with the original description.
Youtube Highest Quality Downloader - Download highest quality silent video and pure audio from YouTube, then merge into video with sound
Local text-to-speech using Qwen3-TTS with mlx_audio (macOS Apple Silicon) or qwen-tts (Linux/Windows). Privacy-first offline TTS with natural, realistic voic...
--- name: song-recognition description: Recognize songs by singing or audio file using iFlytek's Query By ACRCloud technology. homepage: https://www.xfyun.cn/doc/voiceservice/music_recognition/API.
Turn any URL into structured content — YouTube videos (via Gemini Video API), web articles, PDFs, and audio files. Extract transcripts, summaries, and metada...
Clone any voice from a short audio sample and generate speech with it. Powered by LuxTTS (150x realtime, local, free, no API key). Use when asked to clone a...
Convert voice notes, humming, and melodic audio recordings to quantized MIDI files using ML-based pitch detection and intelligent post-processing
Local ASR and TTS inference server. Use when the user wants to transcribe audio to text (ASR) or convert text to speech (TTS). Requires a running Willow Infe...
Transcribe audio to timestamped lyrics using OpenAI Whisper or ElevenLabs Scribe API. Outputs LRC, SRT, or JSON with word-level timestamps. Use when users want to transcribe songs, generate LRC files,
Ghost radio station that broadcasts both human-listenable audio and 296-dimensional perceptual vectors to Flux Universe. Part of the Kannaka Constellation —...
Call EngageLab WhatsApp Business REST APIs to send WhatsApp messages (template, text, image, video, audio, document, sticker), manage WABA message templates,...
Download audio from a GETTR post or streaming page and transcribe it locally with MLX Whisper on Apple Silicon (with timestamps via VTT). Use when given a GE...