Local text-to-speech using Piper voices via sherpa-onnx. 100% offline, no API keys required. Use when user asks for a voice reply, audio response, spoken answer, or wants to hear something read aloud.
Download audio from a GETTR post or streaming page and transcribe it locally with MLX Whisper on Apple Silicon (with timestamps via VTT). Use when given a GE...
Ghost radio station that broadcasts both human-listenable audio and 296-dimensional perceptual vectors to Flux Universe. Part of the Kannaka Constellation —...
Call EngageLab WhatsApp Business REST APIs to send WhatsApp messages (template, text, image, video, audio, document, sticker), manage WABA message templates,...
Complete Venice AI API toolkit - image generation, video, audio, embeddings, transcription, characters, models, and admin functions. Privacy-focused inferenc...
Text-to-speech generation via Qwen3-TTS over SSH. Preset voices, voice cloning, voice design. Use when the user wants to generate speech audio, clone voices,...
Generate digital human talking avatar videos from images and audio using DreamAvatar 3.0 Fast API. Powered by Dreamface - AI tools for everyone. Visit https:...
Transcribe and organize voice memos with automatic categorization and information extraction. Use when users have voice notes, audio memos, or spoken notes t...
Download videos from YouTube, Bilibili, Twitter, and thousands of other sites using yt-dlp. Use when the user provides a video URL and wants to download it, extract audio (MP3), download subtitles, or
Transcribe audio files to text using local speech recognition. Triggers on: "转录", "transcribe", "语音转文字", "ASR", "识别音频", "把这段音频转成文字".
Create funny voice memes with various styles, effects, and templates. Use when users want to make humorous audio content, voice memes, or entertaining sound...
A local-first conversion router and format strategist. Identifies the safest local path for document, image, audio, video, archive, and data transformations....
Text-to-speech, sound effects, music generation, voice management, and quota checks via the ElevenLabs API. Use when generating audio with ElevenLabs or mana...
Generate videos using Flyworks (a.k.a HiFly) Digital Humans. Create talking photo videos from images, use public avatars with TTS, or clone voices for custom audio.
Transcribe audio and video files using OpenAI Whisper API. Use when user wants to transcribe audio/video files, extract speech from media, or get text from r...
Search and directly download free images, audio, music, sound effects, videos, and 3D models from WebSim's large, auth-free digital asset library.
Manage podcasts on Transistor.fm via their API. Use when creating, publishing, updating, or deleting podcast episodes, uploading audio files, listing shows/e...
Fetch iMessage/Messages.app attachments (voice memos and images) and process them — transcribe audio via Silicon Flow ASR (SenseVoiceSmall), and analyze imag...
Accept text or voice input, transcribe if needed, generate natural OpenAI TTS speech, and send audio output to Feishu chat or web player.
YouTube video search, download, subtitles & audio extraction. 40 Stars! Full yt-dlp wrapper. Each call charges 0.001 USDT via SkillPay.
Command-line interface to manage Google NotebookLM notebooks, sources, and generate audio, quizzes, reports, presentations, and visual study materials progra...
Read any web page aloud with natural AI voices. Extract article text from any URL and convert it to audio (MP3). Use when the user wants to: listen to a webp...
Turn any idea into a finished podcast in one command. AudioMind handles ElevenLabs voice narration (29+ voices), AI background music, and server-side audio m...
Text-to-speech conversion using Python edge-tts for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and sub...