Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.
Transcribe or translate audio files to text using a public Hugging Face Whisper Space over Gradio. Use when the user sends voice notes, audio attachments, me...
Generate speech audio from text using HeyGen's Starfish TTS model. Use when: (1) Generating standalone speech audio files from text, (2) Converting text to s...
Free All-in-One AI Audio Generator Platform. Generate an AI podcast discussion or broadcast audio using the WeryAI Podcast API. Create professional podcasts...
Transcribe audio files to text using Telnyx Speech-to-Text API. Use when you need to convert audio recordings, voice messages, or spoken content to text.
--- name: openai-whisper-api description: Transcribe audio via OpenAI Audio Transcriptions API (Whisper). homepage: https://platform.openai.com/docs/guides/speech-to-text metadata: {"clawdbot":{"emoji
--- name: feishu-voice-assistant description: Sends voice messages (audio) to Feishu chats using Duby TTS. tags: [feishu, voice, tts, audio] --- # Feishu Voice Assistant Generate speech from text us
Transcribe audio via Deepgram Nova-3 API (5.26% WER, 40x faster than Whisper, built-in speaker diarization). Use when user asks to transcribe audio, podcasts...
Generate Chinese TTS audio and send as Feishu voice message. Use when user asks for voice/audio/语音/播报/朗读 in Chinese, or when sending audio messages via Feishu.
Process video and audio using FFmpeg CLI for transcoding, cutting, merging, audio extraction, thumbnails, GIFs, speed, filters, subtitles, and watermarks.
Generate AI-powered podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model via WebSocket. Use when building text-to-speech features, audio narrative generation, podcast creation f
Use when the user asks for text-to-speech narration or voiceover, accessibility reads, audio prompts, or batch speech generation via the OpenAI Audio API; ru...
YouTube video summarizer with speaker detection, formatted documents, and audio output. Works out of the box with macOS built-in TTS. Optional recommended tools (pandoc, ffmpeg, mlx-audio) enhance qua
Ton namespace for Netsnek e.U. audio and media processing tools. Handles audio transcription, format conversion, waveform analysis, and podcast production wo...
Simple text-to-speech skill using MiniMax Voice API. Converts text to audio with customizable voice selection. Use for generating speech audio from text.
Transform any text into emotionally expressive audio with ambient soundscapes using ElevenLabs v3 audio tags and Sound Effects API
Transcribe audio via the self-hosted Whisper ASR instance running on Kubernetes. Use this skill whenever the user wants to transcribe audio files, convert sp...
Generate an AI podcast discussion or broadcast audio using the WeryAI Podcast Generation API. Use when the user asks to generate a podcast or audio discussio...
TTS (text-to-speech) via IMA Open API with seed-tts-2.0. Voice synthesis, speech from text, dubbing, audio content creation. Output: audio URL (mp3/wav). Flo...
Transcribe audio via OpenAI Audio Transcriptions API (Whisper).
Generate and publish a dual-host daily podcast. Fetches news, generates a conversational script between two hosts, synthesizes audio via Fish Audio or Edge T...
--- name: musa-torch-coding description: Transcribe audio via OpenAI Audio Transcriptions API (Whisper). homepage: https://platform.openai.com/docs/guides/speech-to-text metadata: { "openc
Generate videos via LTX-2.3 API (ltx.video). Supports text-to-video, image-to-video, audio-to-video (lip-sync from audio + image), extend, and retake. Use wh...