Combined agent that synthesizes speech via Volcengine TTS, uploads the audio to TOS, and returns a presigned temporary URL. Use when users need a shareable a...
Build backend AI with Vercel AI SDK v6 stable. Covers Output API (replaces generateObject/streamObject), speech synthesis, transcription, embeddings, MCP tools with security guidance. Includes v4→v5
Generate high-quality English speech offline on CPU using 8 built-in voices or custom voice cloning with Kyutai's Pocket TTS model.
Send TTS audio as a proper playable audio message (not file attachment) to Feishu chats. Use when asked to send voice messages, TTS audio, speech announcemen...
Free local speech-to-text for Telegram and WhatsApp using MLX Whisper on Apple Silicon. Private, no API costs.
ClawVox - ElevenLabs voice studio for OpenClaw. Generate speech, transcribe audio, clone voices, create sound effects, and more.
Generate AI-powered podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model via WebSocket. Use when building text-to-speech features, audio narrative generation, podcast creation f
Transcribe audio and video files using OpenAI Whisper API. Use when user wants to transcribe audio/video files, extract speech from media, or get text from r...
Transcribe audio via Groq Automatic Speech Recognition (ASR) Models (Whisper).
Free, unlimited text-to-speech using Microsoft Edge neural voices via Python edge-tts. Use when generating long-form audio, podcasts, voice notes, spoken bri...
Provides high-accuracy speech-to-text conversion supporting 22 Chinese dialects and 30 languages with automatic language detection, running on CPU.
Telegram bot that transcribes voice messages using Whisper and replies in Chinese with Microsoft Edge text-to-speech.
Transcribes local voice messages to text using Faster Whisper models for fast, privacy-focused speech recognition on audio files.
Voice-first OpenClaw skill powered by MOSS APIs. Use when a user wants spoken replies in a preferred timbre, either from an existing voice_id or from a refer...
Use when OpenClaw needs to call SpeakNotes API routes directly using an API key and generate transcripts/summaries from YouTube URLs, media files, or documen...
Convert text to speech using Volcengine TTS with preset or cloned voices and send audio messages to Feishu chats or groups.
Use when the user mentions Otter, Otter.ai, or wants to find, search, download, export, or manage meeting notes, transcripts, recordings, or audio from calls...
All-in-One AI creation: images (SeeDream 4.5, Midjourney, Nano Banana 2), videos (Wan 2.6, Kling, Veo 3.1, Sora, Pixverse, Hailuo, SeeDance, Vidu), music (Su...
--- name: moa-debate description: Run an Oxford Union–style multi-agent debate on any motion using Mixture of Agents architecture --- # Oxford Union Multi-Agent Debate When the user wants to **deb
Generate branded AI avatar lip-sync video shorts for TikTok, Reels, and YouTube Shorts. Create 15-second talking-head videos with custom avatars, auto-genera...
Fetch, classify, and summarize papers from multiple sources (arXiv, etc.) with AI-powered multi-language summaries and email delivery.
Explain anything — turn ideas into podcasts, explainer videos, or voice narration. Use when the user wants to "make a podcast", "create an explainer video",...