Pixel art desktop lobster that lip-syncs to OpenClaw TTS speech. Use when: (1) user wants a visual avatar for their AI agent, (2) user wants a desktop overla...
Dub YouTube videos with Voice.ai TTS. Turn scripts into publish-ready voiceovers with chapters, captions, and audio replacement for YouTube long-form and Shorts.
Local STT and TTS on macOS using native Apple capabilities. Speech-to-text via yap (Apple Speech.framework), text-to-speech via say + ffmpeg. Fully offline, no API keys required. Includes voice qualit
Execute multimodal tasks using Novita AI: text-to-image, image-to-image, text-to-video, image-to-video, TTS, STT. Use for: generating images, generating vide...
Turn scripts into publishable voiceovers with Voice.ai TTS, including segments, chapters, captions, and video muxing.
Give your agent a voice — and ears. The Cult of Carcinization is the bot-first gateway to ScrappyLabs TTS and STT. Speak with 20+ voices, design your own from a text description, transcribe audio to
使用接口AI 执行多模态任务:文生图、图生图、文生视频、图生视频、TTS、STT。 适用于:生成图片、生成视频、文字转语音、语音识别。
Integration guide for SenseAudio Open Platform APIs, including TTS (sync/SSE/WebSocket), ASR (HTTP/WebSocket), realtime Agents, video generation/storyboard,...
使用 PPIO 执行多模态任务:文生图、图生图、文生视频、图生视频、TTS、STT。 适用于:生成图片、生成视频、文字转语音、语音识别。
支持查询、绑定及切换火山引擎 TTS 机器人音色,设置默认音色并生成测试音频,配置自动保存生效。
Enables agents to reply in the same modality as received: voice messages get voice replies, text messages get text replies, using Edge TTS and config snippets.
AI voice generation, text-to-speech, and voice synthesis via inference.sh CLI. Models: Kokoro TTS, DIA, Chatterbox, Higgs, VibeVoice for natural speech. Capa...
Send native iMessage voice bubbles with ElevenLabs TTS via BlueBubbles. Use when: user asks to send a voice message, wants something spoken aloud, storytelli...
Local zero-cost text-to-speech with per-agent voice profiles using Kokoro TTS (82M params). 54 voices available, named agent mappings, WAV output. Use when b...
End-to-end voice workflow with Deepgram STT and TTS. Use when transcribing voice messages, generating spoken replies, or building a shell-based audio pipelin...
Stream free, professional text-to-speech from voiceless servers to Linux, macOS, or Android devices with 50+ voices in 30+ languages. Two architecture options for flexible deployment - server-side TTS
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.
Generate local OPUS/Ogg voice replies (default Juno voice) for Feishu and Discord using a local FastAPI TTS server. Requires ffmpeg on PATH plus local Python...
--- name: deepgram-discord-voice description: Voice-channel conversations in Discord using Deepgram streaming STT + low-latency TTS metadata: clawdbot: config: requiredEnv: - DISCO
本地能力中心。通过 HTTP 调用本机麦克风、摄像头、Ollama、YOLO、Stable Diffusion、TTS/转写、通知、剪贴板、天气、白名单脚本等。当需要「验证是否有声
--- name: voice-reply description: "语音回复技能 - 每次回复自动生成语音并保存到桌面,支持 Noiz AI TTS" --- # 语音回复技能 (Voice Reply Skill) 🦞 自动将文字回
Monitor F5-TTS distributed training on the 9-GPU mining rig (Local-LLM) without interfering with the process.
Fetches the latest news using news-aggregator-skill, formats it into a podcast script in Markdown format, and uses the tts skill to generate a podcast audio...