Execute multimodal tasks using Novita AI: text-to-image, image-to-image, text-to-video, image-to-video, TTS, STT. Use for: generating images, generating vide...
Turn scripts into publishable voiceovers with Voice.ai TTS, including segments, chapters, captions, and video muxing.
Give your agent a voice — and ears. The Cult of Carcinization is the bot-first gateway to ScrappyLabs TTS and STT. Speak with 20+ voices, design your own from a text description, transcribe audio to
使用 PPIO 执行多模态任务:文生图、图生图、文生视频、图生视频、TTS、STT。 适用于:生成图片、生成视频、文字转语音、语音识别。
使用接口AI 执行多模态任务:文生图、图生图、文生视频、图生视频、TTS、STT。 适用于:生成图片、生成视频、文字转语音、语音识别。
Integration guide for SenseAudio Open Platform APIs, including TTS (sync/SSE/WebSocket), ASR (HTTP/WebSocket), realtime Agents, video generation/storyboard,...
Enables agents to reply in the same modality as received: voice messages get voice replies, text messages get text replies, using Edge TTS and config snippets.
支持查询、绑定及切换火山引擎 TTS 机器人音色,设置默认音色并生成测试音频,配置自动保存生效。
AI voice generation, text-to-speech, and voice synthesis via inference.sh CLI. Models: Kokoro TTS, DIA, Chatterbox, Higgs, VibeVoice for natural speech. Capa...
Send native iMessage voice bubbles with ElevenLabs TTS via BlueBubbles. Use when: user asks to send a voice message, wants something spoken aloud, storytelli...
Local zero-cost text-to-speech with per-agent voice profiles using Kokoro TTS (82M params). 54 voices available, named agent mappings, WAV output. Use when b...
End-to-end voice workflow with Deepgram STT and TTS. Use when transcribing voice messages, generating spoken replies, or building a shell-based audio pipelin...
Monitor F5-TTS distributed training on the 9-GPU mining rig (Local-LLM) without interfering with the process.
Stream free, professional text-to-speech from voiceless servers to Linux, macOS, or Android devices with 50+ voices in 30+ languages. Two architecture options for flexible deployment - server-side TTS
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.
Generate local OPUS/Ogg voice replies (default Juno voice) for Feishu and Discord using a local FastAPI TTS server. Requires ffmpeg on PATH plus local Python...
本地能力中心。通过 HTTP 调用本机麦克风、摄像头、Ollama、YOLO、Stable Diffusion、TTS/转写、通知、剪贴板、天气、白名单脚本等。当需要「验证是否有声
--- name: deepgram-discord-voice description: Voice-channel conversations in Discord using Deepgram streaming STT + low-latency TTS metadata: clawdbot: config: requiredEnv: - DISCO
--- name: voice-reply description: "语音回复技能 - 每次回复自动生成语音并保存到桌面,支持 Noiz AI TTS" --- # 语音回复技能 (Voice Reply Skill) 🦞 自动将文字回
Fetches the latest news using news-aggregator-skill, formats it into a podcast script in Markdown format, and uses the tts skill to generate a podcast audio...
OpenClaw plugin that bridges to the Clawfinger voice gateway. Provides tools for live call takeover, TTS injection, outbound dialing, hangup, context/knowled...
--- name: avatar description: Interactive AI avatar with Simli video rendering and ElevenLabs TTS emoji: "\U0001F9D1\u200D\U0001F4BB" homepage: https://github.com/Johannes-Berggren/openclaw-avatar met