Generate professional captions and subtitles with multi-engine transcription, word-level timing, styling presets, and burn-in.
Automated text-to-video pipeline with multi-provider TTS/ASR support - OpenAI, Azure, Aliyun, Tencent | 多厂商 TTS/ASR 支持的自动化文本转视频系统
High-quality voice synthesis with 18 personas, 32 languages, sound effects, batch processing, and voice design using ElevenLabs API.
小宇宙播客下载工具。从小宇宙(xiaoyuzhoufm.com)下载播客音频和Show Notes。自动转换为MP3格式(兼容Sanag、小游等骨传导蓝牙耳机、水下游泳时离线播放)
每日名言語音任務。產生「語音 + 封面圖靜態影片 +(選配)HeyGen 數位人影片」並發送給主人。
Stream free, professional text-to-speech from voiceless servers to Linux, macOS, or Android devices with 50+ voices in 30+ languages. Two architecture options for flexible deployment - server-side TTS
Provides a patch for Clawdbot fixing TTS auto-replies on inbound voice memos by disabling block streaming to ensure final payload reaches TTS pipeline.
Discover and install related skills from inference.sh skill registry. Helps find complementary skills for your AI workflow. Use for: skill discovery, workflo...
Build and troubleshoot SenseAudio speech recognition integrations, including HTTP transcription (`/v1/audio/transcriptions`), realtime WebSocket ASR (`/ws/v1...
Workspaces for agentic teams. Complete agent guide with all 19 consolidated tools using action-based routing — parameters, workflows, ID formats, and constra...
Caravo is the first service marketplace built for autonomous AI agents — featuring 200+ ready-to-use services across categories: AI Models, Search, Data & An...
--- name: mlx-tts description: Text-To-Speech with MLX (Apple Silicon) and opensource models (default QWen3-TTS) locally. author: guoqiao metadata: {"openclaw":{"always":true,"emoji":"🦞","homepage"
Generate and iteratively develop polished 3D browser games from natural language. Supports any genre (FPS, RPG, racing, platformer, tower defense, etc.), cus...
AI meeting assistant via ghostmeet. Start sessions, get live transcripts, and generate AI summaries from any browser meeting.
End-to-end encrypted cloud memory for AI agents. 100GB free storage. Store memories, files, and secrets securely.
Extract and summarize YouTube video transcripts into concise overviews with main points, arguments, and conclusions using video captions.
--- name: clawspaces version: 1.0.0 description: X Spaces, but for AI Agents. Live voice rooms where AI agents host conversations. homepage: https://clawspaces.live metadata: {"openclaw":{"emoji":"�
Extracts YouTube video transcripts and provides concise summaries highlighting main points, arguments, and conclusions without watching the full video.
Full ElevenLabs platform integration — text-to-speech, voice cloning, and Conversational AI agent creation. Not just TTS — build interactive voice agents wit...
AI video generation — Sora, Kling, Veo 3, Seedance, Hailuo, WAN, Grok. Text-to-video, image-to-video, video editing. 37 models, one API key.
End-to-end pipeline for creating faceless Islamic story TikTok videos. Orchestrates multiple specialized agents: story research, scriptwriting, image generat...
Local TTS router for Apple Silicon — pull models, serve OpenAI-compatible API, synthesize speech, clone voices. Use when the user asks to "generate speech",...
Guides structured self-directed OCD ERP therapy using inhibitory learning, providing safety screening, progress tracking, reminders, and tailored exposure su...
Generate AI music using ACE-Step 1.5 via ACE Music's free API. Use when the user asks to create, generate, or compose music, songs, beats, instrumentals, or...