OpenAI API integration — chat completions, embeddings, image generation, audio transcription, file management, fine-tuning, and assistants via the OpenAI RES...
Build, operate, and troubleshoot Autonoannounce local speaker text-to-speech using the queued pipeline (enqueue to worker to ElevenLabs to playback backend)....
Text-to-speech conversion using Zhipu AI (BigModel) GLM-TTS model. Use when you need to convert text to audio files with various voice options. Supports Chin...
Generate AI music and songs with Diffrythm, Tencent Song Generation via inference.sh CLI. Models: Diffrythm (fast song generation), Tencent Song Generation (...
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) U
Search, explore, and run fal.ai generative AI models (image generation, video, audio, 3D). Use when user wants to generate images, videos, or other media with AI models.
Unified AI execution engine. Single API key (WODEAPP_API_KEY) routes to 343+ models across text, image, video, TTS, and structured JSON — with automatic cost...
Explain anything — turn ideas into podcasts, explainer videos, or voice narration. Use when the user wants to "make a podcast", "create an explainer video",...
Text-to-speech using macOS built-in `say` command. Use for voice notifications, audio alerts, reading text aloud, or announcing messages through Mac speakers. Supports multiple languages including Chi
Local text-to-speech using Qwen3-TTS-12Hz-1.7B-CustomVoice. Use when generating audio from text, creating voice messages, or when TTS is requested. Supports 10 languages including Italian, 9 premium s
Send high-quality text-to-speech voice messages on WhatsApp in 40+ languages with automatic delivery
Generate speech from text using Kyutai Pocket TTS - lightweight, CPU-friendly, streaming TTS with voice cloning. English only. ~6x real-time on M4 MacBook Air.
Captures learnings, errors, and corrections to enable continuous improvement. And also 50+ models for image generation, video generation, text-to-speech, spe...
Local text-to-speech using Alibaba's CosyVoice3 on macOS Apple Silicon. Supports Chinese, English, Japanese, Korean, and 18+ Chinese dialects. Provides zero-...
--- name: mlx-audio-server description: Local 24x7 OpenAI-compatible API server for STT/TTS, powered by MLX on your Mac. metadata: {"openclaw":{"always":true,"emoji":"🦞","homepage":"https://github.
Send requests to the dr.eamer.dev LLM API for chat completions, vision analysis, image generation, text-to-speech, and video generation across 12 model provi...
Google Workspace CLI for Gmail, Calendar, Drive, Contacts, Sheets, and Docs. And also 50+ models for image generation, video generation, text-to-speech, spee...
Automatically update Clawdbot and all installed skills once daily via cron. And also 50+ models for image generation, video generation, text-to-speech, speec...
Control Slack from Clawdbot including reacting to messages and pinning items. And also 50+ models for image generation, video generation, text-to-speech, spe...
--- name: local-llama-tts description: Local text-to-speech using llama-tts (llama.cpp) and OuteTTS-1.0-0.6B model. metadata: { "openclaw": { "emoji": "🔊", "requires": {
Search the web for information, find current content, and look up news articles. And also 50+ models for image generation, video generation, text-to-speech,...
Local offline text-to-speech via Piper TTS. Self-contained setup, automatic language detection, per-call voice selection. Extensible to any language. Writes...