Read X (Twitter) Articles aloud using macOS text-to-speech. Accepts an X Article URL and reads the content out loud. Automatically detects Chinese vs English...
Control Slack from Clawdbot including reacting to messages and pinning items. And also 50+ models for image generation, video generation, text-to-speech, spe...
Install and use whisper.cpp (local, free/offline speech-to-text) with OpenClaw. Supports downloading different ggml model sizes (tiny/base/small/medium/large...
Automatically update Clawdbot and all installed skills once daily via cron. And also 50+ models for image generation, video generation, text-to-speech, speec...
Local speech-to-text with MLX Whisper (Apple Silicon optimized, no API key).
Text-to-speech via Inworld.ai API. Use when generating voice audio from text, creating spoken responses, or converting text to MP3/audio files. Supports multiple voices, speaking rates, and streaming
Transcribe audio files with ElevenLabs Speech-to-Text (Scribe v2) from the local CLI. Supports diarization, events, JSON output, webhooks, and advanced STT o...
Local speech-to-text using OpenAI Whisper. Use when the user needs to: (1) transcribe audio files to text, (2) convert voice messages to written content, (3)...
Multilingual Text-to-Speech (TTS) with intelligent Pinyin-to-Hanzi conversion. Use when the user asks to generate audio for text that contains a mix of Vietn...
ElevenLabs text-to-speech with mac-style say UX.
Stream free, professional text-to-speech from voiceless servers to Linux, macOS, or Android devices with 50+ voices in 30+ languages. Two architecture options for flexible deployment - server-side TTS
The cheapest AI media API on the market. Generate images (Flux), music (AceStep), speech with voice cloning, transcribe video/audio, OCR, video generation, b...
Google Workspace CLI for Gmail, Calendar, Drive, Contacts, Sheets, and Docs. And also 50+ models for image generation, video generation, text-to-speech, spee...
Text-to-speech conversion using `uvx edge-tts` for generating audio from text. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rath
Captures learnings, errors, and corrections to enable continuous improvement. And also 50+ models for image generation, video generation, text-to-speech, spe...
Text-to-speech conversion using Zhipu AI (BigModel) GLM-TTS model. Use when you need to convert text to audio files with various voice options. Supports Chin...
Configure TTS in OpenClaw. Adapt speech output to user preferences.
Consume the shared Whisper speech-to-text API over Tailnet at http://100.92.116.99:8765 using OpenAI-compatible audio transcription endpoint (/v1/audio/trans...
Local speech-to-text with Parakeet MLX (ASR) for Apple Silicon (no API key).
Work with Obsidian vaults (plain Markdown notes) and automate via obsidian-cli. And also 50+ models for image generation, video generation, text-to-speech, s...
Send voice messages across chat channels (Telegram, Discord, Feishu/Lark, Signal, WhatsApp, and others) using edge-tts for text-to-speech and ffmpeg for audi...
ElevenLabs advanced TTS for converting text to speech, listing voices, and managing credits
--- name: speech-recognition-local description: 本地语音转文字。使用 faster-whisper 在本地运行 Whisper 模型,无需 API 费用。 --- # 本地语音识别 ## 触发 - 用户发送