AI media generation via deAPI. Transcribe YouTube/audio/video, generate images from text, text-to-speech, OCR, remove backgrounds, upscale images, create vid...
Transcribe audio files to text using Telnyx Speech-to-Text API. Use when you need to convert audio recordings, voice messages, or spoken content to text.
Accept text or voice input, transcribe if needed, generate natural OpenAI TTS speech, and send audio output to Feishu chat or web player.
Local speech-to-text using OpenAI Whisper. Use when the user needs to: (1) transcribe audio files to text, (2) convert voice messages to written content, (3)...
PCClaw provides 16 native Windows AI skills for system control, automation, files, notifications, OCR, speech, LLM inference, and task management with minima...
--- name: vectorclaw-mcp description: "MCP tools for Anki Vector: speech, motion, camera, sensors, and automation workflows." openclaw: emoji: "🤖" requires: bins: ["python3"] env: ["VEC
Use when low-latency realtime speech recognition is needed with Alibaba Cloud Model Studio Qwen ASR Realtime models, including streaming microphone input, li...
Text-to-speech conversion using `uvx edge-tts` for generating audio from text. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rath
Send TTS audio as a proper playable audio message (not file attachment) to Feishu chats. Use when asked to send voice messages, TTS audio, speech announcemen...
Converts text to natural speech using ElevenLabs for clinical and healthcare use cases. Use when generating patient instructions, discharge summaries, medica...
Use this skill whenever the user wants speech to sound more human, companion-like, or emotionally expressive. Triggers include: any mention of 'say like', 't...
Combined agent that synthesizes speech via Volcengine TTS, uploads the audio to TOS, and returns a presigned temporary URL. Use when users need a shareable a...
Install and use whisper.cpp (local, free/offline speech-to-text) with OpenClaw. Supports downloading different ggml model sizes (tiny/base/small/medium/large...
Offline speech-to-text (ASR) using whisper.cpp (whisper-cli) + ffmpeg. Supports batch transcription, timestamps, SRT/TXT/JSON outputs, and model download. Cr...
The most complete voice and phone calling skill for OpenClaw. Handles inbound and outbound phone calls over Twilio with OpenAI Realtime speech. Inbound outbo...
Offline speech-to-text conversion using Vosk local model; input audio file path, output transcript text.
Convert text to speech using Volcengine TTS with preset or cloned voices and send audio messages to Feishu chats or groups.
Chat with any real person or fictional character in their own voice by automatically finding their speech online, extracting a clean reference sample, and ge...
Full local AI inference stack on Apple Silicon Macs via MLX. Includes: LLM chat (Qwen3-14B, Gemma3-12B), speech-to-text ASR (Qwen3-ASR, Whisper), text embedd...
Build and debug Groq API chat and speech workflows with low-latency routing, structured outputs, and production-safe patterns.
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) U
Transcribe audio files to text using OpenAI Whisper. Supports speech-to-text with auto language detection, multiple output formats (txt, srt, vtt, json), batch processing, and model selection (tiny to
--- name: openai-whisper description: Local speech-to-text with the Whisper CLI (no API key). homepage: https://openai.com/research/whisper metadata: {"clawdbot":{"emoji":"🎙️","requires":{"bins":
Generate high-quality English speech offline on CPU using 8 built-in voices or custom voice cloning with Kyutai's Pocket TTS model.