Transcribe recorded audio files to text via UniSound UniCloud ASR API, supporting multiple formats and optimized for finance and customer service domains.
Text-to-Speech via macOS say command with Siri Natural Voices. Use for generating speech audio, TTS clips, or speaking text aloud on macOS.
Multimodal YouTube video analysis through both audio (transcript) and visual (frame extraction + image analysis) channels. Especially powerful for HowTo vide...
--- name: summarize description: Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube). homepage: https://summarize.sh metadata: {"clawdbot":{"emoji":"🧾","requires":{"b
Forensic media triage with chain of custody. Use when receiving images, videos, audio, PDFs, or documents that need evidence-grade handling, integrity verifi...
Launch voice collection campaigns for feature phones, list active tasks, and monitor campaign stats. Validate and transcribe audio samples automatically to ensure high-quality datasets. Credit mobile
Command-line tool for fast, accurate speech-to-text transcription from local files, URLs, or live audio using Deepgram’s API with customizable options.
Text-to-speech conversion using `uvx edge-tts` for generating audio from text. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rath
Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).
Intelligent multi-model router — automatically selects the best AI model based on task type (vision, image generation, video generation, audio, reasoning, co...
Generate high-quality text-to-speech and text-to-voice outputs using the [DAISYS](https://www.daisys.ai/) platform and make it able to play and store audio generated.
Generate synchronized subtitles (SRT/VTT/ASS) from video audio with precise timestamps. Use when users need subtitles, captions, or video transcription with...
Automatically converts received voice messages to text via an external ASR service, supporting multiple audio formats and integrating with OpenClaw.
Text-to-speech conversion tool. Use when converting text to speech audio files (opus or mp3 format). Supports macOS native 'say' command and Google TTS (gTTS...
Transcribe audio files using Google's Gemini API or Vertex AI
--- name: moltspaces description: Join audio room spaces to talk and hang out with other agents and users on Moltspaces. compatibility: python>=3.11, uv metadata: version: "1.0.16" homepage: "http
Download and summarize Xiaohongshu (小红书/RedNote) videos. Produces a full resource pack with video, audio, subtitles, transcript, and AI summary. This skill s...
Generate AI videos using ByteDance's Seedance 1.5 Pro — a native audio-visual joint generation model with cinematic camera control, multi-language lip-sync,...
Text-to-speech conversion using Zhipu AI (BigModel) GLM-TTS model. Use when you need to convert text to audio files with various voice options. Supports Chin...
Jarvis TTS text-to-speech using Microsoft edge-tts with afplay playback. Use when users request voice output, audio responses, or text-to-speech. Provides na...
Pronunciation coaching with real voice analysis using Azure Speech Services. Analyzes audio files for phoneme-level accuracy, fluency, prosody, and intonatio...
Unified CLI toolkit for Feishu messaging tasks including fetching messages, sending audio, creating group chats, and listing pinned messages.
Instant access to 100K+ nonfiction book summaries with 1-minute audio previews. Free demo key included — no signup needed. Search, browse, and listen via Fiz...
ClawVox - ElevenLabs voice studio for OpenClaw. Generate speech, transcribe audio, clone voices, create sound effects, and more.