Send images, videos, audio, or documents via WhatsApp by downloading, copying to workspace, sending, and cleaning up temporary files.
Analyze Twitter Spaces and voice conversations to extract market intelligence, crypto alpha, sentiment analysis, and speaker-attributed insights. Transforms spoken audio into structured reports, full
Evaluate hi-fi and audio gear options, build system recommendations, guide installation and tuning, and analyze used-market pricing/resale value. Use when us...
fal.ai API integration with managed API key authentication. Run AI models for image generation, video generation, audio processing, and more. Use this skill...
Text-to-speech conversion using `uvx edge-tts` for generating audio from text. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rath
--- name: any-whisper-api description: Transcribe audio via API Whisper with any compatible local servers. homepage: https://platform.openai.com/docs/guides/speech-to-text metadata: {"clawdbot":{"emoj
Consume the shared Whisper speech-to-text API over Tailnet at http://100.92.116.99:8765 using OpenAI-compatible audio transcription endpoint (/v1/audio/trans...
Install and use the speechall CLI tool for speech-to-text transcription. Use when the user wants to: (1) transcribe audio or video files to text, (2) install speechall on macOS or Linux, (3) list avai
Download videos from YouTube, Bilibili, Twitter, and thousands of other sites using yt-dlp. Use when the user provides a video URL and wants to download it, extract audio (MP3), download subtitles, or
Manage flashcards, generate AI-based cards, create audio podcasts, and track study progress using EchoDecks API integration.
Video to text converter. Downloads videos from Bilibili using bilibili-api, from other sites using yt-dlp, then transcribes audio using faster-whisper. Use w...
Dub YouTube videos with Voice.ai TTS. Turn scripts into publish-ready voiceovers with chapters, captions, and audio replacement for YouTube long-form and Shorts.
Manage podcasts on Transistor.fm via their API. Use when creating, publishing, updating, or deleting podcast episodes, uploading audio files, listing shows/e...
Generate AI videos using ByteDance's Seedance 1.5 Pro — a native audio-visual joint generation model with cinematic camera control, multi-language lip-sync,...
Text-to-Speech via Zvukogram API with SSML support. Use when you need to generate speech from text, create podcasts, voice notifications, or work with audio....
macOS CLI for transcribing audio and video files using local Whisper models or Whisnap Cloud.
Generate retro robotic speech audio using SAM (Software Automatic Mouth), the classic C64 text-to-speech synthesizer. Use for /sam command to generate voice messages. Supports /sam on/off toggle mode
ClawVox - ElevenLabs voice studio for OpenClaw. Generate speech, transcribe audio, clone voices, create sound effects, and more.
Local speech-to-text using Vosk. Lightweight, fast, fully offline. Perfect for transcribing Telegram voice messages, audio files, or any speech-to-text task without cloud APIs.
--- name: safety-guard description: Safety Guard URLs or files with the safety-guard CLI (web, PDFs, images, audio, YouTube). homepage: https://safety-guard.sh metadata: {"clawdbot":{"emoji":"🧾","r
Recognize songs by singing or audio file using iFlytek's Query By ACRCloud technology.
Forensic media triage with chain of custody. Use when receiving images, videos, audio, PDFs, or documents that need evidence-grade handling, integrity verifi...
Generate vertical short videos (9:16) from a Markdown script. Parses script sections, generates TTS audio, renders subtitle cards, and composites into MP4 wi...
Intelligent multi-model router — automatically selects the best AI model based on task type (vision, image generation, video generation, audio, reasoning, co...