Offline speech-to-text conversion using Vosk local model; input audio file path, output transcript text.
Use when the user mentions Otter, Otter.ai, or wants to find, search, download, export, or manage meeting notes, transcripts, recordings, or audio from calls...
FastAPI personalization webhook that adds persistent caller memory and dynamic context injection to ElevenLabs Conversational AI agents on Twilio. No audio proxying, file-based persistence, OpenClaw c
--- name: feishu-voice-reply description: 飞书语音消息自动回复技能 - 使用 Edge TTS 生成语音并通过飞书 API 发送 metadata: tags: feishu, voice, tts, audio, edge-tts version:
Send images, videos, audio, or documents via WhatsApp by downloading, copying to workspace, sending, and cleaning up temporary files.
Evaluate hi-fi and audio gear options, build system recommendations, guide installation and tuning, and analyze used-market pricing/resale value. Use when us...
--- name: any-whisper-api description: Transcribe audio via API Whisper with any compatible local servers. homepage: https://platform.openai.com/docs/guides/speech-to-text metadata: {"clawdbot":{"emoj
Analyze Twitter Spaces and voice conversations to extract market intelligence, crypto alpha, sentiment analysis, and speaker-attributed insights. Transforms spoken audio into structured reports, full
Free, unlimited text-to-speech using Microsoft Edge neural voices via Python edge-tts. Use when generating long-form audio, podcasts, voice notes, spoken bri...
Text-to-speech conversion using `uvx edge-tts` for generating audio from text. Use when: (1) User requests audio/voice output with the "tts" trigger or keyword. (2) Content needs to be spoken rath
fal.ai API integration with managed API key authentication. Run AI models for image generation, video generation, audio processing, and more. Use this skill...
Download videos from YouTube, Bilibili, Twitter, and thousands of other sites using yt-dlp. Use when the user provides a video URL and wants to download it, extract audio (MP3), download subtitles, or
Consume the shared Whisper speech-to-text API over Tailnet at http://100.92.116.99:8765 using OpenAI-compatible audio transcription endpoint (/v1/audio/trans...
Install and use the speechall CLI tool for speech-to-text transcription. Use when the user wants to: (1) transcribe audio or video files to text, (2) install speechall on macOS or Linux, (3) list avai
Unified multi-modal content parser for images, PDF, DOCX, audio, auto OCR/transcription, output structured text for LLM processing
MarkItDown is a Python utility from Microsoft for converting various files (PDF, Word, Excel, PPTX, Images, Audio) to Markdown. Useful for extracting structu...
Manage flashcards, generate AI-based cards, create audio podcasts, and track study progress using EchoDecks API integration.
Generate detailed, production-ready cinematic video prompts following Seedance 2.0’s strict Subject-Action-Camera-Style-Audio-Constraints format for AI video...
--- name: safety-guard description: Safety Guard URLs or files with the safety-guard CLI (web, PDFs, images, audio, YouTube). homepage: https://safety-guard.sh metadata: {"clawdbot":{"emoji":"🧾","r
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) U
Text-to-speech conversion using Zhipu AI (BigModel) GLM-TTS model. Use when you need to convert text to audio files with various voice options. Supports Chin...
AI-powered game asset generation guide covering 2D sprites, tilemaps, UI elements, audio, music, and 3D models. Use when generating game assets with AI tools...
case.dev — a legal AI platform with encrypted document vaults, OCR, audio transcription, and legal search. This skill installs the casedev CLI and provides s...
--- name: summarize description: Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube). homepage: https://summarize.sh metadata: {"clawdbot":{"emoji":"🧾","requires":{"b