Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) U
Manage podcasts on Transistor.fm via their API. Use when creating, publishing, updating, or deleting podcast episodes, uploading audio files, listing shows/e...
Generate Russian male voice audio using ComfyUI with Qwen3 TTS node and save as MP3 for voice messages.
Jarvis TTS text-to-speech using Microsoft edge-tts with afplay playback. Use when users request voice output, audio responses, or text-to-speech. Provides na...
Repurpose long-form video or audio into short-form clip plans with timestamps, hooks, captions, and packaging notes. Use when a user asks to turn a long video, podcast, or stream into Shorts, Reels, T
Clip and download specific time ranges or full YouTube videos in various qualities, including audio-only MP3 extraction, using precise timestamps.
Transcribe audio files using ElevenLabs Speech-to-Text (Scribe v2).
Download and summarize Xiaohongshu (小红书/RedNote) videos. Produces a full resource pack with video, audio, subtitles, transcript, and AI summary. This skill s...
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
Set up mlx-whisper as the local audio transcription engine for OpenClaw on Apple Silicon Macs (M1/M2/M3/M4). Automatically transcribes voice notes sent via T...
Create and grow podcasts by planning episodes, producing audio or video, generating clips, and building audience across formats.
Generate retro robotic speech audio using SAM (Software Automatic Mouth), the classic C64 text-to-speech synthesizer. Use for /sam command to generate voice messages. Supports /sam on/off toggle mode
Use the Phaya SaaS backend to generate images, videos, audio, music, and run LLM chat completions via simple REST API calls. Use when the user wants to gener...
Generate music from text prompts using ElevenLabs Eleven Music API. Use when creating songs, soundtracks, jingles, lullabies, or any audio music from descriptions. Supports vocals with AI-generated ly
--- name: summarize description: Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube). homepage: https://summarize.sh metadata: {"clawdbot":{"emoji":"🧾","requires":{"b
Enables voice synthesis, voice cloning, voice design, and audio post-processing using MiniMax Voice API and FFmpeg. Use when converting text to speech, creat...
Automatically converts received voice messages to text via an external ASR service, supporting multiple audio formats and integrating with OpenClaw.
Generate spectrograms and feature-panel visualizations from audio with the songsee CLI.
Generate AI-optimized Alt Text, file names, captions, and Schema markup for images, videos, and audio assets. Improves AI discoverability on Google Lens, Cha...
OpenClaw agent skill for converting documents to Markdown. Documentation and utilities for Microsoft's MarkItDown library. Supports PDF, Word, PowerPoint, Excel, images (OCR), audio (transcription), H
Translate and dub videos from one language to another, replacing the original audio with TTS while keeping the video intact.
Search and directly download free images, audio, music, sound effects, videos, and 3D models from WebSim's large, auth-free digital asset library.
Convert documents and files to Markdown using markitdown. Use when converting PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx, .xls), HTML, CSV, JSON, XML, images (with EXIF/OCR), audio (with tran
Unified speech-to-text skill. Use when the user asks to transcribe audio or video, generate subtitles, identify speakers, translate speech, search transcript...