Isolated agent runtime for code execution, live preview URLs, browser automation, 50+ tools (ffmpeg, sqlite, pandoc, imagemagick), LLM inference, and persistent memory — all via CLI or HTTP, no SDK
--- name: telegram-voice-group description: 向指定 Telegram 群组发送语音消息 metadata: {"openclaw":{"emoji":"🔊","os":["linux"],"requires":{"bins":["ffmpeg","edge-tts"]}}} --- # Telegram
Creates TikTok image carousels with text overlays using Pexels API & FFmpeg, then uploads via PostBridge API. Use when the user wants to: create TikTok slide...
Local STT and TTS on macOS using native Apple capabilities. Speech-to-text via yap (Apple Speech.framework), text-to-speech via say + ffmpeg. Fully offline, no API keys required. Includes voice qualit
Clips a YouTube video locally using yt-dlp and ffmpeg. Supports auto-highlight detection, translation, and CapCut-style karaoke subtitle burning. Triggers wh...
CLI audio mastering without a reference track using ffmpeg; accepts audio or video inputs and outputs mastered WAV/MP3 or remuxed MP4.
Download videos, extract transcripts, capture frames. Analyze YouTube, tutorials, DD videos with yt-dlp + Whisper + ffmpeg.
Send voice messages across chat channels (Telegram, Discord, Feishu/Lark, Signal, WhatsApp, and others) using edge-tts for text-to-speech and ffmpeg for audi...
Offline speech-to-text (ASR) using whisper.cpp (whisper-cli) + ffmpeg. Supports batch transcription, timestamps, SRT/TXT/JSON outputs, and model download. Cr...
Summarize YouTube videos with NO subtitles by doing local ASR (yt-dlp + faster-whisper) and extracting a few screenshot frames via ffmpeg. Use when the user...
--- name: vibeclip description: Generate AI music videos from melody audio + photo + prompt. Local Ollama (llama3.2:1b/phi3:mini) for scene desc, FFmpeg for morph/zoompan + waveform sync. Node/Express
AI-powered video intelligence - download, analyze, clip, GIF from any URL. Supports YouTube, TikTok, Instagram, X. Uses ffmpeg + yt-dlp.
Capture photos, record video clips, list cameras, and live stream on Linux. Uses V4L2 and ffmpeg. Supports USB webcams and RTSP/IP cameras.
Generate local OPUS/Ogg voice replies (default Juno voice) for Feishu and Discord using a local FastAPI TTS server. Requires ffmpeg on PATH plus local Python...
Complete A/B video pipeline — storyboard, Veo 3 batch generation, browser preview with feedback loop, and ffmpeg assembly into final videos. Use when creatin...
Extract frames or short clips from videos using ffmpeg.
Turn incoming text (email/newsletter) into a short TTS podcast with chunking + ffmpeg concat.
Use when performing video/audio processing tasks including transcoding, filtering, streaming, metadata manipulation, or complex filtergraph operations with FFmpeg.
YouTube video summarizer with speaker detection, formatted documents, and audio output. Works out of the box with macOS built-in TTS. Optional recommended tools (pandoc, ffmpeg, mlx-audio) enhance qua
--- name: local-stt description: Local STT with selectable backends - Parakeet (best accuracy) or Whisper (fastest, multilingual). metadata: {"openclaw":{"emoji":"🎙️","requires":{"bins":["ffmpeg"
--- name: douyin-download description: 抖音无水印视频下载和文案提取工具 metadata: openclaw: emoji: 🎵 requires: bins: [ffmpeg] env: [SILI_FLOW_API_KEY] --- # d