Windows voice companion for OpenClaw. Custom wake word via Porcupine, local STT via faster-whisper, streamed responses over the gateway WebSocket, and ElevenLabs TTS with natural chime/thinking sounds
Download videos and get transcripts, summaries, or metadata from YouTube, TikTok, Instagram, and X (Twitter). Use when the user shares a video URL and wants...
Automates video rough editing by detecting silence, scoring segments, removing duplicates, and generating a best-segment clip and detailed report.
OpenClaw语音技能套装 - 完整的离线语音交互解决方案,支持6种中文方言(普通话、粤语、吴语、客家话、闽南话、四川话)
Unified speech-to-text skill. Use when the user asks to transcribe audio or video, generate subtitles, identify speakers, translate speech, search transcript...
Sync, transcribe, and intelligently organize voice memos, audio/video files, and URLs. 同步、转录、智能整理语音备忘录、音视频文件和视频链接。
Video summarization for Bilibili, Xiaohongshu, Douyin, and YouTube. Extract insights from video content through transcription and summarization.
Generate professional captions and subtitles with multi-engine transcription, word-level timing, styling presets, and burn-in.
Convert Bilibili (B站) videos into a searchable text knowledge base. Supports single videos and batch processing of entire UP主 channels. Uses local whisper.cp...
All-in-one YouTube content generator - create regular videos, Shorts from scratch, and Shorts from long videos. Combines best of youtube-factory and AI-Youtu...
Memorist Agent — helps you capture your parents' and family members' life stories through adaptive interviews via WhatsApp, WeChat, or direct conversation. O...
Play social deduction and game theory games against other AI agents. Register, queue, and play autonomously via HTTP API.
Document intelligence: categorize, autofill forms, analyze contracts, scan receipts/invoices, analyze bank statements, parse resumes/CVs, scan IDs/passports...
Extract recipes from Instagram reels. Use when a user sends an Instagram reel link and wants to get the recipe from the caption. Parses ingredients, instructions, and macros into a clean format.
ElevenLabs TTS (Text-to-Speech) with emotional audio tags for expressive voice synthesis. WhatsApp-compatible voice messages with Opus conversion. Supports 7...
Use when the user wants to transcribe, caption, or get the text content of a video or audio file — e.g. "transcribe this video", "get the transcript", "what...
Transcribe audio files (WAV/MP3/M4A/FLAC) to timestamped text using SenseVoice-Small + FSMN-VAD. Supports single-file and batch mode with VAD-anchored per-se...
Fully offline, CUDA-accelerated local voice assistant pipeline for NVIDIA Jetson. Wake word (openWakeWord) → real-time VAD → whisper.cpp GPU STT → LLM → Pipe...
视频一站式工作流技能包。整合视频剪辑、转写、烧录、拼接全流程,支持分步执行和用户确认。 包含:(1) auto-editor - 视频剪辑去除静音片段;(2) Faster
One-step full-stack installer for OpenClaw WebChat voice input with local speech-to-text. Orchestrates three focused skills in order: local STT backend (fast...
Analyze music/audio files locally without external APIs. Extract tempo, pocket/groove feel, pulse stability, swing proxy, section/repetition structure, key c...
End-to-end AI UGC video pipeline. Product info → GPT-4o-mini script → ElevenLabs voiceover → Aurora talking head (fal-ai/creatify/aurora) → Kling 2.6 Pro pro...
Analyze YouTube notifications for investment and trading insights. Use when user wants investment advice from YouTube, analyzing stock crypto or financial co...
⚠️ DEPRECATED — This skill has been split into two separate skills for better modularity: **webchat-https-proxy** (HTTPS/WSS reverse proxy) and **webchat-voi...