Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe...
Access and search Brazilian Chamber of Deputies data including deputies, bills, votes, committees, agendas, expenses, and legislative statuses via public API...
--- name: senseaudio-voice version: 2.1.0 description: SenseAudio Voice - 语音合成 (TTS) + 语音识别 (ASR),支持语言自动切换 metadata: {"openclaw":{"emoji":"🎤"}} tags: [tts, asr, vo
Turn any idea into a finished podcast in one command. AudioMind handles ElevenLabs voice narration (29+ voices), AI background music, and server-side audio m...
Rev.ai integration. Manage data, records, and automate workflows. Use when the user wants to interact with Rev.ai data.
Generate high-quality videos from text, images, or other videos using the Kling 3.0 Omni model. Covers text-to-video, image-to-video, video editing, video re...
Generate talking head videos from a portrait image and audio using WaveSpeed AI's InfiniteTalk model. Produces lip-synced video up to 10 minutes long at 480p...
Transcribe audio files to text via Step ASR streaming API (HTTP SSE). Supports Chinese and English, multiple audio formats (PCM, WAV, MP3, OGG/OPUS), real-ti...
Build a personal quotes system for saving, discovering, and automatically surfacing meaningful words.
Use VLM Run (vlmrun) to generate transcriptions from YouTube videos. Download a video with yt-dlp, then run vlmrun to transcribe with optional timestamps. VLMRUN_API_KEY must be in .env; follow vlmrun
Agent Skills (SKILL.md) builder, auditor, and improver for cross-platform LLM agents. Use for "skeall", "build a skill", "create skill", "improve skill", "au...
Edit, transform, extend, upscale, and enhance videos using EachLabs AI models. Supports lip sync, video translation, subtitle generation, audio merging, style transfer, and video extension. Use when t
All-in-one YouTube content generator - create regular videos, Shorts from scratch, and Shorts from long videos. Combines best of youtube-factory and AI-Youtu...
Connect to 100+ APIs (Google Workspace, Microsoft 365, Notion, Slack, Airtable, HubSpot, etc.) with managed OAuth. Use this skill when users want to interact...
MarkItDown is a Python utility from Microsoft for converting various files (PDF, Word, Excel, PPTX, Images, Audio) to Markdown. Useful for extracting structu...
Academic paper summarization with dynamic SOP selection based on paper topic classification. Supports method, dataset, multimodal, and other paper types with...
Academic paper quality filtering agent with rigorous scoring system and comprehensive audit trail. Filters papers based on relevance and quality criteria for...
Real-time WhatsApp voice message processing. Transcribe voice notes to text via Whisper, detect intent, execute handlers, and send responses. Use when building conversational voice interfaces for What
Discord project collaboration infrastructure for OpenClaw agents. Manage Forum Channels, threads, participant permissions, and mention mode. Supports 3-tier...
Give your agent a social identity on ImagineAnything.com — the social network for AI agents. Post, follow, like, comment, DM other agents, trade on the marketplace, and build reputation.
Handles voice-to-voice conversations on WhatsApp. Automatically transcribes incoming audio and responds with local TTS audio. Use when the user wants to "talk" instead of type.
Create and switch between AI assistant personalities. Use /personality to list and activate saved personalities. Use /create-personality to design new personas with auto-filled SOUL and IDENTITY. Pers
Summarize YouTube videos with NO subtitles by doing local ASR (yt-dlp + faster-whisper) and extracting a few screenshot frames via ffmpeg. Use when the user...
Search, install, and run multi-skill automations from clawflows.com. Combine multiple skills into powerful workflows with logic, conditions, and data flow between steps.