Generate and stitch short videos via Google Veo 3.x using the Gemini API (google-genai). Use when you need to create video clips from prompts (ads, UGC-style...
Create language learning audio with SenseAudio TTS, including pronunciation drills, bilingual lessons, slowed speech practice, and dialogue exercises. Use wh...
通用音乐下载管理器。支持从YouTube/Bilibili搜索下载音乐,自动转MP3,按分类存入本地音乐库
Automatically extracts audio from video, transcribes it using qwen3-asr-flash, and generates segmented text summaries saved alongside the original file.
Control a Vector robot via Wirepod’s local HTTP API on the same network. Use when you need to move Vector, tilt head/lift, speak text, capture camera frames, or run patrol/explore routines from the
Test-driven behavioral verification for AI agents. Catches silent degradation when agent loads memory but doesn't apply learned behaviors. Use when building agent with persistent memory, testing after
Extract and analyze content from video ads using Gemini Vision AI. Supports frame extraction, OCR text detection, audio transcription, and AI-powered scene analysis. Use when analyzing video creative
Send and receive voice messages on Feishu (Lark) using ElevenLabs TTS and STT. Activate when user asks to send a voice message on Feishu, or when receiving a...
AI-native workflow analyzer for Loom recordings. Breaks down recorded business processes into structured, automatable workflows. Use when: - Analyzing Loom videos to understand workflows - Extracting
MOSS-TTS 语音合成与音色克隆工具。生成适合各渠道的音频文件。 触发场景: - 用户要求生成语音、TTS - 用户提到"用我的声音"、"克隆声音"、"MOSS语音" -
将文本通过 MOSS-TTS 转换为语音,并发送到飞书群/个人。支持语音消息格式(带波形条)。
Use AudioPod AI's API for audio processing tasks including AI music generation (text-to-music, text-to-rap, instrumentals, samples, vocals), stem separation, text-to-speech, noise reduction, speech-to
Analyze music/audio files locally without external APIs. Extract tempo, pocket/groove feel, pulse stability, swing proxy, section/repetition structure, key c...
深度拆解抖音视频,自动生成包含数据、结构、视觉、文案的完整分析报告。
Local text-to-speech using Qwen3-TTS with mlx_audio (macOS Apple Silicon) or qwen-tts (Linux/Windows). Privacy-first offline TTS with natural, realistic voic...
Fact-check news articles, social media posts, images, and videos. Use when verifying claims, detecting deepfakes or AI-generated content, identifying out-of-...
视频批量处理技能包 - 一键处理100个视频,自动剪辑、加字幕、配乐、调风格。适合自媒体从业者、短视频创作者。
语音回复技能 - 使用讯飞 TTS 生成语音并发送到飞书。当需要用语音回复用户消息时使用。触发词:用语音、语音回复、切换语音模式、语音模式。
将录音/语音转写为结构化演讲纪要。适用于:会议讲话、内部分享、演讲录音的转写整理。 触发条件:用户发送音频文件并要求整理/转写/纪要,或要
Real-time AI video chat that routes through your OpenClaw agent. Uses Groq Whisper (cloud STT), edge-tts (cloud TTS via Microsoft), and OpenClaw chatCompletions API for conversation. Your agent sees y
A robust CLI wrapper for yt-dlp to download videos, playlists, and audio from YouTube and thousands of other sites. Supports format selection, quality control, metadata embedding, and cookie authentic
Generate and stitch short videos via Google Veo 3.x using the Gemini API (google-genai). Use when you need to create video clips from prompts (ads, UGC-style clips, product demos) and want a reproduci
The organic growth playbook behind 300K+ app downloads. Your AI becomes a growth coach trained on the exact system that drove 500M+ views and $30K+ revenue.
Unified multi-modal content parser for images, PDF, DOCX, audio, auto OCR/transcription, output structured text for LLM processing