Generate and translate video subtitles using WhisperX and LLM translation. Use when processing video files to create .srt subtitle files. Supports multilingu...
AI voice call agent — make outbound calls, generate browser call links, accept inbound calls, and retrieve full transcripts + summaries when calls end. Suppo...
Clips a YouTube video locally using yt-dlp and ffmpeg. Supports auto-highlight detection, translation, and CapCut-style karaoke subtitle burning. Triggers wh...
使用科大讯飞 API 将音频/视频转换为文字。支持本地音频文件转录、YouTube 视频下载并转文字。适用于会议记录、视频字幕、语音笔记等场景。当用户需
Medical record structuring and standardization tool. Converts doctor's oral or handwritten medical records into standardized electronic medical records (EMR)...
Enter Plaza One, a 3D voxel social world. Move around the plaza, chat with humans and other AI agents, observe surroundings, perform emotes, and interact wit...
Fetch, transcribe, and analyze content from URLs, files, or transcripts across multiple platforms, providing personalized, multi-dimensional insights.
Convert text to speech using MiniMax Speech 2.6 Turbo via WaveSpeed AI. Features ultra-human voice cloning, sub-250ms latency, 40+ languages, emotion control...
A thousand years of spring compressed into fifteen breaths. An immersive journey on drifts.bot — 5 steps, LOW intensity, 15-30 min. Browse, start, and travel...
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art...
Unified multi-modal content parser for images, PDF, DOCX, audio, auto OCR/transcription, output structured text for LLM processing
视频转写工作流,支持B站和YouTube视频。自动判断有字幕/无字幕,有字幕则获取字幕,无字幕则下载音频+whisper转写。触发场景:(1) 用户要求总结视频
HTTPS/WSS reverse proxy for OpenClaw WebChat Control UI. Serves the Control UI over HTTPS with TLS cert management, proxies WebSocket connections to the gate...
Azure OpenAI Service integration. Manage Models, Deployments, Prompts, Completions. Use when the user wants to interact with Azure OpenAI Service data.
Connect with 17 specialized AI mentors for expert guidance on growth, fundraising, sales, product, engineering, operations, finance, legal, hiring, leadershi...
--- name: voice-agent display-name: AI Voice Agent Backend version: 1.1.0 description: Local Voice Input/Output for Agents using the AI Voice Agent API. author: trevisanricardo homepage: https://githu
Local ASR and TTS inference server. Use when the user wants to transcribe audio to text (ASR) or convert text to speech (TTS). Requires a running Willow Infe...
Fact-check news articles, social media posts, images, and videos. Use when verifying claims, detecting deepfakes or AI-generated content, identifying out-of-...
Live as a character in Agent World - a multi-agent social simulation where AI agents move, talk, form relationships, and remember experiences in a shared per...
Decentralized compute and data marketplace for AI agents with spot pricing | 去中心化 AI Agent 计算和数据市场,支持 Spot 动态定价
Install and operate local NVIDIA Parakeet ASR for OpenClaw with an OpenAI-compatible transcription API on Ubuntu/Linux and macOS (Intel/Apple Silicon). Use w...
OpenAI API integration — chat completions, embeddings, image generation, audio transcription, file management, fine-tuning, and assistants via the OpenAI RES...
Automate WhatsApp at scale — mine leads from groups with AI, broadcast to channel followers, bulk message with ban-safe delays, schedule campaigns, auto-repl...