Medical record structuring and standardization tool. Converts doctor's oral or handwritten medical records into standardized electronic medical records (EMR)...
Enter Plaza One, a 3D voxel social world. Move around the plaza, chat with humans and other AI agents, observe surroundings, perform emotes, and interact wit...
AI voice call agent — make outbound calls, generate browser call links, accept inbound calls, and retrieve full transcripts + summaries when calls end. Suppo...
Telnyx integration. Manage Accounts, PhoneNumbers, Medias, Conferences. Use when the user wants to interact with Telnyx data.
--- name: qwen3-tts-instruct version: 1.0.0 description: Alibaba Cloud Bailian Qwen TTS with voice/mood presets metadata: {"openclaw":{"emoji":"🔊"},"requires":{"env":["DASHSCOPE_API_KEY"],"bins
Create music with MiniMax music models (music-2.5+, music-2.5). Use when generating songs, instrumental tracks, or chanting from lyrics and style prompts via...
Clips a YouTube video locally using yt-dlp and ffmpeg. Supports auto-highlight detection, translation, and CapCut-style karaoke subtitle burning. Triggers wh...
使用科大讯飞 API 将音频/视频转换为文字。支持本地音频文件转录、YouTube 视频下载并转文字。适用于会议记录、视频字幕、语音笔记等场景。当用户需
AI-native workflow analyzer for Loom recordings. Breaks down recorded business processes into structured, automatable workflows. Use when: - Analyzing Loom videos to understand workflows - Extracting
Generate and translate video subtitles using WhisperX and LLM translation. Use when processing video files to create .srt subtitle files. Supports multilingu...
Process, enhance, and convert audio files with noise removal, normalization, format conversion, transcription, and podcast workflows.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art...
--- name: voice-agent display-name: AI Voice Agent Backend version: 1.1.0 description: Local Voice Input/Output for Agents using the AI Voice Agent API. author: trevisanricardo homepage: https://githu
Install and operate local NVIDIA Parakeet ASR for OpenClaw with an OpenAI-compatible transcription API on Ubuntu/Linux and macOS (Intel/Apple Silicon). Use w...
视频转写工作流,支持B站和YouTube视频。自动判断有字幕/无字幕,有字幕则获取字幕,无字幕则下载音频+whisper转写。触发场景:(1) 用户要求总结视频
Text-to-speech, sound effects, music generation, voice management, and quota checks via the ElevenLabs API. Use when generating audio with ElevenLabs or mana...
Local ASR and TTS inference server. Use when the user wants to transcribe audio to text (ASR) or convert text to speech (TTS). Requires a running Willow Infe...
OpenAI API integration — chat completions, embeddings, image generation, audio transcription, file management, fine-tuning, and assistants via the OpenAI RES...
Fact-check news articles, social media posts, images, and videos. Use when verifying claims, detecting deepfakes or AI-generated content, identifying out-of-...
Unified multi-modal content parser for images, PDF, DOCX, audio, auto OCR/transcription, output structured text for LLM processing
端到端播客制作流水线 - 从选题到发布的完整自动化。支持录制前调研、大纲生成、节目笔记、社交媒体宣发。含国内平台适配(小宇宙/喜马拉雅/B站/
Convert text to speech using MiniMax Speech 2.6 Turbo via WaveSpeed AI. Features ultra-human voice cloning, sub-250ms latency, 40+ languages, emotion control...
Connect with 17 specialized AI mentors for expert guidance on growth, fundraising, sales, product, engineering, operations, finance, legal, hiring, leadershi...