News video maker skill. Use search tools to get news, generate speech, and create video with golden subtitles. For creating news briefing videos.
OpenAI API integration — chat completions, embeddings, image generation, audio transcription, file management, fine-tuning, and assistants via the OpenAI RES...
Voice conversation interface for OpenClaw using wake word detection, streaming LLM responses, and text-to-speech. Use when a user wants to talk to their Open...
Speak responses aloud on macOS using the built-in `say` command when user input indicates Voice Wake/voice recognition (for example, messages starting with "User talked via voice recognition on <devic
Complete Venice AI API toolkit - image generation, video, audio, embeddings, transcription, characters, models, and admin functions. Privacy-focused inferenc...
Clone any voice and generate speech using Coqui XTTS v2. SUPER SIMPLE - provide a voice sample (6-30 sec WAV) and text, get cloned voice audio. Supports 14+ languages. Use when the user wants to (1) C
Give OpenClaw a body — a tiny fluid glass ball desktop pet with voice cloning, 15+ eye expressions, desktop lyrics overlay, and 7 mood colors. Electron-based, pure CSS/JS animation.
Real-time WhatsApp voice message processing. Transcribe voice notes to text via Whisper, detect intent, execute handlers, and send responses. Use when building conversational voice interfaces for What
End-to-end pipeline for creating faceless Islamic story TikTok videos. Orchestrates multiple specialized agents: story research, scriptwriting, image generat...
Fetch, classify, and summarize papers from multiple sources (arXiv, etc.) with AI-powered multi-language summaries and email delivery.
Remove signs of AI-generated writing from text to make it sound more natural and human-written. And also 50+ models for image generation, video generation, t...
Go live on retake.tv — the livestreaming platform built for AI agents. Register once, stream via RTMP, interact with viewers in real time, and build an audie...
Build, operate, and troubleshoot Autonoannounce local speaker text-to-speech using the queued pipeline (enqueue to worker to ElevenLabs to playback backend)....
Unified gateway skill for async execute/poll, portal user closure, and telemetry feedback workflows.
Transcribe audio files via Doubao Seed-ASR 2.0 (豆包录音文件识别模型2.0, recorded audio → text) API from ByteDance/Volcengine. Best-in-class Chinese speech recognition...
Sync and query CalDAV calendars (iCloud, Google, Fastmail, Nextcloud) using vdirsyncer and khal. And also 50+ models for image generation, video generation,...
--- name: podcast-intel description: > Podcast intelligence engine. Transcribes, segments, summarizes, and scores podcast episodes from RSS feeds. Generates "worth your time" recommendations wit
Use this skill whenever the user wants speech to sound more human, companion-like, or emotionally expressive. Triggers include: any mention of 'say like', 't...
Swiss-knife for AI agents. 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, emai...
Use when creating cloned voices with Alibaba Cloud Model Studio CosyVoice customization models, especially cosyvoice-v3.5-plus or cosyvoice-v3.5-flash, from...
帮助在 HarmonyOS NEXT 上使用百度地图鸿蒙 SDK 进行开发。支持独立包(@bdmap/base、@bdmap/map、@bdmap/search、@bdmap/util)和组合包(@bdmap/map_walkride_search、@bdmap/na
--- name: agent-media description: AI UGC video production from the terminal using the `agent-media` CLI. homepage: https://github.com/gitroomhq/agent-media metadata: {"clawdbot":{"emoji":"🌎","requ
Audio transcription and text-to-speech generation using OpenRouter API. Use when the user needs to transcribe audio files to text or generate speech/audio fr...