Fetch, classify, and summarize papers from multiple sources (arXiv, etc.) with AI-powered multi-language summaries and email delivery.
Integrate Shengwang (Agora) products: ConvoAI voice agents, RTC audio/video, RTM messaging, Cloud Recording, and token generation. Use when the user mentions...
Control JoyIn AI robots (W-1 Walle / M-1 Mini) — movement, follow, photo, video, live stream, TTS, agent config, and device status via OpenAPI.
Your eyes, hands, and ears on Android. See the screen (screenshot + indexed UI tree), interact (tap, swipe, scroll, type, clear-field), navigate via deep lin...
将输入法或语音输入系统已经识别出的口语文本,整理成适合即时通讯发送的自然书面语,并在用户明确指定目标语言时执行翻译。用于语音输入结束
Convert narration audio plus slide decks into a narrated video. Use when the user has an audio-only `mp4/m4a/mp3/wav` and a `ppt/pptx/pdf` deck, and needs sl...
Transcribe and organize voice memos with automatic categorization and information extraction. Use when users have voice notes, audio memos, or spoken notes t...
Complete Venice AI API toolkit - image generation, video, audio, embeddings, transcription, characters, models, and admin functions. Privacy-focused inferenc...
在 RDK X5 的 10TOPS BPU 上运行单个 AI 推理算法:YOLO 目标检测、图像分类、语义分割、人脸识别、手势识别、人体关键点、开放词汇检测(DOSOD/YOLO-World)
Create and launch Toingg voice-calling campaigns by POSTing user-supplied JSON to the toingg/make_campaign API. Use when Codex needs to turn campaign briefs...
--- name: douyin-download description: 抖音无水印视频下载和文案提取工具 metadata: openclaw: emoji: 🎵 requires: bins: [ffmpeg] env: [SILI_FLOW_API_KEY] --- # d
Unified gateway skill for async execute/poll, portal user closure, and telemetry feedback workflows.
深度拆解抖音视频,自动生成包含数据、结构、视觉、文案的完整分析报告。
Kazakh text converter between Cyrillic and Arabic scripts. Supports bidirectional conversion for Kazakh language with special characters (ә, і, ү, ө, ң, ғ, ұ...
Generate synchronized subtitles (SRT/VTT/ASS) from video audio with precise timestamps. Use when users need subtitles, captions, or video transcription with...
Analyze audio quality, detect noise types, and provide improvement recommendations. Use when users need to check audio quality, validate recordings, or ident...
使用 PPIO 执行多模态任务:文生图、图生图、文生视频、图生视频、TTS、STT。 适用于:生成图片、生成视频、文字转语音、语音识别。
Convert Bilibili (B站) videos into a searchable text knowledge base. Supports single videos and batch processing of entire UP主 channels. Uses local whisper.cp...
Execute multimodal tasks using Novita AI: text-to-image, image-to-image, text-to-video, image-to-video, TTS, STT. Use for: generating images, generating vide...
SiliconFlow 多模态服务,支持图片生成(FLUX/Qwen)、视频生成(Wan)、TTS语音合成、ASR语音识别。使用代金券支付。
Deploy MiGPT on a Xiaomi smart speaker to replace the built-in AI with a custom LLM-powered voice assistant. Use when: (1) setting up mi-gpt on a Xiaomi/Redm...
AI task hub for image analysis, background removal, speech-to-text, text-to-speech, markdown conversion, and async execute/poll/presentation orchestration. U...
Build ORBCAFE advanced analytics interactions using CPivotTable/usePivotTable and voice navigation using CAINavProvider/useVoiceInput. Use when requests invo...