--- name: doubao-launch description: Launch Doubao desktop application and configure real-time translation window. tools: - launch_doubao --- # Doubao Launch ## Usage ```bash python scripts/d
╔══════════════════════════════════════════════════════════╗. Use when you need
Generate Pinterest-optimized vertical videos using JSON2Video API. Supports AI-generated or URL-based images, AI-generated or provided voiceovers, optional subtitles, and zoom effects. Use when creati
Automated short drama video publisher. Downloads drama content from MoboBoost, uses AI to identify highlight moments, clips 15-second vertical videos with te...
Generate videos using TensorsLab's AI video generation models. Supports text-to-video and image-to-video generation with automatic prompt enhancement, progre...
Sync, transcribe, and intelligently organize voice memos, audio/video files, and URLs. 同步、转录、智能整理语音备忘录、音视频文件和视频链接。
Premier AI video generation with models: Wan 2.6, Kling O1, Kling 2.6, Google Veo 3.1, Sora 2 Pro, Pixverse V5.5, Hailuo 2.0, Hailuo 2.3, SeeDance 1.5 Pro, V...
Generate precise, timecoded Seedance 2.0 prompts integrating multimodal inputs with asset mapping for controlled 4-15s video creation and editing.
通过curl 调用 HTTP 服务,自动创建剪映/CapCut 草稿、编排素材/字幕/特效并发起渲染。用户要做AI视频生成或批量剪辑自动化时调用。
Bridge Twilio phone calls to Google Gemini Live API for real-time AI voice conversations. No STT/TTS middleware required. Includes VAD and echo suppression.
[Aibrary] Generate a book summary podcast script in a single-narrator storytelling style. Use when the user wants to turn a book into a podcast, create an au...
Create an AI clone video (talking head) from a single reference photo, a text script, and a cloned voice. Automates the pipeline of image generation (Gemini)...
SiliconFlow 多模态服务,支持图片生成(FLUX/Qwen)、视频生成(Wan)、TTS语音合成、ASR语音识别。使用代金券支付。
AI video production workflow using Remotion. Use when creating videos, short films, commercials, or motion graphics. Triggers on requests to make promotional...
Build and debug Groq API chat and speech workflows with low-latency routing, structured outputs, and production-safe patterns.
Unified skill for resolving, downloading, and delivering media (audio/video) to chat platforms. Integrates yt-dlp for resolution and handles Spotify metadata sync.
# System Prompt: Elite Cinematic & Forensic Analysis AI **Role:** You are an elite visual analysis AI capable of acting simultaneously as a **Director**, **Master Cinematographer**, **Production Desi
Access ElevenLabs APIs for text-to-speech, speech-to-speech, realtime speech-to-text, voice/model management, and dialogue workflows with direct HTTP calls.
用于构建和排查 SenseAudio 会议助手,覆盖实时会议转写、说话人区分、实时翻译、会议纪要生成、行动项提取与转录导出。Build and troubleshoot SenseAudio meet
AI-powered Apple TV remote that uses vision to autonomously navigate apps, play content, control playback, and manage settings.
Comprehensive MOSS Transcribe Diarize workflow for high-confidence multi-speaker ASR. Use when users need (1) timestamped transcription, (2) speaker-labeled...
使用 RDK X5 上的 TogetheROS.Bot (tros.b) Humble 框架:启动 43 个预装 ROS2 算法包、管理 ROS2 话题/节点/服务、构建摄像头+AI+输出(Web/语音/HDMI)集成 pipeline、创
下载视频并用AI分析内容 - 支持B站/抖音/YouTube等平台,提取语音内容并分析视频结构
Use VLM Run (vlmrun) to generate transcriptions from YouTube videos. Download a video with yt-dlp, then run vlmrun to transcribe with optional timestamps. VLMRUN_API_KEY must be in .env; follow vlmrun