Convert Chinese Mandarin text to high-quality audio using UniSound's WebSocket TTS API with adjustable voice, speed, volume, pitch, and format options.
Create animated talking-circle videos (Telegram-style round video messages) from avatar frame images and audio. Supports audio-to-video and text-to-video via...
Automated text-to-video pipeline with multi-provider TTS/ASR support - OpenAI, Azure, Aliyun, Tencent | 多厂商 TTS/ASR 支持的自动化文本转视频系统
Play social deduction and game theory games against other AI agents. Register, queue, and play autonomously via HTTP API.
End-to-end voice workflow with Deepgram STT and TTS. Use when transcribing voice messages, generating spoken replies, or building a shell-based audio pipelin...
Create AI avatar and talking head videos with OmniHuman, Fabric, PixVerse via inference.sh CLI. Models: OmniHuman 1.5, OmniHuman 1.0, Fabric 1.0, PixVerse Li...
Discover and install related skills from inference.sh skill registry. Helps find complementary skills for your AI workflow. Use for: skill discovery, workflo...
Generate comic and manga panels, strips, and pages using each::sense AI. Create superhero comics, manga pages, webtoons, action sequences, and convert photos...
调用MiniMax语音合成API,支持中文多音色、高质量文本转语音,提供流式和非流式音频输出。
Complete Venice AI API toolkit - image generation, video, audio, embeddings, transcription, characters, models, and admin functions. Privacy-focused inferenc...
Translate and dub existing videos into multiple languages using HeyGen. Use when: (1) Translating a video into another language, (2) Dubbing video content wi...
Access and interact with AI group interview simulations: browse jobs, create/join rooms, speak, advance interviews, upload resumes, and view history and eval...
Route Alibaba Cloud Model Studio requests to the right local skill (Qwen Image, Qwen Image Edit, Wan Video, Wan R2V, Qwen TTS, Qwen ASR and advanced TTS vari...
Voice cloning and TTS using MiniMax API. User must provide a voice name when cloning; after success, voice_name->voice_id is written back to this skill doc f...
Multi-speaker dialogue audio creation with Dia TTS. Covers speaker tags, emotion control, pacing, conversation flow, and post-production. Use for: podcasts,...
--- name: avatar description: Interactive AI avatar with Simli video rendering and ElevenLabs TTS emoji: "\U0001F9D1\u200D\U0001F4BB" homepage: https://github.com/Johannes-Berggren/openclaw-avatar met
Generate AI music and songs with Diffrythm, Tencent Song Generation via inference.sh CLI. Models: Diffrythm (fast song generation), Tencent Song Generation (...
Interact with GitHub using the gh CLI for issues, PRs, CI runs, and advanced queries. And also 50+ models for image generation, video generation, text-to-spe...
Clawdbot documentation expert with decision tree navigation, search, and doc fetching. And also 50+ models for image generation, video generation, text-to-sp...
Ultimate AI agent memory system with WAL protocol, vector search, git-notes, and cloud backup. And also 50+ models for image generation, video generation, te...
Use the mcporter CLI to list, configure, auth, and call MCP servers and tools directly. And also 50+ models for image generation, video generation, text-to-s...
Call EngageLab App Push REST APIs to send push notifications and in-app messages to Android, iOS, and HarmonyOS devices; manage tags and aliases; create sche...
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube). And also 50+ models for image generation, video generation, text-to-speec...
Fetch and read transcripts from YouTube videos for summarization and content extraction. And also 50+ models for image generation, video generation, text-to-...