Gemini CLI for one-shot Q and A, summaries, and generation. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music...
Text-to-Speech via macOS say command with Siri Natural Voices. Use for generating speech audio, TTS clips, or speaking text aloud on macOS.
Swiss-knife for AI agents. 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, emai...
Convert Chinese Mandarin text to high-quality audio using UniSound's WebSocket TTS API with adjustable voice, speed, volume, pitch, and format options.
Cross-platform mouse/keyboard automation skill. Supports mouse control (move/click/drag/scroll), keyboard control (key press/hotkeys/type text), screen opera...
AI image generation with SeeDream 4.5, Midjourney, Nano Banana 2, Nano Banana Pro. Text-to-image, image-to-image with intelligent model selection and knowled...
Local speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. SRT...
Enables voice synthesis, voice cloning, voice design, and audio post-processing using MiniMax Voice API and FFmpeg. Use when converting text to speech, creat...
Automate TikTok slideshow marketing for any app or product. Researches competitors, generates AI images, adds text overlays, posts via Postiz, tracks analyti...
Creates TikTok image carousels (slideshows with text overlays on photos) via the ViralBaby API. Use when the user wants to: create TikTok slideshows or carou...
Text-to-speech conversion using OpenAI's TTS API for generating high-quality, natural-sounding audio. Supports 6 voices (alloy, echo, fable, onyx, nova, shimmer), speed control (0.25x-4.0x), HD qualit
AI-powered translation services using TranslateFlow API - translation, translate text, language conversion, multilingual translation, language translation, d...
Generate audiobooks from novels and long-form text with chapter management and character voices. Use when users mention audiobooks, narrating books, or conve...
Build WCAG 2.1 AA compliant websites with semantic HTML, proper ARIA, focus management, and screen reader support. Includes color contrast (4.5:1 text), keyboard navigation, form labels, and live regi
Top-tier AI music generation with models: Suno sonic v4, Suno sonic v5, DouBao BGM (GenBGM), DouBao Song (GenSong). One-stop text-to-music with custom mode,...
Format text for Discord using markdown syntax. Use when composing Discord messages, bot responses, embed descriptions, forum posts, webhook payloads, or any...
Run a local script to work with PDF files, DOCX documents, OCR, and text-to-speech. Use the read tool to load this SKILL.md, then exec the uv run command ins...
Feishu/Lark Bot API wrapper for full-featured Feishu bot interactions. Use when users need to send messages (text, image, audio, file, post, interactive, sha...
Python library offering file handling, text extraction, data conversion, utilities, memory storage, and validation tools for AI agent workflows.
Unified AI execution engine. Single API key (WODEAPP_API_KEY) routes to 343+ models across text, image, video, TTS, and structured JSON — with automatic cost...
Local text-to-speech using Qwen3-TTS with mlx_audio (macOS Apple Silicon) or qwen-tts (Linux/Windows). Privacy-first offline TTS with natural, realistic voic...
Automated Chrome browser using nodriver for AI agent web tasks. Full CLI control with LLM-optimized commands — text-based interaction, markdown output, sessi...
Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. When Claude needs to fill in a PDF form or programmatically p