Transcribe audio files to text using local speech recognition. Triggers on: "转录", "transcribe", "语音转文字", "ASR", "识别音频", "把这段音频转成文字".
ElevenLabs TTS (Text-to-Speech) with emotional audio tags for expressive voice synthesis. WhatsApp-compatible voice messages with Opus conversion. Supports 7...
Convert documents and files to Markdown using markitdown (PDF, Word, PowerPoint, Excel). And also 50+ models for image generation, video generation, text-to-...
Async AI image generation (text-to-image and image-to-image). Submit a job to get a task_id, then poll status to get an OSS download URL.
Generate and edit images using ByteDance's Seedream V4.5 model via WaveSpeed AI. Supports text-to-image generation and multi-image editing with custom resolu...
Convert text to speech using Microsoft Edge's TTS engine with customizable voices, direct playback, and automatic temporary file cleanup.
Clone any voice and generate speech using Coqui XTTS v2. SUPER SIMPLE - provide a voice sample (6-30 sec WAV) and text, get cloned voice audio. Supports 14+ languages. Use when the user wants to (1) C
Parse PDF documents with MinerU MCP to extract text, tables, and formulas. Supports multiple backends including MLX-accelerated inference on Apple Silicon.
Create animated talking-circle videos (Telegram-style round video messages) from avatar frame images and audio. Supports audio-to-video and text-to-video via...
Control Slack from Clawdbot including reacting to messages and pinning items. And also 50+ models for image generation, video generation, text-to-speech, spe...
Local search and indexing CLI (BM25 + vectors + rerank) with MCP mode. And also 50+ models for image generation, video generation, text-to-speech, speech-to-...
Posts content to WeChat Official Account (微信公众号) via API or Chrome CDP. Supports article posting (文章) with HTML, markdown, or plain text input, and image-tex...
An MCP server that responds with plain text content for AI tools, built with Node.js/TypeScript and the Model Context Protocol SDK. It uses the Streamable HTTP transport with session management over E
Use xapi CLI to access real-time external data — Twitter/X profiles, tweets, and timelines, crypto token prices and metadata, web search, news, and AI text p...
Hybrid document intelligence pipeline ingesting PDFs, images, and spreadsheets with OCR, visual and text search, and field fix capture for fast retrieval.
Extract Chinese and English text from images and scanned PDFs, including documents like invoices and contracts, using PaddleOCR in Python.
Read and summarize WeChat Official Account articles (微信公众号文章) by URL. Bypasses WeChat's anti-bot detection to extract full article text, title, author, date,...
Generates article cover images with 5 dimensions (type, palette, rendering, text, mood) combining 10 color palettes and 7 rendering styles. Supports cinemati...
Zhipu AI Web Page Reader Tool - Fetches and parses web page content into structured Markdown or text via cURL. Use when: - Need to fetch and read the content...
Control the user's real Safari browser on macOS using AppleScript and screencapture. Read pages, click elements, type text, take screenshots, navigate tabs —...
Automate web form interactions including login, file upload, text input, and form submission using Playwright. Use when user needs to automate website intera...
Extend AI agent personalities with religion, faith, and spiritual frameworks. Define principles, sacred texts, moral frameworks, traditions, and more.
Remove AI writing patterns based on Wikipedia's "Signs of AI writing" research. 24 pattern detection and rewriting rules for making AI-generated text sound n...
Create product demo videos with voiceover, text overlays, and real browser interactions. Fully automated, zero cost. Uses Puppeteer (headless Chrome), edge-t...