--- name: elevenlabs-pro description: ElevenLabs advanced TTS for converting text to speech, listing voices, and managing credits license: MIT metadata: version: 1.0.0 author: Jack2 tags: tts, a
Transcribe Telegram voice messages and audio notes into text using the OpenAI Whisper API. Use when (1) a user sends a voice message or audio note via Telegr...
Generate complete Xiaohongshu (Little Red Book) posts with up to 10 pages (3:4 vertical format). Auto-parses text content into cover + content pages. Support...
Remove unwanted objects, people, text, and imperfections from photos using each::sense AI. Clean up images with intelligent inpainting that seamlessly fills...
Hash generator and file integrity verifier. Generate MD5, SHA1, SHA256, SHA512 hashes for text and files, verify file integrity against expected hashes, comp...
Give your text-based OpenClaw agent the ability to see and describe images
transform any input into a deliberately degraded, semantically corrupted, non-explanatory experimental text stream. use when the user invokes "巴别渊" or "劣化",...
Generate speech from text using Kyutai Pocket TTS - lightweight, CPU-friendly, streaming TTS with voice cloning. English only. ~6x real-time on M4 MacBook Air.
Generate AI-optimized Alt Text, file names, captions, and Schema markup for images, videos, and audio assets. Improves AI discoverability on Google Lens, Cha...
Give your agent the ability to speak to you real-time. Talk to your Claude! Ultra-fast TTS, text-to-speech, voice synthesis, audio output with ~90ms latency....
Query and create field observations and AI-processed captures. Photos, voice notes, and text notes from the field.
AI translation via TranslateFlow API — multi-language content translation, localization, tone adaptation, batch translation. Use when user needs text transla...
Text-to-speech generation on Volcengine (ByteDance) speech services. Use when users need narration, multi-language speech output, voice selection, or TTS tro...
Extract plain-text transcripts from YouTube videos using a local Python script. Use when the user wants to fetch, extract, or get a transcript from a YouTube...
Implement quantum-resistant encryption using the CIFER SDK (cifer-sdk npm package). Covers SDK initialization, wallet setup, secret creation, text encryption/decryption, and file encryption/decryption
Detect and solve simple image captchas during browser automation. Use when flows encounter 4-6 character text, distorted alphanumeric, numeric, rotated, or a...
Transcribe audio using a deployed Cloudflare Worker Whisper endpoint. Use when converting voice/audio files (wav, mp3, m4a, ogg, webm) to text through the cu...
Generate publication-quality academic diagrams, methodology figures, architecture illustrations, and statistical plots from text descriptions using the Paper...
Transcribe audio files to text via Step ASR streaming API (HTTP SSE). Supports Chinese and English, multiple audio formats (PCM, WAV, MP3, OGG/OPUS), real-ti...
Prompt-injection and data-exfiltration screening for untrusted text. Use before summarizing web/email/social content, before replying, and especially before writing anything to memory. Provides a safe
Read any web page aloud with natural AI voices. Extract article text from any URL and convert it to audio (MP3). Use when the user wants to: listen to a webp...
Run a headless Chromium browser via Podman to fetch text or HTML from JavaScript-rendered web pages using Playwright in a container.
Use rn-logs to read React Native Metro logs via CDP without MCP overhead. Default output is plain text and safe for non-interactive agent runs.
Creates AI-generated videos from text scripts, URLs, or PPT/PDF documents using Visla. Use when the user asks to generate a video, turn a webpage into a vide...