Local text-to-speech (TTS) and speech-to-text (STT) using FluidAudio on Apple Silicon. Sub-second voice synthesis and transcription running entirely on-device via the Apple Neural Engine. Use when set
Generate images from text prompts using ZenMux API (Vertex AI protocol with Gemini models). Use when: (1) User wants to generate/create images from text desc...
Execute multimodal tasks using Novita AI: text-to-image, image-to-image, text-to-video, image-to-video, TTS, STT. Use for: generating images, generating vide...
Text-to-speech using macOS built-in `say` command. Use for voice notifications, audio alerts, reading text aloud, or announcing messages through Mac speakers. Supports multiple languages including Chi
Web search and content extraction via Brave Search API. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, ch...
Text-to-speech conversion tool. Use when converting text to speech audio files (opus or mp3 format). Supports macOS native 'say' command and Google TTS (gTTS...
Control Sonos speakers (discover, status, play, volume, group). And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, m...
Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comp...
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) U
Generate images and videos using xAI Grok Imagine Extended. Text-to-image, image editing, text-to-video, image-to-video. Use when: user asks to generate, cre...
On-device speech-to-text (Whisper) + text-to-speech (Qwen3-TTS) CLI. Runs on the Apple Neural Engine (ANE), Apple's low power, dedicated ML inference chip. M...
Fill PDF forms programmatically with text values and checkboxes. Use when you need to populate fillable PDF forms (government forms, applications, surveys, etc.) with data. Supports setting text field
Use the Meshy.ai REST API to generate assets: (1) text-to-2d (Meshy Text to Image) and (2) image-to-3d, then download outputs locally. Use when the user wants Meshy generations, needs polling async ta
Guide for creating effective skills for Clawdbot agents. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, c...
xAI Grok Imagine API integration for image generation, text-to-video, image-to-video, and editing via natural language. Use when you need to generate images or videos from text prompts, edit existing
Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Combines Wikipedia's "Sig...
Text-to-Speech via Zvukogram API with SSML support. Use when you need to generate speech from text, create podcasts, voice notifications, or work with audio....
Generates images and videos using MuleRouter or MuleRun multimodal APIs. Text-to-Image, Image-to-Image, Text-to-Video, Image-to-Video, video editing (VACE, k...
Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Dete
Text-to-speech conversion using Zhipu AI (BigModel) GLM-TTS model. Use when you need to convert text to audio files with various voice options. Supports Chin...
Extract frames or short clips from videos using ffmpeg. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, ch...
Text-to-Speech via macOS say command with Siri Natural Voices. Use for generating speech audio, TTS clips, or speaking text aloud on macOS.
Gemini CLI for one-shot Q and A, summaries, and generation. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music...
FREE voice recognition using Groq's complimentary Whisper API. Transcribe audio messages to text in 50+ languages at no cost. Perfect for voice-to-text autom...