AI task hub for image analysis, background removal, speech-to-text, text-to-speech, markdown conversion, and async execute/poll/presentation orchestration. U...
Use AudioPod AI's API for audio processing tasks including AI music generation (text-to-music, text-to-rap, instrumentals, samples, vocals), stem separation, text-to-speech, noise reduction, speech-to
Local STT and TTS on macOS using native Apple capabilities. Speech-to-text via yap (Apple Speech.framework), text-to-speech via say + ffmpeg. Fully offline, no API keys required. Includes voice qualit
Simple text-to-speech skill using MiniMax Voice API. Converts text to audio with customizable voice selection. Use for generating speech audio from text.
Generate speech audio from text using Telnyx Text-to-Speech API. Use when you need to convert text to spoken audio, create voice messages, or generate audio content.
Local TTS router for Apple Silicon — pull models, serve OpenAI-compatible API, synthesize speech, clone voices. Use when the user asks to "generate speech",...
--- name: ressemble displayName: Ressemble - Adriano version: 1.0.0 description: Text-to-Speech and Speech-to-Text integration using Resemble AI HTTP API. author: Adriano Vargas tags: [tts, stt, audio
Transcribe audio files via Groq's OpenAI-compatible speech-to-text API. Use when the user sends voice messages or audio files and you need fast cloud speech-...
Swiss-knife for AI agents. 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, emai...
High-performance audio library for Apple Silicon with text-to-speech (TTS) and speech-to-text (STT).
Extract frames or short clips from videos using ffmpeg. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, ch...
Use the ClawdHub CLI to search, install, update, and publish agent skills. And also 50+ models for image generation, video generation, text-to-speech, speech...
Notion API for creating and managing pages, databases, and blocks. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text...
Local speech-to-text using Vosk. Lightweight, fast, fully offline. Perfect for transcribing Telegram voice messages, audio files, or any speech-to-text task without cloud APIs.
Advanced desktop automation with mouse, keyboard, and screen control. And also 50+ models for image generation, video generation, text-to-speech, speech-to-t...
Local text-to-speech (TTS) and speech-to-text (STT) using FluidAudio on Apple Silicon. Sub-second voice synthesis and transcription running entirely on-device via the Apple Neural Engine. Use when set
Gemini CLI for one-shot Q and A, summaries, and generation. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music...
Control Sonos speakers (discover, status, play, volume, group). And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, m...
--- name: qwen-audio description: "High-performance audio library with text-to-speech (TTS) and speech-to-text (STT)." version: "0.0.4" --- # Qwen-Audio ## Overview Qwen-Audio is a high-performance
Get current weather and forecasts (no API key required). And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, c...
Summarize per-model usage for Codex or Claude including cost tracking. And also 50+ models for image generation, video generation, text-to-speech, speech-to-...
Local search and indexing CLI (BM25 + vectors + rerank) with MCP mode. And also 50+ models for image generation, video generation, text-to-speech, speech-to-...
Edit PDFs with natural-language instructions using the nano-pdf CLI. And also 50+ models for image generation, video generation, text-to-speech, speech-to-te...