AI-powered speech tools: pronunciation assessment with phoneme-level feedback, speech-to-text with language detection, and text-to-speech with multiple voices.
Complete voice interaction server supporting speech-to-text, text-to-speech, and real-time voice conversations through local microphone, OpenAI-compatible APIs, and LiveKit integration
Convert text into natural-sounding speech for fast audio creation. Orchestrate multi-speaker dialogues and merge segments into a single track. Produce ready-to-share audio for podcasts, videos, and de
MCP Server that uses the open weight Kokoro TTS models to convert text-to-speech. Can convert text to MP3 on a local driver or auto-upload to an S3 bucket.
Bitcoin-powered AI tools via Lightning Network micropayments (L402). Image, text, video, music, speech, 3D model generation, file conversion, and SMS — no signup or API keys required.
Generate high-quality text-to-speech and text-to-voice outputs using the [DAISYS](https://www.daisys.ai/) platform and make it able to play and store audio generated.
MCP server plugin for Claude Code that converts text to speech using OpenAI's TTS API. Features 6 voices, worker pool architecture, mutex-protected playback, and cross-platform support.
A Model Context Protocol (MCP) server that converts various file formats to Markdown using the MarkItDown utility.
Generate high-quality images and videos using FAL AI models with seamless automatic downloads to your local machine. Access generated content via public URLs, data URLs, or local file paths for maximu