Swiss-knife for AI agents. 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, web search, document parsing, emai...
Get current weather and forecasts (no API key required). And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, c...
Guide for creating effective skills for Clawdbot agents. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, c...
Gemini CLI for one-shot Q and A, summaries, and generation. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music...
Web search and content extraction via Brave Search API. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, ch...
Extract frames or short clips from videos using ffmpeg. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, ch...
Search and analyze your own session logs using jq. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, music, chat, w...
Use AudioPod AI's API for audio processing tasks including AI music generation (text-to-music, text-to-rap, instrumentals, samples, vocals), stem separation, text-to-speech, noise reduction, speech-to
Text-to-Speech and Speech-to-Text using ElevenLabs AI. Use when the user wants to convert text to speech, transcribe voice messages, or work with voice in multiple languages. Supports high-quality AI
Local ASR and TTS inference server. Use when the user wants to transcribe audio to text (ASR) or convert text to speech (TTS). Requires a running Willow Infe...
Text-to-Speech and Speech-to-Text integration using Resemble AI HTTP API.
Local text-to-speech (TTS) and speech-to-text (STT) using FluidAudio on Apple Silicon. Sub-second voice synthesis and transcription running entirely on-device via the Apple Neural Engine. Use when set
Local text-to-speech using Piper for voice message delivery. Use when the user asks for voice responses, audio messages, TTS, text-to-speech, voice notes, or...
Complete voice interaction server supporting speech-to-text, text-to-speech, and real-time voice conversations through local microphone, OpenAI-compatible APIs, and LiveKit integration
Enables local voice chat by embedding Hotbutter relay server and PWA, providing speech-to-text and text-to-speech via a secure, self-hosted connection.
On-device speech-to-text (Whisper) + text-to-speech (Qwen3-TTS) CLI. Runs on the Apple Neural Engine (ANE), Apple's low power, dedicated ML inference chip. M...
Jarvis TTS text-to-speech using Microsoft edge-tts with afplay playback. Use when users request voice output, audio responses, or text-to-speech. Provides na...
Use Sarvam AI for Indian language Text-to-Speech (TTS), Speech-to-Text (STT), Translation, and Chat.
Control Sonos speakers (discover, status, play, volume, group). And also 50+ models for image generation, video generation, text-to-speech, speech-to-text, m...
Notion API for creating and managing pages, databases, and blocks. And also 50+ models for image generation, video generation, text-to-speech, speech-to-text...