--- name: openrouter-transcribe description: Transcribe audio files via OpenRouter using audio-capable models (Gemini, GPT-4o-audio, etc). homepage: https://openrouter.ai/docs metadata: {"clawdbot":{"
Transcribe audio files via OpenRouter using audio-capable models (Gemini, GPT-4o-audio, etc).
--- name: qwen-audio description: "High-performance audio library with text-to-speech (TTS) and speech-to-text (STT)." version: "0.0.4" --- # Qwen-Audio ## Overview Qwen-Audio is a high-performance
--- name: audio-play description: Play audio files using Windows media player. Non-blocking execution. tools: - play_audio --- # Audio Play ## Usage ```bash python scripts/audio_play.py <audi
--- name: youtube-audio-download description: Download YouTube video audio and convert to MP3. Supports age-restricted videos with cookies. tools: - download_audio --- # Youtube Audio Download
Quick upload audio to AIOZ Stream API. Create audio objects with default or custom encoding configurations, upload the file, complete the upload, then return the audio link to the user.
Analyze audio quality, detect noise types, and provide improvement recommendations. Use when users need to check audio quality, validate recordings, or ident...
Generate audiobooks, podcasts, or educational audio content on demand. User provides an idea or topic, Claude AI writes a script, and ElevenLabs converts it to high-quality audio. Supports multiple fo
Manage macOS audio output and Bluetooth devices via the macos-audio CLI. Use when scanning paired devices, connecting or disconnecting Bluetooth, switching a...
Speaker separation, voice comparison, and audio processing tools. Use when working with multi-speaker audio, voice cloning, or speaker verification tasks inc...
Audio transcription and text-to-speech generation using OpenRouter API. Use when the user needs to transcribe audio files to text or generate speech/audio fr...
Convert narration audio plus slide decks into a narrated video. Use when the user has an audio-only `mp4/m4a/mp3/wav` and a `ppt/pptx/pdf` deck, and needs sl...
AI audio generation powered by CellCog. Text-to-speech, voice synthesis, voiceovers, podcast audio, narration, music generation, background music, sound design. Professional audio creation with AI.
Send TTS audio as a proper playable audio message (not file attachment) to Feishu chats. Use when asked to send voice messages, TTS audio, speech announcemen...
CLI audio mastering without a reference track using ffmpeg; accepts audio or video inputs and outputs mastered WAV/MP3 or remuxed MP4.
Text-to-speech, speech-to-text, voice conversion, and audio processing using EachLabs AI models. Supports ElevenLabs TTS, Whisper transcription with diarization, and RVC voice conversion. Use when the
Transcribe recorded audio files to text via UniCloud ASR API, supporting multiple formats and domains like finance and customer service; requires configured...
Perform audio editing tasks including trimming, volume adjustment, format conversion, and extracting audio from video files using natural language commands.
High-performance audio library for Apple Silicon with text-to-speech (TTS) and speech-to-text (STT).
Read, analyze, convert, trim, merge, adjust volume, and transcribe audio files in multiple formats including MP3, WAV, FLAC, AAC, OGG, and more.
Generate images, audio, video using MiniMax MCP and send to Telegram. Use when user wants to create media with MiniMax and deliver it via Telegram.
Automates uploading multiple sources (files, URLs, YouTube, Drive, text) to a NotebookLM notebook, generating a deep dive audio overview in a preferred langu...