Text-to-speech conversion using Python edge-tts for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and sub...
Command-line interface to manage Google NotebookLM notebooks, sources, and generate audio, quizzes, reports, presentations, and visual study materials progra...
Read any web page aloud with natural AI voices. Extract article text from any URL and convert it to audio (MP3). Use when the user wants to: listen to a webp...
Text-to-speech conversion using OpenAI's TTS API for generating high-quality, natural-sounding audio. Supports 6 voices (alloy, echo, fable, onyx, nova, shimmer), speed control (0.25x-4.0x), HD qualit
End-to-end voice workflow with Deepgram STT and TTS. Use when transcribing voice messages, generating spoken replies, or building a shell-based audio pipelin...
Create chapters, highlights, and show notes from podcast audio or transcripts. Use when a user wants chapter markers, highlight clips, or show-note drafts without publishing or distribution actions.
Convert articles, blog posts, or any text into professional podcast scripts and TTS audio. Use when a user wants to: (1) Transform written content into conve...
Install and use the speechall CLI tool for speech-to-text transcription. Use when the user wants to: (1) transcribe audio or video files to text, (2) install speechall on macOS or Linux, (3) list avai
Local text-to-speech using Piper for voice message delivery. Use when the user asks for voice responses, audio messages, TTS, text-to-speech, voice notes, or...
Veo, Veo 3.1 Fast - Google AI video generation models for AI agents. 1080p HD output, reference image support, intelligent audio generation.
Download videos, images, and audio without watermarks from 999+ platforms (TikTok, YouTube, Instagram, Twitter, Bilibili, Sora2, etc.) using the MeowLoad API...
Download videos, audio, subtitles, and covers from Bilibili using bilibili-api. Use when working with Bilibili content for downloading videos in various qual...
Set up mlx-whisper as the local audio transcription engine for OpenClaw on Apple Silicon Macs (M1/M2/M3/M4). Automatically transcribes voice notes sent via T...
Transcribe audio to text using ElevenLabs Scribe. Supports batch transcription, realtime streaming from URLs, microphone input, and local files.
Download videos, audio, subtitles, and clean paragraph-style transcripts from YouTube and any other yt-dlp supported site. Use when asked to “download this video”, “save this clip”, “rip aud
Local Spanish TTS using Microsoft VibeVoice. Generate natural voice audio from text, optimized for WhatsApp voice messages.
--- name: ressemble displayName: Ressemble - Adriano version: 1.0.0 description: Text-to-Speech and Speech-to-Text integration using Resemble AI HTTP API. author: Adriano Vargas tags: [tts, stt, audio
Auto-play TTS voice files with wake word detection. Only plays audio when user message contains wake words like "语音", "念出来", "voice", etc. Perfect for Webcha...
Use when the user wants to transcribe, caption, or get the text content of a video or audio file — e.g. "transcribe this video", "get the transcript", "what...
Render TikTok-style animated pill captions onto short-form videos using MoviePy + PIL. Takes a base MP4, a captions JSON, and optional background audio — out...
A robust CLI wrapper for yt-dlp to download videos, playlists, and audio from YouTube and thousands of other sites. Supports format selection, quality control, metadata embedding, and cookie authentic
Orchestrate book-to-content workflows to generate video, audio, cover images, and a manifest for episode or campaign packages.
Transcribe audio files (WAV/MP3/M4A/FLAC) to timestamped text using SenseVoice-Small + FSMN-VAD. Supports single-file and batch mode with VAD-anchored per-se...