Local text-to-speech using Piper for voice message delivery. Use when the user asks for voice responses, audio messages, TTS, text-to-speech, voice notes, or...
Generate AI-optimized Alt Text, file names, captions, and Schema markup for images, videos, and audio assets. Improves AI discoverability on Google Lens, Cha...
OpenClaw agent skill for converting documents to Markdown. Documentation and utilities for Microsoft's MarkItDown library. Supports PDF, Word, PowerPoint, Excel, images (OCR), audio (transcription), H
Analyze videos from TikTok, YouTube, Instagram, Twitter, and others by URL, transcribing audio locally and answering questions about the content.
Seedance 2.0 AI video generation via EvoLink API. Text-to-video, image-to-video with auto audio (voice, SFX, BGM). Works with OpenClaw, Claude Code, Cursor....
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube).
Convert documents and files to Markdown using markitdown. Use when converting PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx, .xls), HTML, CSV, JSON, XML, images (with EXIF/OCR), audio (with tran
Search and directly download free images, audio, music, sound effects, videos, and 3D models from WebSim's large, auth-free digital asset library.
LobsterTv is an AI agent live streaming platform. Agents connect via REST API to broadcast in real-time with rendered avatars, synchronized TTS audio, expression control, chat interaction, and audienc
Multilingual Text-to-Speech (TTS) with intelligent Pinyin-to-Hanzi conversion. Use when the user asks to generate audio for text that contains a mix of Vietn...
Text-to-Speech via macOS say command with Siri Natural Voices. Use for generating speech audio, TTS clips, or speaking text aloud on macOS.
Recognize songs by singing or audio file using iFlytek's Query By ACRCloud technology.
Convert text or subtitle files into speech audio with options for voice cloning, emotion control, speed, and timeline-accurate dubbing using Kokoro or Noiz b...
Text-to-speech conversion using node-edge-tts npm package for generating audio from text. Supports multiple voices, languages, speed adjustment, pitch control, and subtitle generation. Use when: (1) U
Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube). And also 50+ models for image generation, video generation, text-to-speec...
Text-to-speech using Kokoro local TTS. Use when the user wants to convert text to audio, read aloud, or generate speech.
Read any web page aloud with natural AI voices. Extract article text from any URL and convert it to audio (MP3). Use when the user wants to: listen to a webp...
Generate and send video messages with a lip-syncing VRM avatar. Use when user asks for video message, avatar video, video reply, or when TTS should be delivered as video instead of audio.
Transcribes local voice messages to text using Faster Whisper models for fast, privacy-focused speech recognition on audio files.
即梦AI视频生成工具(带声音版本),通过火山引擎API自动生成带音频的高质量视频。支持文生视频、图生视频,适用于短视频内容创作。
Convert text to speech using the TogetherAI API with the MiniMax speech-2.6-turbo model and save audio in mp3 format.
Convert text to speech using Volcengine TTS with preset or cloned voices and send audio messages to Feishu chats or groups.
个性化资讯电台生成服务。使用场景:(1) 生成特定主题的电台,(2) 设置每日定时推送,(3) 配置TTS音色,(4) 收听历史电台。不适用:音乐播放、实时广播、视频内容。
Local speech-to-text using faster-whisper. 4-6x faster than OpenAI Whisper with identical accuracy; GPU acceleration enables ~20x realtime transcription. SRT...