Automation skill for TG Voice Whisper Transcriber.
Auto-transcribe voice messages locally using faster-whisper with selectable Whisper models, no API key required.
Transcribe audio files via OpenRouter using audio-capable models (Gemini, GPT-4o-audio, etc).
Automatically transcribes Telegram voice messages using Groq Whisper API and replies with text generated by an LLM.
Transcribe audio to text using Venice AI's Whisper-based speech recognition. Supports WAV, MP3, FLAC, M4A, AAC formats with optional timestamps.
Transcribe audio files using Qwen ASR (千问STT). Use when the user sends voice messages and wants them converted to text.
Unified speech-to-text skill. Use when the user asks to transcribe audio or video, generate subtitles, identify speakers, translate speech, search transcript...
Transcribe audio via Groq Automatic Speech Recognition (ASR) Models (Whisper).
Record experimental procedures and observations via voice commands during lab work. Real-time transcription for structured experiment documentation.
Use when the user needs local speech-to-text transcription for audio files, especially Chinese or mixed Chinese-English audio, without relying on cloud trans...
使用 Fun-ASR-Nano-2512 轻量级模型进行语音转文字。 提供快速准确的中文语音识别,识别结果实时输出到控制台,针对 CPU/GPU 环境优化。 使用场景:(1) 将
Offline speech-to-text (ASR) using whisper.cpp (whisper-cli) + ffmpeg. Supports batch transcription, timestamps, SRT/TXT/JSON outputs, and model download. Cr...
语音笔记转文字工具 v1.1 | 新增:实时字幕、多语言翻译、语音标记、音频剪辑、SRT导出。支持实时转写、会议纪要生成。
离线使用 OpenAI Whisper 免费转录本地视频音频,支持多格式多语言,生成时间戳字幕及AI内容摘要。
语音笔记转文字工具 Pro | 支持多语言语音识别、实时转写、会议纪要生成。
视频转写工作流,支持B站和YouTube视频。自动判断有字幕/无字幕,有字幕则获取字幕,无字幕则下载音频+whisper转写。触发场景:(1) 用户要求总结视频
Extract subtitles/transcripts from YouTube videos. Triggers: "youtube transcript", "extract subtitles", "video captions", "视频字幕", "字幕提取", "YouTube转文字", "提取字幕".
使用 Fun-ASR-Nano-2512 轻量级模型进行语音转文字。 提供快速准确的中文语音识别,识别结果实时输出到控制台,针对 CPU/GPU 环境优化。 使用场景:(1) 将中文音频文件转写为文字,(2) 需要轻量级低内存占用的 ASR, (3) 处理包含领域特定热词的音频(医疗、保险等), (4) 需要高准...
Local speech-to-text with Parakeet MLX (ASR) for Apple Silicon (no API key).
Fast, affordable automatic speech-to-text transcription supporting 100 languages, speaker diarization, word timestamps, and customizable output formats.
Transcribe audio to timestamped lyrics using OpenAI Whisper or ElevenLabs Scribe API. Outputs LRC, SRT, or JSON with word-level timestamps. Use when users want to transcribe songs, generate LRC files,
Transcribe audio files to text using local speech recognition. Triggers on: "转录", "transcribe", "语音转文字", "ASR", "识别音频", "把这段音频转成文字".