使用 Fun-ASR-Nano-2512 轻量级模型进行语音转文字。 提供快速准确的中文语音识别,识别结果实时输出到控制台,针对 CPU/GPU 环境优化。 使用场景:(1) 将中文音频文件转写为文字,(2) 需要轻量级低内存占用的 ASR, (3) 处理包含领域特定热词的音频(医疗、保险等), (4) 需要高准...
Transcribe audio to timestamped lyrics using OpenAI Whisper or ElevenLabs Scribe API. Outputs LRC, SRT, or JSON with word-level timestamps. Use when users want to transcribe songs, generate LRC files,
Transcribe audio files to text using local speech recognition. Triggers on: "转录", "transcribe", "语音转文字", "ASR", "识别音频", "把这段音频转成文字".
Transcribe audio via the self-hosted Whisper ASR instance running on Kubernetes. Use this skill whenever the user wants to transcribe audio files, convert sp...
Transcribe YouTube videos and local audio/video files with speaker diarization. Use when user asks to transcribe a YouTube URL, podcast, video, or audio file. Outputs clean speaker-labeled transcripts
Transcribe audio and video files using OpenAI Whisper API. Use when user wants to transcribe audio/video files, extract speech from media, or get text from r...
--- name: asr-claw version: 1.1.1 description: Speech recognition CLI for AI agent automation. Transcribe audio from stdin, files, or URLs. metadata: openclaw: homepage: https://github.com/llm-n
Transcribe, diarise, translate, post-process, and structure audio/video with AssemblyAI. Use this skill when the user wants AssemblyAI specifically, needs hi...
Use VLM Run (vlmrun) to generate transcriptions from YouTube videos. Download a video with yt-dlp, then run vlmrun to transcribe with optional timestamps. VLMRUN_API_KEY must be in .env; follow vlmrun
Fetch iMessage/Messages.app attachments (voice memos and images) and process them — transcribe audio via Silicon Flow ASR (SenseVoiceSmall), and analyze imag...
Local ASR and TTS inference server. Use when the user wants to transcribe audio to text (ASR) or convert text to speech (TTS). Requires a running Willow Infe...
Transcribe and organize voice memos with automatic categorization and information extraction. Use when users have voice notes, audio memos, or spoken notes t...
Transcribe YouTube videos to text by extracting captions and subtitles directly from the video URL using yt-dlp without audio processing.
Install and use the speechall CLI tool for speech-to-text transcription. Use when the user wants to: (1) transcribe audio or video files to text, (2) install speechall on macOS or Linux, (3) list avai
Transcribe audio to text with Whisper models via inference.sh CLI. Models: Fast Whisper Large V3, Whisper V3 Large. Capabilities: transcription, translation,...
The cheapest AI media API on the market. Generate images (Flux), music (AceStep), speech with voice cloning, transcribe video/audio, OCR, video generation, b...
Audio transcription and text-to-speech generation using OpenRouter API. Use when the user needs to transcribe audio files to text or generate speech/audio fr...
High-performance local speech-to-text transcription using Faster Whisper with NVIDIA GPU acceleration. Transcribe audio files locally without sending data to...
Local speech-to-text using OpenAI Whisper. Use when the user needs to: (1) transcribe audio files to text, (2) convert voice messages to written content, (3)...
Automatic Speech Recognition (ASR) using Zhipu AI (BigModel) GLM-ASR model. Use when you need to transcribe audio files to text. Supports Chinese audio trans...
Transcribes local voice messages to text using Faster Whisper models for fast, privacy-focused speech recognition on audio files.
Download, transcribe, and analyze videos from YouTube, X/Twitter, and TikTok with local Whisper processing. Perfect for extracting TL;DRs, timestamps, and ac...
Transcribe meetings with SenseAudio ASR speaker diarization, timestamps, and meeting-note extraction workflows. Use when users need meeting transcription, me...