Search

201 results for "speech-to-text"

All 🧪 Skills 🔌 MCP Servers 📏 Rules 💬 Prompts

🧪 Skill

Decentralized Agent Cloud

Free

Decentralized compute and data marketplace for AI agents with spot pricing | 去中心化 AI Agent 计算和数据市场，支持 Spot 动态定价

❤️ 0 ⬇️ 37

🧪 Skill

Voice messaging setup

Free

--- name: voice-stt-tts description: Full voice message setup (STT + TTS) for OpenClaw using faster-whisper and Edge TTS homepage: https://docs.openclaw.ai/nodes/audio metadata: { "openclaw":

❤️ 0 ⬇️ 216

🧪 Skill

Webchat Voice Gui

Free

Voice input and microphone button for OpenClaw WebChat Control UI. Adds a mic button to chat, records audio via browser MediaRecorder, transcribes locally vi...

❤️ 0 ⬇️ 128

🧪 Skill

YouTube Transcribe

Free

Transcribe YouTube videos with smart fallback: extracts captions first (fast, free), falls back to local Whisper transcription when no captions available. Au...

❤️ 0 ⬇️ 84

🧪 Skill

Phone Call Agent

Free

AI voice call agent — make outbound calls, generate browser call links, accept inbound calls, and retrieve full transcripts + summaries when calls end. Suppo...

❤️ 0 ⬇️ 68

🧪 Skill

RingBot

Free

Make outbound AI phone calls. Use when asked to call a business, make a phone call, order food by phone, schedule appointments, or any task requiring voice calls. Triggers on "call", "phone", "dial",

❤️ 4 ⬇️ 2.1k

🧪 Skill

Douyin Video Transcribe

Free

Extract audio from Douyin (抖音/TikTok China) videos and transcribe to text using Whisper. Trigger when user sends a Douyin link (v.douyin.com or www.douyin.co...

❤️ 0 ⬇️ 29

🧪 Skill

Phone Voice Agent

Free

Run a real-time AI phone agent using Twilio, Deepgram, and ElevenLabs. Handles incoming calls, transcribes audio, generates responses via LLM, and speaks back via streaming TTS. Use when user wants to

❤️ 6 ⬇️ 2.3k

🧪 Skill

GenAI CLI

Free

Unified interface for all providers and all modalities: use one nous-genai CLI/SDK flow to run text/image/audio/video/embedding across OpenAI, Gemini, Claude...

❤️ 0 ⬇️ 79

🧪 Skill

Voice Assistant

Free

Real-time voice assistant for OpenClaw. Streams mic audio through configurable STT (Deepgram or ElevenLabs) into your OpenClaw agent, then speaks the response via configurable TTS (Deepgram Aura or El

❤️ 4 ⬇️ 1.3k

🧪 Skill

Cult Of Carcinization

Free

Give your agent a voice — and ears. The Cult of Carcinization is the bot-first gateway to ScrappyLabs TTS and STT. Speak with 20+ voices, design your own from a text description, transcribe audio to

❤️ 3 ⬇️ 1.6k

🧪 Skill

AssemblyAI advanced speech transcription

Free

Transcribe, diarise, translate, post-process, and structure audio/video with AssemblyAI. Use this skill when the user wants AssemblyAI specifically, needs hi...

❤️ 3 ⬇️ 2.5k

🧪 Skill

Content Factory

Free

All-in-one YouTube content generator - create regular videos, Shorts from scratch, and Shorts from long videos. Combines best of youtube-factory and AI-Youtu...

❤️ 0 ⬇️ 214

🧪 Skill

QCut Toolkit

Free

Unified QCut media toolkit — organize project files, process media with FFmpeg, generate AI content, control the QCut editor with native CLI commands, genera...

❤️ 0 ⬇️ 110

🧪 Skill

SenseAudio-ASR

Free

Build and troubleshoot SenseAudio speech recognition integrations, including HTTP transcription (`/v1/audio/transcriptions`), realtime WebSocket ASR (`/ws/v1...

❤️ 0 ⬇️ 16

🧪 Skill

Trugen AI

Free

Build, configure, and deploy conversational video agents using the Trugen AI platform API. Use this skill when the user wants to create AI video avatars, man...

❤️ 0 ⬇️ 128

🧪 Skill

🎤 Transcribe audio files using Qwen ASR. 千问STT

Free

Transcribe audio files using Qwen ASR (千问STT). Use when the user sends voice messages and wants them converted to text.

❤️ 1 ⬇️ 174

🧪 Skill

Youtube Transcription Generator

Free

Use VLM Run (vlmrun) to generate transcriptions from YouTube videos. Download a video with yt-dlp, then run vlmrun to transcribe with optional timestamps. VLMRUN_API_KEY must be in .env; follow vlmrun

❤️ 0 ⬇️ 550

🧪 Skill

Skeall Skill Builder

Free

Agent Skills (SKILL.md) builder, auditor, and improver for cross-platform LLM agents. Use for "skeall", "build a skill", "create skill", "improve skill", "au...

❤️ 0 ⬇️ 404

🧪 Skill

Step Asr

Free

Transcribe audio files to text via Step ASR streaming API (HTTP SSE). Supports Chinese and English, multiple audio formats (PCM, WAV, MP3, OGG/OPUS), real-ti...

❤️ 1 ⬇️ 155

🧪 Skill

Alicloud Ai Audio Asr Realtime

Free

Use when low-latency realtime speech recognition is needed with Alibaba Cloud Model Studio Qwen ASR Realtime models, including streaming microphone input, li...

❤️ 0 ⬇️ 42

🧪 Skill

Local STT (Nvidia Parakeet + Whisper Support)

Free

--- name: local-stt description: Local STT with selectable backends - Parakeet (best accuracy) or Whisper (fastest, multilingual). metadata: {"openclaw":{"emoji":"🎙️","requires":{"bins":["ffmpeg"

❤️ 1 ⬇️ 2.2k

🧪 Skill

YouTube ASR Summarize (Local)

Free

Summarize YouTube videos with NO subtitles by doing local ASR (yt-dlp + faster-whisper) and extracting a few screenshot frames via ffmpeg. Use when the user...

❤️ 0 ⬇️ 109

🧪 Skill

tl;dw - YouTube Video Summarizer

Free

Extracts YouTube video transcripts and provides concise summaries highlighting main points, arguments, and conclusions without watching the full video.

❤️ 0 ⬇️ 1.3k