Decentralized compute and data marketplace for AI agents with spot pricing | 去中心化 AI Agent 计算和数据市场,支持 Spot 动态定价
--- name: voice-stt-tts description: Full voice message setup (STT + TTS) for OpenClaw using faster-whisper and Edge TTS homepage: https://docs.openclaw.ai/nodes/audio metadata: { "openclaw":
Voice input and microphone button for OpenClaw WebChat Control UI. Adds a mic button to chat, records audio via browser MediaRecorder, transcribes locally vi...
Transcribe YouTube videos with smart fallback: extracts captions first (fast, free), falls back to local Whisper transcription when no captions available. Au...
AI voice call agent — make outbound calls, generate browser call links, accept inbound calls, and retrieve full transcripts + summaries when calls end. Suppo...
Make outbound AI phone calls. Use when asked to call a business, make a phone call, order food by phone, schedule appointments, or any task requiring voice calls. Triggers on "call", "phone", "dial",
Extract audio from Douyin (抖音/TikTok China) videos and transcribe to text using Whisper. Trigger when user sends a Douyin link (v.douyin.com or www.douyin.co...
Run a real-time AI phone agent using Twilio, Deepgram, and ElevenLabs. Handles incoming calls, transcribes audio, generates responses via LLM, and speaks back via streaming TTS. Use when user wants to
Unified interface for all providers and all modalities: use one nous-genai CLI/SDK flow to run text/image/audio/video/embedding across OpenAI, Gemini, Claude...
Real-time voice assistant for OpenClaw. Streams mic audio through configurable STT (Deepgram or ElevenLabs) into your OpenClaw agent, then speaks the response via configurable TTS (Deepgram Aura or El
Give your agent a voice — and ears. The Cult of Carcinization is the bot-first gateway to ScrappyLabs TTS and STT. Speak with 20+ voices, design your own from a text description, transcribe audio to
Transcribe, diarise, translate, post-process, and structure audio/video with AssemblyAI. Use this skill when the user wants AssemblyAI specifically, needs hi...
All-in-one YouTube content generator - create regular videos, Shorts from scratch, and Shorts from long videos. Combines best of youtube-factory and AI-Youtu...
Unified QCut media toolkit — organize project files, process media with FFmpeg, generate AI content, control the QCut editor with native CLI commands, genera...
Build and troubleshoot SenseAudio speech recognition integrations, including HTTP transcription (`/v1/audio/transcriptions`), realtime WebSocket ASR (`/ws/v1...
Build, configure, and deploy conversational video agents using the Trugen AI platform API. Use this skill when the user wants to create AI video avatars, man...
Transcribe audio files using Qwen ASR (千问STT). Use when the user sends voice messages and wants them converted to text.
Use VLM Run (vlmrun) to generate transcriptions from YouTube videos. Download a video with yt-dlp, then run vlmrun to transcribe with optional timestamps. VLMRUN_API_KEY must be in .env; follow vlmrun
Agent Skills (SKILL.md) builder, auditor, and improver for cross-platform LLM agents. Use for "skeall", "build a skill", "create skill", "improve skill", "au...
Transcribe audio files to text via Step ASR streaming API (HTTP SSE). Supports Chinese and English, multiple audio formats (PCM, WAV, MP3, OGG/OPUS), real-ti...
Use when low-latency realtime speech recognition is needed with Alibaba Cloud Model Studio Qwen ASR Realtime models, including streaming microphone input, li...
--- name: local-stt description: Local STT with selectable backends - Parakeet (best accuracy) or Whisper (fastest, multilingual). metadata: {"openclaw":{"emoji":"🎙️","requires":{"bins":["ffmpeg"
Summarize YouTube videos with NO subtitles by doing local ASR (yt-dlp + faster-whisper) and extracting a few screenshot frames via ffmpeg. Use when the user...
Extracts YouTube video transcripts and provides concise summaries highlighting main points, arguments, and conclusions without watching the full video.