Generate speech from text using Kyutai Pocket TTS - lightweight, CPU-friendly, streaming TTS with voice cloning. English only. ~6x real-time on M4 MacBook Air.
Transcribe or translate audio files to text using a public Hugging Face Whisper Space over Gradio. Use when the user sends voice notes, audio attachments, me...
Organizes family knowledge into Inbox (raw capture by scene), Cognition (distilled insights by 5 dimensions), and Guidebook (validated reusable methods) in O...
Lattice integration. Manage Persons, Organizations, Roles, Activities, Notes, Files. Use when the user wants to interact with Lattice data.
Lightning Network payments via Archon DIDs - create wallets, send/receive sats, verify payments, Lightning Address zaps
Analyze audio quality, detect noise types, and provide improvement recommendations. Use when users need to check audio quality, validate recordings, or ident...
Transcribe audio files to text using local speech recognition. Triggers on: "转录", "transcribe", "语音转文字", "ASR", "识别音频", "把这段音频转成文字".
Speaker separation, voice comparison, and audio processing tools. Use when working with multi-speaker audio, voice cloning, or speaker verification tasks inc...
Integrate Google NotebookLM capabilities into your workflow via the unofficial notebooklm-py library. Use when you need to: create/manage notebooks, import s...
Create explainer videos with narration and AI-generated visuals. Triggers on: "解说视频", "explainer video", "explain this as a video", "tutorial video", "introd...
Free All-in-One AI Image Generator Platform. Access FLUX, Midjourney alternatives, Wan AI, and Qwen Image in one place. Generate photorealistic 8K images nat...
Manage Readwise highlights, books, daily review, and Reader documents (save-for-later / read-it-later). Use when the user wants to save articles or URLs to Reader, browse their reading list, search sa
Free All-in-One AI Video Generator Platform. Access Kling AI, Google Veo, Sora 2, and Runway in one place. Generate cinematic Text-to-Video, Image-to-Video,...
Automated social media manager — plan, write, schedule, and analyze content across X/Twitter, LinkedIn, Instagram, TikTok, Facebook, and Pinterest. Integrate...
Proactive Chinese language tutor that delivers curated, real-world Mandarin learning content on a schedule. Use when: (1) User wants to learn or improve Chin...
Use the Gemini API (Nano Banana image generation, Veo video, Gemini TTS speech and audio understanding) to deliver end-to-end multimodal media workflows and code templates for "generation + understand
Manage linkding bookmarks - save URLs, search, tag, organize, and retrieve your personal bookmark collection. Use when the user wants to save links, search bookmarks, manage tags, or organize their re
Manage AI agent personas (Souls) for OpenClaw. Use when the user wants to install, switch, list, or restore AI personalities/personas. Triggers on requests l...
Enables voice synthesis, voice cloning, voice design, and audio post-processing using MiniMax Voice API and FFmpeg. Use when converting text to speech, creat...
Download music from YouTube/YouTube Music and stream to Chromecast via Home Assistant. Complete CLI toolset with web server integration, configuration wizard, and playback controls.
GorillaStack integration. Manage Organizations. Use when the user wants to interact with GorillaStack data.
Text-to-speech conversion using GLM-TTS service via the `uvx zai-tts` command for generating audio from text. Use when (1) User requests audio/voice output w...
Use this skill whenever the user wants to interact with the Faces AI platform — including logging in or registering, creating or managing face personas, runn...
Peer-to-peer task payroll marketplace on Base L2. Clients create USDC-funded gigs, distribute tasks to gigworkers via email/webhook mailboxes, review proofs...