Generate images with DrawThings (Stable Diffusion) via API. Use when creating images from text prompts, running image generation workflows, or batch generating images. DrawThings runs locally on Mac w
Generate videos using OpenAI's Sora API. Use when the user asks to generate, create, or make videos from text prompts or reference images. Supports image-to-video generation with automatic resizing.
Generate images with Alibaba Cloud Model Studio Z-Image Turbo (z-image-turbo) via DashScope multimodal-generation API. Use when creating text-to-image output...
Generate and edit images using Cloudflare Workers AI via the `imageflare` CLI. Use when: user asks to generate an image from a text prompt, edit/transform an...
--- name: mlx-stt description: Speech-To-Text with MLX (Apple Silicon) and opensource models (default GLM-ASR-Nano-2512) locally. version: 1.0.7 author: guoqiao metadata: {"openclaw":{"always":true,"e
Semantic knowledge base allowing ingest, search, and retrieval of saved texts, URLs, and files using embeddings and SQLite.
Turn a user shared web link into two Feishu docs: (1) full original text archive with minimal loss and clear source metadata, and (2) structured analysis sum...
Format any content into AI-readable structured formats that maximize citation probability. Converts unstructured text into GEO-optimized layouts using header...
Generate images from text prompts using FLUX via Together.ai. Returns image URL. Prompts are auto-enhanced for best results.
Read-only file browsing and reading in the OpenClaw workspace (/home/alfred/.openclaw/workspace). Use for listing directories or reading text files (up to 10...
AI-powered meeting notes generator - automatic transcription, summary, action items extraction, and task assignment. Turns meeting recordings or text into pr...
--- name: generate-qrcode description: Generate QR codes from URLs or text using a pre-built Python script with qrcode library author: yuanyanan version: 1.0.0 metadata: openclaw: emoji: "📱"
Read WeChat official account articles. Use the built-in browser tool to open the page and extract body text. Always append ?scene=1 to the URL.
A universal 4x4 grid sticker generator. uses strict visual guidelines (No Text, Transparent BG) and supports loading theme templates from resources.
Text-to-speech, sound effects, music generation, voice management, and quota checks via the ElevenLabs API. Use when generating audio with ElevenLabs or mana...
--- name: mlx-whisper description: Local speech-to-text with MLX Whisper (Apple Silicon optimized, no API key). homepage: https://github.com/ml-explore/mlx-examples/tree/main/whisper --- # MLX Whispe
Send voice messages across chat channels (Telegram, Discord, Feishu/Lark, Signal, WhatsApp, and others) using edge-tts for text-to-speech and ffmpeg for audi...
Perform KVcore CRM actions via MCP/CLI, including managing contacts, tags, notes, calls, emails, texts, campaigns, and raw API access with optional Twilio ca...
Generate high-resolution PNG images from detailed text prompts using the NVIDIA Stable Diffusion XL model with customizable style, lighting, and resolution.
Build and debug SenseAudio text-to-speech integrations on `/v1/t2a_v2` and `/ws/v1/t2a_v2`, including sync HTTP, SSE stream, WebSocket event sequencing, hex...
Transforms debugging sessions into a text-based dungeon crawl. Your bug is the final boss. Stack frames are dungeon rooms. Variables are loot. Log messages a...
Estimates token count and API cost for a given text using Claude's tokenizer approximation, with chunking advice for context limits.
Manage Facebook Pages via Meta Graph API. Post content (text, photos, links), list posts, manage comments (list/reply/hide/delete). Use when user wants to pu...
--- name: MoltShell Vision Engine description: Give your text-based OpenClaw agent the ability to see and describe images --- # 👁️ MoltShell Vision Engine Standard OpenClaw agents are **blind**