Extract text from images using Tesseract.js OCR. Supports Chinese (simplified/traditional) and English.
Agentic Vision via Gemini's native Code Execution sandbox. Use for spatial grounding, visual math, and UI auditing.
--- name: paddleocr-text-recognition description: Extracts text (with locations) from images and PDF documents using PaddleOCR. metadata: openclaw: requires: env: - PADDLEOCR_OCR_A
Install, authenticate, and use Claude Code CLI as a native coding tool for any OpenClaw agent system.
Launch and manage on-demand cloud development environments with preserved storage. Expose services securely, sync code safely, and get ready-to-run SSH and logging workflows. Capture browser logs and
E-commerce operations workflow for "high-repeat small goods" stores (cosmetics, phone cases, accessories, small jewelry, daily FMCG). Trigger whenever the us...
Automate TikTok slideshow marketing for any app or product. Researches competitors, generates AI images, adds text overlays, posts via Postiz, tracks analyti...
Gives any AI agent a persistent identity in SiliVille (硅基小镇) — a multiplayer AI-native metaverse. Farm, steal crops, post to the town feed, build social grap...
Control Android devices via ADB with support for UI layout analysis (uiautomator) and visual feedback (screencap). Use when you need to interact with Android apps, perform UI automation, take screensh
Use the official MinerU (mineru.net) parsing API to convert a URL (HTML pages like WeChat articles, or direct PDF/Office/image links) into clean Markdown + s...
Use for browser-based study, quiz, and practice tasks. Best for Yuketang, Xuexitong, Pintia, and similar learning or question pages where you need a persiste...
通过 Selenium 自动化控制浏览器,支持网页打开、元素操作、标签页管理、截图、JS 执行及代理设置等功能。
Personal CRM and relationship intelligence. Extracts contacts from conversations, tracks commitments, detects cooling relationships, delivers morning briefs,...
--- name: Browser Config description: 配置和管理 OpenClaw-CN 浏览器模式(openclaw/chrome),解决浏览器连接问题 metadata: {"clawdbot":{"requires":{"bins":["openclaw-cn"]}}} --- #
Industry hotspot and competitor monitoring across 5 dimensions. Use when user (in Chinese) asks to monitor an industry (监测...行业) and provides competitor URLs...
Fetch iMessage/Messages.app attachments (voice memos and images) and process them — transcribe audio via Silicon Flow ASR (SenseVoiceSmall), and analyze imag...
Operate an already-open Hinge session in the browser or on iPhone to review profiles, triage the queue, analyze matches, draft respectful openers or replies,...
The peer-to-peer freelance marketplace where AI agents and humans hire each other. Register, browse jobs, apply, message, and get paid.
UI/UX design intelligence and implementation guidance for building polished interfaces. Use when the user asks for UI design, UX flows, information architect...
自动将markdown内容转换为图片发送。当需要返回markdown内容给用户时,自动调用md2img生成图片代替纯文本markdown发送,避免排版混乱。触发场景:所有需
搜索和读取微信公众号文章的完整工具,支持关键词搜索与全文提取。 **只要用户提到以下任何场景,必须使用此 skill:** (1) 搜索公众号文章、按关键
Perform video/audio cutting, format conversion, compression, frame/audio extraction, watermarking, and subtitle addition using FFmpeg.
Complex document parsing with PaddleOCR. Intelligently converts complex PDFs and document images into Markdown and JSON files that preserve the original stru...
Guide skill for controlling native Windows apps (UIA) and web browsers (Playwright) via the handsfree-windows CLI. Use when you need to automate or test desk...