用 MinerU API 解析 PDF/Word/PPT/图片为 Markdown,支持公式、表格、OCR。适用于论文解析、文档提取。
Hybrid document intelligence pipeline ingesting PDFs, images, and spreadsheets with OCR, visual and text search, and field fix capture for fast retrieval.
Real-time OCR and data extraction API by Veryfi (https://veryfi.com). Extract structured data from receipts, invoices, bank statements, W-9s, purchase orders...
Authenticate AI agents with the DeepRead OCR API using OAuth device flow. The agent displays a code, the user approves it in their browser, and the agent rec...
发送微信消息给指定联系人。支持两种模式:(1) 有消息内容:直接发送指定消息;(2) 无消息内容:OCR 截图识别聊天窗口内容并自动回复。当用户需要自动发送微信消息、自动回复微信聊天时触发此技能。
Business card scanner + Google Contacts manager. Auto-detects business card images, extracts contact info via OCR (imageModel), confirms with user, saves to...
Extract structured data from construction PDFs. Convert specifications, BOMs, schedules, and reports from PDF to Excel/CSV/JSON. Use OCR for scanned documents and pdfplumber for native PDFs.
Tag and annotate images using Apple Vision framework (macOS only). Detects faces, bodies, hands, text (OCR), barcodes, objects, scene labels, and saliency re...
Extract text from PDF files using PyMuPDF. Parse tables, forms, and complex layouts. Supports OCR for scanned documents.
Provides web content fetching, caching, document OCR, real-time messaging, group chats, file transfers, and webhook integrations via Prismer Cloud APIs.
Prismer enables agents to fetch, compress, and parse web content, perform OCR, and communicate via messaging with real-time sync using CLI or SDK.
Use the VLM Run CLI (`vlmrun`) to interact with Orion visual AI agent. Process images, videos, and documents with natural language. Triggers: image understanding/generation, object detection, OCR, vid
Convert PDFs, DOCX, PPTX, and images to Markdown using zerox with GPT-4o vision, including OCR for scanned documents.
Masumi Network skill for warranty vault verification. Handles OCR receipt scanning, Cardano blockchain proof-of-purchase logging, immutable decision logging, agent collaboration discovery, and smart w
Self-hosted PDF operations and conversions with metered usage output.
Windows 平台微信和 QQ 自动发消息工具。支持搜索联系人、发送消息、截图OCR分析、智能回复建议(需用户确认后发送)。
Universal (non-OpenClaw) Nutrient DWS document-processing skill for Agent Skills-compatible products. Best for Claude Code, Codex CLI, Gemini CLI, Cursor, Wi...
MUST use for ANY PDF or image format conversion task — converting PDF and images (JPG/JPEG/PNG/BMP/TIFF/TIF/WEBP/JPEG2000) to 10 formats (Word, Excel, PPT, H...
Use this skill whenever the user wants to do anything with PDF files. Triggers include: reading or extracting text/tables from PDFs, combining or merging mul...
Single source of truth for all paths, naming conventions, and data formats across the OpenClaw Greek Accounting system. Reference document.
简历画像。触发场景:用户要求生成候选人画像;用户想了解候选人的多维度标签和能力评估。