Authenticate AI agents with the DeepRead OCR API using OAuth device flow. The agent displays a code, the user approves it in their browser, and the agent rec...
Prismer enables agents to fetch, compress, and parse web content, perform OCR, and communicate via messaging with real-time sync using CLI or SDK.
Extract text from PDF files using PyMuPDF. Parse tables, forms, and complex layouts. Supports OCR for scanned documents.
Provide powerful document parsing capabilities by integrating with the Mineru API. Enable single and batch file parsing with support for multiple formats, OCR, formula, and table recognition. Monitor
Extract structured data from construction PDFs. Convert specifications, BOMs, schedules, and reports from PDF to Excel/CSV/JSON. Use OCR for scanned documents and pdfplumber for native PDFs.
Parse UI screenshots into structured element JSON (type, OCR text, bbox) and operate desktop UI from parsed elements. Use when a user asks to detect/locate U...
The cheapest AI media API on the market. Generate images (Flux), music (AceStep), speech with voice cloning, transcribe video/audio, OCR, video generation, b...
增值税发票识别技能:自动识别 PDF(单页/多页)或各种常见图片格式(PNG/JPG等)的发票,调用百度云增值税发票 OCR API 提取关键信息,输出结构化 Excel 报告。适用于以下场景: 用户上传发票文件并要求识别、提取、转换信息时;需要批量处理发票并生成 Excel 汇总表时; 需要对发票进行检测、内容...
Extract PDF content to Markdown using MinerU API. Supports formulas, tables, OCR. Provides both local file and online URL parsing methods.
Masumi Network skill for warranty vault verification. Handles OCR receipt scanning, Cardano blockchain proof-of-purchase logging, immutable decision logging, agent collaboration discovery, and smart w
识别图片中的K12算式(加减乘除、竖式计算、分数、方程等),返回结构化文本结果。 支持手写体和印刷体,可拒绝非算式图片。 触发条件:用户要求识别算式、数学题、计算题图片,或上传数学题图片时调用。 关键词:算式识别、数学题、OCR、竖式计算、ArithmeticOCR
Real-time OCR and data extraction API by Veryfi (https://veryfi.com). Extract structured data from receipts, invoices, bank statements, W-9s, purchase orders...
OpenClaw agent skill for converting documents to Markdown. Documentation and utilities for Microsoft's MarkItDown library. Supports PDF, Word, PowerPoint, Excel, images (OCR), audio (transcription), H
发送微信消息给指定联系人。支持两种模式:(1) 有消息内容:直接发送指定消息;(2) 无消息内容:OCR 截图识别聊天窗口内容并自动回复。当用户需要自动发送微信消息、自动回复微信聊天时触发此技能。
MCP server for the Nutrient DWS Processor API. Convert, merge, redact, sign, OCR, watermark, and extract data from PDFs and Office documents via natural language. Works with Claude Desktop, LangGraph,
Hybrid document intelligence pipeline ingesting PDFs, images, and spreadsheets with OCR, visual and text search, and field fix capture for fast retrieval.
Business card scanner + Google Contacts manager. Auto-detects business card images, extracts contact info via OCR (imageModel), confirms with user, saves to...
用 MinerU API 解析 PDF/Word/PPT/图片为 Markdown,支持公式、表格、OCR。适用于论文解析、文档提取。
为OpenClaw提供中文文本处理、翻译、OCR、语音识别等功能的综合工具包。支持中文分词、拼音转换、中英文翻译、关键词提取、文本分析等功能。
Provides web content fetching, caching, document OCR, real-time messaging, group chats, file transfers, and webhook integrations via Prismer Cloud APIs.
Extract structured data from PDFs, images, and Word files with layout analysis, table recognition, OCR, seal detection, and directory extraction.
Self-hosted PDF operations and conversions with metered usage output.
Universal (non-OpenClaw) Nutrient DWS document-processing skill for Agent Skills-compatible products. Best for Claude Code, Codex CLI, Gemini CLI, Cursor, Wi...