Detect and solve simple image captchas during browser automation. Use when flows encounter 4-6 character text, distorted alphanumeric, numeric, rotated, or a...
Edit PDF files visually using natural language with the nano-pdf CLI tool, powered by Google's Gemini 3 Pro Image (Nano Banana). Use this skill whenever the...
老师作业批改助手,用于自动批改数学作业、统计错题、生成Excel统计表和PDF报告。当老师需要:(1) 上传正确答案并让AI识别 (2) 批量上传学生作业照片
Scan receipt or invoice photos sent via chat, extract expense data using OpenAI Vision, validate and deduplicate, then log to a Google Spreadsheet. Responds...
Access reMarkable tablet documents, notebooks, PDFs, and EPUBs. Use when the user wants to read, search, browse, or extract text from their reMarkable tablet...
Full Windows desktop control. Mouse, keyboard, screenshots - interact with any Windows application like a human.
Soulprint decentralized identity verification for AI agents. v0.6.4 — blockchain-first architecture (no libp2p): state lives on Base Sepolia, 4 validator nod...
Control macOS GUI apps visually — take screenshots, click, scroll, type. Use when the user asks to interact with any Mac desktop application's graphical inte...
自动爬取银登网不良贷款转让公告及结果,支持多模型提取关键金融数据并导出结构化分析报告。
Local-first multimedia research library for hardware projects. Capture code, CAD, PDFs, images. Search with material-type weighting. Project isolation with cross-references. Async extraction. Backup +
Parse, extract, and analyze documents using the LlamaParse API (LlamaCloud). Use when the user asks to parse PDFs, images, spreadsheets, or other documents i...
Control macOS GUI apps visually — take screenshots, click, scroll, type. Use when the user asks to interact with any Mac desktop application's graphical interface.
Use the official MinerU (mineru.net) parsing API to convert a URL (HTML pages like WeChat articles, or direct PDF/Office/image links) into clean Markdown + s...
Transform document photos into clean scanned-looking pages with automatic edge detection, cropping, and perspective correction. Use when (1) the user wants a...
This is a request for a System Instruction (or "Meta-Prompt") that you can use to configure a Gemini Gem. This prompt is designed to force the model into a hyper-analytical mode where it prioritizes c
TITLE: Job Posting Snapshot & Preservation Engine VERSION: 1.5 Author: Scott M LAST UPDATED: 2026-03 ============================================================ CHANGELOG ===================
Mask and redact sensitive information (PII) in screenshots and images — phone numbers, emails, IDs, API keys, crypto wallets, credit cards, passwords, and mo...
Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multipl...
Create, edit, and manipulate DOCX files using SuperDoc - a modern document editor with custom rendering pipeline. Use when you need to programmatically work...
Collect and organize a personal knowledge base from URLs (web/X/WeChat) and screenshots. Use when the user says they want to save an URL, ingest a link, archive content to KB, tag/classify notes, stor
End-to-end KYC (Know Your Customer) identity verification for onboarding real users. Use when someone needs to perform KYC, onboard users with identity verif...
--- name: pdfagent description: Self-hosted PDF operations and conversions with metered usage output. version: 0.1.0 --- # PDF Agent Summary - Use `pdfagent` to perform PDF operations (merge, split,
Render structured table data as high-quality PNG images using Headless Chrome. Use when: need to visualize tabular data for chat interfaces, reports, or soci...
Generate a deterministic, template-preserving 16-section SDS/MSDS package from 1 DOCX template, 1 prompt/rule file, and 1-3 source SDS/MSDS files, with DOCX/...