Use when user wants to track expenses, scan receipts, upload card payment screenshots, categorize spending, record transactions, check spending summaries, vi...
Create OpenClaw skills from best practice videos or image sequences. Use when creating skill from video, generating skill from screenshots, converting tutori...
Desktop app that captures screen activity via event-driven screenshots, stores AI-generated summaries and OCR text locally in SQLite, and exposes your activity history to AI assistants via MCP with se
Vision-driven browser automation using Midscene. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all visible...
Web scraping MCP server for AI agents. 6 tools: clean content extraction, structured scraping with CSS selectors, full-page screenshots via Playwright, link extraction, metadata extraction (OG/Twitter
Browser automation for AI agents via inference.sh. Navigate web pages, interact with elements using @e refs, take screenshots, record video. Capabilities: we...
MCP server for Godot 4.x with runtime control via injected UDP bridge: input simulation, screenshots, UI discovery, and live GDScript execution while the game is running.
Extract and analyze images from files, links, and embedded images to understand text, objects, and visual content. Turn screenshots, photos, diagrams, and documents into searchable insights. Streamlin
Give your agent eyes — capture screenshots, voice, and annotations from any screen, monitor, or device via MCP.
MCP server that captures webpage screenshots, with viewport or full-page options and base64 PNG output.
Vision-driven browser automation using Midscene Bridge mode. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with...
Verify suspicious news, announcements, screenshots, and viral claims using a high-trust source pool (official channels + Chinese mainstream media + internati...
Summarize YouTube videos with NO subtitles by doing local ASR (yt-dlp + faster-whisper) and extracting a few screenshot frames via ffmpeg. Use when the user...
Browser automation server that enables interaction with web pages, taking screenshots, and executing JavaScript in a real browser environment
Automate Chrome pages with clicks, form fills, navigation, and in-page scripting. Inspect console and network activity, take screenshots or text snapshots, and manage multiple pages. Analyze performan
Mobile browser and native app automation via ATL (iOS Simulator). Navigate, click, screenshot, and automate web and native app tasks on iPhone/iPad simulators.
A Model Context Protocol server providing browser automation capabilities using Playwright. It allows LLMs to interact with web pages, take screenshots, and execute JavaScript in a real browser enviro
Instagram for AI agents. Build your following, grow your influence. Share screenshots, get likes & comments, engage with @mentions. Be a creator, not just a coder.
Control Android cloud phones via ADB broadcast commands - tap, swipe, type, screenshot, read UI elements. Requires DuoPlus CloudPhone service running on the...
Automate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, navigate web pages, extract data from websites, take screenshots, fill forms, clic
MCP server for [Pyxel](https://github.com/kitao/pyxel) retro game engine, enabling AI to run, capture screenshots, inspect sprites, and analyze audio of Pyxel games.