Use when user wants to track expenses, scan receipts, upload card payment screenshots, categorize spending, record transactions, check spending summaries, vi...
Control macOS via CLI using MacPilot for automating UI actions, managing windows, handling file dialogs, capturing screenshots, and system tasks.
Vision-driven browser automation using Midscene Bridge mode. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with...
Automate browser tasks using the BrowserMCP MCP server and Chrome extension. Use for navigating websites, filling forms, clicking elements, taking screenshot...
Review products on Reveal as an AI agent reviewer. Browse available review tasks, navigate target websites using agent-browser, take screenshots, record obse...
When the user sends a screenshot via Telegram, parse it using Gemini (fast, default) with automatic Claude fallback when confidence is low. Saves results to...
Mobile browser and native app automation via ATL (iOS Simulator). Navigate, click, screenshot, and automate web and native app tasks on iPhone/iPad simulators.
Summarize YouTube videos with NO subtitles by doing local ASR (yt-dlp + faster-whisper) and extracting a few screenshot frames via ffmpeg. Use when the user...
Access websites with advanced bot protection to fetch HTML, screenshots, PDFs, or multiple pages in parallel using isolated browser contexts.
Use when a user sends you an image, meme, screenshot, or asks you to explain a joke or meme. Also used during cron meme ingestion from Telegram channels. Dec...
Run Node.js scripts using Playwright for full browser automation, including scraping, screenshots, form handling, and dynamic content interaction.
Browser automation via Playwright MCP server. Navigate websites, click elements, fill forms, extract data, take screenshots, and perform full browser automation workflows.
Vision-driven Android device automation using Midscene. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all v...
Give your agent eyes — capture screenshots, voice, and annotations from any screen, monitor, or device via MCP.
Generate and read QR codes. Use when the user wants to create a QR code from text/URL, or decode/read a QR code from an image file. Supports PNG/JPG output and can read QR codes from screenshots or im
Monitor and recap official X (Twitter) updates using actionbook-rs screenshots. Use when the user asks to track/recap X posts (especially official accounts l...
Verify suspicious news, announcements, screenshots, and viral claims using a high-trust source pool (official channels + Chinese mainstream media + internati...
macOS CLI tool to record microphone audio, screen video or screenshot, and camera video or photo from the terminal with device listing and output control.
A Model Context Protocol server providing browser automation capabilities using Playwright. It allows LLMs to interact with web pages, take screenshots, and execute JavaScript in a real browser enviro
Instagram for AI agents. Build your following, grow your influence. Share screenshots, get likes & comments, engage with @mentions. Be a creator, not just a coder.
Control Android cloud phones via ADB broadcast commands - tap, swipe, type, screenshot, read UI elements. Requires DuoPlus CloudPhone service running on the...
Take website screenshots, capture full pages, generate PDFs. Handles desktop, mobile, dark mode, stealth mode, cookie banner blocking, and batch URLs via the...
Convert Markdown text to beautiful Xiaohongshu (XHS) style card images with 5 themes, deterministic browser screenshot rules, auto-pagination, smart title ex...