Browser automation via Playwright MCP. Navigate websites, click elements, fill forms, take screenshots, extract data, and debug real browser workflows. Use w...
Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with w...
Control the user's real Safari browser on macOS using AppleScript and screencapture. Read pages, click elements, type text, take screenshots, navigate tabs —...
Control a real iPhone through macOS iPhone Mirroring — screenshot, tap, swipe, type, launch apps, record video, OCR, and run multi-step scenarios. Works with...
Automate browser tasks using the BrowserMCP MCP server and Chrome extension. Use for navigating websites, filling forms, clicking elements, taking screenshot...
Control and automate the Linux desktop GUI on X11. Use this skill to take screenshots, find and click UI elements, type text, send keyboard shortcuts, scroll...
Control Safari on macOS with AppleScript, safaridriver, screenshots, tab navigation, and real-browser read, click, and type workflows.
Start a Selenium‑controlled Chrome browser, open a URL, take a screenshot, and report progress. Supports headless mode and optional proxy.
When the user sends a screenshot via Telegram, parse it using Gemini (fast, default) with automatic Claude fallback when confidence is low. Saves results to...
Browse the web, read page content, click buttons, fill forms, take screenshots, and get accessibility snapshots using the webcli headless browser. Use when t...
Design UI screens in Paper — a professional design tool running locally on macOS. Create artboards, write HTML into designs, take screenshots, and iterate vi...
Professional Windows-only visual automation toolkit with 11 modules for screenshot, OCR, template matching, clicks, input, environment setup, and looping tasks.
Complete Feishu (Lark) integration toolkit for AI agents. Read/write documents, fetch chat history, send files & screenshots, manage permissions, and create...
Full Windows desktop control. Mouse, keyboard, screenshots - interact with any Windows application like a human.
macOS CLI tool to record microphone audio, screen video or screenshot, and camera video or photo from the terminal with device listing and output control.
Browser automation for AI agents via inference.sh. Navigate web pages, interact with elements using @e refs, take screenshots, record video. Capabilities: web scraping, form filling, clicking, typing,
Remote Windows desktop control from WSL/Linux via screenshot + mouse/keyboard simulation. Use when: user asks to control their PC, click something, open an a...
Complete browser automation with Playwright. Auto-detects dev servers, writes clean test scripts to /tmp. Test pages, fill forms, take screenshots, check res...
Give your agent eyes on the web — screenshot any URL as an image file. Supports device emulation (iPhone, iPad, Pixel, MacBook), dark mode, full-page scroll,...
Build a reusable UI inspiration library that both archives and retrieves design references. Use when the user wants to save screenshots, collect UI inspirati...
Full desktop computer use for headless Linux servers. Xvfb + XFCE virtual desktop with xdotool automation. 17 actions (click, type, scroll, screenshot, drag,...