Parse UI screenshots into structured element JSON (type, OCR text, bbox) and operate desktop UI from parsed elements. Use when a user asks to detect/locate U...
Full desktop computer use for headless Linux servers. Xvfb + XFCE virtual desktop with xdotool automation. 17 actions (click, type, scroll, screenshot, drag,...
Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with w...
Control real Android phones through the Mobilerun API. Supports tapping, swiping, typing, taking screenshots, reading the UI accessibility tree, and managing...
Analyzes Bilibili academic/educational videos to extract knowledge points and generate clean-style study notes with screenshots. Use this skill when users pr...
Headless browser automation CLI for AI agents. Use when interacting with websites — navigating pages, filling forms, clicking buttons, taking screenshots, ex...
Clawsy is a native macOS menu bar app that gives your OpenClaw agent real-world reach — screenshots, clipboard sync, Quick Send, camera, file access via Find...
Automate Windows desktop by simulating mouse/keyboard input, managing clipboard, capturing screenshots, running commands, and launching apps.
Control Chrome browser with AI using MCP protocol. Use when users want to automate browser tasks, take screenshots, fill forms, click elements, navigate page...
Browser automation using Playwright API directly. Navigate websites, interact with elements, extract data, take screenshots, generate PDFs, record videos, and automate complex workflows. More reliable
Control Android cloud phones via ADB broadcast commands - tap, swipe, type, screenshot, read UI elements. Requires DuoPlus CloudPhone service running on the...
Take website screenshots, capture full pages, generate PDFs. Handles desktop, mobile, dark mode, stealth mode, cookie banner blocking, and batch URLs via the...
--- name: claw-mouse description: Control a Linux X11 desktop by taking screenshots and moving/clicking/typing via xdotool + scrot. homepage: https://github.com/rylena/claw-mouse metadata: openclaw:
Collect invoices/receipts from Gmail and send a summary email with attachments. Automatically downloads PDF attachments or takes screenshots of emails withou...
Activate when the user needs to interact with any website — browser automation, web scraping, screenshots, form filling, UI testing, monitoring, or building AI agents. Provides pre-verified page act
Perform advanced web crawling and content extraction with multi-page crawling, search result parsing, pattern filtering, and screenshot capture using the Wry...
A visual, human-like web browser for OpenClaw agents.Supports reading,screenshots, and visible mode.
Web scraping and crawling with Firecrawl API. Fetch webpage content as markdown, take screenshots, extract structured data, search the web, and crawl documentation sites. Use when the user needs to sc
Discover WebSim projects, creators, and trending content with powerful search and filters. Inspect project details, screenshots, comments, and engagement stats to guide research and curation. Track us
Convert URL to PNG suitable for mobile reading.
Render DNA codes to Pencil .pen frames. Does ONE thing well. Input: DNA code + component type (hero, card, form, etc.) Output: .pen frame ID + screenshot Use when: design-exploration or other orches
Use when the agent needs to drive a browser through the Microsoft Playwright CLI (`playwright-cli`) for navigation, form interactions, screenshots, recordings, data extraction, session management, or