pyautogui
Cross-platform mouse/keyboard automation skill. Supports mouse control (move/click/drag/scroll), keyboard control (key press/hotkeys/type text), screen opera...
Description
name: openclaw-pyautogui description: Cross-platform mouse/keyboard automation skill. Supports mouse control (move/click/drag/scroll), keyboard control (key press/hotkeys/type text), screen operations (screenshots/mouse position/screen size), image utilities (metadata/crop), screen overlay markers, drawing markers on images, image locating (template matching + OCR), and file cleanup to free disk space. Activate when the user needs UI automation, screenshots, coordinate verification, image analysis/annotation, on-screen element locating, or cleanup. description_zh: 跨平台键鼠自动化控制技能,支持鼠标控制(移动、点击、拖拽、滚动)、键盘控制(按键、组合键、输入文本)、屏幕操作(截图、获取鼠标位置)、图片处理(获取图片参数、裁剪图片)、屏幕绘图(在屏幕上绘制标记)、图片绘制(在图片上绘制标记)、图像查找(以图找图、以文找图OCR)、文件清理(释放磁盘空间)。当用户需要进行自动化键鼠操作、屏幕截图、获取鼠标位置、模拟键盘输入、获取图片信息、裁剪图片、绘制标记、图像识别定位或清理文件时激活此技能。
PyAutoGUI Automation Skill
Cross-platform mouse/keyboard automation for Windows, Linux, and macOS.
Features
- Mouse control: move, click, drag, scroll
- Keyboard control: key press, hotkeys, type text
- Screen operations: screenshot, mouse position, screen size
- Image utilities: image metadata (size/format/file size), crop images
- Screen overlay: draw temporary markers to validate coordinates
- Draw on images: draw persistent markers into an image and save
- Image locating: template matching and OCR-based text locating
- Cleanup: remove generated screenshots/marked files to free disk space
Activation
Activate when the user asks to do things like:
- "Click a position on the screen"
- "Move the mouse to (x, y)"
- "Type text / press keys"
- "Take a screenshot"
- "Run repetitive UI automation"
- "Get the current mouse position"
- "Get image size / image info"
- "Crop an image"
- "Draw a marker on the screen"
- "Draw a marker on an image"
- "Locate an element by template"
- "Locate text on the screen (OCR)"
- "Clean up screenshots / temporary files"
Usage
Install dependencies
# Mouse/keyboard automation
pip3 install pyautogui
# Image utilities
pip3 install Pillow
Screen info
# Screen size
python3 scripts/keyboard_mouse.py screen_size
# Mouse position
python3 scripts/keyboard_mouse.py mouse_position
Mouse actions
# Move mouse to (x, y)
python3 scripts/keyboard_mouse.py mouse_move 500 300
python3 scripts/keyboard_mouse.py mouse_move 500 300 --duration 1.0
# Mouse click (left/right/middle)
python3 scripts/keyboard_mouse.py mouse_click left
python3 scripts/keyboard_mouse.py mouse_click right
python3 scripts/keyboard_mouse.py mouse_click middle --clicks 2
# Click at a specific location
python3 scripts/keyboard_mouse.py mouse_click_at 500 300 left
python3 scripts/keyboard_mouse.py mouse_click_at 500 300 right --clicks 2
# Double click
python3 scripts/keyboard_mouse.py mouse_double_click 500 300
# Drag
python3 scripts/keyboard_mouse.py mouse_drag 500 300 800 600
python3 scripts/keyboard_mouse.py mouse_drag 500 300 800 600 --duration 2.0
# Scroll (positive = up, negative = down)
python3 scripts/keyboard_mouse.py mouse_scroll 5
python3 scripts/keyboard_mouse.py mouse_scroll -3
Keyboard actions
# Single key
python3 scripts/keyboard_mouse.py key_press enter
python3 scripts/keyboard_mouse.py key_press escape
python3 scripts/keyboard_mouse.py key_press tab
python3 scripts/keyboard_mouse.py key_press space
# Hotkeys
python3 scripts/keyboard_mouse.py key_hotkey ctrl c
python3 scripts/keyboard_mouse.py key_hotkey ctrl v
python3 scripts/keyboard_mouse.py key_hotkey win r
python3 scripts/keyboard_mouse.py key_hotkey alt tab
python3 scripts/keyboard_mouse.py key_hotkey ctrl alt t
# Type text
python3 scripts/keyboard_mouse.py type_text "Hello World"
python3 scripts/keyboard_mouse.py type_text "你好世界" --interval 0.05
Screenshot
# Save a screenshot (primary screen)
python3 scripts/keyboard_mouse.py screenshot /tmp/screenshot.png
# Windows example
python scripts/keyboard_mouse.py screenshot "E:\\temp\\screenshot.png"
Screenshot notes:
- Supported formats: PNG (recommended), JPG, BMP, etc.
- Scope: primary monitor (in multi-monitor setups)
Region Screenshot
# Screenshot specific region (x1, y1, x2, y2)
python3 scripts/keyboard_mouse.py screenshot_region region.png 100 100 500 500
# Windows example - capture QQ chat window area
python scripts/keyboard_mouse.py screenshot_region qq_window.png 2800 300 3800 1200
Parameters:
x1, y1: Top-left corner coordinatesx2, y2: Bottom-right corner coordinates- Order doesn't matter (automatically calculated)
Copy & Paste
# Copy text to clipboard
python3 scripts/keyboard_mouse.py copy "Text to copy"
# Paste from clipboard (Ctrl+V)
python3 scripts/keyboard_mouse.py paste
# Copy and paste in one command (fastest way to input text)
python3 scripts/keyboard_mouse.py copy_paste "Text to input directly"
Use cases:
copy_pasteis faster thantype_textfor long text- Use
copy_pastewhen you want to skip typing animation - Use
type_textwhen you need to simulate realistic typing
Common key names
- Letters:
abc... - Numbers:
012... - Function keys:
f1f2...f12 - Modifiers:
ctrlaltshiftwin - Others:
enteresctabspacebackspacedeleteupdownleftright
Safety
⚠️ Important:
- Make sure the target window is focused before executing actions
- Be careful with system hotkeys to avoid unintended actions
- Add delays when needed to give yourself time to interrupt
- Moving the mouse to the top-left corner (0, 0) triggers PyAutoGUI failsafe
Cross-platform notes
- Windows: Full support; admin permission may be needed in some environments
- Linux: Requires X11; Wayland may not work
- macOS: Grant Accessibility permission to Terminal/Python in System Settings
Example scenarios
Open Calculator (Windows)
python3 scripts/keyboard_mouse.py key_hotkey win r
python3 scripts/keyboard_mouse.py type_text "calc"
python3 scripts/keyboard_mouse.py key_press enter
Auto-fill a form
python3 scripts/keyboard_mouse.py mouse_click_at 500 300 left
python3 scripts/keyboard_mouse.py type_text "example@email.com"
python3 scripts/keyboard_mouse.py key_press tab
python3 scripts/keyboard_mouse.py type_text "password123"
Batch clicking
python3 scripts/keyboard_mouse.py mouse_click_at 100 100 left
python3 scripts/keyboard_mouse.py mouse_click_at 200 200 left
python3 scripts/keyboard_mouse.py mouse_click_at 300 300 left
Included scripts
scripts/keyboard_mouse.py- Mouse/keyboard controlscripts/image_utils.py- Image utilitiesscripts/draw_overlay.py- Screen overlay markersscripts/draw_on_image.py- Draw markers on imagesscripts/image_finder.py- Image locating (template + OCR)scripts/cleanup.py- Cleanup tool
Image utilities
Image info
python3 scripts/image_utils.py info screenshot.png
python3 scripts/image_utils.py size photo.jpg
Crop image
python3 scripts/image_utils.py crop screenshot.png 100 100 500 500
python3 scripts/image_utils.py crop screenshot.png 100 100 500 500 -o output.png
Output example
$ python3 scripts/image_utils.py info screenshot.png
{
"path": "screenshot.png",
"filename": "screenshot.png",
"size": {
"width": 3840,
"height": 2160
},
"format": "PNG",
"mode": "RGB",
"file_size_bytes": 2097152,
"file_size_kb": 2048.0
}
Image fields
| Field | Meaning | Example |
|---|---|---|
width |
Image width (px) | 1920, 3840 |
height |
Image height (px) | 1080, 2160 |
format |
Image format | PNG, JPEG, GIF, BMP, WEBP |
mode |
Color mode | RGB, RGBA, L |
file_size_bytes |
File size (bytes) | 2097152 |
file_size_kb |
File size (KB) | 2048.0 |
Coordinate system
Screen coordinates:
- Origin (0, 0) is the top-left corner
- X increases to the right
- Y increases downward
Crop coordinates:
x1, y1: top-left corner of cropx2, y2: bottom-right corner of crop- Cropped size = (x2 - x1) × (y2 - y1)
Example:
python3 scripts/image_utils.py crop screenshot.png 1520 880 1920 1080
Typical workflows
Analyze positions in a screenshot
python3 scripts/image_utils.py size screenshot.png
python3 scripts/image_utils.py crop screenshot.png 3440 1960 3840 2160 -o bottom_right.png
Batch image sizing
for img in *.png; do
echo -n "$img: "
python3 scripts/image_utils.py size "$img"
done
Capture a region of the screen
python3 scripts/keyboard_mouse.py screenshot full.png
python3 scripts/image_utils.py crop full.png 500 300 1000 800 -o region.png
Screen overlay markers
Draw temporary markers on the screen for coordinate verification. Useful for:
- Calibrating coordinates
- Confirming the real position of a button/element
- Debugging automation scripts
Draw a marker
python3 scripts/draw_overlay.py marker cross 500 300
python3 scripts/draw_overlay.py marker target 800 600 --duration 10
python3 scripts/draw_overlay.py marker circle 500 300 --color blue --text "Send button"
python3 scripts/draw_overlay.py marker arrow 1000 800 --direction down --color yellow
python3 scripts/draw_overlay.py marker square 600 400 --color green --size 40
Draw a rectangular area
python3 scripts/draw_overlay.py area 3028 276 3832 2098 --label "Window" --duration 8
python3 scripts/draw_overlay.py area 3744 2062 3832 2098 --label "Send button" --color red
Marker types
| Type | Description | Use case |
|---|---|---|
cross |
Crosshair | Precise single-point targeting |
circle |
Circle | Mark buttons/circular elements |
square |
Square | Mark rectangular elements |
arrow |
Arrow | Indicate direction / draw attention |
target |
Target | Strongest visual cue (circle + crosshair) |
Colors
red, green, blue, yellow, cyan, magenta, white, orange
Coordinate calibration example
python3 scripts/keyboard_mouse.py screenshot screen.png
python3 scripts/draw_overlay.py marker target 3788 2080 --text "Send button" --duration 10
python3 scripts/draw_overlay.py marker target 3790 2090 --text "Send button (adjusted)" --duration 10
python3 scripts/keyboard_mouse.py mouse_click_at 3790 2090 left
Draw markers on images
Draw persistent markers into image files. Useful for:
- Annotating recognized positions on a screenshot
- Producing reference images
- Batch marking candidates for comparison
- Keeping calibration records
Draw a marker
python3 scripts/draw_on_image.py screenshot.png marker cross 500 300
python3 scripts/draw_on_image.py screenshot.png marker target 800 600 -o marked.png
python3 scripts/draw_on_image.py screenshot.png marker circle 500 300 --color red --text "Send button"
python3 scripts/draw_on_image.py screenshot.png marker arrow 1000 800 --direction down --color yellow
python3 scripts/draw_on_image.py screenshot.png marker point 600 400 --color green --size 10
Draw a rectangular area
python3 scripts/draw_on_image.py screenshot.png area 3028 276 3832 2098 --label "Window"
python3 scripts/draw_on_image.py screenshot.png area 3744 2062 3832 2098 -o button_marked.png --label "Send button"
Batch marking workflow
python3 scripts/keyboard_mouse.py screenshot screen.png
python3 scripts/draw_on_image.py screen.png marker target 3788 2080 --text "Send button" -o step1.png
python3 scripts/draw_on_image.py step1.png marker target 3790 2090 --text "Adjusted" -o step2.png
python3 scripts/draw_on_image.py step2.png marker circle 3000 1500 --text "Avatar area" -o final.png
Screen overlay vs drawing on image
| Item | Screen overlay (draw_overlay.py) | Draw on image (draw_on_image.py) |
|---|---|---|
| Display | Real-time on screen | Inside the image file |
| Duration | Temporary | Persistent |
| Interaction | Auto-close (time) | No interaction |
| Best for | Real-time coordinate validation | Generating annotated references |
| Output | Not saved | Saved to file |
Recommended coordinate calibration (cost-saving)
python3 scripts/keyboard_mouse.py screenshot screen.png
python3 scripts/image_utils.py size screen.png
python3 scripts/draw_on_image.py screen.png marker target 3788 2080 --text "Candidate 1" -o marked1.png
python3 scripts/draw_on_image.py screen.png marker target 3790 2090 --text "Candidate 2" -o marked2.png
python3 scripts/draw_on_image.py screen.png marker target 3785 2085 --text "Candidate 3" -o marked3.png
python3 scripts/draw_overlay.py marker target 3790 2090 --duration 3
python3 scripts/keyboard_mouse.py mouse_click_at 3790 2090 left
Image locating
Built on OpenCV template matching and RapidOCR. Supports locating UI elements by image and by text.
Install dependencies
pip install opencv-python numpy rapidocr_onnxruntime
Note: RapidOCR models are ~15MB and are downloaded automatically on first use.
Template matching (find by image)
python3 scripts/image_finder.py image button.png
python3 scripts/image_finder.py image button.png --all
python3 scripts/image_finder.py image button.png --threshold 0.95
python3 scripts/image_finder.py image button.png --mark
python3 scripts/image_finder.py image button.png --click
Output example:
✅ Match found: position (3788, 2080), similarity: 98.50%
OCR text locating (find by text)
python3 scripts/image_finder.py text "Send"
python3 scripts/image_finder.py text "OK" --click
python3 scripts/image_finder.py text "Send" --mark-on-image checked.png
python3 scripts/image_finder.py text-all
python3 scripts/image_finder.py text "Login" --confidence 0.9
Output example:
✅ Found 2 candidates containing 'Send':
[1] Text: 'Send', position: (3788, 2080), confidence: 95%
[2] Text: 'Send to all', position: (2100, 1500), confidence: 88%
Recommended automation workflows
Template matching (most accurate):
python3 scripts/image_finder.py image qq_send_button.png --threshold 0.9
python3 scripts/draw_on_image.py marker screen.png target 3788 2080 --text "Candidate 1" -o check1.png
python3 scripts/keyboard_mouse.py mouse_click_at 3788 2080 left
OCR text locating (when no template is available):
python3 scripts/image_finder.py text "Send"
python3 scripts/keyboard_mouse.py mouse_click_at 3548 1462 left
Important principle:
- OCR returns accurate screen coordinates; do not modify the returned coordinates
- If there are multiple candidates, mark them on an image to visually choose the correct one
- Once you choose the right candidate, click using the original coordinates
Template matching vs OCR
| Item | Template matching | OCR text locating |
|---|---|---|
| Accuracy | ⭐⭐⭐⭐⭐ pixel-level | ⭐⭐⭐⭐ depends on font/background |
| Speed | ⭐⭐⭐⭐⭐ milliseconds | ⭐⭐⭐ requires inference |
| Dependencies | OpenCV | RapidOCR |
| Best for | Icons/buttons/fixed UI | Text buttons/labels/inputs |
Why this is better than guessing coordinates
- High precision and repeatability (pixel-level)
- Local compute with no API cost
- Fast response
- Easy to debug via marked outputs
Cleanup
Analyze disk usage
python3 scripts/cleanup.py analyze .
Clean files
python3 scripts/cleanup.py clean . --days 7
python3 scripts/cleanup.py clean . --days 7 --execute
python3 scripts/cleanup.py clean . --size 1024 --execute
python3 scripts/cleanup.py clean . --execute
Auto cleanup
python3 scripts/cleanup.py auto . --max-files 50 --max-size 100
python3 scripts/cleanup.py auto . --max-files 20 --max-size 50
End-to-end example
python3 scripts/keyboard_mouse.py screenshot screen.png
python3 scripts/draw_on_image.py marker screen.png target 500 300 --text "Button" -o marked.png
python3 scripts/cleanup.py analyze .
python3 scripts/cleanup.py clean . --days 1 --execute
python3 scripts/cleanup.py auto . --max-files 10 --max-size 50
Command quick reference
Mouse/keyboard (keyboard_mouse.py)
| Command | Description | Example |
|---|---|---|
screen_size |
Get screen size | keyboard_mouse.py screen_size |
mouse_position |
Get mouse position | keyboard_mouse.py mouse_position |
mouse_move x y |
Move mouse | keyboard_mouse.py mouse_move 500 300 |
mouse_click button |
Click mouse | keyboard_mouse.py mouse_click left |
mouse_click_at x y button |
Click at coordinates | keyboard_mouse.py mouse_click_at 500 300 left |
mouse_double_click x y |
Double click | keyboard_mouse.py mouse_double_click 500 300 |
mouse_drag x1 y1 x2 y2 |
Drag | keyboard_mouse.py mouse_drag 500 300 800 600 |
mouse_scroll amount |
Scroll | keyboard_mouse.py mouse_scroll 5 |
key_press key |
Press key | keyboard_mouse.py key_press enter |
key_hotkey key1 key2 |
Hotkey | keyboard_mouse.py key_hotkey ctrl c |
type_text text |
Type text | keyboard_mouse.py type_text "Hello" |
screenshot path |
Screenshot | keyboard_mouse.py screenshot img.png |
Image utilities (image_utils.py)
| Command | Description | Example |
|---|---|---|
info path |
Full image info | image_utils.py info photo.png |
size path |
Image size only | image_utils.py size photo.jpg |
crop x1 y1 x2 y2 |
Crop image | image_utils.py crop img.png 100 100 500 500 |
Screen overlay (draw_overlay.py)
| Command | Description | Example |
|---|---|---|
marker type x y |
Draw marker | draw_overlay.py marker target 500 300 |
area x1 y1 x2 y2 |
Draw rectangle | draw_overlay.py area 100 100 500 400 |
Draw on image (draw_on_image.py)
| Command | Description | Example |
|---|---|---|
marker type x y |
Draw marker on image | draw_on_image.py img.png marker target 500 300 |
area x1 y1 x2 y2 |
Draw rectangle on image | draw_on_image.py img.png area 100 100 500 400 |
Image finder (image_finder.py)
| Command | Description | Example |
|---|---|---|
image template |
Find by template | image_finder.py image button.png |
text str |
Find by text (OCR) | image_finder.py text "Send" |
text-all |
Recognize all text | image_finder.py text-all |
Cleanup (cleanup.py)
| Command | Description | Example |
|---|---|---|
analyze dir |
Analyze disk usage | cleanup.py analyze . |
clean dir |
Clean files | cleanup.py clean . --days 7 --execute |
auto dir |
Auto cleanup | cleanup.py auto . --max-files 50 |
Reviews (0)
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!