desktop-automation-100per100-local
Automate desktop tasks locally with mouse, keyboard, window control, OCR, and image recognition using Python on Windows/macOS/Linux.
Description
Desktop Automation Skill
Desktop automation: mouse, keyboard, windows control, OCR, image recognition — all local.
🎯 About
OpenClaw skill to automate desktop interactions on Windows/macOS/Linux using Python. Uses PyAutoGUI for basic actions, OpenCV for image recognition, and pytesseract for OCR.
Perfect for:
- Automating legacy apps without an API
- Filling repetitive forms
- Launching GUI workflows from commands
- Recording and replaying macros
🔐 Security & Privacy
⚠️ Keyboard Recording Warning
The macro recorder captures ALL keyboard events, including:
- Passwords
- Credit card numbers
- Personal identification numbers
- Private messages
Never record macros while entering sensitive credentials. Only use macro recording for non-sensitive workflows. Recorded macro files store raw keystrokes — store them securely.
Data Handling
- No network access: All actions execute locally. No data is sent to external servers.
- No credential storage: The skill does not store or transmit passwords unless you explicitly record them in a macro.
- Local execution: Dependencies are standard Python packages (PyAutoGUI, OpenCV, etc.) — no external API calls.
Best Practices
- Verify coordinates before clicking on sensitive windows.
- Test macros in dry-run or slow speed first.
- Use
activate_windowwith precise titles to avoid misclicks. - Restrict file permissions on recorded macro JSON files (they may contain sensitive input).
📦 Installation
Prerequisites
- Python 3.10+
- Pip
Install dependencies
pip install -r requirements.txt
Enable the skill
The folder desktop-automation-100per100-local must be in C:\Users\Noham iA\.openclaw\workspace\skills\.
Restart OpenClaw:
openclaw gateway restart
🚀 Available Actions
Basic
| Action | Parameters | Description |
|---|---|---|
click |
x, y, button? ("left"/"right"/"middle") |
Click at coordinates |
type |
text, interval? (float) |
Type with optional interval |
screenshot |
path? (default: ~/Desktop/screenshot.png) |
Capture screen |
get_active_window |
— | Returns {title, x, y, width, height} of active window |
list_windows |
— | List all windows ([{title, x, y, width, height, is_active}]) |
activate_window |
title_substring |
Activate first matching window |
move_mouse |
x, y |
Move cursor |
press_key |
key (e.g. 'enter', 'tab', 'escape', 'space') |
Press a single key |
scroll |
amount (positive=up, negative=down) |
Scroll |
copy_to_clipboard |
text |
Copy to clipboard (requires pyperclip) |
paste_from_clipboard |
— | Paste (Ctrl+V) |
drag |
start_x, start_y, end_x, end_y, duration?, button? |
Drag-and-drop |
Advanced
| Action | Parameters | Description |
|---|---|---|
find_image |
template_path, confidence? (0.0-1.0) |
Find image on screen (OpenCV). Returns {x, y, confidence} or not_found |
wait_for_image |
template_path, timeout?, interval?, confidence? |
Wait for an image to appear |
find_text_on_screen |
text, lang? ('fra' default) |
OCR: search text on screen (requires Tesseract) |
📝 Usage Examples
1. Click and type
sessions_spawn({ task: 'click {"x":100,"y":200}', label: 'desktop-automation-100per100-local' });
sessions_spawn({ task: 'type {"text":"Hello World"}', label: 'desktop-automation-100per100-local' });
2. Screenshot and list windows
sessions_spawn({ task: 'list_windows', label: 'desktop-automation-100per100-local' });
sessions_spawn({ task: 'screenshot {"path":"~/Desktop/my_screen.png"}', label: 'desktop-automation-100per100-local' });
3. Activate window
sessions_spawn({ task: 'activate_window {"title_substring":"Notepad"}', label: 'desktop-automation-100per100-local' });
4. Image search (with OpenCV)
sessions_spawn({
task: 'find_image {"template_path":"C:/path/button.png","confidence":0.9}',
label: 'desktop-automation-100per100-local'
});
5. Full macro (record/play)
// Start GUI recording
sessions_spawn({ task: 'record_macro', label: 'desktop-automation-100per100-local' });
// Stop → file generated in recorded_macro/
// Replay
sessions_spawn({
task: 'play_macro {"macro_path":"C:/.../macro_2026-03-14_22-00-00.json"}',
label: 'desktop-automation-100per100-local'
});
⚙️ Macro Recording
The skill includes a Tkinter GUI to record macros:
- Launch:
python scripts/record_macro.py - Records: mouse, clicks, scrolling, keyboard, window changes
- Saves JSON in
recorded_macro/(timestamped name) - Playback:
python scripts/play_macro.py <file.json> [speed]
CLI example:
python scripts/record_macro.py
python scripts/play_macro.py recorded_macro/macro_2026-03-14_22-00-00.json 1.0
🔧 Configuration
OCR (pytesseract)
- Install Tesseract on your system: https://github.com/tesseract-ocr/tesseract
- On Windows: add to PATH or set
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' - Language packs:
fra,eng, etc.
Pyperclip (copy/paste)
- On Linux:
xcliporxselrequired - On Windows/macOS: usually works out of the box
⚠️ Safety & Best Practices
- Full control: the skill controls mouse and keyboard — use carefully
- No network access: isolated, no external data
- Coordinates: verify before clicking (use
screenshotfor reference) - Windows:
activate_windowassumes a unique title — be precise - Macros: always test in dry-run or slow speed first
- Permissions: on Windows, may require admin rights for some apps
🐛 Troubleshooting
| Problem | Solution |
|---|---|
pyautogui.FailSafeException |
Move mouse to corner (0,0) to disable failsafe, or pyautogui.FAILSAFE = False |
| OCR finds nothing | Check Tesseract installation + language. pytesseract.get_tesseract_version() |
find_image fails |
Ensure template matches exactly (same scale/color). Adjust confidence (0.7-0.95) |
activate_window not found |
Use list_windows to see exact titles |
| ImportError (missing module) | pip install -r requirements.txt (check Python 3.10+) |
| Tkinter GUI won't open | On Linux: apt-get install python3-tk. On Windows: usually present |
🤝 Contributing
This skill is open-source under the MIT license. Contributions are welcome!
How to contribute
- Fork the repository
- Create a branch (
git checkout -b feature/improvement) - Commit (
git commit -am 'Add X') - Push (
git push origin feature/improvement) - Open a Pull Request
Guidelines
- Follow existing code style (PEP 8, 4 spaces)
- Add docstrings in English
- Test locally
- Update README.md if needed
See CONTRIBUTING.md for more details.
📄 License
MIT License — see LICENSE file.
Copyright (c) 2026 — Jordane Guemara & contributors
🙏 Authors
Original creator: Jordane Guemara — https://github.com/JordaneParis
See AUTHORS.md for full contributors list.
🔗 Links
Version: 2.0.0 — ultra-robust, thread-safe, with Tkinter logging
Reviews (0)
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!