🧪 Skills

desktop-automation-100per100-local

Automate desktop tasks locally with mouse, keyboard, window control, OCR, and image recognition using Python on Windows/macOS/Linux.

v2.0.0
❤️ 0
⬇️ 27
👁 2
Share

Description

Desktop Automation Skill

License: MIT OpenClaw

Desktop automation: mouse, keyboard, windows control, OCR, image recognition — all local.


🎯 About

OpenClaw skill to automate desktop interactions on Windows/macOS/Linux using Python. Uses PyAutoGUI for basic actions, OpenCV for image recognition, and pytesseract for OCR.

Perfect for:

  • Automating legacy apps without an API
  • Filling repetitive forms
  • Launching GUI workflows from commands
  • Recording and replaying macros

🔐 Security & Privacy

⚠️ Keyboard Recording Warning

The macro recorder captures ALL keyboard events, including:

  • Passwords
  • Credit card numbers
  • Personal identification numbers
  • Private messages

Never record macros while entering sensitive credentials. Only use macro recording for non-sensitive workflows. Recorded macro files store raw keystrokes — store them securely.

Data Handling

  • No network access: All actions execute locally. No data is sent to external servers.
  • No credential storage: The skill does not store or transmit passwords unless you explicitly record them in a macro.
  • Local execution: Dependencies are standard Python packages (PyAutoGUI, OpenCV, etc.) — no external API calls.

Best Practices

  • Verify coordinates before clicking on sensitive windows.
  • Test macros in dry-run or slow speed first.
  • Use activate_window with precise titles to avoid misclicks.
  • Restrict file permissions on recorded macro JSON files (they may contain sensitive input).

📦 Installation

Prerequisites

  • Python 3.10+
  • Pip

Install dependencies

pip install -r requirements.txt

Enable the skill

The folder desktop-automation-100per100-local must be in C:\Users\Noham iA\.openclaw\workspace\skills\. Restart OpenClaw:

openclaw gateway restart

🚀 Available Actions

Basic

Action Parameters Description
click x, y, button? ("left"/"right"/"middle") Click at coordinates
type text, interval? (float) Type with optional interval
screenshot path? (default: ~/Desktop/screenshot.png) Capture screen
get_active_window Returns {title, x, y, width, height} of active window
list_windows List all windows ([{title, x, y, width, height, is_active}])
activate_window title_substring Activate first matching window
move_mouse x, y Move cursor
press_key key (e.g. 'enter', 'tab', 'escape', 'space') Press a single key
scroll amount (positive=up, negative=down) Scroll
copy_to_clipboard text Copy to clipboard (requires pyperclip)
paste_from_clipboard Paste (Ctrl+V)
drag start_x, start_y, end_x, end_y, duration?, button? Drag-and-drop

Advanced

Action Parameters Description
find_image template_path, confidence? (0.0-1.0) Find image on screen (OpenCV). Returns {x, y, confidence} or not_found
wait_for_image template_path, timeout?, interval?, confidence? Wait for an image to appear
find_text_on_screen text, lang? ('fra' default) OCR: search text on screen (requires Tesseract)

📝 Usage Examples

1. Click and type

sessions_spawn({ task: 'click {"x":100,"y":200}', label: 'desktop-automation-100per100-local' });
sessions_spawn({ task: 'type {"text":"Hello World"}', label: 'desktop-automation-100per100-local' });

2. Screenshot and list windows

sessions_spawn({ task: 'list_windows', label: 'desktop-automation-100per100-local' });
sessions_spawn({ task: 'screenshot {"path":"~/Desktop/my_screen.png"}', label: 'desktop-automation-100per100-local' });

3. Activate window

sessions_spawn({ task: 'activate_window {"title_substring":"Notepad"}', label: 'desktop-automation-100per100-local' });

4. Image search (with OpenCV)

sessions_spawn({
  task: 'find_image {"template_path":"C:/path/button.png","confidence":0.9}',
  label: 'desktop-automation-100per100-local'
});

5. Full macro (record/play)

// Start GUI recording
sessions_spawn({ task: 'record_macro', label: 'desktop-automation-100per100-local' });
// Stop → file generated in recorded_macro/
// Replay
sessions_spawn({
  task: 'play_macro {"macro_path":"C:/.../macro_2026-03-14_22-00-00.json"}',
  label: 'desktop-automation-100per100-local'
});

⚙️ Macro Recording

The skill includes a Tkinter GUI to record macros:

  • Launch: python scripts/record_macro.py
  • Records: mouse, clicks, scrolling, keyboard, window changes
  • Saves JSON in recorded_macro/ (timestamped name)
  • Playback: python scripts/play_macro.py <file.json> [speed]

CLI example:

python scripts/record_macro.py
python scripts/play_macro.py recorded_macro/macro_2026-03-14_22-00-00.json 1.0

🔧 Configuration

OCR (pytesseract)

  • Install Tesseract on your system: https://github.com/tesseract-ocr/tesseract
  • On Windows: add to PATH or set pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
  • Language packs: fra, eng, etc.

Pyperclip (copy/paste)

  • On Linux: xclip or xsel required
  • On Windows/macOS: usually works out of the box

⚠️ Safety & Best Practices

  • Full control: the skill controls mouse and keyboard — use carefully
  • No network access: isolated, no external data
  • Coordinates: verify before clicking (use screenshot for reference)
  • Windows: activate_window assumes a unique title — be precise
  • Macros: always test in dry-run or slow speed first
  • Permissions: on Windows, may require admin rights for some apps

🐛 Troubleshooting

Problem Solution
pyautogui.FailSafeException Move mouse to corner (0,0) to disable failsafe, or pyautogui.FAILSAFE = False
OCR finds nothing Check Tesseract installation + language. pytesseract.get_tesseract_version()
find_image fails Ensure template matches exactly (same scale/color). Adjust confidence (0.7-0.95)
activate_window not found Use list_windows to see exact titles
ImportError (missing module) pip install -r requirements.txt (check Python 3.10+)
Tkinter GUI won't open On Linux: apt-get install python3-tk. On Windows: usually present

🤝 Contributing

This skill is open-source under the MIT license. Contributions are welcome!

How to contribute

  1. Fork the repository
  2. Create a branch (git checkout -b feature/improvement)
  3. Commit (git commit -am 'Add X')
  4. Push (git push origin feature/improvement)
  5. Open a Pull Request

Guidelines

  • Follow existing code style (PEP 8, 4 spaces)
  • Add docstrings in English
  • Test locally
  • Update README.md if needed

See CONTRIBUTING.md for more details.


📄 License

MIT License — see LICENSE file.

Copyright (c) 2026 — Jordane Guemara & contributors


🙏 Authors

Original creator: Jordane Guemara — https://github.com/JordaneParis

See AUTHORS.md for full contributors list.


🔗 Links


Version: 2.0.0 — ultra-robust, thread-safe, with Tkinter logging

Reviews (0)

Sign in to write a review.

No reviews yet. Be the first to review!

Comments (0)

Sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Compatible Platforms

Pricing

Free

Related Configs