🧪 Skills

virtual-remote-desktop

KasmVNC-based virtual desktop for headless Linux with AI-first automation and human handoff. Use when most steps are automated but a user must manually inter...

v1.0.3
❤️ 2
⬇️ 957
👁 1
Share

Description


name: virtual-remote-desktop description: KasmVNC-based virtual desktop for headless Linux with AI-first automation and human handoff. Use when most steps are automated but a user must manually intervene for captcha/risk-control/login approval, then return to automation. Includes requirement-driven setup for mobile/desktop takeover and browser mobile/desktop rendering.

Virtual Remote Desktop (KasmVNC edition)

What this skill is for

Use this when the workflow is:

  1. AI runs browser automation most of the time
  2. Captcha / risk-control / MFA appears
  3. User takes over remotely for 1-3 minutes
  4. AI continues automatically

This version replaces x11vnc+noVNC with KasmVNC and keeps computer-use style action scripts for AI control.

Core commands

0) Install KasmVNC (one-time)

bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/install_kasmvnc.sh

This installer also prepares required runtime tools:

  • fluxbox (lightweight desktop)
  • xdotool + scrot (computer-use actions)
  • xauth

1) Requirement-driven start (recommended)

Before starting, always confirm these user requirements:

  1. takeover device: 手机 or 电脑
  2. website render mode: 手机网页 or 桌面网页
  3. access mode: 本地隧道 (127.0.0.1) or 临时公网 (0.0.0.0)
  4. network quality: 弱网 / 普通 / 良好

Use guided script (interactive Q&A):

bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd_guided.sh

Preview config without starting:

bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd_guided.sh --dry-run

1.1) Direct start (manual env)

bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd.sh

Important env vars:

  • AUTO_LAUNCH_URL (optional): open target page automatically
  • KASM_BIND (default 127.0.0.1, safer)
  • AUTO_STOP_IDLE_SECS (default 900)
  • BROWSER_MOBILE_MODE=1 (launch browser with mobile emulation)
  • BROWSER_DEVICE=iphone14pro|pixel7|ipad

Example:

AUTO_LAUNCH_URL="https://example.com/login" \
AUTO_STOP_IDLE_SECS=1200 \
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd.sh

Mobile-friendly VNC stream example (better phone takeover UX):

MOBILE_MODE=1 MOBILE_PRESET=phone \
AUTO_STOP_IDLE_SECS=900 \
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd.sh

Mobile stream options:

  • MOBILE_MODE=1 enables mobile defaults
  • MOBILE_PRESET=phone|tablet sets default resolution (960x540 / 1280x720)
  • KASM_MAX_FPS can be lowered further (e.g. 18) on weak networks

Browser mobile emulation (website renders as mobile page):

AUTO_LAUNCH_URL="https://example.com" \
BROWSER_MOBILE_MODE=1 BROWSER_DEVICE=iphone14pro \
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/start_vrd.sh

Notes:

  • Browser mobile emulation changes UA + viewport + touch behavior.
  • It is different from MOBILE_MODE (which optimizes VNC stream size).

2) Check status / health

bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/status_vrd.sh
bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/health_vrd.sh

3) Stop desktop

bash /home/ubuntu/.openclaw/workspace/skills/virtual-remote-desktop/scripts/stop_vrd.sh

AI automation actions (computer-use style)

All actions run on the active virtual display from pids.env.

# screenshot (base64)
bash scripts/action_screenshot.sh

# click / type / key / scroll
bash scripts/action_click.sh 500 420 left
bash scripts/action_type.sh "hello"
bash scripts/action_key.sh "ctrl+l"
bash scripts/action_scroll.sh down 4

# helpers
bash scripts/action_mouse_move.sh 800 300
bash scripts/action_cursor_position.sh
bash scripts/action_wait.sh 2

Recommended loop:

  1. action_screenshot.sh
  2. analyze
  3. action_click/type/key/...
  4. screenshot verify
  5. repeat

Best handoff pattern (AI ↔ User)

When captcha/risk-control appears:

  1. AI sends user the KasmVNC URL + username/password
  2. User manually solves challenge
  3. User replies “done”
  4. AI runs screenshot + validation step
  5. AI resumes automation

This avoids full manual operation while keeping recovery fast.


UX presets

Preset A: safe local tunnel (recommended)

  • KASM_BIND=127.0.0.1
  • user connects via SSH tunnel
  • best for security

Preset B: temporary public takeover

  • KASM_BIND=0.0.0.0
  • short AUTO_STOP_IDLE_SECS (e.g. 300)
  • use only for urgent remote intervention

Notes

  • KasmVNC uses HTTPS + per-user auth (username/password).
  • This skill stores runtime files in ~/.openclaw/vrd-data by default.
  • Browser profile persistence is controlled by CHROME_PROFILE_DIR.
  • If captcha frequency is high, keep one long-lived profile to reduce repeated challenges.

Reviews (0)

Sign in to write a review.

No reviews yet. Be the first to review!

Comments (0)

Sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Compatible Platforms

Pricing

Free

Related Configs