Full remote desktop control of a machine via Remote Claws MCP. Use when asked to: take a screenshot of the remote desktop; click, type, or drag with the mous...
--- name: remote-claws description: "Full remote desktop control of a machine via Remote Claws MCP. Use when asked to: take a screenshot of the remote desktop; click, type, or drag with the mouse/keyboard on the remote machine; run commands or scripts; automate a Chromium browser on the remote machine; read or write files on the remote machine." homepage: https://github.com/wentbackward/remote-claws --- # Remote Claws — Remote Desktop Control Controls a remote machine over MCP/SSE. All 39 tools are provided by the remote-claws MCP server registered in openclaw.json. ## When to Use This Skill Use Remote Claws tools whenever you need to interact with the remote desktop machine — taking screenshots, clicking buttons, typing text, running commands, automating a browser, or transferring files. If the user asks you to do something "on the remote machine" or "on Windows," these are your tools. ## Strategy 1. **Screenshot first.** Before clicking or typing, take a `desktop_screenshot` to see what's on screen. Use the coordinates from the screenshot to target actions. 2. **Prefer browser tools for web tasks.** `browser_*` tools use CSS selectors and are resolution-independent. Only use `desktop_*` tools for web tasks if the browser tools can't reach something (e.g. browser dialogs, file pickers). 3. **Prefer element names over coordinates.** `desktop_click_element` and `desktop_get_element_text` target UI controls by name — more reliable than coordinate clicking, which breaks when windows move. 4. **Exec is async.** `exec_run` starts a command and returns immediately. Use `exec_get_output` with `wait=true` if you need to block until it finishes. 5. **Re-screenshot after actions.** Windows may move, dialogs may appear. Take a fresh screenshot to verify the result before proceeding. ## Tool Groups ### Desktop (mouse, keyboard, screenshots) - `desktop_screenshot` — capture full screen or region [x, y, width, height] - `desktop_mouse_click` — left/right/middle click at x, y - `desktop_mouse_move` — move cursor to x, y - `desktop_mouse_drag` — drag from start to end coordinates - `desktop_type_text` — type ASCII text at current focus (ASCII only) - `desktop_press_key` — press key or combo: "enter", "ctrl+c", "alt+f4" - `desktop_scroll` — scroll at x,y; direction "up" or "down" - `desktop_find_window` — find windows by title or class_name substring - `desktop_focus_window` — bring window to foreground by title - `desktop_list_elements` — list UI controls (buttons, fields) inside a window - `desktop_click_element` — click a named UI element (more reliable than coords) - `desktop_get_element_text` — read the value of a named UI element ### Browser (Chromium via Playwright — CSS selectors) - `browser_navigate` — go to a URL - `browser_click` — click element by CSS selector - `browser_fill` — set input value (handles Unicode, triggers change events) - `browser_type` — type keystroke-by-keystroke (appends, does not clear) - `browser_press_key` — key press e.g. "Enter", "Control+a" - `browser_get_text` — extract visible text from element (default: body) - `browser_get_html` — get HTML markup of element - `browser_eval_js` — run JavaScript in page context - `browser_screenshot` — screenshot page or element - `browser_wait_for` — wait for element state: visible/hidden/attached/detached - `browser_select_option` — select a dropdown option by value or label - `browser_go_back` / `browser_go_forward` - `browser_tabs_list` / `browser_tab_new` / `browser_tab_close` ### Exec (run commands, async) - `exec_run` — start command; returns process_id immediately - `exec_get_output` — read stdout/stderr; set wait=true to block - `exec_send_input` — send a line to stdin of a running process - `exec_kill` — terminate a process - `exec_list` — list all tracked processes ### Files (base64 encoded) - `file_write` — write base64 content to a path - `file_read` — read file as base64 (use offset/limit for large files) - `file_list` — list directory; supports glob patterns, recursive - `file_delete` — delete file or empty directory - `file_move` — move or rename file/directory - `file_info` — get size, created, modified timestamps ## Authentication & Security The remote-claws MCP server requires a bearer token, configured in `openclaw.json` when registering the server. The server will reject unauthenticated connections with 401. The server also supports IP allowlisting (`allowed_ips`), host header validation (`allowed_hosts`), and per-tool permission policies (`permissions.json`) to restrict which tools are available. See the [setup guide](https://github.com/wentbackward/remote-claws/blob/master/remote-claws-openclaw-setup-guide.md) and [README](https://github.com/wentbackward/remote-claws#security) for configuration details. ## Important Notes - Screenshots are JPEG, max 1280x960. Coordinates are absolute pixels. - `desktop_type_text` is ASCII only. For Unicode, use `browser_fill` or clipboard: `exec_run powershell Set-Clipboard`, then `desktop_press_key ctrl+v`. - File content is base64 encoded. Decode after reading. - The browser launches on first use and stays open across calls. Sessions persist (cookies, local storage).
don't have the plugin yet? install it then click "run inline in claude" again.
extracted decision logic from strategy into explicit if-else decision points, documented external mcp server connection and auth requirements as inputs, formalized the original procedure into 6 numbered steps with clear input/output, added edge cases for unicode, large files, timeouts, and permission errors, and spelled out success criteria in output contract and outcome signal sections.
Controls a remote machine over MCP/SSE. All 39 tools are provided by the remote-claws MCP server registered in openclaw.json.
use remote claws when you need to interact with a desktop machine remotely: take screenshots, click buttons, type text, run commands, automate a browser, or transfer files. if the user asks you to do something "on the remote machine" or "on Windows," these are your tools. deploy this skill when direct desktop control beats api calls or manual steps, especially for UI automation, visual verification, or complex multi-step workflows that require seeing the screen state between actions.
MCP Server Registration
openclaw.json with bearer token authenticationREMOTE_CLAWS_TOKEN (bearer token for 401 auth bypass)REMOTE_CLAWS_HOST (server endpoint, e.g. http://localhost:8000)Security Configuration (optional but recommended)
allowed_ips: ip allowlist if configured on server sideallowed_hosts: host header validation listpermissions.json: per-tool restrictions (some tools may be disabled by policy)Context from User
Network & Connection
Take a screenshot. call desktop_screenshot with no region params to capture the full screen. this is your ground truth. examine the returned JPEG to identify window positions, buttons, text fields, and current state.
identify the target. from the screenshot, locate the ui element or window where the action needs to happen. note its approximate coordinates, visible label, or window title.
choose your tool category:
browser_* tools with css selectors. they are resolution-independent and more reliable.desktop_* tools. prefer desktop_click_element with element names over desktop_mouse_click with coordinates.exec_* tools.file_* tools.execute the action. call the appropriate tool with coordinates, selectors, or element names from step 2. for text input, use desktop_type_text (ascii only) or browser_fill (unicode ok). for commands, use exec_run (async) or exec_get_output with wait=true (blocking).
re-screenshot and verify. after each action, call desktop_screenshot again. compare the new screenshot to the previous one. confirm the action took effect (button state changed, text appeared, dialog closed, process started). if the result is unexpected, troubleshoot and retry.
repeat. loop through steps 2-5 until the task is complete. take a final screenshot as evidence.
if user asks to interact with a web page and a browser is already open: use browser_* tools (css selectors, unicode support, resolution-independent). skip to step 4 with browser tools.
if user asks to interact with a web page but no browser is running: call browser_navigate to start one and load the url, then use browser_* tools.
if the task requires typing unicode or non-ascii characters: use browser_fill (if in a browser context) or clipboard workaround: exec_run "powershell Set-Clipboard -Value '<text>'" followed by desktop_press_key ctrl+v. do not use desktop_type_text for non-ascii.
if a command needs to run in the background (e.g. file downloads, long-running processes): use exec_run (returns immediately with process_id). do not wait. call exec_get_output with wait=true only if you need to block until completion.
if exec_get_output with wait=true times out or hangs: the remote process may be stuck. call exec_kill with the process_id to terminate it. then take a screenshot to assess the state.
if a file is very large (> 10 MB): use file_read with offset and limit params to stream in chunks rather than loading the entire file into memory. decode base64 per chunk on your end.
if the server returns 401 unauthorized: the bearer token is missing, expired, or invalid. check REMOTE_CLAWS_TOKEN env var and openclaw.json configuration. request a fresh token from the operator.
if the server returns 403 forbidden: permissions.json is restricting the tool. the operator has disabled it. fall back to manual steps or request permission elevation.
if a screenshot or action times out (network latency > 30 seconds): the remote machine or network is degraded. wait a moment, retry, or escalate to the user that the remote session is unstable.
if desktop_click_element fails because the element name is ambiguous or not found: fall back to desktop_mouse_click with coordinates from the screenshot. or call desktop_list_elements to enumerate all available named elements in the target window.
For screenshots:
For text reads (browser_get_text, desktop_get_element_text):
For html reads (browser_get_html):
For command execution (exec_get_output):
wait=true, includes exit code (0 = success, non-zero = error).For file reads (file_read):
For file writes (file_write):
For element lists (desktop_list_elements, browser_wait_for):
For window finds (desktop_find_window):