Help us improve
Share bugs, ideas, or general feedback.
From kwin-mcp
Automates Linux KDE Plasma Wayland desktops via kwin-mcp: launch apps, click, type, screenshot. For GUI automation, E2E testing, kiosk control, live sessions.
npx claudepluginhub isac322/kwin-mcp --plugin kwin-mcpHow this skill is triggered — by the user, by Claude, or both
Slash command
/kwin-mcp:kwin-desktop-automationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Drive Linux KDE Plasma 6 Wayland desktops through the `kwin-mcp` MCP server. The MCP server provides 30 capabilities; this skill provides the operational discipline to use them efficiently, in the right order, and without falling into platform-specific traps.
Automates Linux desktop applications for E2E testing using ydotool (Wayland), xdotool (X11), grim, and D-Bus.
Automates GUI interactions via screen capture, mouse clicks, typing, scrolling for UI testing, visual verification, and non-browser apps. Bridges Playwright to user browsers using extensions or CDP endpoints.
Automates Android, iOS, Aurora OS, and Desktop via CLI: screenshots, annotations, taps/swipes/text input, app install/launch/stop/uninstall, file push/pull, shell commands, device info queries.
Share bugs, ideas, or general feedback.
Drive Linux KDE Plasma 6 Wayland desktops through the kwin-mcp MCP server. The MCP server provides 30 capabilities; this skill provides the operational discipline to use them efficiently, in the right order, and without falling into platform-specific traps.
Activate this skill whenever the user wants to:
If no kwin-mcp tools are available, this skill does not apply.
Every other tool requires a session. Two mutually exclusive modes:
Virtual — session_start
Opens an isolated dbus-run-session + kwin_wayland --virtual compositor. Nothing reaches the host display. Use when the user says "test", "headless", "CI", "isolated", or names a specific app to launch fresh.
Useful arguments:
app_command="..." — launch the target app inside the session.enable_clipboard=true — required for clipboard_get / clipboard_set and the Unicode-via-clipboard fallback. Off by default because wl-copy can hang on a freshly minted bus.keep_screenshots=true — preserves PNGs in /tmp/kwin-mcp-screenshots-* after session_stop (delete the directory yourself when done).isolate_home=true — temp HOME with isolated XDG dirs; keeps host configuration untouched.Live — session_connect
Attaches to an already-running KWin: the user's real desktop, or a KWin running inside a container / kiosk / embedded device. Use when the user says "my", "current", "this window", "what I'm looking at", "container", "kiosk", or "live". Defaults to $DBUS_SESSION_BUS_ADDRESS and $WAYLAND_DISPLAY; clipboard is always enabled. session_stop only disconnects — it never kills the live KWin or its apps.
If the kwin-mcp server was launched with --default-live-session, the descriptions of session_start and session_connect swap roles; in that mode session_connect is the default.
End every successful turn that opened a session with session_stop. Virtual sessions leak kwin processes otherwise; live sessions just disconnect.
Each interaction is three steps. Cheap observation before action prevents acting on an unfocused window or stale UI.
Observation tools, cheapest first:
list_windows — window titles + active/focused markers. Free.accessibility_tree — full AT-SPI2 widget tree. Always pass app_name= and/or role= (e.g. "button", "check box") and/or max_depth= to keep it small. Don't fetch the whole tree just to find one button.find_ui_elements — query by name/role/states. Use this when you know what you are looking for. query="" + states=["focused"] answers "what currently has focus?".wait_for_element — same matching as find_ui_elements but polls until the element appears (or timeout_ms elapses). Use after launching an app or after any click that triggers async UI.screenshot — last resort for visual inspection or when AT-SPI2 fails to expose an element (see Pitfalls).Pick the cheapest tool that answers the question. Do not start with screenshot if find_ui_elements("Save") would suffice.
Action tools:
find_ui_elements / accessibility_tree directly with mouse_click.keyboard_type is ASCII / US-QWERTY only. It maps characters to evdev keycodes; non-ASCII silently breaks.keyboard_type_unicode for Korean / CJK / emoji / any non-ASCII. Internally uses wtype first, falls back to wl-copy + Ctrl+V. Requires wtype or wl-clipboard installed.Branch typing by string content — never assume the input is ASCII.
Verify after every meaningful action. Typical pattern:
find_ui_elements(query="OK", states=["enabled"]) — locate.mouse_click(x, y) — act.wait_for_element(query="Settings saved", timeout_ms=3000) — confirm.For animation-heavy or transient UI, pass screenshot_after_ms=[0, 100, 300] to a single action call instead of three round-trips — kwin-mcp captures frames server-side via the fast ScreenShot2 D-Bus interface (~30–70 ms per frame).
These are properties of the Wayland / AT-SPI2 / EIS stack, not bugs. Know them or get burned.
keyboard_type is US QWERTY only. Non-ASCII text must go through keyboard_type_unicode. Always check the input.enable_clipboard=true to session_start AND ensure wl-clipboard is installed. Live sessions always have clipboard.screenshot to disambiguate.screenshot, locate the menu item visually, click by coordinates.dbus_call to invoke KWin scripting or a keyboard shortcut instead of trying to hover the edge.session_connect fails with "no Wayland display" or "no session bus". The user must mount $XDG_RUNTIME_DIR/wayland-* and propagate DBUS_SESSION_BUS_ADDRESS into the container.session_stop once the task is complete, even if a step errored.keep_screenshots=true was used, /tmp/kwin-mcp-screenshots-* survives session_stop. Delete it explicitly when no longer needed.isolate_home=true + keep_home=true were both used, the temp HOME under /tmp/ also survives — delete it manually."Screenshot my desktop" (live):
session_connect()screenshot() → report the file path.session_stop()."Click the Save button in kate" (virtual):
session_start(app_command="kate")wait_for_element(query="Save", app_name="kate", timeout_ms=5000)mouse_click(x, y) using coords from step 2.wait_for_element(query="Save File", timeout_ms=3000) to confirm the dialog appeared.session_stop()."Type 안녕하세요 into the active text field":
keyboard_type_unicode(text="안녕하세요") — never keyboard_type; it would silently drop the characters."Find what currently has focus":
find_ui_elements(query="", states=["focused"]) — empty query is allowed when filtering by state.