Help us improve
Share bugs, ideas, or general feedback.
From automate
Executes Appium mobile test suites on Kobiton device cloud. Uploads apps, selects devices, parses capabilities, runs scripts, and surfaces session artifacts.
npx claudepluginhub kobiton/automate --plugin automateHow this skill is triggered — by the user, by Claude, or both
Slash command
/automate:run-automation-suiteThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Execute Appium-based mobile test automation suites on Kobiton's device cloud. Given a local Appium test script, this skill identifies the target app, selects an available device, parses and reconciles capabilities, runs the script in the background, and surfaces the resulting session URL plus artifacts (video, logs, screenshots, reports).
Performs interactive testing on Kobiton devices using natural language. Translates user intents into CLI commands for WebDriver actions, device operations, file management, and test execution.
Automates Android/iOS devices via MobAI HTTP API: screenshots, taps, typing, swipes, app launches, UI tree access using native-runner and web-runner sub-agents.
Guides creation of AppiumLibrary Robot Framework tests for iOS/Android native, hybrid apps, and mobile browsers. Useful for Appium mobile automation.
Share bugs, ideas, or general feedback.
Execute Appium-based mobile test automation suites on Kobiton's device cloud. Given a local Appium test script, this skill identifies the target app, selects an available device, parses and reconciles capabilities, runs the script in the background, and surfaces the resulting session URL plus artifacts (video, logs, screenshots, reports).
Use this skill when the user asks to run mobile tests, validate an APK or IPA across real devices, or trigger a Kobiton-hosted automation run from a local script directory.
Before invoking this skill, ensure:
api.kobiton.com/mcp; check .mcp.json for the configured endpoint)..js, .ts, .py, .java, .kt, .cs, .rb) referencing desired capabilities for the target platform..apk / .ipa / .zip build artifact for upload, or a kobiton-store:vXXXXX reference for an existing upload.This skill calls tools served by the Kobiton MCP server at api.kobiton.com/mcp. Two authentication configurations ship with the plugin; one of them must be active before any MCP tool will respond:
| Config file | Auth mechanism | When to use |
|---|---|---|
.mcp.json (default) | OAuth 2.1 browser flow | Interactive AI-CLI session for an end user |
.mcp.apikey-example.json | Basic auth header — Authorization: Basic base64(username:apikey) from the KOBITON_AUTH env var | CI / headless / scripted invocations; copy this file over .mcp.json |
If the skill is invoked and no MCP connection is established, abort step 1 and surface a clear error: the user needs to authenticate the Kobiton MCP server in their AI CLI before any device or session call can succeed. The exact invocation depends on the host (e.g. /mcp in Claude Code, /mcp auth kobiton in GitHub Copilot CLI and Gemini CLI, automatic browser flow on first tool call in Codex CLI, /mcp list then Login in Cursor CLI — see the plugin README for current per-CLI commands). Do NOT attempt to recover by retrying — the auth context is fixed at session start.
IMPORTANT: Always ask the user this question, even if they already provided an app file path. Do NOT skip ahead or start uploading automatically.
Ask the user:
"Would you like to:"
- Upload a new app build
- Use an existing app from Kobiton Store or a provided URL
Wait for their response before proceeding. Do not call any upload or app-related tools until the user responds.
If uploading a new app: Look for .apk, .ipa, .zip files in the project context, or ask the user for the file path. Upload via uploadAppToStore (permanent, visible in app repository). This is a four-step process: call the tool to get a pre-signed URL, upload the file via PUT, confirm the upload via confirmAppUpload, then poll getAppParsingStatus with the returned versionId until the state is terminal - OK means ready to use; a FAILURE_* state means parsing failed. confirmAppUpload may return appId: null for a brand-new upload - getAppParsingStatus resolves the real appId.
If reusing an existing app: Check appium:app field of capabilities in the test script. Call listApps with that app version as keywork to check uploaded or not. Let the user pick the version to use (e.g., kobiton-store:v72107) if needed
Ask the user which device or platform to target.
Call listDevices with the relevant platform filter to show available options.
If the user has a specific device in mind, confirm its availability with getDeviceStatus.
Reserve the device with reserveDevice if needed.
Ask the user for the path to their local Appium test script.
Detect the language and runtime from the file extension. See references/capabilities.md for the full extension -> runtime -> common commands lookup and the manifest-based runner selection guidance.
Read the script file and extract the key capability fields used by Appium and Kobiton (platformName, udid, app, automation/browser names, vendor kobiton:* extensions, etc.).
Identify how the UDID is passed into the script (CLI argument, environment variable, or hardcoded) so it can be overridden with the selected device.
Appium runtime: Check if the script contains 'kobiton:runtime': 'appium' or equivalent. If it does NOT, do not inject it - the default Kobiton runtime will be used. Only if the user explicitly asks to use the Appium runtime should you suggest adding 'kobiton:runtime': 'appium' to the script's capabilities.
Validate capabilities: After parsing the script, run the render script to generate the correct capabilities for the selected device and app:
node skills/run-automation-suite/scripts/render-capabilities.js \
--platformName <platform> \
--udid <udid> \
--deviceName "<deviceName>" \
--platformVersion <version> \
--automationName <automationName> \
--app <app> \
--testingType app
For web testing, replace --app <app> with --browserName <browser> --testingType web.
Compare the JSON output against the parsed script capabilities using the reconciliation rules: must-match fields are autocorrected to the rendered values, suggested defaults require user confirmation before changing, and user-controlled capabilities are left untouched.
The rendered output also includes kobiton:aiToolName: "<host>" so Kobiton can attribute sessions started by this skill to the calling AI workspace in adoption analytics. Resolution order:
--aiToolName <name> CLI flag (always wins; "" opts out entirely)KOBITON_AI_TOOL_NAME env var (also accepts "" to opt out)CLAUDECODE -> ClaudeCOPILOT_CLI -> CopilotGEMINI_CLI -> GeminiCODEX_THREAD_ID (or CODEX_CLI) -> Codex — Codex CLI sets the thread ID, not a generic CODEX_CLI flag; the latter is accepted for manual override onlykobiton:aiToolName capability is emitted.This capability is treated as must-match during reconciliation (see references/capabilities.md): if the rendered output includes kobiton:aiToolName, always overwrite any existing value in the user's script with it. A stale value from a prior session run under a different CLI would mis-attribute adoption analytics. If the rendered output omits the capability (no runtime marker matched), leave the user's value untouched.
The injection is non-interactive. Edit the script silently using your Edit tool — mention the one-line change inline in your reply for transparency (e.g., "Added kobiton:aiToolName: 'Gemini' to your capabilities for adoption analytics."), but do NOT ask the user to confirm before editing. The value is deterministic (it matches the runtime env from auto-detect), there is nothing to negotiate. If the user objects, they can revert the edit themselves.
Required: verify the injection landed before Step 4. The kobiton:aiToolName capability must be present in the script's source code (e.g., the capabilities object/dict/map). If your reconciliation pass didn't write it to the script, Kobiton will never see it — there is no sidecar config that injects it at runtime. Confirm with a literal-string grep against the user's script:
grep -F 'kobiton:aiToolName' <path-to-user-script>
Edit tool to add the capability to the script's capabilities block now (use the language-appropriate syntax for the script — JS/TS object literal, Python dict, Java Map.of(...) / DesiredCapabilities, .NET AppiumOptions.AddAdditionalCapability(...), Ruby hash), then re-run the grep. Do NOT proceed to Step 4 until the grep succeeds.--aiToolName "" or KOBITON_AI_TOOL_NAME=""). Skip injection, proceed.Present a summary to the user before running:
Language: Node.js
Script: /path/to/test.js
Platform: Android
Device: Pixel 4 (9B211FFAZ0017F)
App: kobiton-store:v72107
Session Name: Verify Appium session
Command: node /path/to/test.js 9B211FFAZ0017F
Wait for user confirmation, then execute the command in the background using your shell execution tool.
Ask the user:
"Would you like me to open the running session in the browser?"
Wait for their response. If they decline, skip to Step 6.
If they agree, wait 2 seconds after the script was launched in Step 4 (to allow the session to initialize on Kobiton), then open the session in the user's browser.
Determine the portal URL: Read .mcp.json to get the MCP server URL, then derive the portal base URL by replacing the api host with the portal equivalent (drop any trailing /mcp):
| MCP Server | Portal Base URL |
|---|---|
https://api.kobiton.com/mcp | https://portal.kobiton.com |
https://api-*.kobiton.com/mcp | https://portal-*.kobiton.com (same * suffix) |
For example, an api-*.kobiton.com host maps to its matching portal-*.kobiton.com host. If the mapping doesn't resolve, fall back to https://portal.kobiton.com.
Build the launch URL. Default to the device-only view — it shows just the device screen, no surrounding Kobiton UI, ideal for watching an automation run, sharing, or embedding:
<portal-base-url>/devices/launch?id=<deviceId>&view=device-only
Where <deviceId> is the ID of the selected device from Step 2 (returned by listDevices, getDeviceStatus, or reserveDevice).
The device-only view is also interactive for redirection: when the user taps or swipes on the device canvas, those gestures BOTH (a) reach the device in real time (just like a normal click-to-device tap) AND (b) are captured and made available to you via getUserInputEvents (see Step 6) so you can see what the user just did and adapt your plan.
Fall back to the default view (without &view=device-only) only when the user explicitly asks to interact with the device — e.g. "let me drive it manually", "open the full session view", "I want to tap on the screen", or similar interaction-implying language. The default view shows the full Kobiton UI around the device (sidebars, controls, action panels):
<portal-base-url>/devices/launch?id=<deviceId>
Launch via the chromeless helper (default — gated on saved browser preference). For the device-only URL branch above, check auto memory for a saved browser preference:
When the chromeless launcher is invoked, it opens Chrome in --app mode (no tab strip, no URL bar, no bookmarks bar) and resizes the window to a device-shaped frame. Pick width and height based on the reserved device's form factor (the device name from Step 2 — listDevices / getDeviceStatus / reserveDevice):
| Device class | Detect by name (case-insensitive) | Portrait width × height | Landscape width × height |
|---|---|---|---|
| Tablet | iPad, Galaxy Tab, Pixel Tablet, Surface, MatePad, or any model with "Tab" or "Pad" in the name | 780 × 920 | 920 × 780 |
| Fold (unfolded) | Fold, Z Fold, Pixel Fold | 880 × 920 | 920 × 880 |
| Phone (default — Galaxy, Pixel, iPhone, OnePlus, Xiaomi, …) | none of the above patterns match | 540 × 920 | 920 × 540 |
All three presets share the same 920 px height — only the width grows by device class — so the chromeless window's vertical footprint stays consistent.
If getSession / the rendered capabilities report orientation=LANDSCAPE, swap width and height (use the Landscape column above). Default is portrait.
Pick the right command for the host OS:
| OS | Command |
|---|---|
| macOS | bash skills/run-automation-suite/scripts/chromeless-launcher.sh --url "<url>" --width <W> --height <H> |
| Windows | pwsh skills/run-automation-suite/scripts/chromeless-launcher-windows.ps1 -Url "<url>" -Width <W> -Height <H> |
| Linux | bash skills/run-automation-suite/scripts/chromeless-launcher.sh --url "<url>" --width <W> --height <H> (launch-only — no auto-resize on Linux) |
<W> and <H> are the dimensions from the table above — substitute literally.
On macOS, the very first run triggers a system prompt: "X wants to control Google Chrome.app" — click OK. The grant lives under System Settings → Privacy & Security → Automation (NOT Accessibility) and persists per host process. Tell the user this once if you can see it's their first invocation.
Launcher exit codes drive the fallback:
0 — Chrome was launched (resize may have logged a warning, but the session is fine). Continue to Step 6.2 — Chrome / Chromium is not installed on the host. Fall through to the default-browser fallback table below.64 — usage error (missing --url or unknown flag) — surface to the user; the launcher is buggy.Manual-interaction fallback URL. When the URL branch above was the manual-interaction form (the user explicitly asked to drive the device, so the URL has no ?view=device-only), skip the chromeless helper entirely and go straight to the default-browser fallback below — the user wants the full Kobiton UI around the device, so a chromeless --app window would defeat the point.
Default-browser fallback (used when (a) the saved browser preference is Safari/Firefox/Default, OR (b) chromeless exited 2 because Chrome is absent, OR (c) the URL branch is the manual-interaction form):
Check auto memory for a saved browser preference. If none exists, ask the user which browser to use:
"Which browser should I open the session in?"
- Google Chrome
- Safari
- Firefox
- Default browser
Save their choice to auto memory so they are not asked again in future sessions.
| Choice | Command |
|---|---|
| Google Chrome | open -na "Google Chrome" --args --new-window <url> |
| Safari | open -a "Safari" <url> |
| Firefox | open -a "Firefox" <url> |
| Default browser | open <url> |
On Linux, use xdg-open <url> (browser selection is not supported — always opens the default).
While the background script is running, call listSessions with deviceId=<deviceId> (from Step 2) and state='START' to find the session that just triggered. Use the most recent session (first result) as the match.
Call getSession with the matched session ID to get detailed results.
Call getSessionArtifacts with the session ID to retrieve:
Watch for user redirection (when the device-only view is open). While the background script runs, poll getUserInputEvents with the matched sessionId between your scripted commands — no faster than once per second. Track the timestamp of the newest event you have seen and pass it as sinceTimestamp on the next call so you only get new gestures:
sinceTimestamp to drain everything buffered. Each response is capped (default 50 events, max 200) — if you see exactly that many events in a single response there may be more behind them; re-poll immediately with sinceTimestamp set to the newest event's timestamp to page forward.(0.42, 0.78) near the bottom-right means "the user tapped there; they may want me to focus on whatever is at that position (looks like Settings)." A swipe (xNorm/yNorm → xNorm2/yNorm2) means they want you to scroll or navigate in that direction.sinceTimestamp to the largest timestamp you received, so the next poll returns only newer gestures.Present a summary to the user:
On successful completion, the skill returns:
https://portal.kobiton.com/devices/launch?id=<deviceId>, opened automatically in the user's default browser as the script starts.getSession).getSessionArtifacts).On failure, the skill surfaces error output from the test runner, the session URL if the session reached Kobiton (useful for portal-side debugging), and suggested next steps drawn from the categories in ## Error Handling.
listDevices returns empty: suggest broadening filters (remove platform/group constraints) or trying again later when devices free up.getAppParsingStatus returns a FAILURE_* state): surface the state to the user and stop - do not reserve devices or start a session with that build.getSession with a reasonable timeout. If still running, offer to call terminateSession and retry.reserveDevice fails (device already taken): call listDevices again to find another available device.wd, appium), incorrect UDID, or network issues. Suggest fixes."Run
./tests/checkout.json a Pixel 7 - upload the latest APK from./build/app.apkfirst."
The skill detects the .apk build, uploads it via uploadAppToStore, queries listDevices filtered to Pixel 7, reserves the device with reserveDevice, parses the script's capabilities, confirms the launch summary with the user, runs node ./tests/checkout.js <udid> in the background, opens the live session URL in the user's browser, and returns the session ID plus artifacts when the run completes.
"Test this app @TestApp.ipa by this script @automation.js on Kobiton iOS iPhone 15 Pro"
The skill resolves the two @-referenced files from the chat context, uploads TestApp.ipa via uploadAppToStore, queries listDevices filtered to iOS iPhone 15 Pro, reserves the matching device, parses automation.js for capabilities, reconciles them against the rendered defaults for the selected device, confirms the launch summary with the user, runs node automation.js <udid> in the background, opens the live session URL in the user's browser, and surfaces the session ID plus artifacts when the run completes.
kobiton:* and supported appium:* capabilities the skill's render-capabilities step compares against.kobiton/automate plugin source - issue tracker, contribution guide, and the tool YAML schemas this skill orchestrates.