From mobai
Automates Android/iOS devices via MobAI HTTP API: screenshots, taps, typing, swipes, app launches, UI tree access using native-runner and web-runner sub-agents.
How this skill is triggered — by the user, by Claude, or both
Slash command
/mobai:mobaiThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
When automating mobile devices, ALWAYS use this order:
When automating mobile devices, ALWAYS use this order:
ALWAYS try DSL subagents first. Raw HTTP API for tap/type/swipe/screenshot/ui-tree is a LAST RESORT.
Screenshots: When using the API, screenshots are automatically saved to /tmp/mobai/screenshots/ and the path is returned. Use the Read tool to view them.
This skill enables you to control Android and iOS devices through the MobAI HTTP API running locally.
For complex automation tasks, use a hierarchical approach with specialized sub-agents:
| Scenario | Approach |
|---|---|
| Simple query (list devices, take screenshot) | Direct API call |
| Native app automation (Settings, Instagram) | Spawn native-runner sub-agent |
| Browser chrome (URL bar, tabs, nav buttons) | Spawn native-runner sub-agent |
| Web page DOM content (CSS selectors, JS, DOM) | Spawn web-runner sub-agent (try native-runner first) |
| Complex multi-step task | Break into subgoals, spawn appropriate sub-agent for each |
/native-runner)Use for native mobile apps - apps that use platform UI components:
Uses DSL batch execution with element predicates for robust automation.
How to spawn:
Use the native-runner skill to accomplish: [subgoal description]
Device ID: [deviceId]
/web-runner)Use web-runner when you need to interact with DOM content inside a web page or WebView:
IMPORTANT: iOS Simulators NOT supported - Web context requires a physical iOS device. Use native-runner for simulators.
Try native-runner first - it works for most web page interactions via accessibility tree.
DO NOT use for browser UI elements (address bar, tabs, back button) - those are native!
Uses DSL batch execution with CSS selectors and JavaScript.
How to spawn:
Use the web-runner skill to accomplish: [subgoal description]
Device ID: [deviceId]
User request: "Log into Twitter, search for 'AI news', and screenshot the results"
Step 1: List devices to get device ID (direct API call) Step 2: Launch Twitter app (direct API call) Step 3: Spawn native-runner: "Tap the search tab and enter 'AI news'" Step 4: Wait for results (determine if it's native or web) Step 5: If web results: spawn web-runner: "Scroll to see results" Step 6: Take screenshot (direct API call)
Use the mcp__mobai-http__http_request tool to make HTTP requests:
{
"method": "GET",
"url": "http://127.0.0.1:8686/api/v1/devices"
}
For POST/PUT/PATCH requests with a body:
{
"method": "POST",
"url": "http://127.0.0.1:8686/api/v1/devices/{id}/dsl/execute",
"body": "{\"version\":\"0.2\",\"steps\":[{\"action\":\"observe\",\"context\":\"native\"}]}"
}
Parameters:
method: HTTP method (GET, POST, PUT, PATCH, DELETE)url: Full URL to requestbody: Request body as JSON string (for POST/PUT/PATCH)headers: Optional additional headerstimeout: Request timeout in milliseconds (default: 600000 = 10 minutes)http://127.0.0.1:8686/api/v1
No authentication is required. The API runs on localhost only.
Success responses:
{"success": true, "data": {...}}
Error responses:
{"error": "error message", "code": "optional_code", "details": "optional_details"}
GET /devices # List all connected devices
GET /devices/{id} # Get specific device info
GET /devices/{id}/screenshot # Capture screenshot (saved to file, path returned). Add ?path=~/Downloads&name=foo to save to custom location.
GET /devices/{id}/apps # List installed apps (or use DSL observe with include: ["installed_apps"])
POST /devices/{id}/bridge/start # Start on-device bridge (60s timeout)
POST /devices/{id}/bridge/stop # Stop on-device bridge
POST /devices/{id}/dsl/execute # Execute DSL script with retries
Example DSL script:
{
"version": "0.2",
"steps": [
{"action": "observe", "context": "native", "include": ["ui_tree"]},
{"action": "tap", "predicate": {"text_contains": "Settings"}},
{"action": "observe", "context": "native", "include": ["ui_tree"]}
],
"on_fail": {"strategy": "retry", "max_retries": 2}
}
POST /devices/{id}/scroll-until-visible # Scroll to find element
POST /devices/{id}/collect-list # Collect all list items
POST /devices/{id}/agent/run # {"task": "...", "agentType": "toolagent"}
POST /devices/{id}/kill-app # Force-kill app: {"bundleId": "..."}
DELETE /devices/{id}/apps/{bundleId} # Uninstall app by bundle ID
POST /devices/{id}/location # Set GPS: {"lat": 40.71, "lon": -74.00}
DELETE /devices/{id}/location # Reset to real GPS
Use metrics_start and metrics_stop DSL actions for performance testing:
{
"version": "0.2",
"steps": [
{"action": "metrics_start", "types": ["system_cpu", "system_memory", "fps", "network", "battery"], "capture_logs": true, "label": "test"},
{"action": "open_app", "bundle_id": "com.example.app"},
{"action": "delay", "duration_ms": 5000},
{"action": "metrics_stop", "format": "summary"}
]
}
Returns health score, anomalies, and recommendations. Metric types: system_cpu, system_memory, fps, network, battery, process.
CRITICAL: Browser apps (Safari, Chrome) have TWO zones:
Use Native Mode (native-runner) when:
Use Web Mode (web-runner) ONLY when:
NEVER use web-runner for: browser chrome UI (address bar, tabs, navigation buttons) - always use native-runner for those!
Detection Tips:
bridgeRunning: true) before automationapi-reference.md for full endpoint documentationnpx claudepluginhub mobai-app/mobai-marketplace --plugin mobaiExecutes native UI automation on mobile devices via DSL batch scripts: tap/type/swipe elements, launch apps, verify screens, save screenshots using accessibility tree predicates. For testing apps and device interactions.
Drives Kobiton mobile devices interactively from natural language. Translates intents into WebDriver actions, ADB commands, file transfers, and app management via the Kobiton CLI.
Automates Android, iOS, Aurora OS, and Desktop via CLI: screenshots, annotations, taps/swipes/text input, app install/launch/stop/uninstall, file push/pull, shell commands, device info queries.