Control the iOS Simulator — tap, type, swipe, screenshot, record video, install and launch apps. Use when interacting with the iOS simulator during React Native development. Triggers on "simulator", "tap", "swipe", "screenshot of simulator", "install app", "launch app", "what's on screen", "record video", "accessibility tree", "tap by label", "tap by id", "list elements", "go back", "scroll to top", "scroll to bottom".
From react-native-foundationsnpx claudepluginhub ryanthedev/react-native-foundations.skillThis skill is limited to using the following tools:
references/troubleshooting.mdscripts/app.shscripts/capture.shscripts/device.shscripts/ui.shSearches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Guides agent creation for Claude Code plugins with file templates, frontmatter specs (name, description, model), triggering examples, system prompts, and best practices.
On load: Read ../../.claude-plugin/plugin.json from this skill's base directory. Display ios-sim v{version} before proceeding.
Control the iOS Simulator through shell scripts wrapping xcrun simctl and AXe.
IMPORTANT: Never load screenshots or accessibility trees in the main context.
Always dispatch a subagent for visual/inspection tasks.
| Skill | What it uses |
|---|---|
a11y-audit | ui.sh describe-all for accessibility tree capture |
layout-check | capture.sh view + ui.sh describe-all for screenshot and element positions |
diagnose | capture.sh view to read error text from the simulator screen |
deeplink-test | Screenshot capture to verify the screen after firing a deep link |
xcrun simctl list devices to check)brew install cameroncooke/axe/axe) — required for all ui.sh commands (tap, tap-label, tap-id, type, swipe, describe-all, describe-point, list, back, scroll). capture.sh view works without AXe.${CLAUDE_SKILL_DIR}/references/troubleshooting.md if anything is missingAll scripts live at ${CLAUDE_SKILL_DIR}/scripts/. Run them with Bash.
| Intent | Workflow | Why |
|---|---|---|
| See what's on screen | view | Image stays in subagent |
| Find UI elements/coordinates | inspect | JSON tree stays in subagent |
| Multi-step UI interaction | interact | Entire loop stays in subagent |
| Simple one-shot command | direct | No image/tree involved |
| Intent | Script | Example |
|---|---|---|
| Get booted simulator ID | device.sh booted | device.sh booted |
| Open Simulator app | device.sh open | device.sh open |
| Save screenshot to file | capture.sh screenshot <path> | capture.sh screenshot /tmp/shot.png |
| Start video recording | capture.sh record | capture.sh record |
| Stop video recording | capture.sh stop | capture.sh stop |
| Install app bundle | app.sh install <path> | app.sh install /path/to/App.app |
| Launch app by bundle ID | app.sh launch <id> | app.sh launch com.example.app |
| Tap element by accessibility label | ui.sh tap-label <label> | ui.sh tap-label "Login" |
| Tap element by accessibility ID | ui.sh tap-id <id> | ui.sh tap-id "submit-button" |
| List on-screen elements (Controls/Content) | ui.sh list | ui.sh list |
| Tap the back/navigation button | ui.sh back | ui.sh back |
| Scroll to top or bottom of list | ui.sh scroll top|bottom | ui.sh scroll top |
When: "What's on the simulator screen?", "How does it look?", "Is there an error?"
Main agent never loads the image. Haiku does the analysis.
Dispatch Agent:
subagent_type: general-purpose
model: haiku
description: "ios-sim: analyze screenshot"
prompt: |
1. Run: ${CLAUDE_SKILL_DIR}/scripts/capture.sh view
This outputs a file path to a compressed JPEG.
2. Read that file path with the Read tool to see the image.
3. Analyze and return:
- Overview: What app/screen is visible (1-2 sentences)
- Key elements: Buttons, text, inputs, navigation items
- State: Errors, loading, forms filled, current tab
- Coordinates: Notable interactive elements with approximate point positions
4. If the user asked something specific, answer that directly.
Return text only. Be concise.
USER QUESTION: [insert user's question here]
When: "What elements are on screen?", "Find the login button", "Where should I tap?"
The accessibility tree JSON can be massive. Parse it in a subagent.
Dispatch Agent:
subagent_type: general-purpose
model: haiku
description: "ios-sim: inspect UI elements"
prompt: |
1. Run: ${CLAUDE_SKILL_DIR}/scripts/ui.sh describe-all
This outputs the full accessibility tree as JSON.
2. Parse the JSON and return a structured summary:
- Screen dimensions (from root frame)
- Interactive elements: buttons, text fields, switches, links
Format each as: "Label" [type] at (x, y) — size WxH
- Current focus/selection state
- Navigation structure (tabs, headers, back buttons)
3. If looking for a specific element, report its exact coordinates.
Return text only. Be concise.
LOOKING FOR: [insert what the user needs to find]
When: "Tap the login button", "Fill in the form", "Navigate to settings"
Combines view + inspect + actions in a subagent loop. The entire interaction stays isolated — main context only gets the final result.
Dispatch Agent:
subagent_type: general-purpose
description: "ios-sim: UI interaction"
prompt: |
You are automating the iOS Simulator. Scripts are at:
${CLAUDE_SKILL_DIR}/scripts/
Available commands:
- capture.sh view → compressed screenshot (read the output path to see it)
- ui.sh describe-all → full accessibility tree JSON
- ui.sh describe-point X Y → element at coordinates
- ui.sh tap X Y → tap at point coordinates
- ui.sh tap X Y --duration S → long press
- ui.sh type "text" → type ASCII text (max 500 chars)
- ui.sh swipe X1 Y1 X2 Y2 → swipe gesture
- ui.sh tap-label "label" → tap element by accessibility label (no coordinate lookup needed)
- ui.sh tap-id "id" → tap element by accessibility ID (no coordinate lookup needed)
- ui.sh list → compact table of on-screen elements grouped by Controls/Content
- ui.sh back → heuristic back-button finder and tap (scores by label/position)
- ui.sh scroll top|bottom → repeated swipes with stabilization detection (max 10 swipes)
TASK: [insert what the user wants to do]
WORKFLOW:
1. First capture.sh view to see current state
2. Use ui.sh describe-all if you need exact coordinates
3. Perform the requested actions
4. capture.sh view again to verify the result
5. Return a text summary of what you did and the final state
RULES:
- Use POINT coordinates from the accessibility tree, not pixel coordinates
- After each action, verify the result before proceeding
- If something fails, try describe-all to re-orient
- Return text summary only — do not include base64 image data
--udid <UUID> to target a specific device.ui.sh type only accepts ASCII printable characters (max 500 chars).capture.sh stop to finish.model to use the user's current model (better reasoning for complex multi-step tasks).| Item | Size | In Main Context? |
|---|---|---|
| Screenshot JPEG | ~100-300 KB | NEVER — haiku subagent only |
| Accessibility tree JSON | ~10-100 KB | NEVER — subagent only |
| Subagent text summary | ~200-800 chars | YES |
| Direct commands (device, app) | ~50-200 chars | YES |