Skill

ios-sim

Control the iOS Simulator — tap, type, swipe, screenshot, record video, install and launch apps. Use when interacting with the iOS simulator during React Native development. Triggers on "simulator", "tap", "swipe", "screenshot of simulator", "install app", "launch app", "what's on screen", "record video", "accessibility tree", "tap by label", "tap by id", "list elements", "go back", "scroll to top", "scroll to bottom".

From react-native-foundations

Install

Run in your terminal

npx claudepluginhub ryanthedev/react-native-foundations.skill

Tool Access

This skill is limited to using the following tools:

BashReadAgent

Supporting Assets

View in Repository

references/troubleshooting.md

scripts/app.sh

scripts/capture.sh

scripts/device.sh

scripts/ui.sh

Skill Content

Similar Skills

skill-lookup

Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.

prompts.chat

157.6k

prompt-lookup

Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.

prompts.chat

157.6k

Agent Development

6 files

Guides agent creation for Claude Code plugins with file templates, frontmatter specs (name, description, model), triggering examples, system prompts, and best practices.

plugin-dev

83.2k

Stats

Stars4

Forks0

Last CommitMar 11, 2026

Actions

View Source View Plugin View on GitHub View README

Skill: ios-sim

On load: Read ../../.claude-plugin/plugin.json from this skill's base directory. Display ios-sim v{version} before proceeding.

Control the iOS Simulator through shell scripts wrapping xcrun simctl and AXe.

IMPORTANT: Never load screenshots or accessibility trees in the main context.
Always dispatch a subagent for visual/inspection tasks.

Used By

Skill	What it uses
`a11y-audit`	`ui.sh describe-all` for accessibility tree capture
`layout-check`	`capture.sh view` + `ui.sh describe-all` for screenshot and element positions
`diagnose`	`capture.sh view` to read error text from the simulator screen
`deeplink-test`	Screenshot capture to verify the screen after firing a deep link

Prerequisites

macOS with Xcode installed
A booted iOS simulator (xcrun simctl list devices to check)
AXe (brew install cameroncooke/axe/axe) — required for all ui.sh commands (tap, tap-label, tap-id, type, swipe, describe-all, describe-point, list, back, scroll). capture.sh view works without AXe.
See ${CLAUDE_SKILL_DIR}/references/troubleshooting.md if anything is missing

Scripts

All scripts live at ${CLAUDE_SKILL_DIR}/scripts/. Run them with Bash.

Routing Table

Intent	Workflow	Why
See what's on screen	view	Image stays in subagent
Find UI elements/coordinates	inspect	JSON tree stays in subagent
Multi-step UI interaction	interact	Entire loop stays in subagent
Simple one-shot command	direct	No image/tree involved

Direct Commands (safe for main context)

Intent	Script	Example
Get booted simulator ID	`device.sh booted`	`device.sh booted`
Open Simulator app	`device.sh open`	`device.sh open`
Save screenshot to file	`capture.sh screenshot <path>`	`capture.sh screenshot /tmp/shot.png`
Start video recording	`capture.sh record`	`capture.sh record`
Stop video recording	`capture.sh stop`	`capture.sh stop`
Install app bundle	`app.sh install <path>`	`app.sh install /path/to/App.app`
Launch app by bundle ID	`app.sh launch <id>`	`app.sh launch com.example.app`
Tap element by accessibility label	`ui.sh tap-label <label>`	`ui.sh tap-label "Login"`
Tap element by accessibility ID	`ui.sh tap-id <id>`	`ui.sh tap-id "submit-button"`
List on-screen elements (Controls/Content)	`ui.sh list`	`ui.sh list`
Tap the back/navigation button	`ui.sh back`	`ui.sh back`
Scroll to top or bottom of list	`ui.sh scroll top\|bottom`	`ui.sh scroll top`

Workflows

1. View (screenshot analysis)

When: "What's on the simulator screen?", "How does it look?", "Is there an error?"

Main agent never loads the image. Haiku does the analysis.

Dispatch Agent:
  subagent_type: general-purpose
  model: haiku
  description: "ios-sim: analyze screenshot"
  prompt: |
    1. Run: ${CLAUDE_SKILL_DIR}/scripts/capture.sh view
       This outputs a file path to a compressed JPEG.
    2. Read that file path with the Read tool to see the image.
    3. Analyze and return:
       - Overview: What app/screen is visible (1-2 sentences)
       - Key elements: Buttons, text, inputs, navigation items
       - State: Errors, loading, forms filled, current tab
       - Coordinates: Notable interactive elements with approximate point positions
    4. If the user asked something specific, answer that directly.
    Return text only. Be concise.

    USER QUESTION: [insert user's question here]

2. Inspect (accessibility tree)

When: "What elements are on screen?", "Find the login button", "Where should I tap?"

The accessibility tree JSON can be massive. Parse it in a subagent.

Dispatch Agent:
  subagent_type: general-purpose
  model: haiku
  description: "ios-sim: inspect UI elements"
  prompt: |
    1. Run: ${CLAUDE_SKILL_DIR}/scripts/ui.sh describe-all
       This outputs the full accessibility tree as JSON.
    2. Parse the JSON and return a structured summary:
       - Screen dimensions (from root frame)
       - Interactive elements: buttons, text fields, switches, links
         Format each as: "Label" [type] at (x, y) — size WxH
       - Current focus/selection state
       - Navigation structure (tabs, headers, back buttons)
    3. If looking for a specific element, report its exact coordinates.
    Return text only. Be concise.

    LOOKING FOR: [insert what the user needs to find]

3. Interact (multi-step UI automation)

When: "Tap the login button", "Fill in the form", "Navigate to settings"

Combines view + inspect + actions in a subagent loop. The entire interaction stays isolated — main context only gets the final result.

Dispatch Agent:
  subagent_type: general-purpose
  description: "ios-sim: UI interaction"
  prompt: |
    You are automating the iOS Simulator. Scripts are at:
    ${CLAUDE_SKILL_DIR}/scripts/

    Available commands:
    - capture.sh view               → compressed screenshot (read the output path to see it)
    - ui.sh describe-all            → full accessibility tree JSON
    - ui.sh describe-point X Y      → element at coordinates
    - ui.sh tap X Y                 → tap at point coordinates
    - ui.sh tap X Y --duration S    → long press
    - ui.sh type "text"             → type ASCII text (max 500 chars)
    - ui.sh swipe X1 Y1 X2 Y2      → swipe gesture
    - ui.sh tap-label "label"       → tap element by accessibility label (no coordinate lookup needed)
    - ui.sh tap-id "id"             → tap element by accessibility ID (no coordinate lookup needed)
    - ui.sh list                    → compact table of on-screen elements grouped by Controls/Content
    - ui.sh back                    → heuristic back-button finder and tap (scores by label/position)
    - ui.sh scroll top|bottom       → repeated swipes with stabilization detection (max 10 swipes)

    TASK: [insert what the user wants to do]

    WORKFLOW:
    1. First capture.sh view to see current state
    2. Use ui.sh describe-all if you need exact coordinates
    3. Perform the requested actions
    4. capture.sh view again to verify the result
    5. Return a text summary of what you did and the final state

    RULES:
    - Use POINT coordinates from the accessibility tree, not pixel coordinates
    - After each action, verify the result before proceeding
    - If something fails, try describe-all to re-orient
    - Return text summary only — do not include base64 image data

Tips

All scripts auto-detect the booted simulator. Pass --udid <UUID> to target a specific device.
Screenshots are 3x pixel resolution. The accessibility tree reports point coordinates. Always use point coordinates for tap/swipe.
ui.sh type only accepts ASCII printable characters (max 500 chars).
Video recording runs in the background. Use capture.sh stop to finish.
For the interact workflow, omit model to use the user's current model (better reasoning for complex multi-step tasks).

Context Efficiency

Item	Size	In Main Context?
Screenshot JPEG	~100-300 KB	NEVER — haiku subagent only
Accessibility tree JSON	~10-100 KB	NEVER — subagent only
Subagent text summary	~200-800 chars	YES
Direct commands (device, app)	~50-200 chars	YES