Skill

UI Interact

Automate simulator UI interactions - tap, swipe, type, and verify screen content.

Install

npx claudepluginhub briannadoubt/claude-marketplace --plugin apple-docs

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Automate simulator UI interactions - tap, swipe, type, and verify screen content.

SKILL.md

Similar Skills

kotlin-ktor-patterns

Provides Ktor server patterns for routing DSL, plugins (auth, CORS, serialization), Koin DI, WebSockets, services, and testApplication testing.

everything-claude-code

163.2k

deep-research

Conducts multi-source web research with firecrawl and exa MCPs: searches, scrapes pages, synthesizes cited reports. For deep dives, competitive analysis, tech evaluations, or due diligence.

everything-claude-code

163.2k

inventory-demand-planning

Provides demand forecasting, safety stock optimization, replenishment planning, and promotional lift estimation for multi-location retailers managing 300-800 SKUs.

everything-claude-code

163.2k

Stats

Stars0

Forks0

Last CommitJan 17, 2026

Actions

View Source View Plugin View on GitHub View README

UI Interact

Automate simulator UI interactions - tap, swipe, type, and verify screen content.

When to Use

Testing UI flows end-to-end
Automating repetitive manual testing
Capturing screenshots of specific screens
Verifying UI state after actions

Prerequisites

Important: This skill requires the AppleDocsTool MCP server for UI automation, as it uses macOS Accessibility APIs that can't be accessed via shell commands alone.

For basic interactions, you can use xcrun simctl directly. For visual state inspection and coordinate-based interactions, you'll need the MCP tools.

Basic Interactions (Shell)

Keyboard Input

# Type text (requires app with focused text field)
# Note: This sends keystrokes to the Simulator app
osascript -e 'tell application "Simulator" to activate'
osascript -e 'tell application "System Events" to keystroke "Hello World"'

# Press Return
osascript -e 'tell application "System Events" to key code 36'

# Press Escape
osascript -e 'tell application "System Events" to key code 53'

Hardware Buttons via simctl

# Home button
xcrun simctl ui booted home

# Lock device (if supported)
# Note: Not all button types available via simctl

Open URLs (Deep Links)

# Navigate via deep link
xcrun simctl openurl booted "myapp://screen/settings"
xcrun simctl openurl booted "https://example.com/login"

Visual State Inspection

Screenshot + Analysis

# Take screenshot
xcrun simctl io booted screenshot /tmp/screen.png

# View it (opens Preview)
open /tmp/screen.png

For OCR and coordinate extraction, use the MCP tool simulator_ui_state which returns:

Screenshot as base64 image
OCR-extracted text with bounding boxes
Tap coordinates for each text element

Coordinate-Based Interactions

The MCP tools provide coordinate-based interaction:

Get UI State (simulator_ui_state)
- Returns screenshot + all visible text with coordinates
Find Text (simulator_find_text)
- Search for specific text and get its tap coordinates
Interact (simulator_interact)
- Tap at coordinates
- Swipe between points
- Type text
- Press hardware buttons

Workflow Example

1. Call simulator_ui_state to see the screen
2. Find the "Login" button coordinates from the response
3. Call simulator_interact with action="tap" at those coordinates
4. Call simulator_ui_state again to verify the result

AppleScript Fallbacks

For basic automation without MCP:

# Click at coordinates (screen coordinates, not simulator)
osascript -e 'tell application "System Events" to click at {500, 400}'

# More reliable: Use Accessibility
osascript << 'EOF'
tell application "Simulator" to activate
delay 0.5
tell application "System Events"
    tell process "Simulator"
        -- Click in the simulator window
        click at {500, 400}
    end tell
end tell
EOF

Testing Workflows

Login Flow Test

# 1. Launch app fresh
xcrun simctl terminate booted com.example.myapp
xcrun simctl launch booted com.example.myapp

# 2. Wait for launch
sleep 2

# 3. Take screenshot to see initial state
xcrun simctl io booted screenshot /tmp/step1.png

# 4. Use MCP simulator_interact to tap login button
# 5. Use MCP simulator_interact to type credentials
# 6. Screenshot final state
xcrun simctl io booted screenshot /tmp/step2.png

Screenshot Tour

# Navigate through screens capturing each
for screen in home settings profile; do
  xcrun simctl openurl booted "myapp://$screen"
  sleep 1
  xcrun simctl io booted screenshot "/tmp/$screen.png"
done

Tips

Always wait (sleep) after actions for UI to settle
Use deep links when possible - more reliable than tap coordinates
Screenshot before AND after interactions to verify
Reset app state between test runs with xcrun simctl privacy booted reset
The simulator must be visible and active for UI interactions

Limitations

Shell-based automation is limited - can't inspect UI elements
Coordinate-based tapping is fragile if UI changes
AppleScript automation requires Simulator to be frontmost
For robust UI testing, consider XCUITest instead

MCP Tools (Required for Full Functionality)

simulator_ui_state - Get screenshot + OCR text with coordinates
simulator_find_text - Find specific text and get tap coordinates
simulator_interact - Tap, swipe, type, press buttons

These tools use macOS Accessibility APIs to interact with the simulator window directly, providing reliable coordinate-based automation.