SwiftAutoGUI

A Swift library for macOS automation — mouse, keyboard, screenshots, image recognition, and AI-powered agents.

This repository is inspired by pyautogui.

Demo

AI Agent that autonomously observes the screen and executes actions to achieve a goal.

sagui agent "Open Safari and search for Swift"

Requirements

macOS 26.0+
Swift 6.0+

Installation

Swift Package Manager

SwiftAutoGUI is available through Swift Package Manager.

in Package.swift add the following:

dependencies: [
    // Dependencies declare other packages that this package depends on.
    .package(url: "https://github.com/NakaokaRei/SwiftAutoGUI", branch: "master")
],
targets: [
    .target(
        name: "MyProject",
        dependencies: [..., "SwiftAutoGUI"]
    )
    ...
]

Example Usage

If you would like to know more details, please refer to the DocC Style Document.

AI Agent (Autonomous Loop)

SwiftAutoGUI includes an Agent that can autonomously observe the screen, reason about what it sees, and execute actions in a loop until a goal is achieved. This follows the ReAct (Observe → Think → Act) pattern using a vision-capable LLM.

Basic Usage

import SwiftAutoGUI

let backend = OpenAIVisionBackend(apiKey: "sk-...", model: "gpt-4o")
let agent = Agent(backend: backend, maxIterations: 15)

let result = try await agent.run(goal: "Open Safari and search for Swift")
print("Completed: \(result.completed), Steps: \(result.iterationsUsed)")

With Step Callback

let result = try await agent.run(goal: "Click the Settings icon") { step in
    print("Reasoning: \(step.reasoning)")
    print("Actions: \(step.actions)")
}

Custom Backend

You can implement the VisionActionGenerating protocol to use any vision-capable LLM:

struct MyBackend: VisionActionGenerating {
    var isAvailable: Bool { true }
    var unavailableReason: String? { nil }
    
    func generateActions(
        goal: String,
        screenshot: Data,
        screenSize: CGSize,
        history: [AgentStep]
    ) async throws -> AgentResponse {
        // Send screenshot to your LLM and parse the response
        ...
    }
}

CLI

# Run the agent from the command line
sagui agent "Open Safari and search for Swift" --api-key sk-...

# With options
sagui agent "Click the trash icon" --model gpt-4o --max-iterations 15 --delay 2.0

# Using environment variable for the API key
export OPENAI_API_KEY=sk-...
sagui agent "Open Terminal"

Action Pattern

SwiftAutoGUI provides an intuitive Action pattern for building and executing automation sequences.

Basic Usage

import SwiftAutoGUI

// Execute single actions
await Action.leftClick.execute()
await Action.write("Hello, World!").execute()
await Action.keyShortcut([.command, .a]).execute()  // Select all

// Build and execute action sequences
let actions: [Action] = [
    .move(to: CGPoint(x: 100, y: 100)),
    .wait(0.5),
    .leftClick,
    .write("Hello, SwiftAutoGUI!"),
    .keyShortcut([.returnKey])
]
await actions.execute()

Keyboard Actions

// Text input and shortcuts
let typingActions: [Action] = [
    .write("Fast typing"),
    .wait(1.0),
    .write("Slow typing", interval: 0.1),  // 0.1 second between characters
    .keyShortcut([.command, .z])  // Undo
]
await typingActions.execute()

// Common shortcuts as convenience methods
await Action.copy().execute()       // Cmd+C
await Action.paste().execute()      // Cmd+V
await Action.cut().execute()        // Cmd+X
await Action.selectAll().execute()  // Cmd+A
await Action.save().execute()       // Cmd+S
await Action.undo().execute()       // Cmd+Z
await Action.redo().execute()       // Cmd+Shift+Z

// Special keys
await Action.keyDown(.soundUp).execute()
await Action.keyUp(.soundUp).execute()

Help us improve

Find plugins for your project

Help us improve

swift-auto-gui

Popularity

Health & Quality

Confidence

What's Inside

README

SwiftAutoGUI

Demo

Requirements

Installation

Swift Package Manager

Example Usage

AI Agent (Autonomous Loop)

Basic Usage

With Step Callback

Custom Backend

CLI

Action Pattern

Basic Usage

Keyboard Actions

Mouse Actions

Help us improve

Similar Plugins

automating-mac-apps

phantom

handson

seer-skill

swiftui-autotest-skill

mac