npx claudepluginhub himattm/skills --plugin androidThis skill uses the workspace's default tool permissions.
`android layout` returns the entire on-screen UI as a structured JSON tree. For most verification — "did the button appear?", "is the input focused?", "did the count increment?", "is the error visible?" — JSON is **strictly better than a screenshot**:
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Guides code writing, review, and refactoring with Karpathy-inspired rules to avoid overcomplication, ensure simplicity, surgical changes, and verifiable success criteria.
Executes ctx7 CLI to fetch up-to-date library documentation, manage AI coding skills (install/search/generate/remove/suggest), and configure Context7 MCP. Useful for current API refs, skill handling, or agent setup.
Share bugs, ideas, or general feedback.
android layout returns the entire on-screen UI as a structured JSON tree. For most verification — "did the button appear?", "is the input focused?", "did the count increment?", "is the error visible?" — JSON is strictly better than a screenshot:
--diff returns only what changed since the last callDefault to this skill. Reach for verify-android-screen only when JSON can't answer the question (WebView content, animations, visual fidelity, image content).
focused, checked, selected)clickable, scrollable, etc.)layout may fail or return partial stateFor those, use verify-android-screen.
Each element in the layout tree may include:
| Property | Meaning |
|---|---|
text | Literal text the element contains |
resourceId | The Android resource id used to refer to the element |
contentDesc | Accessibility description |
class | Android view class (e.g. android.widget.Button) |
interactions | What the user can do: checkable, clickable, focusable, scrollable, long-clickable, password |
state | Current state: checked, focused, selected |
bounds | Bounding rectangle as [minX,minY][maxX,maxY] |
center | Center point as [x,y] |
off-screen | True if in the hierarchy but not currently visible — may need a scroll |
Example:
{
"key": -248568265,
"class": "android.widget.Button",
"text": "Submit",
"bounds": "[138,9][167,38]",
"center": "[152,23]",
"interactions": ["clickable", "focusable"]
}
android layout --pretty -o /tmp/layout.json
If the file is under ~50 lines, read it inline. Otherwise, delegate to a sub-agent (see below).
After the first call, use --diff to get only the elements that changed:
android layout --diff --pretty -o /tmp/layout-diff.json
This is the single biggest context saver. A calculator key press should return a one-element diff, not the whole tree.
When the dump is >50 lines (most real screens), spawn a sub-agent with model: "sonnet" and a self-contained prompt:
resourceId or text to findcenter of the element with text='Submit'")Do NOT read the dump in the main thread.
Read
/tmp/layout.json. Find an element withtext="Sign in". Return itscentercoordinate as[x,y], or "NOT FOUND" if absent. Under 20 words.
Read
/tmp/layout-diff.json. Verify the readout element (resourceIdcontainingdisplay) now showstext="42". Answer YES/NO + one sentence on what it actually shows.
Read
/tmp/layout.json. Confirm: (a) an EditText withstatecontainingfocused, (b) a Button withtext="Submit"andinteractionscontainingclickable. Under 40 words: did both pass? If not, what's actually there?
Once you have a center or bounds, drive adb shell input directly.
Tap the center of an element:
adb shell input tap 152 23
Swipe / scroll a scrollable element. The 5th argument is duration in ms — keep it generous (500ms+) so the gesture is interpreted as a scroll, not a fling:
adb shell input swipe 250 400 250 100 500
Type into an input. Always confirm state contains focused before typing — if it isn't, tap the element first:
adb shell input text "hello%sworld"
(Use %s for spaces in input text.)
state contains focused; if not, adb shell input tap the element first, then re-dump and verify focus.scrollable in its interactions, try scrolling it when looking for an off-screen element. off-screen: true on a target is a strong signal you need to scroll its container.layout call is missing expected information after an action, wait a couple of seconds and call layout --diff to see what arrived.layout failsandroid layout can fail on WebViews or mid-animation. Two fallbacks:
--diff (animation may finish)verify-android-screen with --annotate to find elements visually, then resolve to coordinates| Mistake | Fix |
|---|---|
| Reaching for a screenshot first | JSON is the default; screenshots are the fallback |
| Reading a 500-line layout dump inline | Always delegate dumps >50 lines to a Sonnet sub-agent |
Not using --diff in iteration loops | The full tree on every step is wasted context — --diff gives you only what changed |
| Typing into an unfocused input | Always verify state contains focused first; tap to focus if not |
| Fast swipes that fling instead of scroll | Use a duration of 500ms+ on adb shell input swipe |
| Vague sub-agent criteria ("does it look right?") | Name the resourceId, text, or state to check, and cap the response length |
| Letting the sub-agent default to Opus | Always pass model: "sonnet" — the task is narrow text parsing |