From android
Analyzes Android Perfetto traces with SQL queries via trace_processor to extract metrics like missed frames, slice durations, and thread activity for pinpointing slowdowns.
npx claudepluginhub himattm/skills --plugin androidThis skill uses the workspace's default tool permissions.
You have a `.perfetto-trace` file (from `android-perfetto-capture`). This skill turns it into specific, citable findings using `trace_processor` SQL. You will:
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Guides code writing, review, and refactoring with Karpathy-inspired rules to avoid overcomplication, ensure simplicity, surgical changes, and verifiable success criteria.
Executes ctx7 CLI to fetch up-to-date library documentation, manage AI coding skills (install/search/generate/remove/suggest), and configure Context7 MCP. Useful for current API refs, skill handling, or agent setup.
Share bugs, ideas, or general feedback.
You have a .perfetto-trace file (from android-perfetto-capture). This skill turns it into specific, citable findings using trace_processor SQL. You will:
trace_processor if not already presentYou will never open the trace in ui.perfetto.dev from within an agent loop — it requires browser interaction. The SQL backend is the agent-friendly path.
android-perfetto-capture) and need numbersmain?" — slice→thread queryMyComposable recompositions?" — slice aggregationandroid-perfetto-captureandroid-probe-logging is fasterui.perfetto.devtrace_processor is a self-contained C++ binary. Cached in /tmp for the session, it's a one-time install:
if [ ! -x /tmp/trace_processor ]; then
curl -L https://get.perfetto.dev/trace_processor -o /tmp/trace_processor
chmod +x /tmp/trace_processor
fi
/tmp/trace_processor --version
It runs queries against a trace file and exits — no daemon mode required for ad-hoc analysis. (For interactive multi-query sessions you can run --http, but that's overkill for the single-query agent loop.)
The trace is large (10–100 MB binary). The query result should be small (under ~10 KB text). If your query returns 5,000 rows, narrow it — group, aggregate, or filter — until the result fits in a sub-agent's context with room to reason.
/tmp/trace_processor /tmp/trace.perfetto-trace -q - <<'SQL' > /tmp/trace-<question>.txt
<your query>
SQL
wc -l /tmp/trace-<question>.txt
Then:
Read
/tmp/trace-<question>.txt. Return: <very specific question, with units>. Under 80 words.model: "sonnet".
When you instrumented your code with Trace.beginSection("AGENT_TRACE_<id>....") (see android-trace-sections), this query lists every emitted slice with its thread and duration:
SELECT
s.name,
s.ts AS ts_ns,
s.dur / 1e6 AS dur_ms,
COALESCE(t.name, 'process/async') AS thread_name,
CASE
WHEN t.name = 'main' OR t.name LIKE '%.main' THEN 'MAIN'
WHEN t.is_main_thread = 1 THEN 'MAIN'
ELSE 'other'
END AS lane
FROM slice s
LEFT JOIN thread_track tt ON s.track_id = tt.id
LEFT JOIN thread t ON tt.utid = t.utid
WHERE s.name LIKE 'AGENT_TRACE_%'
ORDER BY s.ts;
LEFT JOIN is critical — slices wrapped across coroutine withContext boundaries land on async tracks, not thread tracks, and an inner join would silently drop them.
For "did frames meet the 60 Hz / 120 Hz budget?" use the actual frame timeline (requires surfaceflinger_frame_timeline in the capture config — see android-perfetto-capture Strategy B):
WITH app_frames AS (
SELECT
name AS frame_id,
ts,
dur / 1e6 AS dur_ms,
jank_type,
jank_severity_type,
on_time_finish
FROM actual_frame_timeline_slice
WHERE upid = (
SELECT upid FROM process WHERE name = 'com.example.app' LIMIT 1
)
)
SELECT
COUNT(*) AS total_frames,
SUM(CASE WHEN on_time_finish = 1 THEN 1 ELSE 0 END) AS on_time,
SUM(CASE WHEN on_time_finish = 0 THEN 1 ELSE 0 END) AS late,
SUM(CASE WHEN jank_type != 'None' THEN 1 ELSE 0 END) AS janky,
PRINTF('%.2f', AVG(dur_ms)) AS avg_dur_ms,
PRINTF('%.2f', MAX(dur_ms)) AS p100_dur_ms
FROM app_frames;
For per-frame detail (only when total frames < ~60, otherwise this is too verbose):
SELECT
frame_id,
PRINTF('%.2f', dur_ms) AS dur_ms,
jank_type,
on_time_finish
FROM app_frames
WHERE on_time_finish = 0 OR jank_type != 'None'
ORDER BY ts
LIMIT 50;
Bound the query by timestamps once you've identified a slow region from Recipe 2:
-- Replace 1234567890 / 1244567890 with your slow-frame ts and ts+dur
SELECT
s.name,
s.ts AS ts_ns,
s.dur / 1e6 AS dur_ms
FROM slice s
JOIN thread_track tt ON s.track_id = tt.id
JOIN thread t ON tt.utid = t.utid
JOIN process p ON t.upid = p.upid
WHERE
p.name = 'com.example.app'
AND t.is_main_thread = 1
AND s.ts >= 1234567890
AND s.ts <= 1244567890
ORDER BY s.dur DESC
LIMIT 30;
The result is the top 30 longest slices on the main thread inside that window — that's where the budget went.
SELECT
t.name AS thread_name,
PRINTF('%.2f', SUM(s.dur) / 1e9) AS total_running_s,
COUNT(*) AS num_slices
FROM sched s
JOIN thread t USING (utid)
JOIN process p USING (upid)
WHERE p.name = 'com.example.app'
GROUP BY t.name
ORDER BY SUM(s.dur) DESC
LIMIT 20;
Shows the most CPU-hungry threads in your app over the entire trace. Useful for "which thread is doing surprisingly much work?"
SELECT
s.name,
s.ts AS ts_ns,
s.dur / 1e6 AS dur_ms,
t.name AS thread_name
FROM slice s
JOIN thread_track tt ON s.track_id = tt.id
JOIN thread t ON tt.utid = t.utid
JOIN process p ON t.upid = p.upid
WHERE
p.name = 'com.example.app'
AND s.name IN ('Concurrent mark sweep GC', 'Concurrent copying GC', 'Explicit concurrent copying GC')
ORDER BY s.dur DESC
LIMIT 20;
Long GC pauses (over ~10 ms) on the main thread are visible in the trace as scheduled-out segments. This recipe surfaces them by name.
SELECT
s.name,
COUNT(*) AS calls,
PRINTF('%.2f', SUM(s.dur) / 1e6) AS total_ms,
PRINTF('%.2f', AVG(s.dur) / 1e6) AS avg_ms,
PRINTF('%.2f', MAX(s.dur) / 1e6) AS max_ms
FROM slice s
JOIN thread_track tt ON s.track_id = tt.id
JOIN thread t ON tt.utid = t.utid
JOIN process p ON t.upid = p.upid
WHERE
p.name = 'com.example.app'
AND s.name LIKE 'binder%'
GROUP BY s.name
ORDER BY SUM(s.dur) DESC
LIMIT 20;
If binder transaction shows up as a multi-millisecond average inside a tight UI loop, you have an IPC perf issue.
If your app uses Compose's tracing (Compose 1.5+ instruments Recomposer slices automatically when androidx.tracing is on the classpath):
SELECT
s.name,
COUNT(*) AS recompose_count,
PRINTF('%.2f', AVG(s.dur) / 1e6) AS avg_dur_ms,
PRINTF('%.2f', SUM(s.dur) / 1e6) AS total_ms
FROM slice s
JOIN thread_track tt ON s.track_id = tt.id
JOIN thread t ON tt.utid = t.utid
WHERE s.name LIKE '%recompose' OR s.name LIKE '%Recompose%'
GROUP BY s.name
ORDER BY total_ms DESC
LIMIT 20;
A composable recomposing 200 times across a 10s capture is a stability bug, even if each recomposition is fast.
When android-perfetto-capture produced N traces (Recipe section "Repetition for noisy phenomena"), use a shell loop to extract one number per trace and compare the distribution:
echo "trace,p50_ms,p95_ms,max_ms" > /tmp/trace-aggregate.csv
for f in /tmp/trace-*.perfetto-trace; do
/tmp/trace_processor "$f" -q - <<'SQL' | tail -1 | awk -v f="$f" '{print f","$0}' >> /tmp/trace-aggregate.csv
SELECT
PRINTF('%.2f', PERCENTILE(dur_ms, 50)) AS p50,
PRINTF('%.2f', PERCENTILE(dur_ms, 95)) AS p95,
PRINTF('%.2f', MAX(dur_ms)) AS max_ms
FROM (
SELECT s.dur / 1e6 AS dur_ms
FROM slice s
WHERE s.name = 'YOUR_SLICE_NAME'
);
SQL
done
(Note: trace_processor's SQLite has PERCENTILE_CONT available in newer builds — fall back to MIN/AVG/MAX if your version doesn't.)
After every query, the verdict step is:
wc -l <result-file> — confirm the result is smallBad sub-agent prompt:
Read the trace results and tell me what's slow.
Good sub-agent prompt:
Read
/tmp/trace-mainthread-window.txt. Return: (a) the single longest slice and its duration in ms, (b) the count of slices longer than 8 ms, (c) the most-frequent slice name. Under 60 words.model: "sonnet".
The sub-agent's verdict tells you whether to commit a fix, capture again, or run a different query.
The trace files and their analysis artifacts:
rm -f /tmp/trace*.perfetto-trace /tmp/trace-*.txt /tmp/trace-aggregate.csv
# Keep /tmp/trace_processor — it's reusable across sessions
If you wrote any custom shell wrappers around the queries, clean those too. The cleanup grep is light — no source touched, the only state is in /tmp.
| Mistake | Fix |
|---|---|
| Reading a trace file inline | Always go through trace_processor SQL — the binary file is unreadable to LLMs |
Inner JOIN on thread_track for slices | Async slices have no thread; use LEFT JOIN and coalesce thread name |
| Query returning 5,000 rows | Aggregate, group, or filter; the sub-agent needs a small result, not raw data |
Forgot to capture surfaceflinger_frame_timeline | Frame analysis (Recipe 2) returns empty; recapture with frametimeline data source |
| Comparing wall-clock dur to thread time | dur is wall-clock; tts (thread_ts) gives CPU time — use the right one for the question |
| Querying release build trace | Trace sections may be stripped; analysis returns empty for app slices; capture against debug |
| One trace, one number, declaring victory | Capture multiple times, look at the distribution; one capture is one sample |
| Sub-agent given the trace file directly | Sub-agents can't parse .perfetto-trace; always feed them the SQL result file |
trace_processor not refreshed across major Perfetto versions | Some queries break across versions; if a query errors, refresh: rm /tmp/trace_processor and re-curl |