From harness-claude
Interprets PostgreSQL EXPLAIN and EXPLAIN ANALYZE outputs to diagnose slow queries, row count misestimations, missing indexes, and I/O bottlenecks. Useful for production debugging and schema validation.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeThis skill uses the workspace's default tool permissions.
> How to read query execution plans to identify performance bottlenecks, row count misestimations, and missing indexes.
Analyzes slow queries in PostgreSQL, MySQL, MongoDB using EXPLAIN plans, buffer stats, index usage to detect sequential scans, missing indexes, and recommend optimizations.
Optimizes PostgreSQL and MySQL queries using EXPLAIN ANALYZE to identify sequential scans, inefficient joins, cardinality issues, and rewrite for indexes and low latency.
Analyzes SQL queries for missing indexes, N+1 patterns, suboptimal joins, full table scans. Interprets EXPLAIN plans, detects anti-patterns, recommends indexes, rewrites queries for PostgreSQL, MySQL, SQLite.
Share bugs, ideas, or general feedback.
How to read query execution plans to identify performance bottlenecks, row count misestimations, and missing indexes.
EXPLAIN shows the query plan without executing the query. EXPLAIN ANALYZE executes the query and shows actual timing and row counts alongside estimates.
-- Plan only (no execution):
EXPLAIN SELECT * FROM orders WHERE status = 'active';
-- Plan with actual execution metrics:
EXPLAIN ANALYZE SELECT * FROM orders WHERE status = 'active';
-- Full diagnostic output (recommended):
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders WHERE status = 'active';
Key fields in EXPLAIN output:
cost=startup..total -- estimated cost in arbitrary units (based on seq_page_cost). Startup cost is the work before the first row is returned. Total cost is the work for all rows.rows -- estimated number of rows the node will producewidth -- estimated average row width in bytesAdditional fields with ANALYZE:
actual time=startup..total -- real millisecondsrows -- actual number of rows producedloops -- how many times this node executed (important for nested loops)Reading order: Plans read bottom-up and inside-out. The deepest-indented node executes first. Each parent node consumes the output of its children.
BUFFERS adds I/O detail:
shared hit -- pages found in the buffer cache (fast)shared read -- pages read from disk (slow)A JOIN query with performance issues:
EXPLAIN (ANALYZE, BUFFERS)
SELECT o.id, o.total, c.name
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.status = 'pending' AND c.region = 'us-east';
Nested Loop (cost=1.00..45231.21 rows=10 width=28)
(actual time=0.082..4521.340 rows=48500 loops=1)
Buffers: shared hit=12340 read=89012
-> Index Scan using idx_orders_status on orders o
(cost=0.43..8234.12 rows=10 width=16)
(actual time=0.031..234.120 rows=48500 loops=1)
Index Cond: (status = 'pending')
Buffers: shared hit=5234 read=3421
-> Index Scan using customers_pkey on customers c
(cost=0.29..3.71 rows=1 width=20)
(actual time=0.085..0.086 rows=1 loops=48500)
Index Cond: (id = o.customer_id)
Filter: (region = 'us-east')
Rows Removed by Filter: 0
Buffers: shared hit=7106 read=85591
Planning Time: 0.342 ms
Execution Time: 4523.891 ms
Diagnosis:
loops=48500), each doing an index scan on customersshared read=85591 on customers shows heavy disk I/OFix: Update statistics to correct the misestimation:
ANALYZE orders;
After ANALYZE, the planner sees the correct row count and switches to Hash Join:
Hash Join (cost=1234.56..9876.54 rows=48200 width=28)
(actual time=12.340..89.120 rows=48500 loops=1)
Buffers: shared hit=8923 read=1234
Execution Time: 91.234 ms
Query time dropped from 4.5 seconds to 91 milliseconds.
Using EXPLAIN without ANALYZE. You only see estimated costs and rows -- not reality. Always use ANALYZE for performance diagnosis (but see the next point for destructive queries).
Running EXPLAIN ANALYZE on destructive queries without a transaction.
-- WRONG: actually deletes rows
EXPLAIN ANALYZE DELETE FROM orders WHERE status = 'cancelled';
-- CORRECT: analyze without side effects
BEGIN;
EXPLAIN ANALYZE DELETE FROM orders WHERE status = 'cancelled';
ROLLBACK;
Ignoring the rows discrepancy. When estimated rows differ from actual rows by more than 10x, the planner likely chose a suboptimal strategy. Fix statistics with ANALYZE tablename;.
Focusing only on total cost. A node with low total cost but high loops can dominate execution time. Multiply actual time by loops to get the true cost of a node.
JSON format for programmatic parsing:
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) SELECT ...;
auto_explain extension logs slow query plans automatically:
-- In postgresql.conf:
-- shared_preload_libraries = 'auto_explain'
-- auto_explain.log_min_duration = '1s'
EXPLAIN (ANALYZE, BUFFERS, WAL) in PostgreSQL 13+ shows WAL bytes generated -- useful for understanding write overhead.
pg_stat_statements identifies which queries need EXPLAIN. Sort by total_exec_time to find the biggest offenders:
SELECT query, calls, mean_exec_time, total_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
JIT compilation indicators. PostgreSQL 11+ may show JIT compilation in EXPLAIN output for complex queries. JIT: Functions: N, Generation Time: Xms, Inlining Time: Xms. JIT helps for CPU-bound queries but adds planning overhead.
Parallel query plans. Look for Workers Planned: N and Workers Launched: N. If launched < planned, the system ran out of parallel workers (max_parallel_workers_per_gather).
CTE Scan nodes. Before PostgreSQL 12, CTEs were always materialized (written to a temp tuplestore). The CTE Scan node reads from this materialized result. In PostgreSQL 12+, CTEs may be inlined -- no CTE Scan node appears.
SubPlan vs InitPlan. SubPlan executes once per outer row (correlated subquery). InitPlan executes once for the entire query (uncorrelated subquery). SubPlan with high outer row counts is a performance red flag.
"Never executed" nodes. In conditional plans (e.g., with CASE or parameterized bitmap scans), some nodes show (never executed). This is normal -- the runtime condition was not met.
MySQL EXPLAIN uses a tabular format with different columns:
| MySQL Column | PostgreSQL Equivalent |
|---|---|
type | Node type (ALL, index, range) |
possible_keys | Candidate indexes |
key | Chosen index |
rows | Estimated rows |
filtered | Selectivity percentage |
Extra | Additional info (Using index) |
MySQL EXPLAIN ANALYZE (8.0.18+) shows actual timing with tree format:
EXPLAIN ANALYZE SELECT * FROM orders WHERE status = 'active';
MySQL EXPLAIN FORMAT=TREE is the closest equivalent to PostgreSQL's default text output. MySQL lacks a BUFFERS equivalent -- I/O diagnostics require Performance Schema or SHOW STATUS counters.
API endpoint taking 8 seconds. EXPLAIN ANALYZE on the slow query revealed a Nested Loop join with rows=10 estimated but rows=200000 actual. The stale statistics caused the planner to choose Nested Loop instead of Hash Join. Running ANALYZE orders; updated the statistics, the planner switched to Hash Join, and the query dropped from 8 seconds to 50ms. The fix was added to the CI pipeline: ANALYZE runs after every migration.