From harness-claude
Optimizes PostgreSQL queries with covering indexes using INCLUDE clause for index-only scans, eliminating heap fetches in high-frequency dashboard and aggregation queries.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeThis skill uses the workspace's default tool permissions.
> Indexes that contain all columns needed by a query, enabling index-only scans that skip heap table access entirely.
Guides B-tree index mechanics, composite column ordering, partial/covering indexes, GIN for JSONB/full-text search, GiST for geometry/ranges, and maintenance. Use for sequential scans or slow multi-column filters in EXPLAIN.
Analyzes PostgreSQL and MySQL index usage to detect missing indexes causing sequential scans, unused indexes, and recommend optimal configurations for query performance.
Provides SQL Server index design and optimization strategies for clustered, nonclustered, columnstore, filtered, and covering indexes plus maintenance and fragmentation.
Share bugs, ideas, or general feedback.
Indexes that contain all columns needed by a query, enabling index-only scans that skip heap table access entirely.
Heap Fetches in EXPLAIN)A covering index contains every column the query references -- in WHERE, SELECT, ORDER BY, and GROUP BY. When the index "covers" the entire query, PostgreSQL performs an Index Only Scan, reading data exclusively from the index and never touching the heap table.
The INCLUDE clause (PostgreSQL 11+) adds non-searchable payload columns to the index:
CREATE INDEX idx_orders_covering
ON orders (status) INCLUDE (total, created_at);
INCLUDE columns are stored in index leaf pages but are not part of the search key. They do not affect sort order, are not used for index lookups, and add less overhead than regular index columns because they are not stored in internal (non-leaf) pages.
Without INCLUDE (pre-PostgreSQL 11 or when sort order matters), you can create a covering index by listing all columns as regular index keys:
CREATE INDEX idx_orders_cover_all ON orders (status, created_at, total);
This works but makes created_at and total part of the sort key, increasing index size across all pages (including internal nodes).
Dashboard query executed 1000 times per minute:
SELECT status, count(*), sum(total)
FROM orders
WHERE created_at > '2024-01-01'
GROUP BY status;
Without covering index (heap fetches required):
CREATE INDEX idx_orders_date ON orders (created_at);
EXPLAIN (ANALYZE, BUFFERS)
SELECT status, count(*), sum(total)
FROM orders WHERE created_at > '2024-01-01' GROUP BY status;
HashAggregate (actual time=892.341..892.356 rows=5 loops=1)
-> Index Scan using idx_orders_date on orders
(actual time=0.031..654.210 rows=3200000 loops=1)
Index Cond: (created_at > '2024-01-01')
Buffers: shared hit=89234 read=45123
Execution Time: 893.102 ms
The index finds matching rows but must fetch status and total from the heap -- 45K buffer reads from disk.
With covering index:
CREATE INDEX idx_orders_cover
ON orders (created_at, status) INCLUDE (total);
EXPLAIN (ANALYZE, BUFFERS)
SELECT status, count(*), sum(total)
FROM orders WHERE created_at > '2024-01-01' GROUP BY status;
HashAggregate (actual time=128.451..128.466 rows=5 loops=1)
-> Index Only Scan using idx_orders_cover on orders
(actual time=0.024..91.230 rows=3200000 loops=1)
Index Cond: (created_at > '2024-01-01')
Heap Fetches: 0
Buffers: shared hit=23456
Execution Time: 128.892 ms
Heap Fetches: 0 -- the index contained all needed columns. Buffer reads dropped from 134K to 23K. Query time dropped from 893ms to 129ms.
Including too many columns. Adding 10 INCLUDE columns creates a bloated index that is nearly as large as the table itself. The I/O savings from skipping the heap are offset by the larger index. Target 1-3 INCLUDE columns for specific queries.
Covering indexes for queries with unstable SELECT lists. If the application frequently adds columns to the query, the covering index must be rebuilt each time. Only use covering indexes for stable, high-frequency queries.
Ignoring the visibility map. PostgreSQL can only perform Index Only Scan on pages marked "all-visible" in the visibility map. If VACUUM has not run recently, you see Heap Fetches: N -- the index has the data but PostgreSQL must verify tuple visibility from the heap.
INCLUDE columns as regular keys. Putting payload columns as regular index keys instead of INCLUDE columns wastes space in internal pages and affects sort order unnecessarily.
Visibility map and VACUUM. The visibility map tracks which heap pages have only visible (committed) tuples. Index Only Scan skips heap access only for all-visible pages. After bulk updates or inserts, run VACUUM to update the visibility map:
VACUUM orders;
Monitor autovacuum to ensure it runs frequently enough:
SELECT relname, last_vacuum, last_autovacuum, n_dead_tup
FROM pg_stat_user_tables
WHERE relname = 'orders';
High n_dead_tup with a stale last_autovacuum means Index Only Scans are degraded.
INCLUDE with UNIQUE constraints:
CREATE UNIQUE INDEX idx_orders_unique_ref
ON orders (reference_number) INCLUDE (status, total);
This enforces uniqueness on reference_number while also enabling Index Only Scans for queries that select status and total.
Trade-off analysis. The decision to create a covering index involves weighing:
Heap Fetches and buffer reads)pg_relation_size)Monitoring effectiveness:
SELECT indexrelname,
idx_tup_read, -- entries returned by index scans
idx_tup_fetch -- heap tuples fetched (lower = more index-only scans)
FROM pg_stat_user_indexes
WHERE indexrelname = 'idx_orders_cover';
When idx_tup_fetch is much lower than idx_tup_read, the index is serving many Index Only Scans.
Covering indexes with partial indexes. Combine both techniques for highly targeted queries:
CREATE INDEX idx_active_cover
ON orders (created_at) INCLUDE (total, status)
WHERE status = 'active';
MySQL InnoDB handles covering indexes differently due to its clustered index architecture:
INCLUDE syntax -- to create a covering index, all columns must be regular index keysUsing index in the Extra column when a covering index is used-- MySQL covering index (no INCLUDE syntax available):
CREATE INDEX idx_orders_cover ON orders (created_at, status, total);
Key difference: In MySQL, making the primary key narrow (e.g., INT vs UUID) benefits all secondary indexes because the PK is appended to every secondary index entry.
Analytics platform reading 20M rows for daily reports. The report query selected 4 columns from a 25-column table. Without a covering index, each of the 20M matching rows required a random heap read -- the query took 45 seconds. After adding a covering index with the 4 columns (3 as keys, 1 as INCLUDE), the query used Index Only Scan and completed in 3 seconds. The index added 800MB of storage but eliminated 20M random I/O operations per report run, reducing total daily I/O by 90%.
EXPLAIN (ANALYZE, BUFFERS) that the query uses Index Only Scan with Heap Fetches near zero.