Skill

Expression and Specialized Indexes

Creates and optimizes PostgreSQL expression indexes for computed values like lower(email) and GIN/GiST indexes for JSONB, arrays, full-text search, and geometric data to accelerate queries.

PostgreSQL

SQL

database

performance

npx claudepluginhub intense-visions/harness-engineering --plugin harness-claude

Tool Access

This skill uses the workspace's default tool permissions.

Preview

> Indexes on computed expressions and specialized index types (GIN, GiST) for non-scalar data like JSONB, arrays, and full-text search.

Supporting Assets

skill.yaml

SKILL.md

Similar Skills

Index Strategies

Guides B-tree index mechanics, composite column ordering, partial/covering indexes, GIN for JSONB/full-text search, GiST for geometry/ranges, and maintenance. Use for sequential scans or slow multi-column filters in EXPLAIN.

1 file

harness-claude

postgresql-optimization

30.6k

Provides PostgreSQL expertise on advanced features: JSONB/array operations, window functions, full-text search, extensions, and query performance tuning with EXPLAIN ANALYZE.

database-data-management

postgres-optimization

1.4k

Optimizes PostgreSQL with indexes, EXPLAIN ANALYZE query plans, partitioning, JSONB operations, and PgBouncer connection pooling examples.

claude-code-toolkit

Stats

Stars12

Forks6

Last CommitApr 10, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Expression and Specialized Indexes

Indexes on computed expressions and specialized index types (GIN, GiST) for non-scalar data like JSONB, arrays, and full-text search.

When to Use

Case-insensitive searches using lower() or upper()
JSONB key lookups and containment queries
Full-text search with tsvector and tsquery
Array containment queries (@>, &&)
Geometric, spatial, or range data types

Instructions

Key Concepts

An expression index indexes the result of a function or expression, not the raw column value:

CREATE INDEX idx_users_lower_email ON users (lower(email));

The query must use the exact same expression for the planner to match it:

-- Uses the index (expression matches):
SELECT * FROM users WHERE lower(email) = 'user@example.com';

-- Does NOT use the index (different function):
SELECT * FROM users WHERE upper(email) = 'USER@EXAMPLE.COM';

GIN (Generalized Inverted Index) is designed for values containing multiple elements:

-- Array containment:
CREATE INDEX idx_posts_tags ON posts USING GIN (tags);
SELECT * FROM posts WHERE tags @> ARRAY['postgresql'];

-- JSONB containment:
CREATE INDEX idx_events_meta ON events USING GIN (metadata);
SELECT * FROM events WHERE metadata @> '{"type": "purchase"}';

GIN indexes build an inverted index -- each distinct element (array value, JSONB key, text lexeme) maps to a list of row locations. This makes containment and membership queries fast.

GiST (Generalized Search Tree) supports geometric, range, and full-text data with operators like overlap (&&), contains (@>), and nearest-neighbor (<->):

CREATE INDEX idx_locations_geo ON locations USING GiST (coordinates);
SELECT * FROM locations WHERE coordinates && circle('(0,0)', 10);

Worked Example

JSONB metadata search on an events table with 20M rows:

CREATE TABLE events (
  id         SERIAL PRIMARY KEY,
  event_type TEXT NOT NULL,
  metadata   JSONB NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

Approach 1: GIN index for ad-hoc JSONB queries:

CREATE INDEX idx_events_meta_gin ON events USING GIN (metadata);

EXPLAIN ANALYZE
SELECT id, event_type
FROM events
WHERE metadata @> '{"type": "purchase", "source": "mobile"}';

Bitmap Heap Scan on events
  (cost=45.21..12345.67 rows=2100 width=12)
  (actual time=1.234..8.901 rows=1987 loops=1)
  Recheck Cond: (metadata @> '{"type": "purchase", "source": "mobile"}')
  ->  Bitmap Index Scan on idx_events_meta_gin
        (actual time=1.102..1.102 rows=1987 loops=1)
Execution Time: 9.432 ms

Approach 2: Expression index for a single frequently queried key:

CREATE INDEX idx_events_type_expr ON events ((metadata->>'type'));

EXPLAIN ANALYZE
SELECT id, event_type
FROM events
WHERE metadata->>'type' = 'purchase';

Index Scan using idx_events_type_expr on events
  (cost=0.43..4521.23 rows=18500 width=12)
  (actual time=0.025..12.341 rows=18200 loops=1)
  Index Cond: ((metadata ->> 'type') = 'purchase')
Execution Time: 13.102 ms

When to use which: GIN for ad-hoc queries across any combination of JSONB keys. Expression index for a single known key queried at high frequency -- smaller index, faster single-key lookup.

Anti-Patterns

Expression mismatch. Index on lower(email) but query uses UPPER(email) -- the planner cannot match different functions. The expression must be identical.
GIN on small scalar columns. GIN is designed for multi-element data types. Using GIN on a simple TEXT column is overkill -- use a B-tree.
Expensive expressions in indexes. The expression is evaluated on every INSERT and UPDATE. A computationally expensive function (e.g., involving network calls or complex parsing) in an expression index severely impacts write performance.
GiST when B-tree suffices. For simple scalar comparisons, B-tree is faster and smaller. GiST is for data types with complex relationships (geometry, ranges, full-text).

PostgreSQL Specifics

Trigram indexes for pattern matching. The pg_trgm extension enables GIN indexes that support LIKE '%pattern%' (leading wildcard) queries:

CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX idx_users_name_trgm ON users USING GIN (name gin_trgm_ops);

-- Now this uses the index:
SELECT * FROM users WHERE name LIKE '%alice%';

Full-text search with tsvector and GIN:

CREATE INDEX idx_posts_fts ON posts USING GIN (to_tsvector('english', body));
SELECT * FROM posts WHERE to_tsvector('english', body) @@ to_tsquery('postgresql & indexing');

GIN fastupdate. By default, GIN indexes use a "pending list" to batch insertions, trading faster writes for slightly slower reads. Disable for read-heavy workloads:

CREATE INDEX idx_meta_gin ON events USING GIN (metadata) WITH (fastupdate = off);

Details

Advanced Topics

GIN pending list. The gin_pending_list_limit setting (default 4MB) controls the pending list size. Larger values improve write throughput at the cost of slower first reads after inserts. VACUUM flushes the pending list.

GiST vs SP-GiST. SP-GiST (Space-Partitioned GiST) is optimized for data that can be partitioned into non-overlapping regions (quad-trees, k-d trees). Use SP-GiST for IP addresses (inet), text with prefix matching, and point data with spatial partitioning.

BRIN indexes (Block Range Indexes) for naturally ordered data like time-series. BRIN stores min/max values per block range rather than per row -- extremely small indexes for large, physically ordered tables:

CREATE INDEX idx_events_brin ON events USING BRIN (created_at);

RUM indexes (extension) extend GIN with ordering support for full-text search, enabling ORDER BY ts_rank() without a separate sort step.

Engine Differences

MySQL 8.0 supports functional indexes with similar syntax:

CREATE INDEX idx_lower_email ON users ((LOWER(email)));

MySQL has no GIN equivalent. Full-text search uses FULLTEXT indexes with different syntax:

-- MySQL full-text:
CREATE FULLTEXT INDEX idx_posts_body ON posts (body);
SELECT * FROM posts WHERE MATCH(body) AGAINST('postgresql indexing' IN BOOLEAN MODE);

MySQL FULLTEXT indexes use an inverted index internally but with different operators and ranking algorithms than PostgreSQL's GIN + tsvector approach.

MySQL spatial indexes use R-tree (similar to GiST) via the SPATIAL keyword but support fewer operators and data types than PostgreSQL's GiST.

Real-World Case Studies

Multi-tenant SaaS storing tenant configuration in JSONB. Each tenant had a config JSONB column with nested feature flags, plan details, and custom settings. Before indexing, admin queries like WHERE config @> '{"feature_flags": {"beta": true}}' scanned 2M tenant rows (12 seconds). Adding a GIN index on the config column enabled Bitmap Index Scan, dropping the query to 15ms. The GIN index consumed 180MB -- acceptable for eliminating full table scans on an admin dashboard used hundreds of times daily.

Source

Process

Read the key concepts to understand expression indexes, GIN, and GiST and their appropriate use cases.
Apply expression indexes when queries use function-wrapped columns, GIN for multi-element containment queries, and GiST for geometric/range data.
Verify with EXPLAIN ANALYZE that the planner uses the specialized index and that the query expression matches the index expression exactly.

Harness Integration

Type: knowledge -- this skill is a reference document, not a procedural workflow.
No tools or state -- consumed as context by other skills and agents.
related_skills: db-btree-index, db-partial-index, db-composite-index, db-explain-reading

Success Criteria

Expression indexes match the exact expression used in queries.
GIN is chosen for array, JSONB, and full-text containment queries.
GiST is chosen for geometric, range, and nearest-neighbor queries.
Anti-patterns (expression mismatch, GIN on scalars) are avoided.