From harness-claude
Implements closure table pattern storing ancestor-descendant pairs for O(1) subtree, ancestor, and path queries in SQL hierarchies like permissions, categories, and discussions.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeThis skill uses the workspace's default tool permissions.
> Storing all ancestor-descendant pairs in a separate table for O(1) subtree and ancestor lookups with manageable write costs.
> Encoding hierarchy position with left/right boundary numbers for O(1) subtree and ancestor queries at the cost of expensive writes.
Designs graph schemas, models relationships, optimizes traversals and queries for SurrealDB and general graph databases. Use for knowledge graphs, social networks, recommendations, fraud detection.
Provides PostgreSQL patterns for query optimization, schema design, indexing, security, RLS, UPSERT, pagination, and anti-pattern detection based on Supabase practices. Useful for SQL queries, migrations, slow query troubleshooting, and connection pooling.
Share bugs, ideas, or general feedback.
Storing all ancestor-descendant pairs in a separate table for O(1) subtree and ancestor lookups with manageable write costs.
Two tables: the node table and a closure table storing every ancestor-descendant pair:
CREATE TABLE categories (
id serial PRIMARY KEY,
name varchar NOT NULL
);
CREATE TABLE category_paths (
ancestor_id int NOT NULL REFERENCES categories(id) ON DELETE CASCADE,
descendant_id int NOT NULL REFERENCES categories(id) ON DELETE CASCADE,
depth int NOT NULL DEFAULT 0,
PRIMARY KEY (ancestor_id, descendant_id)
);
CREATE INDEX idx_paths_descendant ON category_paths (descendant_id, depth);
Every node has a self-referencing row (ancestor=self, descendant=self, depth=0). This is critical -- it simplifies descendant-count queries and ensures consistent results.
Key queries -- all O(1) with indexes:
| Query | SQL |
|---|---|
| All descendants of X | WHERE ancestor_id = X AND depth > 0 |
| All ancestors of X | WHERE descendant_id = X AND depth > 0 |
| Direct children of X | WHERE ancestor_id = X AND depth = 1 |
| Is A an ancestor of B? | WHERE ancestor_id = A AND descendant_id = B (existence check) |
| Path length A to B | SELECT depth WHERE ancestor_id = A AND descendant_id = B |
Insert a new node under parent P:
-- Copy all ancestor paths of the parent, pointing to the new node, with depth + 1
INSERT INTO category_paths (ancestor_id, descendant_id, depth)
SELECT cp.ancestor_id, NEW_ID, cp.depth + 1
FROM category_paths cp
WHERE cp.descendant_id = P
UNION ALL
SELECT NEW_ID, NEW_ID, 0; -- self-referencing row
Delete a node and its subtree: Remove all closure rows where descendant_id is in the subtree. With ON DELETE CASCADE on the closure table FK, deleting from categories handles this automatically.
An organizational permission hierarchy:
INSERT INTO categories (id, name) VALUES
(1, 'Company'),
(2, 'Engineering'),
(3, 'Backend'),
(4, 'Frontend'),
(5, 'Design');
-- Closure table rows (all ancestor-descendant pairs):
INSERT INTO category_paths (ancestor_id, descendant_id, depth) VALUES
(1, 1, 0), (2, 2, 0), (3, 3, 0), (4, 4, 0), (5, 5, 0), -- self-refs
(1, 2, 1), (1, 3, 2), (1, 4, 2), (1, 5, 1), -- Company's descendants
(2, 3, 1), (2, 4, 1); -- Engineering's descendants
All descendants of Engineering:
SELECT c.* FROM categories c
JOIN category_paths cp ON c.id = cp.descendant_id
WHERE cp.ancestor_id = 2 AND cp.depth > 0;
-- Returns: Backend, Frontend
Is Company an ancestor of Backend?
SELECT EXISTS(
SELECT 1 FROM category_paths
WHERE ancestor_id = 1 AND descendant_id = 3
);
-- Returns: true
Adding "DevOps" under Engineering:
INSERT INTO categories (id, name) VALUES (6, 'DevOps');
INSERT INTO category_paths (ancestor_id, descendant_id, depth)
SELECT cp.ancestor_id, 6, cp.depth + 1
FROM category_paths cp WHERE cp.descendant_id = 2
UNION ALL SELECT 6, 6, 0;
-- Creates: (6,6,0), (2,6,1), (1,6,2)
(ancestor_id, depth) for descendant queries AND (descendant_id, depth) for ancestor queries.Atomic insert using a single statement:
INSERT INTO category_paths (ancestor_id, descendant_id, depth)
SELECT cp.ancestor_id, $new_id, cp.depth + 1
FROM category_paths cp
WHERE cp.descendant_id = $parent_id
UNION ALL
SELECT $new_id, $new_id, 0;
Partial index for direct-children queries (depth=1 only):
CREATE INDEX idx_paths_direct_children
ON category_paths (ancestor_id) WHERE depth = 1;
Foreign key with cascading delete ensures closure rows are cleaned up automatically:
ALTER TABLE category_paths
ADD CONSTRAINT fk_ancestor FOREIGN KEY (ancestor_id)
REFERENCES categories(id) ON DELETE CASCADE,
ADD CONSTRAINT fk_descendant FOREIGN KEY (descendant_id)
REFERENCES categories(id) ON DELETE CASCADE;
Storage overhead analysis: A balanced tree of N nodes with branching factor B and height H produces O(N * H) closure rows. For a balanced binary tree of 1000 nodes (H ~10), that is about 10K closure rows -- 10x the node count. A degenerate chain (linked list) of N nodes produces O(N^2/2) rows -- worst case.
Comparison with materialized path: The ltree extension in PostgreSQL provides path-based hierarchy queries with GiST indexing. Materialized paths use less storage (one column per row vs O(N log N) closure rows) but lack the referential integrity of closure tables. See db-hierarchical-data for a full comparison.
Subtree move operation: Moving a subtree from parent A to parent B requires: (1) DELETE all paths from ancestors-of-A to descendants-of-subtree-root, (2) INSERT new paths from ancestors-of-B to descendants-of-subtree-root. This is O(subtree_size * depth) operations.
Hybrid approach: Maintain both parent_id (adjacency list) for simple parent lookups and the closure table for complex hierarchy queries. The adjacency list parent_id column adds negligible storage and simplifies direct-parent queries.
MySQL closure tables work identically at the SQL level. The same schema, queries, and insert patterns apply.
Key MySQL differences:
(ancestor_id, depth) instead of a filtered index on depth = 1INSERT ... SELECT works the same way for populating closure rowsON DELETE CASCADE on foreign keys identicallyRBAC permission system with 200K roles in a 15-level hierarchy. The core query "does user X have permission Y?" requires ancestor traversal -- checking if any of the user's role ancestors has the target permission. With adjacency list and recursive CTE: 12ms average. With closure table: 0.3ms average (single indexed join on category_paths). The closure table added 1.2M rows (6x the node count) but the 40x read performance improvement justified the storage. Role additions happen ~50/day; the insert cost (copying ~15 ancestor rows per new role) is negligible.
(ancestor_id, depth) and (descendant_id, depth).