From harness-claude
Guides cursor-based pagination with opaque tokens to prevent page drift from inserts/deletes in feeds, logs, timelines, and large datasets. Covers forward/backward traversal.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeThis skill uses the workspace's default tool permissions.
> CURSOR-BASED PAGINATION REPLACES NUMERIC OFFSETS WITH OPAQUE POSITION TOKENS — EACH CURSOR ENCODES EXACTLY WHERE THE CLIENT LEFT OFF, ELIMINATING PAGE DRIFT WHEN ROWS ARE INSERTED OR DELETED BETWEEN REQUESTS AND ENABLING CONSISTENT TRAVERSAL OF LIVE DATASETS.
Guides offset/limit SQL pagination: when to use for admin UIs or read-heavy endpoints, pitfalls like costly COUNT(*) queries, page drift from concurrent writes, and deep offset scans. Advises max offsets and cursor alternatives.
Implements offset/limit, cursor-based, and keyset pagination for APIs with large datasets. Use for returning collections, search results, infinite scroll, and query optimization.
Implements offset, cursor, and keyset pagination strategies for APIs handling large datasets. Use for paginated endpoints, infinite scroll, or optimizing database collection queries.
Share bugs, ideas, or general feedback.
CURSOR-BASED PAGINATION REPLACES NUMERIC OFFSETS WITH OPAQUE POSITION TOKENS — EACH CURSOR ENCODES EXACTLY WHERE THE CLIENT LEFT OFF, ELIMINATING PAGE DRIFT WHEN ROWS ARE INSERTED OR DELETED BETWEEN REQUESTS AND ENABLING CONSISTENT TRAVERSAL OF LIVE DATASETS.
next_page_token or after cursor parameters on list endpointsOpaque Cursor — A cursor is a position token that is intentionally opaque to the client. The server encodes whatever internal position data it needs (a timestamp, a sort key, a composite of multiple columns) and base64-encodes the result so clients cannot parse or construct cursors manually. Opacity allows the server to change the internal encoding without a breaking API change.
Internal: { "id": 8821, "created_at": "2024-03-15T10:22:00Z" }
Encoded: eyJpZCI6ODgyMSwiY3JlYXRlZF9hdCI6IjIwMjQtMDMtMTVUMTA6MjI6MDBaIn0=
Forward Pagination — The client passes an after cursor to retrieve the page that follows a known position. The server decodes the cursor, constructs a WHERE clause that selects rows strictly after the encoded position, and applies LIMIT n. The response includes a next_cursor when more results exist and omits it (or returns null) when the collection is exhausted.
Backward Pagination — The client passes a before cursor to traverse in reverse. The server selects rows strictly before the encoded position with the sort direction reversed, then re-reverses the result set before returning it so the response order matches the canonical collection order. Not all APIs implement backward pagination — document the capability explicitly.
Cursor Stability Guarantees — A cursor must remain valid for a reasonable window (typically 24–72 hours). If the sort key column is mutable (e.g., updated_at), cursor-encoded values may no longer identify the same logical position after an update. Prefer immutable sort keys (e.g., monotonic id or created_at) for cursor stability. Document the expiry window in the API reference.
Page Size Negotiation — Accept a limit (or per_page) parameter with a capped maximum. Return the actual page size used in the response alongside the cursor, even when the client specifies limit. This allows clients to detect when the server reduced their requested page size.
has_more Flag — Rather than computing total count (expensive), return a boolean has_more (or has_next_page / has_previous_page in Relay). To determine has_more, request LIMIT n+1 rows from the database and return only n — if the extra row exists, has_more = true.
GitHub's REST API uses cursor-based pagination via Link headers and an opaque cursor parameter on the GraphQL API. The REST List Issues endpoint demonstrates the pattern:
Request page 1:
GET /repos/octocat/hello-world/issues?per_page=2&state=open
Authorization: Bearer ghp_...
HTTP/1.1 200 OK
Link: <https://api.github.com/repos/octocat/hello-world/issues?per_page=2&state=open&after=Y3Vyc29yOnYyOpHOAAFBUQ%3D%3D>; rel="next"
Content-Type: application/json
[
{ "id": 100, "number": 42, "title": "Fix build" },
{ "id": 99, "number": 41, "title": "Add tests" }
]
Request the next page using the cursor from the Link header:
GET /repos/octocat/hello-world/issues?per_page=2&state=open&after=Y3Vyc29yOnYyOpHOAAFBUQ%3D%3D
Authorization: Bearer ghp_...
HTTP/1.1 200 OK
Link: <https://api.github.com/repos/octocat/hello-world/issues?per_page=2&state=open&after=Y3Vyc29yOnYyOpHOAAFBUQ%3D%3D>; rel="prev",
<https://api.github.com/repos/octocat/hello-world/issues?per_page=2&state=open&after=Y3Vyc29yOnYyOpHOAAFBUM%3D%3D>; rel="next"
[
{ "id": 98, "number": 40, "title": "Update docs" },
{ "id": 97, "number": 39, "title": "Refactor auth" }
]
The GitHub GraphQL API uses the Relay connection spec with edges, node, pageInfo.endCursor, pageInfo.hasNextPage — a canonical reference implementation of cursor pagination at scale.
Exposing raw database IDs as cursors. Returning "cursor": "8821" leaks internal implementation details, lets clients construct arbitrary cursors that bypass intended access controls, and makes it impossible to change the internal ID scheme. Always base64-encode an opaque payload. Even if the payload is just an integer today, opacity preserves the option to change it.
Using mutable columns as the sole cursor key. If the cursor encodes updated_at and rows are frequently updated, a client mid-pagination may re-see rows it already received or skip rows whose updated_at was bumped between pages. Use an immutable monotonic column (id, created_at) as the primary cursor key. If the client must sort by a mutable field, use a composite cursor of (mutable_field, id) so id breaks ties and the position remains unique and stable.
Returning a total count alongside every cursor page. SELECT COUNT(*) on large tables acquires a table scan or index scan that is often more expensive than the page query itself. Cursor pagination's primary advantage over offset is avoiding this scan. If a UI needs a total count, compute it asynchronously and cache it, or switch to an approximate count (pg_class.reltuples in PostgreSQL). Never block the paginated response on a live COUNT.
Accepting client-constructed cursors without validation. If a client can construct an arbitrary cursor value, it may bypass row-level security, access soft-deleted records, or enumerate internal IDs. Always decode, validate schema, and verify the encoded values fall within the caller's access scope before using cursor data in a query.
The Relay cursor connection specification defines a canonical shape that many REST and GraphQL APIs follow:
{
"data": {
"edges": [
{ "cursor": "Y3Vyc29yMQ==", "node": { "id": "1", "name": "Alice" } },
{ "cursor": "Y3Vyc29yMg==", "node": { "id": "2", "name": "Bob" } }
],
"pageInfo": {
"startCursor": "Y3Vyc29yMQ==",
"endCursor": "Y3Vyc29yMg==",
"hasNextPage": true,
"hasPreviousPage": false
}
}
}
Adopting this shape for REST responses reduces integration friction for teams already using Relay on the frontend and documents a well-understood contract.
For a table sorted by created_at DESC, id DESC, the forward-pagination query given a decoded cursor {created_at: "2024-03-15T10:22:00Z", id: 8821} is:
SELECT * FROM events
WHERE (created_at, id) < ('2024-03-15T10:22:00Z', 8821)
ORDER BY created_at DESC, id DESC
LIMIT 26 -- request 26 to detect has_next_page; return 25
This row-value comparison is supported by PostgreSQL, MySQL 8+, and SQLite and uses composite indexes efficiently without an OFFSET scan.
Slack's Web API uses cursor pagination (next_cursor in response_metadata) for all list endpoints including conversations.list, users.list, and channels.history. Before migrating from offset to cursor pagination, Slack reported that deep-offset queries on the channels.history endpoint (messages at offset 50,000+) caused p99 response times exceeding 8 seconds on large workspaces. After the migration to cursor-based pagination using composite (ts, channel_id) cursors, p99 latency for equivalent page fetches dropped to under 80ms — a 100x improvement — because each page query became a bounded index seek rather than a full table scan.
id, created_at). If the client sorts by a mutable field, design a composite cursor of (sort_field, id).WHERE (col_a, col_b) < (val_a, val_b)) with LIMIT n+1 to detect has_more without a COUNT query.next_cursor (omit the field or return null when has_more is false). Include has_more as an explicit boolean.400 Bad Request with error code cursor_expired).has_more is determined by fetching n+1 rows, not by COUNT(*).400 Bad Request with a machine-readable error code, not a 500.