From shipshitdev-backend
Builds resilient data ingestion pipelines for paginated REST APIs like Twitter or exchanges, tracking progress with watermarks, avoiding duplicates, handling rate limits, and enabling incremental updates plus historical backfills.
npx claudepluginhub shipshitdev/libraryThis skill uses the workspace's default tool permissions.
Build data pipelines that never lose progress and never re-fetch existing data.
Builds resilient data ingestion pipelines for paginated REST APIs like Twitter or exchanges, tracking progress with watermarks, avoiding duplicates, handling rate limits, and enabling incremental updates plus historical backfills.
Generates API integration code for system-to-system connectors using REST/GraphQL, with authentication (OAuth, API key, JWT), rate limiting, data mapping, error recovery via circuit breakers, and sync monitoring.
Guides cursor-based pagination with opaque tokens to prevent page drift from inserts/deletes in feeds, logs, timelines, and large datasets. Covers forward/backward traversal.
Share bugs, ideas, or general feedback.
Build data pipelines that never lose progress and never re-fetch existing data.
Track TWO cursors to support both forward and backward fetching:
| Watermark | Purpose | API Parameter |
|---|---|---|
newest_id | Fetch new data since last run | since_id |
oldest_id | Backfill older data | until_id |
A single watermark only fetches forward. Two watermarks enable:
newest_id)oldest_id)These are different operations with different timing:
| What | When to Save | Why |
|---|---|---|
| Data records | After EACH page | Resilience: interrupted on page 47? Keep 46 pages |
| Watermarks | ONCE at end of run | Correctness: only commit progress after full success |
fetch page 1 → save records → fetch page 2 → save records → ... → update watermarks
First run (no watermarks)?
├── YES → Full fetch (no since_id, no until_id)
└── NO → Backfill flag set?
├── YES → Backfill mode (until_id = oldest_id)
└── NO → Update mode (since_id = newest_id)
This pattern works best with ID-based pagination (numeric IDs that can be compared). For other pagination types:
| Type | Adaptation |
|---|---|
| Cursor/token | Store cursor string instead of ID; can't compare numerically |
| Timestamp | Use last_timestamp column; compare as dates |
| Offset/limit | Store page number; resume from last saved page |
See references/patterns.md for schemas and code examples.