Install
1
Install the plugin$
npx claudepluginhub majesticlabs-dev/majestic-marketplace --plugin majestic-dataWant just this skill?
Add to a custom plugin, then install with one command.
Description
Production ETL patterns orchestrator. Routes to core reliability patterns and incremental load strategies.
Tool Access
This skill is limited to using the following tools:
Read Write Edit Grep Glob Bash
Skill Content
ETL Patterns
Orchestrator for production-grade Extract-Transform-Load patterns.
Skill Routing
| Need | Skill | Content |
|---|---|---|
| Reliability patterns | etl-core-patterns | Idempotency, checkpointing, error handling, chunking, retry, logging |
| Load strategies | etl-incremental-patterns | Backfill, timestamp-based, CDC, pipeline orchestration |
Pattern Selection Guide
By Reliability Need
| Need | Pattern | Skill |
|---|---|---|
| Repeatable runs | Idempotency | etl-core-patterns |
| Resume after failure | Checkpointing | etl-core-patterns |
| Handle bad records | Error handling + DLQ | etl-core-patterns |
| Memory management | Chunked processing | etl-core-patterns |
| Network resilience | Retry with backoff | etl-core-patterns |
| Observability | Structured logging | etl-core-patterns |
By Load Strategy
| Scenario | Pattern | Skill |
|---|---|---|
| Small tables (<100K) | Full refresh | etl-incremental-patterns |
| Large tables | Timestamp incremental | etl-incremental-patterns |
| Real-time sync | CDC events | etl-incremental-patterns |
| Historical migration | Parallel backfill | etl-incremental-patterns |
| Zero-downtime refresh | Swap pattern | etl-incremental-patterns |
| Multi-step pipelines | Pipeline orchestration | etl-incremental-patterns |
Quick Reference
Idempotency Options
# Small datasets: Delete-then-insert
# Large datasets: UPSERT on conflict
# Change detection: Row hash comparison
Load Strategy Decision
Is table < 100K rows?
→ Full refresh
Has reliable timestamp column?
→ Timestamp incremental
Source supports CDC?
→ CDC event processing
Need zero downtime?
→ Swap pattern (temp table → rename)
One-time historical load?
→ Parallel backfill with date ranges
Common Pipeline Structure
# 1. Setup
checkpoint = Checkpoint('.etl_checkpoint.json')
processor = ETLProcessor()
# 2. Extract (with incremental)
df = incremental_by_timestamp(source_table, 'updated_at')
# 3. Transform (with error handling)
transformed = processor.process_batch(df.to_dict('records'))
# 4. Load (with idempotency)
upsert_records(pd.DataFrame(transformed))
# 5. Checkpoint
checkpoint.set_last_processed('sync', df['updated_at'].max())
# 6. Handle failures
processor.save_failures('failures/')
Related Skills
data-validation- Validate data quality during ETLdata-quality- Monitor data quality metricspandas-coder- DataFrame transformations
Stats
Stars30
Forks6
Last CommitFeb 15, 2026
Actions