From example-skills
Designs ETL/ELT data pipelines with proper extraction, transformation, and loading patterns, including orchestration, error handling, and data quality validation.
npx claudepluginhub organvm-iv-taxis/a-i--skills --plugin document-skillsThis skill uses the workspace's default tool permissions.
This skill provides guidance for designing robust, scalable data pipelines that move data reliably from sources to destinations.
Compares coding agents like Claude Code and Aider on custom YAML-defined codebase tasks using git worktrees, measuring pass rate, cost, time, and consistency.
Designs and optimizes AI agent action spaces, tool definitions, observation formats, error recovery, and context for higher task completion rates.
Designs, implements, and audits WCAG 2.2 AA accessible UIs for Web (ARIA/HTML5), iOS (SwiftUI traits), and Android (Compose semantics). Audits code for compliance gaps.
This skill provides guidance for designing robust, scalable data pipelines that move data reliably from sources to destinations.
To begin pipeline design, gather:
Batch Pipelines - For periodic bulk processing:
Streaming Pipelines - For real-time requirements:
Hybrid Approaches - Lambda or Kappa architecture:
ETL (Transform before Load):
ELT (Transform after Load):
Extraction Layer:
Transformation Layer:
Loading Layer:
┌─────────────────────────────────────────────────────────┐
│ Pipeline Execution │
├─────────────────────────────────────────────────────────┤
│ ┌─────────┐ ┌───────────┐ ┌──────────┐ │
│ │ Extract │───▶│ Transform │───▶│ Load │ │
│ └────┬────┘ └─────┬─────┘ └────┬─────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────┐ ┌───────────┐ ┌──────────┐ │
│ │ Retry │ │ Dead Letter│ │ Rollback │ │
│ │ w/Backoff│ │ Queue │ │ Checkpoint│ │
│ └─────────┘ └───────────┘ └──────────┘ │
└─────────────────────────────────────────────────────────┘
Implement checks at each stage:
| Stage | Check Type | Example |
|---|---|---|
| Extract | Completeness | Row count matches source |
| Extract | Freshness | Data timestamp within SLA |
| Transform | Validity | Values in expected ranges |
| Transform | Uniqueness | Primary keys unique |
| Load | Reconciliation | Target matches source totals |
| Load | Integrity | Foreign keys valid |
Essential metrics to track:
Alert on:
-- Timestamp-based incremental
SELECT * FROM source
WHERE updated_at > {{ last_run_timestamp }}
-- CDC-based (Change Data Capture)
-- Captures inserts, updates, deletes from transaction log
-- Delete + Insert pattern
DELETE FROM target WHERE date_partition = '2024-01-15';
INSERT INTO target SELECT * FROM staging WHERE date_partition = '2024-01-15';
-- Merge/Upsert pattern
MERGE INTO target t
USING staging s ON t.id = s.id
WHEN MATCHED THEN UPDATE SET ...
WHEN NOT MATCHED THEN INSERT ...
references/orchestration-patterns.md - Airflow, Dagster, Prefect patternsreferences/data-quality-checks.md - Validation frameworks and rulesreferences/pipeline-templates.md - Common pipeline architectures