From aeo-python
Implement data pipelines with SQLAlchemy 2.0 patterns (TypeDecorator, hybrid properties, events) and warehouse architectures (Kimball star schemas, medallion layers, SCD handling). Covers ETL/ELT design, dim_/fact_/stg_ conventions, and orchestration with dbt, Airflow, or Dagster. Reference for API-to-database flows or dimensional modeling with Python and PostgreSQL.
npx claudepluginhub aeyeops/aeo-skill-marketplace --plugin aeo-pythonThis skill uses the workspace's default tool permissions.
Comprehensive patterns for building production-grade data pipelines, dimensional models, and API integrations using SQLAlchemy 2.0+ and modern data warehousing practices.
integration/field-mapping.mdintegration/incremental-sync.mdintegration/schema-resilience.mdintegration/uv-package-management.mdpydantic/api-validation.mdpydantic/settings-management.mdsqlalchemy/factory-patterns.mdsqlalchemy/migrations.mdsqlalchemy/query-patterns.mdsqlalchemy/repository-pattern.mdsqlalchemy/type-decorators.mdwarehouse/naming-conventions.mdwarehouse/scd-patterns.mdwhen-to-build-vs-buy.mdSearches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Searches prompts.chat for AI prompt templates by keyword or category, retrieves by ID with variable handling, and improves prompts via AI. Use for discovering or enhancing prompts.
Guides MCP server integration in Claude Code plugins via .mcp.json or plugin.json configs for stdio, SSE, HTTP types, enabling external services as tools.
Comprehensive patterns for building production-grade data pipelines, dimensional models, and API integrations using SQLAlchemy 2.0+ and modern data warehousing practices.
Apply this skill when building custom Python data pipelines with SQLAlchemy and PostgreSQL.
Use this skill for:
Consider frameworks FIRST for common SaaS integrations:
Build custom Python when:
Need to integrate external data source?
│
├─ Check existing connectors (Airbyte, Fivetran, Meltano)
│ │
│ ├─ Connector exists + meets requirements
│ │ └─ [USE FRAMEWORK] - Faster, maintained, documented
│ │
│ └─ No connector OR custom logic needed
│ │
│ ├─ Simple transformation (SQL only)
│ │ └─ [USE dbt] - Post-load SQL transformations
│ │
│ └─ Complex transformation OR custom logic
│ └─ [BUILD CUSTOM PYTHON] - Full control, use patterns in this skill
│
└─ Consider:
• Maintenance cost (custom code = ongoing maintenance)
• Time to market (framework = faster initial setup)
• Customization needs (custom code = unlimited flexibility)
• Team expertise (SQL vs Python)
Decision guide: See when-to-build-vs-buy.md for detailed analysis
TypeDecorators (custom column types): See sqlalchemy/type-decorators.md Factory methods (from_dict, from_api): See sqlalchemy/factory-patterns.md Query patterns: See sqlalchemy/query-patterns.md Repository pattern: See sqlalchemy/repository-pattern.md Migrations with Alembic: See sqlalchemy/migrations.md
API validation: See pydantic/api-validation.md Settings management: See pydantic/settings-management.md
Naming conventions (dim_, fact_): See warehouse/naming-conventions.md Slowly Changing Dimensions: See warehouse/scd-patterns.md
Field mapping strategies: See integration/field-mapping.md Incremental sync (high-water mark): See integration/incremental-sync.md Schema resilience (JSONB): See integration/schema-resilience.md UV package management (local deps): See integration/uv-package-management.md
This skill teaches the three-layer separation for data pipelines:
Source System Layer (libs.external_api)
└─ HTTP client, authentication, queries
└─ Source-specific quirks and normalization
Database Schema Layer (libs.database)
└─ Models (DimCustomer, FactOrders, etc.)
└─ Mappers (API → Model transformations)
└─ Custom types (TypeDecorators)
Application Layer (your applications)
└─ Orchestration (when, what, how to sync)
└─ Business logic (analysis, reporting)
└─ CLI/API interfaces
Why this separation:
Problem: Transform external API response to SQLAlchemy model with type conversions
Solution: Factory classmethod + TypeDecorators
class Customer(Base):
email: Mapped[str] = mapped_column(NormalizedEmail)
is_active: Mapped[bool] = mapped_column(BooleanStringType)
@classmethod
def from_api_response(cls, data: dict, sync_time: datetime) -> "Customer":
# Transformation logic here
See sqlalchemy/factory-patterns.md for complete pattern.
Problem: SaaS system adds/removes custom fields, breaking rigid schema
Solution: JSONB column with lifecycle metadata
class FlexibleRecordMixin:
custom_fields: Mapped[dict] = mapped_column(JSONB)
# Structure: {"field": {"value": ..., "first_seen": ..., "deprecated": bool}}
See integration/schema-resilience.md for complete pattern.
Problem: Re-fetching all data is slow; need to sync only changed records
Solution: High-water mark pattern with sync metadata
max_date = session.query(func.max(Customer.last_modified)).scalar()
api_response = client.get(f"/customers?modified_since={max_date}")
See integration/incremental-sync.md for complete pattern.
Problem: Combine data from multiple external systems into unified schema
Solution: Conformed dimensions + source-specific mappers
class Customer(Base):
source_system: Mapped[str] # "api_a", "api_b", "internal"
source_id: Mapped[str] # Original system's ID
class APIACustomerMapper:
@classmethod
def to_customer(cls, data: dict) -> Customer:
# API A schema → unified Customer schema
class APIBCustomerMapper:
@classmethod
def to_customer(cls, data: dict) -> Customer:
# API B schema → unified Customer schema
Each mapper handles its source's quirks independently.
import pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
from datetime import datetime, UTC
@pytest.fixture
def db_session():
engine = create_engine("sqlite:///:memory:")
Base.metadata.create_all(engine)
with Session(engine) as session:
yield session
def test_api_to_model_transformation(db_session):
api_response = {"customer_id": "C001", "name": "Test Corp", "email": "test@example.com"}
customer = Customer.from_api_response(api_response, datetime.now(UTC))
db_session.add(customer)
db_session.commit()
assert customer.customer_id == "C001"
assert customer.name == "Test Corp"
cache_ok=True)dim_, fact_, stg_, bridge_This skill provides practical patterns for Python data engineering with SQLAlchemy, Pydantic, and PostgreSQL.