Skill

great-expectations

Validates data pipelines using Great Expectations with expectation suites, checkpoints, data docs, and Python scripts for monitoring.

Python

data-engineering

npx claudepluginhub majesticlabs-dev/majestic-marketplace --plugin majestic-data

Tool Access

This skill is limited to using the following tools:

Read Write Edit Bash

Preview

**Audience:** Data engineers building validated data pipelines.

Supporting Assets

scripts/expectations.py

SKILL.md

Similar Skills

data-validation

Provides Python data validation functions and pipelines for DataFrames using custom checks, Pydantic, Pandera, and Great Expectations. Includes schema evolution and pytest assertions.

1 file1 tool

majestic-data

data-quality-frameworks

Implements data quality validation with Great Expectations, dbt tests, and data contracts for pipelines, rules, and team agreements.

superpowers

data-quality-frameworks

36.5k

Implements data quality validation with Great Expectations, dbt tests, and data contracts for pipelines, rules, and team agreements.

1 file

antigravity-awesome-skills

Stats

Parent Repo Stars33

Parent Repo Forks7

Last CommitJan 19, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

Great Expectations

Audience: Data engineers building validated data pipelines.

Goal: Provide GX patterns for expectation-based validation and monitoring.

Scripts

Execute GX functions from scripts/expectations.py:

from scripts.expectations import (
    get_pandas_context,
    add_dataframe_asset,
    create_basic_suite,
    run_validation
)

Usage Examples

Quick Setup

from scripts.expectations import get_pandas_context, add_dataframe_asset

context, datasource = get_pandas_context("my_datasource")
batch_request = add_dataframe_asset(datasource, "users", df)

Create Expectation Suite

from scripts.expectations import create_basic_suite

columns_config = {
    'user_id': {'not_null': True, 'unique': True, 'type': 'int'},
    'age': {'min': 0, 'max': 150},
    'status': {'values': ['active', 'inactive', 'pending']},
    'email': {'regex': r'^[\w\.-]+@[\w\.-]+\.\w+$'}
}

suite = create_basic_suite(context, "user_suite", columns_config)

Run Validation

from scripts.expectations import run_validation

results = run_validation(
    context,
    checkpoint_name="user_checkpoint",
    batch_request=batch_request,
    suite_name="user_suite"
)

if results['success']:
    print("All expectations passed!")
else:
    for failure in results['failures']:
        print(f"Failed: {failure['expectation']} on {failure['column']}")

Common Expectations Reference

Category	Expectation	Description
Table	`ExpectTableRowCountToBeBetween`	Row count range
Existence	`ExpectColumnToExist`	Column must exist
Nulls	`ExpectColumnValuesToNotBeNull`	No null values
Range	`ExpectColumnValuesToBeBetween`	Value bounds
Set	`ExpectColumnValuesToBeInSet`	Allowed values
Pattern	`ExpectColumnValuesToMatchRegex`	Regex match
Unique	`ExpectColumnValuesToBeUnique`	No duplicates

Data Docs

# Build and open HTML reports
context.build_data_docs()
context.open_data_docs()

Directory Structure

great_expectations/
├── great_expectations.yml     # Config
├── expectations/              # Expectation suites (JSON)
├── checkpoints/               # Checkpoint definitions
├── plugins/                   # Custom expectations
└── uncommitted/
    ├── data_docs/            # Generated HTML docs
    └── validations/          # Validation results

When to Use Great Expectations

Use Case	GX	Alternative
Pipeline monitoring	✓	-
Data warehouse validation	✓	-
Automated data docs	✓	-
Simple DataFrame checks	-	Pandera
Record-level API validation	-	Pydantic

Dependencies

great_expectations>=0.18
pandas