Skill

great-expectations

Data validation using Great Expectations. Expectation suites, checkpoints, and data docs for pipeline monitoring.

From majestic-data
Install
1
Run in your terminal
$
npx claudepluginhub majesticlabs-dev/majestic-marketplace --plugin majestic-data
Tool Access

This skill is limited to using the following tools:

Read Write Edit Bash
Supporting Assets
View in Repository
scripts/expectations.py
Skill Content

Great Expectations

Audience: Data engineers building validated data pipelines.

Goal: Provide GX patterns for expectation-based validation and monitoring.

Scripts

Execute GX functions from scripts/expectations.py:

from scripts.expectations import (
    get_pandas_context,
    add_dataframe_asset,
    create_basic_suite,
    run_validation
)

Usage Examples

Quick Setup

from scripts.expectations import get_pandas_context, add_dataframe_asset

context, datasource = get_pandas_context("my_datasource")
batch_request = add_dataframe_asset(datasource, "users", df)

Create Expectation Suite

from scripts.expectations import create_basic_suite

columns_config = {
    'user_id': {'not_null': True, 'unique': True, 'type': 'int'},
    'age': {'min': 0, 'max': 150},
    'status': {'values': ['active', 'inactive', 'pending']},
    'email': {'regex': r'^[\w\.-]+@[\w\.-]+\.\w+$'}
}

suite = create_basic_suite(context, "user_suite", columns_config)

Run Validation

from scripts.expectations import run_validation

results = run_validation(
    context,
    checkpoint_name="user_checkpoint",
    batch_request=batch_request,
    suite_name="user_suite"
)

if results['success']:
    print("All expectations passed!")
else:
    for failure in results['failures']:
        print(f"Failed: {failure['expectation']} on {failure['column']}")

Common Expectations Reference

CategoryExpectationDescription
TableExpectTableRowCountToBeBetweenRow count range
ExistenceExpectColumnToExistColumn must exist
NullsExpectColumnValuesToNotBeNullNo null values
RangeExpectColumnValuesToBeBetweenValue bounds
SetExpectColumnValuesToBeInSetAllowed values
PatternExpectColumnValuesToMatchRegexRegex match
UniqueExpectColumnValuesToBeUniqueNo duplicates

Data Docs

# Build and open HTML reports
context.build_data_docs()
context.open_data_docs()

Directory Structure

great_expectations/
├── great_expectations.yml     # Config
├── expectations/              # Expectation suites (JSON)
├── checkpoints/               # Checkpoint definitions
├── plugins/                   # Custom expectations
└── uncommitted/
    ├── data_docs/            # Generated HTML docs
    └── validations/          # Validation results

When to Use Great Expectations

Use CaseGXAlternative
Pipeline monitoring-
Data warehouse validation-
Automated data docs-
Simple DataFrame checks-Pandera
Record-level API validation-Pydantic

Dependencies

great_expectations>=0.18
pandas
Stats
Parent Repo Stars30
Parent Repo Forks6
Last CommitJan 19, 2026