Skill

pandera-validation

Install
1
Install the plugin
$
npx claudepluginhub majesticlabs-dev/majestic-marketplace --plugin majestic-data

Want just this skill?

Add to a custom plugin, then install with one command.

Description

DataFrame schema validation using pandera. Schema definitions, column checks, and decorator-based validation.

Tool Access

This skill is limited to using the following tools:

Read Write Edit Bash
Supporting Assets
View in Repository
scripts/schemas.py
Skill Content

Pandera Validation

Audience: Data engineers validating pandas DataFrames.

Goal: Provide pandera patterns for schema validation and type checking.

Scripts

Execute schema functions from scripts/schemas.py:

from scripts.schemas import (
    create_user_schema,
    create_nullable_schema,
    create_date_range_schema,
    UserSchema,
    validate_with_errors,
    infer_and_export_schema
)

Usage Examples

Basic Schema Validation

from scripts.schemas import create_user_schema

schema = create_user_schema()
validated_df = schema.validate(df)

Collect All Errors

from scripts.schemas import create_user_schema, validate_with_errors

schema = create_user_schema()
validated_df, errors = validate_with_errors(df, schema)

if errors:
    for err in errors:
        print(f"{err['column']}: {err['check']} - {err['failure_case']}")

Class-Based Schema

from scripts.schemas import UserSchema

# Validate with type hints
UserSchema.validate(df)

# Use as function type hint
def process_users(df: pa.typing.DataFrame[UserSchema]) -> pd.DataFrame:
    return df.query("status == 'active'")

Infer Schema from DataFrame

from scripts.schemas import infer_and_export_schema

schema_export = infer_and_export_schema(df)
print(schema_export['python_code'])  # Python schema definition
print(schema_export['yaml'])         # YAML schema

Built-in Checks Reference

Check TypeExampleDescription
NumericCheck.gt(0), Check.in_range(0, 100)Comparisons
StringCheck.str_matches(r'pattern')Regex match
Set membershipCheck.isin(['A', 'B'])Allowed values
Uniquenessunique=True on ColumnNo duplicates
Nullablenullable=True on ColumnAllow nulls

Decorator-Based Validation

import pandera as pa

@pa.check_output(schema)
def load_data(path: str) -> pd.DataFrame:
    return pd.read_csv(path)

@pa.check_input(schema, "df")
def process_data(df: pd.DataFrame) -> pd.DataFrame:
    return df.assign(processed=True)

@pa.check_io(df=input_schema, out=output_schema)
def transform_data(df: pd.DataFrame) -> pd.DataFrame:
    return df.transform(...)

When to Use Pandera

Use CasePanderaAlternative
DataFrame validation-
Type hints for DataFrames-
ETL pipeline checksGreat Expectations
Record-level validation-Pydantic

Dependencies

pandera>=0.18
pandas
Stats
Stars30
Forks6
Last CommitJan 19, 2026
Actions

Similar Skills