Master feature stores - Feast, data validation, versioning, online/offline serving
Implement production feature stores using Feast for serving, Great Expectations for validation, and DVC for versioning. Use when building ML pipelines that require consistent online/offline feature access and data quality checks.
/plugin marketplace add pluginagentmarketplace/custom-plugin-mlops/plugin install custom-plugin-mlops@pluginagentmarketplace-mlopsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
assets/config.yamlassets/schema.jsonreferences/GUIDE.mdreferences/PATTERNS.mdscripts/validate.pyLearn: Build production feature stores for ML systems.
| Attribute | Value |
|---|---|
| Bonded Agent | 03-data-pipelines |
| Difficulty | Intermediate to Advanced |
| Duration | 35 hours |
| Prerequisites | mlops-basics |
Components:
┌─────────────────────────────────────────────────────────────┐
│ FEATURE STORE ARCHITECTURE │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Offline │ │ Feature │ │ Online │ │
│ │ Store │───▶│ Registry │◀───│ Store │ │
│ │ (Parquet) │ │ (Metadata) │ │ (Redis) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ [Training] [Discovery] [Inference] │
│ │
└─────────────────────────────────────────────────────────────┘
Exercises:
Feature Definition Example:
from feast import Entity, Feature, FeatureView, FileSource
from feast.types import Float32, Int64
from datetime import timedelta
# Entity definition
customer = Entity(
name="customer_id",
value_type=ValueType.INT64,
description="Customer identifier"
)
# Feature view
customer_features = FeatureView(
name="customer_features",
entities=["customer_id"],
ttl=timedelta(days=7),
schema=[
Feature(name="total_purchases", dtype=Float32),
Feature(name="avg_order_value", dtype=Float32),
Feature(name="days_since_last_order", dtype=Int64),
],
online=True,
source=customer_stats_source
)
Exercises:
Great Expectations Setup:
import great_expectations as gx
# Create validation suite
suite = context.add_expectation_suite("ml_data_validation")
# Add expectations
suite.add_expectation(
gx.expectations.ExpectColumnValuesToNotBeNull(
column="target",
mostly=0.99
)
)
suite.add_expectation(
gx.expectations.ExpectColumnMeanToBeBetween(
column="feature_a",
min_value=0.0,
max_value=100.0
)
)
DVC Workflow:
# Initialize DVC
dvc init
# Add data to tracking
dvc add data/training_data.parquet
# Push to remote storage
dvc push
# Checkout specific version
git checkout v1.0.0
dvc checkout
# templates/feature_pipeline.py
from sklearn.base import BaseEstimator, TransformerMixin
import pandas as pd
class FeaturePipeline(BaseEstimator, TransformerMixin):
"""Production feature engineering pipeline."""
def __init__(self, config: dict):
self.config = config
self.feature_names = []
def fit(self, X: pd.DataFrame, y=None):
"""Learn feature statistics."""
self.means = X.select_dtypes(include=['number']).mean()
self.stds = X.select_dtypes(include=['number']).std()
return self
def transform(self, X: pd.DataFrame) -> pd.DataFrame:
"""Apply feature transformations."""
X = X.copy()
# Numerical normalization
for col in X.select_dtypes(include=['number']).columns:
X[f"{col}_normalized"] = (X[col] - self.means[col]) / self.stds[col]
# Temporal features
for col in self.config.get("datetime_columns", []):
X[f"{col}_hour"] = pd.to_datetime(X[col]).dt.hour
X[f"{col}_dow"] = pd.to_datetime(X[col]).dt.dayofweek
return X
| Issue | Cause | Solution |
|---|---|---|
| Slow feature serving | Online store bottleneck | Scale Redis, add caching |
| Training-serving skew | Different transformations | Use unified feature pipeline |
| Stale features | Materialization lag | Increase refresh frequency |
| Version | Date | Changes |
|---|---|---|
| 2.0.0 | 2024-12 | Production-grade with Feast examples |
| 1.0.0 | 2024-11 | Initial release |
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.