Skill

dimensional-modeling

Design and implement Kimball-style star schemas in DuckDB for LLM agent state persistence. Use when user needs to track agent execution, model operational data, design fact/dimension tables, implement SCD Type 2, generate surrogate keys, or build analytical views. Also triggers on "star schema", "dimensional model", "DuckDB schema", "fact table", "dimension table", "SCD Type 2", "surrogate keys", "data warehouse for agents", "design a schema for", "help me track agent state", "store execution data in DuckDB", "I need to persist agent history", or "how do I model".

From dimensional-modeling

Install

Run in your terminal

npx claudepluginhub fblissjr/fb-claude-skills --plugin dimensional-modeling

Tool Access

This skill is limited to using the following tools:

Read

Supporting Assets

View in Repository

references/anti_patterns.md

references/dag_execution.md

references/key_generation.md

references/query_patterns.md

references/schema_patterns.md

Skill Content

Dimensional Modeling for Agent Systems

Design and implement Kimball-style star schemas in DuckDB for tracking agent state, execution, and operational data.

When to Use

Designing a new DuckDB schema for agent/tool state tracking
Adding fact or dimension tables to an existing star schema
Implementing SCD Type 2 for slowly changing dimensions
Choosing between degenerate dimensions and full dimension tables
Designing surrogate key generation strategies
Building analytical views over star schema data
Modeling agent execution as a data pipeline DAG

Core Principle

Abstract the data, not the behavior. Frameworks that abstract interaction patterns (chains, agents, retrievers) break when research moves faster than the abstraction. Dimensional modeling abstracts what happened (facts) in what context (dimensions) -- patterns that are 30+ years old and model-agnostic.

Process

Step 1: Identify the Business Process

What are you tracking? Every schema starts by naming the business process:

Business Process	Example Grain	Example Facts
Agent task execution	One tool call	duration_ms, input_tokens, output_tokens, status
Skill quality tracking	One validation run	error_count, warning_count, is_valid
CDC change detection	One page check	content_hash, change_type, severity
Session cost tracking	One session	total_tokens, cost_usd, tool_call_count
Task decomposition	One routing decision	subtask_count, agent_assigned, success

Step 2: Declare the Grain

The grain is the most atomic level of data captured in the fact table. State it as a sentence:

"One row in fact_tool_call represents a single tool invocation by a single agent in a single session."

Rules:

Too coarse = you lose detail you can't recover
Too fine = you waste storage on noise
When in doubt, go finer -- you can always aggregate up

When choosing between grain levels, use ultrathink to reason through the trade-offs before committing.

Step 3: Identify the Dimensions

Dimensions answer who/what/where/when/why/how about each fact row.

Full dimension table when:

The entity has mutable attributes (name changes, status changes)
You need to track history (SCD Type 2)
Multiple fact tables reference the same entity

Degenerate dimension (carried in fact rows) when:

The natural key IS the only interesting attribute
High cardinality (session_id, transaction_id)
No mutable attributes to track

See references/schema_patterns.md for dimension table templates.

Step 4: Design the Facts

Fact tables are append-only event logs. Every fact table follows these rules:

No primary keys. Grain = composite dimension keys + event timestamp.
No sequences. Deterministic surrogate keys via MD5 hash.
No FK constraints. Join by convention, validate at application layer.
Metadata on every row: inserted_at, record_source, session_id.

See references/schema_patterns.md for fact table templates.

Step 5: Generate Keys

All surrogate keys use MD5 hash of natural key components:

import hashlib

def dimension_key(*natural_keys) -> str:
    """MD5 surrogate from natural key components."""
    parts = [str(k) if k is not None else "-1" for k in natural_keys]
    return hashlib.md5("|".join(parts).encode("utf-8")).hexdigest()

def hash_diff(**attributes) -> str:
    """MD5 of non-key attributes for SCD Type 2 change detection."""
    parts = [f"{k}={v}" for k, v in sorted(attributes.items()) if v is not None]
    return hashlib.md5("|".join(parts).encode("utf-8")).hexdigest()

See references/key_generation.md for details on key design.

Step 6: Implement SCD Type 2

When a dimension attribute changes:

Set effective_to = NOW() and is_current = FALSE on the old row
Insert new row with updated attributes, effective_from = NOW(), is_current = TRUE
Compute new hash_diff from non-key attributes

This requires no PRIMARY KEY on dimension tables (hash_key appears in multiple rows).

Step 7: Build Views

Views compose dimensions and facts to answer analytical questions. Always filter is_current = TRUE when joining dimensions unless doing point-in-time analysis.

See references/query_patterns.md for view recipes.

Step 8: Add Meta Tables

Every database needs:

meta_schema_version -- tracks schema evolution
meta_load_log -- tracks script execution for operational visibility

Agent Execution as a DAG

The primary use case: model agent execution as a data pipeline DAG.

Goal / Objective     (why -- spans days, sessions)
  Task               (what -- decomposed unit)
    Branch / Attempt (how -- specific approach, may be pruned)
      Session        (where -- execution boundary)
        Agent        (delegation -- routed sub-problem)
          Tool chain (sequence -- ordered operations)
            Tool call(atomic -- read/write/search/execute)

The five invariant operations (decompose, route, prune, synthesize, verify) become fact table grains:

Phase	Fact Table	What It Captures
Decompose	fact_task_decomposition	goal -> tasks (what was broken down and how)
Route	fact_routing_decision	task -> agent/tool (what was assigned where)
Execute	fact_execution_step	atomic tool call with timing, tokens, status
Prune	fact_pruning_event	what was killed/abandoned and why
Synthesize	fact_synthesis_result	merged output with quality signal
Verify	fact_verification	quality checks on final output

See references/dag_execution.md for full schema and capture mechanisms.

Common Anti-Patterns

See references/anti_patterns.md for mistakes to avoid:

Adding PRIMARY KEY to SCD Type 2 dimension tables
Using auto-increment sequences instead of hash keys
Normalizing fact tables (they should be denormalized)
Making dimension tables too wide (split into outriggers)
Forgetting metadata columns (record_source, session_id)

Reference Implementation

Working proof: skill-maintainer/scripts/store.py -- the full Kimball schema (v0.6.0) with 3 dimensions, 6 fact tables, analytical views, and automatic schema migration.

References

schema_patterns.md -- dimension and fact table templates
query_patterns.md -- star schema query cookbook
key_generation.md -- hash keys, natural keys, degenerate dimensions
anti_patterns.md -- common mistakes and how to avoid them
dag_execution.md -- agent execution as data pipeline DAG

Similar Skills

ui-ux-pro-max

UI/UX design intelligence for web and mobile. Includes 50+ styles, 161 color palettes, 57 font pairings, 161 product types, 99 UX guidelines, and 25 chart types across 10 stacks (React, Next.js, Vue, Svelte, SwiftUI, React Native, Flutter, Tailwind, shadcn/ui, and HTML/CSS). Actions: plan, build, create, design, implement, review, fix, improve, optimize, enhance, refactor, and check UI/UX code. Projects: website, landing page, dashboard, admin panel, e-commerce, SaaS, portfolio, blog, and mobile app. Elements: button, modal, navbar, sidebar, card, table, form, and chart. Styles: glassmorphism, claymorphism, minimalism, brutalism, neumorphism, bento grid, dark mode, responsive, skeuomorphism, and flat design. Topics: color systems, accessibility, animation, layout, typography, font pairing, spacing, interaction states, shadow, and gradient. Integrations: shadcn/ui MCP for component search and examples.

53.0k

context7-mcp

This skill should be used when the user asks about libraries, frameworks, API references, or needs code examples. Activates for setup questions, code generation involving libraries, or mentions of specific frameworks like React, Vue, Next.js, Prisma, Supabase, etc.

51.1k

payload

11 files

Use when working with Payload CMS projects (payload.config.ts, collections, fields, hooks, access control, Payload API). Use when debugging validation errors, security issues, relationship queries, transactions, or hook behavior.

41.6k

Stats

Parent Repo Stars0

Parent Repo Forks0

Last CommitMar 31, 2026

Actions

View Source View Plugin View on GitHub View README