Design hub/link/satellite structures for enterprise data warehouses using Data Vault 2.0 methodology.
Designs enterprise data warehouses using Data Vault 2.0 methodology. Use when creating hub, link, and satellite structures for historical tracking and auditability.
/plugin marketplace add melodic-software/claude-code-plugins/plugin install data-architecture@melodic-softwareThis skill is limited to using the following tools:
Use this skill when:
Data Vault is a detail-oriented, historical tracking, and uniquely linked set of normalized tables supporting enterprise data warehouses. Designed for agility, auditability, and scalability.
┌─────────────────────────────────────────────────────────────────┐
│ DATA VAULT │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ HUB │◄───────►│ LINK │◄───────►│ HUB │ │
│ │ Customer │ │ Orders │ │ Product │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │
│ ┌────┴─────┐ ┌────┴─────┐ ┌────┴─────┐ │
│ │SATELLITE │ │SATELLITE │ │SATELLITE │ │
│ │ CustInfo │ │ OrderDtl │ │ ProdInfo │ │
│ └──────────┘ └──────────┘ └──────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Hubs store unique business keys - the core business concepts that won't change.
| Column | Purpose |
|---|---|
| Hub Hash Key | Primary key (hash of business key) |
| Business Key | Natural identifier from source |
| Load Date | First load timestamp |
| Record Source | Origin system identifier |
Note: Data Vault 2.0 uses snake_case with standardized prefixes (
hub_,sat_,link_) as part of its methodology. This convention is acceptable for Data Vault schemas even on SQL Server.
CREATE TABLE hub_customer (
hub_customer_hk BINARY(32) NOT NULL PRIMARY KEY,
customer_bk VARCHAR(50) NOT NULL,
load_dts DATETIME2 NOT NULL,
record_source VARCHAR(100) NOT NULL
);
CREATE UNIQUE INDEX IX_hub_customer_bk ON hub_customer(customer_bk);
# Hub: hub_customer
## Business Key
customer_id - Unique customer identifier from CRM
## Sources
- CRM System (primary)
- E-commerce Platform
- Mobile App
## Hash Key Generation
SHA-256(UPPER(TRIM(customer_id)))
## Notes
- Business key is stable, never changes
- Multiple sources may provide same customer
Links represent relationships between business concepts (hubs).
| Type | Description | Example |
|---|---|---|
| Standard Link | N:M relationship | Customer-Product |
| Hierarchical Link | Self-referencing | Employee-Manager |
| Same-As Link | Duplicate resolution | Customer-Customer |
| Transactional Link | Transaction with measures | Order (with amount) |
| Column | Purpose |
|---|---|
| Link Hash Key | Primary key (hash of hub keys) |
| Hub Hash Key 1..N | Foreign keys to hubs |
| Load Date | First load timestamp |
| Record Source | Origin system |
CREATE TABLE link_order (
link_order_hk BINARY(32) NOT NULL PRIMARY KEY,
hub_customer_hk BINARY(32) NOT NULL,
hub_product_hk BINARY(32) NOT NULL,
hub_store_hk BINARY(32) NOT NULL,
order_bk VARCHAR(50) NOT NULL, -- Degenerate key
load_dts DATETIME2 NOT NULL,
record_source VARCHAR(100) NOT NULL,
CONSTRAINT FK_link_order_customer
FOREIGN KEY (hub_customer_hk)
REFERENCES hub_customer(hub_customer_hk),
CONSTRAINT FK_link_order_product
FOREIGN KEY (hub_product_hk)
REFERENCES hub_product(hub_product_hk)
);
Satellites store descriptive attributes and track history through delta detection.
| Type | Description | Example |
|---|---|---|
| Descriptive | Attributes for a hub | Customer demographics |
| Effectivity | Relationship validity | Order status |
| Multi-Active | Multiple concurrent values | Phone numbers |
| Status Tracking | State changes | Order lifecycle |
| Column | Purpose |
|---|---|
| Hub/Link Hash Key | FK to parent hub/link |
| Load Date | Record timestamp (part of PK) |
| Load End Date | Optional: end of validity |
| Hash Diff | Change detection hash |
| Attributes | Descriptive columns |
| Record Source | Origin system |
CREATE TABLE sat_customer_details (
hub_customer_hk BINARY(32) NOT NULL,
load_dts DATETIME2 NOT NULL,
load_end_dts DATETIME2,
hash_diff BINARY(32) NOT NULL,
-- Attributes
customer_name VARCHAR(100),
email VARCHAR(255),
phone VARCHAR(20),
address_line1 VARCHAR(200),
city VARCHAR(100),
state VARCHAR(50),
postal_code VARCHAR(20),
country VARCHAR(50),
-- Audit
record_source VARCHAR(100) NOT NULL,
PRIMARY KEY (hub_customer_hk, load_dts),
CONSTRAINT FK_sat_customer_hub
FOREIGN KEY (hub_customer_hk)
REFERENCES hub_customer(hub_customer_hk)
);
public static class DataVaultHasher
{
public static byte[] GenerateHashKey(params string[] businessKeys)
{
var concatenated = string.Join("|",
businessKeys.Select(k =>
k?.Trim().ToUpperInvariant() ?? string.Empty));
using var sha256 = SHA256.Create();
return sha256.ComputeHash(Encoding.UTF8.GetBytes(concatenated));
}
public static byte[] GenerateHashDiff(object entity)
{
var json = JsonSerializer.Serialize(entity,
new JsonSerializerOptions { PropertyNamingPolicy = null });
using var sha256 = SHA256.Create();
return sha256.ComputeHash(Encoding.UTF8.GetBytes(json));
}
}
┌─────────────────────────────────────────────────────────────────┐
│ INFORMATION MART │
│ (Dimensional models, aggregates, reports) │
└─────────────────────────────────────────────────────────────────┘
▲
┌─────────────────────────────────────────────────────────────────┐
│ BUSINESS VAULT │
│ (Derived data, business rules, calculated attributes) │
└─────────────────────────────────────────────────────────────────┘
▲
┌─────────────────────────────────────────────────────────────────┐
│ RAW VAULT │
│ (Hubs, Links, Satellites - source-aligned) │
└─────────────────────────────────────────────────────────────────┘
▲
┌─────────────────────────────────────────────────────────────────┐
│ STAGING AREA │
│ (Transient, source-aligned extracts) │
└─────────────────────────────────────────────────────────────────┘
PIT tables optimize queries by pre-joining satellite data at specific points in time.
CREATE TABLE pit_customer (
pit_customer_hk BINARY(32) NOT NULL,
snapshot_dts DATETIME2 NOT NULL,
hub_customer_hk BINARY(32) NOT NULL,
-- Satellite load dates (for joins)
sat_customer_details_load_dts DATETIME2,
sat_customer_loyalty_load_dts DATETIME2,
sat_customer_preferences_load_dts DATETIME2,
PRIMARY KEY (pit_customer_hk),
CONSTRAINT FK_pit_customer_hub
FOREIGN KEY (hub_customer_hk)
REFERENCES hub_customer(hub_customer_hk)
);
Bridge tables resolve many-to-many relationships for easier querying.
CREATE TABLE bridge_customer_product (
bridge_hk BINARY(32) NOT NULL PRIMARY KEY,
hub_customer_hk BINARY(32) NOT NULL,
hub_product_hk BINARY(32) NOT NULL,
snapshot_dts DATETIME2 NOT NULL,
order_count INT,
total_revenue DECIMAL(18,2),
first_order_dts DATETIME2,
last_order_dts DATETIME2
);
erDiagram
HUB_CUSTOMER {
binary hub_customer_hk PK
varchar customer_bk UK
datetime2 load_dts
varchar record_source
}
HUB_PRODUCT {
binary hub_product_hk PK
varchar product_bk UK
datetime2 load_dts
varchar record_source
}
LINK_ORDER {
binary link_order_hk PK
binary hub_customer_hk FK
binary hub_product_hk FK
varchar order_bk
datetime2 load_dts
varchar record_source
}
SAT_CUSTOMER_DETAILS {
binary hub_customer_hk FK
datetime2 load_dts PK
binary hash_diff
varchar customer_name
varchar email
varchar record_source
}
SAT_ORDER_DETAILS {
binary link_order_hk FK
datetime2 load_dts PK
binary hash_diff
decimal quantity
decimal unit_price
varchar record_source
}
HUB_CUSTOMER ||--o{ SAT_CUSTOMER_DETAILS : "has history"
HUB_CUSTOMER ||--o{ LINK_ORDER : "places"
HUB_PRODUCT ||--o{ LINK_ORDER : "ordered in"
LINK_ORDER ||--o{ SAT_ORDER_DETAILS : "has details"
public async Task LoadHub(
IEnumerable<SourceCustomer> customers,
CancellationToken ct)
{
foreach (var batch in customers.Chunk(1000))
{
var hubRecords = batch.Select(c => new HubCustomer
{
HubCustomerHk = DataVaultHasher.GenerateHashKey(c.CustomerId),
CustomerBk = c.CustomerId,
LoadDts = DateTime.UtcNow,
RecordSource = "CRM"
});
await _context.BulkMergeAsync(hubRecords,
options => options.ColumnPrimaryKeyExpression = x => x.HubCustomerHk,
ct);
}
}
public async Task LoadSatellite(
IEnumerable<SourceCustomer> customers,
CancellationToken ct)
{
foreach (var customer in customers)
{
var hubHk = DataVaultHasher.GenerateHashKey(customer.CustomerId);
var hashDiff = DataVaultHasher.GenerateHashDiff(customer);
// Check if changed
var current = await _context.SatCustomerDetails
.Where(s => s.HubCustomerHk == hubHk && s.LoadEndDts == null)
.FirstOrDefaultAsync(ct);
if (current == null || !current.HashDiff.SequenceEqual(hashDiff))
{
// Close current record
if (current != null)
{
current.LoadEndDts = DateTime.UtcNow;
}
// Insert new record
_context.SatCustomerDetails.Add(new SatCustomerDetails
{
HubCustomerHk = hubHk,
LoadDts = DateTime.UtcNow,
HashDiff = hashDiff,
CustomerName = customer.Name,
Email = customer.Email,
RecordSource = "CRM"
});
}
}
await _context.SaveChangesAsync(ct);
}
| Object | Prefix | Example |
|---|---|---|
| Hub | HUB_ | HUB_CUSTOMER |
| Link | LNK_or LINK_ | LINK_ORDER |
| Satellite | SAT_ | SAT_CUSTOMER_DETAILS |
| Bridge | BRG_or BRIDGE_ | BRIDGE_CUSTOMER_PRODUCT |
| Point-in-Time | PIT_ | PIT_CUSTOMER |
| Reference | REF_ | REF_COUNTRY |
Inputs from:
conceptual-modeling skill → Business entitieser-modeling skill → Source relationshipsOutputs to:
dimensional-modeling skill → Information martsmigration-planning skill → ETL/ELT designschema-design skill → Physical implementationThis skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.