Plan master data management strategies including golden record identification, matching, and merging approaches.
Plans master data management strategies for creating golden records. Use when designing entity matching, merging, and survivorship rules to resolve duplicates across source systems.
/plugin marketplace add melodic-software/claude-code-plugins/plugin install data-architecture@melodic-softwareThis skill is limited to using the following tools:
Use this skill when:
Master Data Management (MDM) ensures critical business entities (customers, products, locations) are consistent, accurate, and governed across the enterprise. MDM creates "golden records" - the single source of truth.
| Style | Description | Best For |
|---|---|---|
| Registry | Index to source systems | Quick wins, low investment |
| Consolidation | Read-only golden copy | Reporting, analytics |
| Coexistence | Bidirectional sync | Complex enterprises |
| Centralized | Single authoring source | Greenfield, tight control |
| Transaction Hub | Real-time master | High-volume transactions |
CONSOLIDATION STYLE
┌─────────────────────────────────────────────────────────────────┐
│ MDM Hub │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Golden Records (Read-Only) │ │
│ │ Customer Master │ Product Master │ Location Master │ │
│ └─────────────────────────────────────────────────────────┘ │
│ ▲ │
│ ┌────────────────────┼────────────────────┐ │
│ │ │ │ │
│ ┌────┴────┐ ┌────┴────┐ ┌────┴────┐ │
│ │ Match │ │ Match │ │ Match │ │
│ │ & Merge │ │ & Merge │ │ & Merge │ │
│ └────┬────┘ └────┬────┘ └────┬────┘ │
└─────────┼───────────────────┼───────────────────┼───────────────┘
│ │ │
┌────┴────┐ ┌────┴────┐ ┌────┴────┐
│ CRM │ │ ERP │ │E-Commerce│
└─────────┘ └─────────┘ └──────────┘
| Domain | Examples | Typical Attributes |
|---|---|---|
| Party | Customer, Vendor, Employee | Name, Address, Contact |
| Product | SKU, Service, Asset | Description, Category, Price |
| Location | Site, Store, Warehouse | Address, Coordinates, Type |
| Financial | Account, Cost Center | Code, Name, Hierarchy |
| Reference | Country, Currency, Status | Code, Description |
# Master Data Domain: Customer
## Domain Scope
- B2B customers
- B2C customers
- Prospects
- Leads
## Source Systems
| System | Records | Quality Score | Authority |
|--------|---------|---------------|-----------|
| CRM | 150,000 | 85% | High |
| ERP | 120,000 | 75% | Medium |
| E-Commerce | 200,000 | 60% | Low |
## Key Challenges
- Duplicate customers across systems
- Inconsistent address formats
- Missing email validation
- No unique identifier across systems
## Business Impact
- $X million in duplicate mailings
- Y% order errors from wrong customer
- Z hours/week manual reconciliation
| Approach | Description | Accuracy | Speed |
|---|---|---|---|
| Exact | Identical values | High | Fast |
| Fuzzy | Similar values (Levenshtein, Soundex) | Medium | Medium |
| Probabilistic | Statistical likelihood | Highest | Slow |
| Rule-Based | Business logic | Configurable | Fast |
# Match Rules: Customer Domain
## Rule 1: Exact Email Match
- Field: email
- Type: Exact (case-insensitive)
- Confidence: 95%
- Auto-merge: Yes
## Rule 2: Name + Address
- Fields: last_name, address_line1, postal_code
- Type: Fuzzy (threshold 85%)
- Confidence: 80%
- Auto-merge: No (manual review)
## Rule 3: Phone + Last Name
- Fields: phone, last_name
- Type: Exact phone, Fuzzy name
- Confidence: 75%
- Auto-merge: No
## Rule 4: Company Match
- Fields: company_name, city
- Type: Fuzzy (threshold 80%)
- Confidence: 70%
- Auto-merge: No
public class CustomerMatcher
{
public MatchResult Match(Customer source, Customer target)
{
var scores = new List<MatchScore>();
// Exact email match (high confidence)
if (!string.IsNullOrEmpty(source.Email) &&
source.Email.Equals(target.Email, StringComparison.OrdinalIgnoreCase))
{
scores.Add(new MatchScore("Email", 0.95m));
}
// Fuzzy name match
var nameScore = CalculateLevenshteinSimilarity(
source.FullName, target.FullName);
if (nameScore > 0.85m)
{
scores.Add(new MatchScore("Name", nameScore * 0.7m));
}
// Address match
var addressScore = CalculateAddressSimilarity(
source.Address, target.Address);
if (addressScore > 0.80m)
{
scores.Add(new MatchScore("Address", addressScore * 0.6m));
}
var overallScore = scores.Sum(s => s.Weight) / scores.Count;
return new MatchResult
{
IsMatch = overallScore > 0.75m,
Confidence = overallScore,
Scores = scores,
RequiresReview = overallScore is > 0.6m and < 0.85m
};
}
}
| Strategy | Description | Use Case |
|---|---|---|
| Most Recent | Latest timestamp wins | Frequently updated data |
| Most Trusted | Source priority | Known data quality |
| Most Complete | Non-null preferred | Sparse data |
| Longest | Longest string wins | Descriptions, names |
| Aggregate | Combine values | Lists, tags |
| Custom | Business logic | Complex rules |
public class CustomerSurvivorship
{
public Customer Merge(IEnumerable<Customer> duplicates)
{
var ordered = duplicates
.OrderByDescending(d => GetSourcePriority(d.SourceSystem))
.ThenByDescending(d => d.ModifiedDate)
.ToList();
return new Customer
{
// Most trusted source
CustomerId = ordered.First().CustomerId,
// Most complete (first non-null)
FirstName = ordered.Select(d => d.FirstName)
.FirstOrDefault(v => !string.IsNullOrEmpty(v)),
LastName = ordered.Select(d => d.LastName)
.FirstOrDefault(v => !string.IsNullOrEmpty(v)),
// Most recent
Email = ordered
.OrderByDescending(d => d.EmailVerifiedDate)
.Select(d => d.Email)
.FirstOrDefault(v => !string.IsNullOrEmpty(v)),
// Aggregated
Tags = ordered.SelectMany(d => d.Tags ?? [])
.Distinct()
.ToList(),
// Highest value
LifetimeValue = ordered.Max(d => d.LifetimeValue),
// Custom logic
Segment = DetermineSegment(ordered)
};
}
private int GetSourcePriority(string source) => source switch
{
"CRM" => 100,
"ERP" => 80,
"E-Commerce" => 60,
_ => 40
};
}
| Type | Description | Examples |
|---|---|---|
| Code Lists | Simple lookups | Countries, Status codes |
| Hierarchies | Parent-child | Categories, Org structure |
| Cross-References | System mappings | Product codes across systems |
| Rules | Business logic | Tax rates, Pricing tiers |
-- Reference data tables (PascalCase - SQL Server Convention)
CREATE TABLE RefCountry (
CountryCode CHAR(2) PRIMARY KEY,
CountryCode3 CHAR(3) NOT NULL,
CountryName VARCHAR(100) NOT NULL,
Region VARCHAR(50),
IsActive BIT DEFAULT 1,
EffectiveDate DATE NOT NULL,
ExpirationDate DATE,
CreatedAt DATETIME2 DEFAULT GETUTCDATE(),
UpdatedAt DATETIME2
);
CREATE TABLE RefCrossReference (
CrossRefId INT IDENTITY PRIMARY KEY,
Domain VARCHAR(50) NOT NULL,
SourceSystem VARCHAR(50) NOT NULL,
SourceCode VARCHAR(100) NOT NULL,
TargetSystem VARCHAR(50) NOT NULL,
TargetCode VARCHAR(100) NOT NULL,
IsActive BIT DEFAULT 1,
CONSTRAINT UQ_CrossRef
UNIQUE (Domain, SourceSystem, SourceCode, TargetSystem)
);
# MDM Governance RACI
| Activity | Data Owner | Data Steward | IT | Business |
|----------|------------|--------------|-----|----------|
| Define data standards | A | R | C | C |
| Approve match rules | A | R | C | I |
| Review merge candidates | I | R | C | A |
| Monitor data quality | I | R | A | C |
| Resolve data issues | A | R | C | I |
| Maintain MDM system | I | C | R | I |
R = Responsible, A = Accountable, C = Consulted, I = Informed
flowchart TD
A[New Record Ingested] --> B{Duplicate Found?}
B -->|No| C[Create Golden Record]
B -->|Yes| D{Auto-Merge Threshold?}
D -->|Yes| E[Auto-Merge]
D -->|No| F[Queue for Review]
F --> G[Data Steward Review]
G --> H{Decision}
H -->|Merge| E
H -->|Not a Match| I[Mark as Distinct]
H -->|Needs Investigation| J[Research Queue]
E --> K[Update Golden Record]
C --> K
I --> K
┌─────────────────────────────────────────────────────────────────┐
│ MDM Platform │
├─────────────────────────────────────────────────────────────────┤
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌───────────┐ │
│ │ Data │ │ Match & │ │ Workflow │ │ Data │ │
│ │ Integration│ │ Merge │ │ Engine │ │ Quality │ │
│ └────────────┘ └────────────┘ └────────────┘ └───────────┘ │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌───────────┐ │
│ │ Hierarchy │ │ Reference │ │ Governance │ │ API │ │
│ │ Manager │ │ Data Mgmt │ │ Dashboard │ │ Gateway │ │
│ └────────────┘ └────────────┘ └────────────┘ └───────────┘ │
└─────────────────────────────────────────────────────────────────┘
Inputs from:
conceptual-modeling skill → Entity candidatesdata-governance skill → Stewardship modeldata-quality-planning skill → Quality rulesOutputs to:
schema-design skill → MDM hub schemamigration-planning skill → Initial load strategydata-lineage skill → Cross-reference trackingThis skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.