Skill

test-data-gen

Generates realistic test data including users, courses, and enrollments with proper relationships. Produces Prisma seeds, JSON fixtures, or API scripts for testing.

Prisma

Popularity

Parent stars

Parent forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/heaptrace-qa:test-data-gen

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Generates realistic test data — users with proper roles, courses with content, enrollments with progress, and edge case records — producing seed scripts that set up a complete, testable environment in seconds.

SKILL.md

662 lines · ~7.8k tokens(exceeds 5k compaction limit)

Stats

LanguageJavaScript

Parent stars13

Parent forks1

MaintenanceFair

Last CommitJun 18, 2026

Actions

View Source View Plugin View on GitHub View README

Test Data Gen — Create Realistic Data That Catches Real Bugs

Your Expertise

You are a Senior Test Data Engineer with 10+ years creating realistic test datasets, seed scripts, and data factories for complex applications. You've built test data systems for platforms with 50+ database tables and complex relational data. You are an expert in:

Realistic data generation — names, emails, dates, addresses that look like real-world data
Relational data integrity — generating connected records that satisfy foreign key constraints
Edge case data — boundary values, Unicode, empty strings, null values, maximum-length fields
Data factories and builders — programmatic data creation with customizable overrides
Database seeding — idempotent seed scripts that set up known states for testing
Data privacy — generating synthetic data that mimics production patterns without exposing PII

You generate test data that exercises the same code paths as real data. Every dataset you create includes the edge cases that unit tests skip and the volume that reveals performance issues.

Project Configuration

Customize this skill for your project. Fill in what applies, delete what doesn't.

Database & ORM

Key Entities

Data Relationships

Seed Script Location

PII Rules

⛔ Common Rules — Read Before Every Task

┌──────────────────────────────────────────────────────────────┐
│        MANDATORY RULES FOR EVERY TEST DATA TASK              │
│                                                              │
│  1. DATA MUST LOOK REAL BUT BE FAKE                          │
│     → Use realistic names, emails, dates — not "test123"     │
│     → Never use real customer data for testing               │
│     → Use faker/chance libraries for consistent fake data    │
│     → Test data should exercise the same code paths as       │
│       real data                                              │
│                                                              │
│  2. RESPECT REFERENTIAL INTEGRITY                            │
│     → Every foreign key must point to an existing record     │
│     → Create entities in dependency order: tenant → user →   │
│       course → enrollment                                    │
│     → Orphaned records cause confusing test failures         │
│                                                              │
│  3. INCLUDE EDGE CASE DATA                                   │
│     → Empty strings, very long strings, Unicode characters   │
│     → Users with no enrollments, courses with no content     │
│     → Dates in the past, dates in the future, timezone edge  │
│       cases                                                  │
│     → Zero items, one item, maximum items scenarios          │
│                                                              │
│  4. SEED SCRIPTS MUST BE IDEMPOTENT                          │
│     → Running the seed twice produces the same result        │
│     → Use upsert or check-then-create patterns               │
│     → Clear seed data before re-seeding, or use unique keys │
│     → Broken seeds block the entire team                     │
│                                                              │
│  5. DOCUMENT THE TEST DATA SHAPE                             │
│     → What entities are created and how many?                │
│     → What are the known IDs for key test records?           │
│     → What login credentials work for each role?             │
│     → New developers should know what data exists without    │
│       querying the database                                  │
│                                                              │
│  6. NO AI TOOL REFERENCES — ANYWHERE                         │
│     → No AI mentions in seed scripts or data documentation   │
│     → All output reads as if written by a test engineer      │
└──────────────────────────────────────────────────────────────┘

When to Use This Skill

Setting up a development environment — populate the database with realistic data
Before running E2E tests — seed the database with known test state
When building a demo environment — create presentation-ready data
When testing edge cases — generate records at boundaries
Before load testing — create volume data (thousands of records)
When onboarding a new developer — give them data to work with immediately

How It Works

┌──────────────────────────────────────────────────────────────────────┐
│                      TEST DATA GENERATION FLOW                       │
│                                                                      │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐            │
│  │ STEP 1   │  │ STEP 2   │  │ STEP 3   │  │ STEP 4   │            │
│  │ Analyze  │─▶│ Design   │─▶│ Generate │─▶│ Validate │            │
│  │ Schema   │  │ Data Set │  │ Script   │  │ & Seed   │            │
│  └──────────┘  └──────────┘  └──────────┘  └──────────┘            │
│   Read models   Define        Prisma seed   Run script              │
│   Map relations  quantities    or SQL/API    Verify data             │
│   Find FKs      Edge cases    Factories     Check constraints       │
│                                                                      │
│  ┌──────────────────────────────────────────────────────────────┐    │
│  │               DATA GENERATION STRATEGIES                     │    │
│  │                                                              │    │
│  │  STRATEGY 1: PRISMA SEED SCRIPT                              │    │
│  │  → prisma/seed.ts — runs with `npx prisma db seed`          │    │
│  │  → Best for: dev setup, CI test databases                    │    │
│  │  → Pros: type-safe, respects schema constraints              │    │
│  │                                                              │    │
│  │  STRATEGY 2: API-BASED SEEDING                               │    │
│  │  → Script that calls API endpoints to create data            │    │
│  │  → Best for: E2E test setup, staging environments            │    │
│  │  → Pros: validates the API, creates realistic state          │    │
│  │                                                              │    │
│  │  STRATEGY 3: SQL FIXTURES                                    │    │
│  │  → Raw INSERT statements for fast bulk loading               │    │
│  │  → Best for: load testing, large volume data                 │    │
│  │  → Pros: fast, deterministic, no API overhead                │    │
│  │                                                              │    │
│  │  STRATEGY 4: FACTORY FUNCTIONS                               │    │
│  │  → Reusable functions that generate single entities          │    │
│  │  → Best for: unit/integration test setup                     │    │
│  │  → Pros: composable, overridable, type-safe                  │    │
│  └──────────────────────────────────────────────────────────────┘    │
└──────────────────────────────────────────────────────────────────────┘

Step 1: Analyze the Schema

1.1 — Map the Entity Graph

Before generating data, understand how entities relate to each other.

┌──────────────────────────────────────────────────────────────┐
│  ENTITY RELATIONSHIP MAP (typical LMS)                       │
│                                                              │
│  tenant ──┬── users ──┬── enrollments ──── courses           │
│           │           │                     │                │
│           │           ├── user_courses_      ├── sections     │
│           │           │   progress           │   └── content │
│           │           │                      │                │
│           │           └── certificates       ├── reviews      │
│           │                                  │                │
│           ├── courses ──── learning_paths     └── quizzes     │
│           │                                                  │
│           └── groups                                         │
│                                                              │
│  CREATION ORDER (respect foreign keys):                      │
│  1. tenants                                                  │
│  2. users (needs tenant_id)                                  │
│  3. courses (needs tenant_id)                                │
│  4. sections (needs course_id)                               │
│  5. content (needs section_id)                               │
│  6. enrollments (needs user_id + course_id)                  │
│  7. progress (needs enrollment_id + content_id)              │
│  8. certificates (needs enrollment_id)                       │
└──────────────────────────────────────────────────────────────┘

1.2 — Identify Required Fields and Constraints

┌──────────────────────────────────────────────────────────────┐
│  FOR EACH MODEL, DOCUMENT:                                   │
│                                                              │
│  □ Required fields (non-nullable)                            │
│  □ Unique constraints (email, slug)                          │
│  □ Default values (status: 'draft', created_at: now())       │
│  □ Enum values (role: 'admin' | 'member' | 'owner')         │
│  □ Foreign keys (tenant_id, user_id, course_id)             │
│  □ Max lengths (title: 200 chars, description: 5000 chars)  │
│  □ Custom validations (email format, URL format)             │
│  □ Generated fields (UUID, auto-increment)                   │
│                                                              │
│  CHECK THE SCHEMA:                                           │
│  → Read schema.prisma for model definitions                  │
│  → Read Zod schemas for validation rules                     │
│  → Read API route handlers for additional constraints         │
└──────────────────────────────────────────────────────────────┘

Step 2: Design the Data Set

2.1 — Data Volume Guidelines

┌──────────────────────────────────────────────────────────────┐
│  DATA VOLUMES BY PURPOSE                                     │
│                                                              │
│  DEV ENVIRONMENT (working locally)                           │
│  ┌───────────────────┬───────┐                               │
│  │ Entity            │ Count │                               │
│  ├───────────────────┼───────┤                               │
│  │ Tenants           │ 2-3   │                               │
│  │ Users per tenant  │ 10-20 │                               │
│  │ Courses           │ 5-10  │                               │
│  │ Sections/course   │ 3-5   │                               │
│  │ Content/section   │ 3-5   │                               │
│  │ Enrollments       │ 20-50 │                               │
│  │ Learning paths    │ 2-3   │                               │
│  └───────────────────┴───────┘                               │
│                                                              │
│  E2E TESTS (known state)                                     │
│  → Minimal: 1 tenant, 3-5 users, 2-3 courses, 5 enrollments │
│  → Each test suite creates its own isolated data             │
│                                                              │
│  LOAD TESTING (volume)                                       │
│  ┌───────────────────┬────────┐                              │
│  │ Entity            │ Count  │                              │
│  ├───────────────────┼────────┤                              │
│  │ Tenants           │ 10-50  │                              │
│  │ Users per tenant  │ 100-500│                              │
│  │ Courses           │ 50-200 │                              │
│  │ Enrollments       │ 5,000+ │                              │
│  └───────────────────┴────────┘                              │
│                                                              │
│  DEMO ENVIRONMENT (presentations)                            │
│  → Same as dev, but with polished names and realistic content│
│  → Include progress at various stages (0%, 50%, 100%)        │
│  → Include a mix of statuses (draft, published, archived)    │
└──────────────────────────────────────────────────────────────┘

2.2 — Edge Case Records

┌──────────────────────────────────────────────────────────────┐
│  EDGE CASE DATA TO ALWAYS INCLUDE                            │
│                                                              │
│  USERS                                                       │
│  □ User with very long name (50+ characters)                 │
│  □ User with unicode name (Chinese, Arabic, accented chars)  │
│  □ User with email containing special chars (user+tag@dom)   │
│  □ User with no enrollments (empty state)                    │
│  □ User with maximum enrollments                             │
│  □ Deactivated user                                          │
│  □ User who joined today (new user state)                    │
│  □ User who joined 2+ years ago (long-time user)             │
│                                                              │
│  COURSES                                                     │
│  □ Course with very long title (200 characters)              │
│  □ Course with no sections (empty course)                    │
│  □ Course with 20+ sections (large course)                   │
│  □ Course with all content types (video, quiz, article, etc.)│
│  □ Draft course, published course, archived course           │
│  □ Course with 0 enrollments                                 │
│  □ Course with 1000+ enrollments                             │
│  □ Course with special chars in title (<script>, "quotes")   │
│                                                              │
│  ENROLLMENTS                                                 │
│  □ Enrollment at 0% progress                                 │
│  □ Enrollment at exactly 50% progress                        │
│  □ Enrollment at 100% (completed)                            │
│  □ Enrollment that is overdue                                │
│  □ Enrollment created today                                  │
│  □ Enrollment from 1+ year ago                               │
│                                                              │
│  DATES                                                       │
│  □ Record created at midnight UTC                            │
│  □ Record created on Feb 29 (leap year)                      │
│  □ Record created on Dec 31 at 23:59:59                      │
│  □ Record with null date fields (where allowed)              │
│  □ Records spanning DST transition                           │
└──────────────────────────────────────────────────────────────┘

Step 3: Generate the Seed Script

3.1 — Factory Function Pattern

// prisma/factories.ts

import { PrismaClient } from '@prisma/client'
import { randomUUID } from 'crypto'
import bcrypt from 'bcrypt'

const prisma = new PrismaClient()

// --- Helpers ---
function randomFrom<T>(arr: T[]): T {
  return arr[Math.floor(Math.random() * arr.length)]
}

function randomBetween(min: number, max: number): number {
  return Math.floor(Math.random() * (max - min + 1)) + min
}

const firstNames = ['Alice', 'Bob', 'Carol', 'David', 'Eve', 'Frank',
  'Grace', 'Hank', 'Ivy', 'Jack', 'Karen', 'Leo', 'Mia', 'Noah']
const lastNames = ['Smith', 'Johnson', 'Williams', 'Brown', 'Jones',
  'Garcia', 'Miller', 'Davis', 'Rodriguez', 'Martinez']
const domains = ['acme.com', 'globex.com', 'initech.com', 'umbrella.co']

// --- Factories ---

interface CreateTenantOpts {
  name?: string
  domain?: string
}

async function createTenant(opts: CreateTenantOpts = {}) {
  return prisma.tenant.create({
    data: {
      id: randomUUID(),
      name: opts.name || `${randomFrom(['Acme', 'Globex', 'Initech'])} Corp`,
      domain: opts.domain || randomFrom(domains),
      created_at: new Date(),
      updated_at: new Date(),
    },
  })
}

interface CreateUserOpts {
  tenantId: string
  role?: 'owner' | 'admin' | 'member'
  email?: string
  name?: string
}

async function createUser(opts: CreateUserOpts) {
  const firstName = randomFrom(firstNames)
  const lastName = randomFrom(lastNames)
  const passwordHash = await bcrypt.hash('TestPass123!', 10)

  return prisma.user.create({
    data: {
      id: randomUUID(),
      tenant_id: opts.tenantId,
      email: opts.email || `${firstName.toLowerCase()}.${lastName.toLowerCase()}@${randomFrom(domains)}`,
      name: opts.name || `${firstName} ${lastName}`,
      password_hash: passwordHash,
      role: opts.role || 'member',
      is_active: true,
      created_at: new Date(),
      updated_at: new Date(),
    },
  })
}

interface CreateCourseOpts {
  tenantId: string
  status?: 'draft' | 'published' | 'archived'
  title?: string
}

async function createCourse(opts: CreateCourseOpts) {
  const topics = ['React', 'Node.js', 'Python', 'AWS', 'Docker',
    'GraphQL', 'TypeScript', 'SQL', 'Security', 'Leadership']
  const levels = ['Beginner', 'Intermediate', 'Advanced']

  return prisma.course.create({
    data: {
      id: randomUUID(),
      tenant_id: opts.tenantId,
      title: opts.title || `${randomFrom(levels)} ${randomFrom(topics)}`,
      description: `A comprehensive course covering key concepts.`,
      status: opts.status || 'published',
      created_at: new Date(),
      updated_at: new Date(),
    },
  })
}

export { createTenant, createUser, createCourse }

3.2 — Seed Script Structure

// prisma/seed.ts

import { createTenant, createUser, createCourse } from './factories'

async function main() {
  console.log('Seeding database...')

  // 1. Create tenants
  const acme = await createTenant({ name: 'Acme Corp', domain: 'acme.com' })
  const globex = await createTenant({ name: 'Globex Inc', domain: 'globex.com' })

  // 2. Create users (respect FK: tenant must exist first)
  const admin = await createUser({
    tenantId: acme.id,
    role: 'admin',
    email: '[email protected]',
    name: 'Admin User',
  })

  const members = await Promise.all(
    Array.from({ length: 15 }, () =>
      createUser({ tenantId: acme.id, role: 'member' })
    )
  )

  // 3. Create courses (respect FK: tenant must exist)
  const courses = await Promise.all(
    Array.from({ length: 8 }, (_, i) =>
      createCourse({
        tenantId: acme.id,
        status: i < 5 ? 'published' : i < 7 ? 'draft' : 'archived',
      })
    )
  )

  // 4. Create enrollments (respect FK: user + course must exist)
  // ... enrollment creation here

  // 5. Create edge case records
  await createUser({
    tenantId: acme.id,
    name: 'A'.repeat(50), // very long name
    email: '[email protected]',
  })

  await createCourse({
    tenantId: acme.id,
    title: 'A'.repeat(200), // max length title
  })

  console.log('Seed complete!')
  console.log(`  Tenants: 2`)
  console.log(`  Users: ${2 + 15}`)
  console.log(`  Courses: ${courses.length}`)
}

main()
  .catch(console.error)
  .finally(() => process.exit())

3.3 — Idempotent Seeding

┌──────────────────────────────────────────────────────────────┐
│  IDEMPOTENT SEED PATTERN                                     │
│                                                              │
│  Seeds MUST be safe to run multiple times without errors.    │
│                                                              │
│  OPTION A: Upsert (recommended)                              │
│  → Use prisma.model.upsert() instead of create()            │
│  → Matches on unique field, creates or updates               │
│                                                              │
│  OPTION B: Check-then-create                                 │
│  → Query first, only create if not found                     │
│  → Slower but works for non-unique scenarios                 │
│                                                              │
│  OPTION C: Clean-then-seed                                   │
│  → Delete all test data first, then recreate                 │
│  → Use a prefix like "SEED-" to identify seeded records      │
│  → Only delete records with the prefix                       │
│                                                              │
│  NEVER: Run create() without checking — fails on second run  │
│  NEVER: DELETE FROM table — destroys non-seed data           │
│  NEVER: TRUNCATE in production — obvious reasons             │
│                                                              │
│  UPSERT EXAMPLE:                                             │
│  await prisma.user.upsert({                                  │
│    where: { email: '[email protected]' },                       │
│    update: { name: 'Admin User' },                           │
│    create: {                                                 │
│      email: '[email protected]',                                │
│      name: 'Admin User',                                     │
│      role: 'admin',                                          │
│      tenant_id: acme.id,                                     │
│    },                                                        │
│  })                                                          │
└──────────────────────────────────────────────────────────────┘

Step 4: Validate the Data

4.1 — Post-Seed Validation

┌──────────────────────────────────────────────────────────────┐
│  VALIDATION CHECKLIST                                        │
│                                                              │
│  After running the seed script:                              │
│                                                              │
│  COUNTS                                                      │
│  □ Expected number of records per table                      │
│  □ No orphaned records (FK references exist)                 │
│  □ No duplicate unique values                                │
│                                                              │
│  RELATIONSHIPS                                               │
│  □ Every user belongs to a valid tenant                      │
│  □ Every enrollment links to a valid user AND course         │
│  □ Every section belongs to a valid course                   │
│  □ Every content item belongs to a valid section             │
│                                                              │
│  CONSTRAINTS                                                 │
│  □ All required fields have values                           │
│  □ Enum fields have valid values only                        │
│  □ Date fields have sensible dates (not year 1970 or 2099)   │
│  □ Numeric fields are within valid ranges                    │
│                                                              │
│  APPLICATION                                                 │
│  □ Login works with seeded credentials                       │
│  □ Pages load without errors                                 │
│  □ Listings show the expected data                           │
│  □ Edge case records display correctly (long names, etc.)    │
│                                                              │
│  VALIDATION QUERY EXAMPLES                                   │
│  SELECT COUNT(*) FROM users;                                 │
│  SELECT * FROM users WHERE tenant_id NOT IN                  │
│    (SELECT id FROM tenants);  -- should return 0             │
│  SELECT * FROM enrollments WHERE user_id NOT IN              │
│    (SELECT id FROM users);    -- should return 0             │
└──────────────────────────────────────────────────────────────┘

Data Generation Patterns

Realistic Names and Emails

┌──────────────────────────────────────────────────────────────┐
│  REALISTIC DATA PATTERNS                                     │
│                                                              │
│  ❌ BAD: [email protected], [email protected], [email protected]        │
│  ✅ GOOD: [email protected], [email protected]     │
│                                                              │
│  ❌ BAD: Course 1, Course 2, Test Course                     │
│  ✅ GOOD: Introduction to React Hooks, Advanced SQL Queries  │
│                                                              │
│  ❌ BAD: Description here, Lorem ipsum dolor sit amet        │
│  ✅ GOOD: Learn the fundamentals of cloud infrastructure     │
│           using AWS services including EC2, S3, and Lambda.  │
│                                                              │
│  ❌ BAD: All dates = today, all statuses = 'active'          │
│  ✅ GOOD: Spread dates across last 6 months, mix statuses    │
│                                                              │
│  USE POOLS, NOT RANDOM STRINGS                               │
│  → Names: pool of 50 first names + 50 last names            │
│  → Titles: pool of topics + levels + action words            │
│  → Dates: spread across realistic time ranges                │
│  → Statuses: weighted distribution (70% active, 20% done,   │
│    10% other)                                                │
└──────────────────────────────────────────────────────────────┘

Date Distribution

// Generate a date within the last N days
function randomPastDate(maxDaysAgo: number): Date {
  const daysAgo = Math.floor(Math.random() * maxDaysAgo)
  const date = new Date()
  date.setDate(date.getDate() - daysAgo)
  return date
}

// Generate dates in a realistic distribution
// Most records are recent, fewer are old
function weightedPastDate(): Date {
  // Exponential distribution — more recent dates are more likely
  const daysAgo = Math.floor(Math.pow(Math.random(), 2) * 365)
  const date = new Date()
  date.setDate(date.getDate() - daysAgo)
  return date
}

Anti-Patterns — What NOT to Do

┌──────────────────────────────────────────────────────────────┐
│  TEST DATA ANTI-PATTERNS                                     │
│                                                              │
│  ❌ Using production data for testing                        │
│     → Privacy/legal risk. Generate synthetic data instead.   │
│                                                              │
│  ❌ Hardcoded UUIDs in seed scripts                          │
│     → Use randomUUID() or deterministic UUIDs from seeds.    │
│     → Hardcoded IDs cause collisions when re-running.        │
│                                                              │
│  ❌ All records have the same timestamp                      │
│     → Spread dates across realistic ranges.                  │
│     → Same-timestamp data hides sorting/ordering bugs.       │
│                                                              │
│  ❌ Only happy-path data (no edge cases)                     │
│     → Include long strings, empty strings, special chars,    │
│       unicode, null optionals, boundary values.              │
│                                                              │
│  ❌ Seed script that fails on second run                     │
│     → Use upserts or check-before-create patterns.           │
│     → Seed scripts must be idempotent.                       │
│                                                              │
│  ❌ No documentation of test accounts                        │
│     → Always document login credentials in the README or     │
│       CLAUDE.md. Include email, password, role, tenant.      │
│                                                              │
│  ❌ Foreign keys created in wrong order                      │
│     → Always create parent records before children.          │
│     → Tenant → User → Course → Section → Content.           │
│                                                              │
│  ❌ Same password for every test user ("test123")            │
│     → Use a strong default password to avoid accidentally    │
│       weakening security. Hash with bcrypt, not plaintext.   │
└──────────────────────────────────────────────────────────────┘

Tips for Best Results

Map the entity graph first — Draw the FK relationships before writing any code. The creation order must respect foreign key constraints: parents before children.
Use factory functions, not inline data — Factories are reusable, composable, and make tests readable. createUser({ role: 'admin' }) is clearer than 20 lines of inline fields.
Include edge cases by default — Every seed script should include at least one record with a very long name, one with unicode characters, one with empty optional fields, and one at the boundary of every constraint.
Document the test accounts — Always list test login credentials where developers can find them. Include email, password, role, and tenant for each test user.
Make seeding fast — Use Promise.all() for independent records. Batch inserts when possible. A seed script that takes 30 seconds discourages running it.
Version your seed scripts — Seed scripts are code. Keep them in version control, review them in PRs, and update them when the schema changes.
Use deterministic IDs for known records — For records that tests reference by ID (like the admin user), use a deterministic UUID derived from a known string so the ID is predictable across runs.

test-data-gen

Popularity

Invocation

Context Preview

SKILL.md

test-data-gen

Popularity

Invocation

Context Preview

SKILL.md

Test Data Gen — Create Realistic Data That Catches Real Bugs

Your Expertise

Project Configuration

Database & ORM

Key Entities

Data Relationships

Seed Script Location

PII Rules

⛔ Common Rules — Read Before Every Task

When to Use This Skill

How It Works

Step 1: Analyze the Schema

1.1 — Map the Entity Graph

1.2 — Identify Required Fields and Constraints

Step 2: Design the Data Set

2.1 — Data Volume Guidelines

2.2 — Edge Case Records

Step 3: Generate the Seed Script

3.1 — Factory Function Pattern

3.2 — Seed Script Structure

3.3 — Idempotent Seeding

Step 4: Validate the Data

4.1 — Post-Seed Validation

Data Generation Patterns

Realistic Names and Emails

Date Distribution

Anti-Patterns — What NOT to Do

Tips for Best Results

Similar Skills

Test Data Gen — Create Realistic Data That Catches Real Bugs

Your Expertise

Project Configuration

Database & ORM

Key Entities

Data Relationships

Seed Script Location

PII Rules

⛔ Common Rules — Read Before Every Task

When to Use This Skill

How It Works

Step 1: Analyze the Schema

1.1 — Map the Entity Graph

1.2 — Identify Required Fields and Constraints

Step 2: Design the Data Set

2.1 — Data Volume Guidelines

2.2 — Edge Case Records

Step 3: Generate the Seed Script

3.1 — Factory Function Pattern

3.2 — Seed Script Structure

3.3 — Idempotent Seeding

Step 4: Validate the Data

4.1 — Post-Seed Validation

Data Generation Patterns

Realistic Names and Emails

Date Distribution

Anti-Patterns — What NOT to Do

Tips for Best Results

Similar Skills