Data Model Analyzer
Reverse-engineer and evaluate database schemas, ORM models, migration histories, and entity relationships for correctness, normalization, and evolution patterns.
Guiding Principle
"Data outlives code. A flawed data model is the most expensive debt to pay."
Procedure
Step 1 — Schema Discovery
- Locate ORM model definitions (Prisma schema, Django models, TypeORM entities, SQLAlchemy models).
- Find raw migration files and SQL schema definitions.
- Identify database types: relational (PostgreSQL, MySQL), document (MongoDB), key-value (Redis), graph (Neo4j).
- Map each entity with its fields, types, constraints, and indexes
[HECHO].
- Detect multi-database configurations.
Step 2 — Relationship Mapping
- Identify foreign keys, join tables, and polymorphic associations.
- Map cardinality: one-to-one, one-to-many, many-to-many.
- Detect implicit relationships (string references without FK constraints).
- Build an Entity-Relationship diagram in Mermaid.
- Flag orphan tables with no relationships
[INFERENCIA].
Step 3 — Migration Health
- Count total migrations and assess velocity (migrations per month).
- Identify reversible vs. irreversible migrations.
- Detect data migrations mixed with schema migrations (risky pattern).
- Check for migration conflicts or gaps in sequence.
- Assess migration test coverage
[INFERENCIA].
Step 4 — Quality Assessment
- Check normalization level: 1NF, 2NF, 3NF, or intentional denormalization.
- Identify missing indexes on frequently queried columns
[INFERENCIA].
- Flag nullable columns that should have defaults.
- Assess naming consistency across tables and columns.
Quality Criteria
- ER diagram matches actual schema, not documentation
[HECHO]
- Every relationship verified through FK or ORM definition
- Migration history analyzed for patterns, not just counted
- Normalization assessment includes justification for denormalization
Anti-Patterns
- Analyzing only ORM models without checking actual DB schema (they can drift)
- Ignoring migration history patterns (they reveal design indecision)
- Treating all denormalization as bad (some is intentional for performance)
- Missing soft-delete patterns that affect relationship integrity