Audits PostgreSQL/MySQL data for integrity issues (NULLs, orphans, formats, ranges, duplicates) and generates/enforces CHECK constraints, foreign keys, triggers.
From data-validation-enginenpx claudepluginhub nickloveinvesting/nick-love-plugins --plugin data-validation-engineThis skill is limited to using the following tools:
assets/README.mdreferences/README.mdscripts/README.mdscripts/configure_validation_rules.pyscripts/generate_validation_report.pyGuides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Designs, audits, and improves analytics tracking systems using Signal Quality Index for reliable, decision-ready data in marketing, product, and growth.
Enforces A/B test setup with gates for hypothesis locking, metrics definition, sample size calculation, assumptions checks, and execution readiness before implementation.
Implement and enforce data integrity rules at the database level using CHECK constraints, triggers, foreign keys, and custom validation functions across PostgreSQL and MySQL.
psql or mysql CLI for executing validation queriesinformation_schema for column specificationsAudit existing data quality by running validation queries before adding constraints. Check for NULL values in columns that should be required: SELECT column_name, COUNT(*) FILTER (WHERE column_name IS NULL) AS null_count, COUNT(*) AS total FROM table_name GROUP BY column_name.
Detect orphaned records (broken referential integrity): SELECT c.id FROM child_table c LEFT JOIN parent_table p ON c.parent_id = p.id WHERE p.id IS NULL. Document all orphaned records for cleanup or archival before adding foreign key constraints.
Validate data format compliance:
SELECT email FROM users WHERE email !~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'SELECT phone FROM contacts WHERE phone !~ '^\+?[1-9]\d{6,14}$'SELECT url FROM links WHERE url !~ '^https?://.+'SELECT * FROM events WHERE start_date > end_dateCheck numeric range violations: SELECT * FROM products WHERE price < 0 OR price > 999999.99 and SELECT * FROM users WHERE age < 0 OR age > 150. Map each column to its valid range based on business rules.
Identify duplicate records that violate intended uniqueness: SELECT email, COUNT(*) FROM users GROUP BY email HAVING COUNT(*) > 1. Determine which duplicate to keep (most recent, most complete) and plan deduplication.
Generate CHECK constraints for validated rules:
ALTER TABLE products ADD CONSTRAINT chk_price_positive CHECK (price >= 0)ALTER TABLE users ADD CONSTRAINT chk_email_format CHECK (email ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$')ALTER TABLE events ADD CONSTRAINT chk_date_order CHECK (start_date <= end_date)ALTER TABLE orders ADD CONSTRAINT chk_status_valid CHECK (status IN ('pending', 'processing', 'shipped', 'delivered', 'cancelled'))Create foreign key constraints with appropriate cascade behavior:
ALTER TABLE orders ADD CONSTRAINT fk_orders_customer FOREIGN KEY (customer_id) REFERENCES customers(id) ON DELETE RESTRICTON DELETE CASCADE for dependent data (order_items when order is deleted)ON DELETE SET NULL for optional relationships (assigned_to when user is deactivated)Implement complex business rule validation using database triggers when CHECK constraints are insufficient:
Apply constraints in a safe two-phase approach:
NOT VALID option (PostgreSQL): ALTER TABLE users ADD CONSTRAINT chk_email CHECK (email ~ '...') NOT VALID then ALTER TABLE users VALIDATE CONSTRAINT chk_email (validates existing data without blocking writes).Generate a data quality report summarizing: total records per table, violation counts by constraint type, cleanup actions taken, constraints applied, and remaining data quality issues requiring manual review.
| Error | Cause | Solution |
|---|---|---|
check constraint violated by existing row | Existing data fails the new constraint | Run the validation query first to find violations; clean up data; use NOT VALID option to add constraint without checking existing data, then validate separately |
cannot add foreign key: referenced row not found | Orphaned child records reference non-existent parent | Clean up orphaned records first with DELETE or UPDATE to valid parent; or insert missing parent records |
column cannot be made NOT NULL: contains NULL values | Existing rows have NULL in the target column | Backfill NULLs with UPDATE table SET column = default_value WHERE column IS NULL before adding NOT NULL |
| Trigger function causes performance regression | Complex validation logic executes on every INSERT/UPDATE | Optimize trigger function; use WHEN clause to limit trigger firing; consider CHECK constraints instead of triggers for simple rules |
| Circular foreign key prevents constraint creation | Tables reference each other, preventing creation order | Use ALTER TABLE ADD CONSTRAINT after both tables exist; or use DEFERRABLE INITIALLY DEFERRED constraints |
Auditing a legacy database with 50,000 invalid email addresses: Validation query reveals 50,000 of 2M user records have invalid email formats (missing @, double dots, spaces). A cleanup script normalizes common issues (trim whitespace, lowercase) and flags 3,000 unfixable records for manual review. After cleanup, a CHECK constraint with regex validation is applied.
Enforcing referential integrity on a database without foreign keys: An application relied on application-level FK enforcement, resulting in 12,000 orphaned order_items, 800 orphaned payments, and 200 orphaned reviews. Cleanup scripts archive orphaned records to backup tables, then foreign key constraints with ON DELETE CASCADE are added. A nightly validation job monitors for new orphans.
Implementing business rules for a financial application: Constraints enforce: account balance cannot be negative (CHECK (balance >= 0)), transfer amount must be positive (CHECK (amount > 0)), transaction date cannot be in the future (CHECK (transaction_date <= CURRENT_DATE)), and a trigger prevents transfers between accounts owned by different customers unless explicitly authorized.