From co-dev
Analyze, document, map, or scan the database schema. Use when the user wants to analyze the database, document the database, generate schema docs, map the database, create DB documentation, or inspect the database structure. Generates a docs/db.md file with complete database schema documentation. Auto-detects language/framework. Supports MySQL, PostgreSQL, SQLite, MongoDB, Elasticsearch, Redis, and BigQuery.
How this skill is triggered — by the user, by Claude, or both
Slash command
/co-dev:analyze-dbThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Analyze the project and generate `docs/db.md` with **complete database schema documentation** ready for use by the `query-db` skill.
Analyze the project and generate docs/db.md with complete database schema documentation ready for use by the query-db skill.
Document EVERY table/collection/index without exception — including join tables, migration trackers, session tables, queue tables, cache tables, framework-internal tables. Developers need full schema docs, not just "important" ones.
Prefer MCP tools when available — they handle connection management. Fall back to CLI on errors.
| Database | MCP Tools | CLI Fallback |
|---|---|---|
| PostgreSQL | mcp__postgres__list_tables, describe_table, list_schemas, query | psql |
| MySQL | mcp__mysql__mysql_query | mysql |
| MongoDB | mcp__mongodb__list-databases, list-collections, collection-schema, find | mongosh |
| Redis | mcp__redis__* | redis-cli |
| SQLite | (no MCP) | sqlite3 |
| BigQuery | mcp__bigquery__* | bq |
| Elasticsearch | (no MCP) | curl |
| Database | Variables |
|---|---|
| MySQL | MYSQL_HOST, MYSQL_PORT, MYSQL_USER, MYSQL_PASS, MYSQL_DB |
| PostgreSQL | PGHOST, PGPORT, PGUSER, PGPASSWORD, PGDATABASE |
| MongoDB | MONGODB_URI |
| Elasticsearch | ES_URL, ES_API_KEY (optional) |
| Redis | REDIS_URL |
| BigQuery | BQ_PROJECT, BQ_DATASETS (comma-separated list, e.g. archive_2023,archive_2024,archive_2025) |
| Database | Connect / Query | List schema |
|---|---|---|
| MySQL | MYSQL_PWD="$MYSQL_PASS" mysql -h "$MYSQL_HOST" -P "$MYSQL_PORT" -u "$MYSQL_USER" "$MYSQL_DB" -e "<SQL>" | SELECT table_name, table_rows FROM information_schema.tables WHERE table_schema = DATABASE() ORDER BY table_rows DESC; (estimates, instant) |
| PostgreSQL | psql -c "<SQL>" | SELECT schemaname, relname, n_live_tup FROM pg_stat_user_tables ORDER BY n_live_tup DESC; |
| SQLite | sqlite3 <db> "<SQL>" | SELECT name FROM sqlite_master WHERE type='table'; |
| MongoDB | mongosh "$MONGODB_URI" --eval "<JS>" | db.getCollectionNames().forEach(c => print(c + ': ' + db[c].estimatedDocumentCount())) |
| Elasticsearch | curl -s "$ES_URL/<endpoint>" (add -H "Authorization: ApiKey $ES_API_KEY" if set) | curl -s "$ES_URL/_cat/indices?v&h=index,docs.count,store.size" |
| Redis | redis-cli -u "$REDIS_URL" <CMD> | DBSIZE, SCAN 0 MATCH <pattern> COUNT 100 |
| BigQuery | bq query --use_legacy_sql=false --format=prettyjson --project_id="$BQ_PROJECT" "<SQL>" | bq ls --project_id="$BQ_PROJECT" "$DATASET"; bq show --schema --format=prettyjson --project_id="$BQ_PROJECT" "$DATASET.<table>" |
docs/db.mdIf the file exists, read it but still execute every step. Code and schemas drift. After fresh analysis, merge findings:
Match these signals (run each only as needed):
| Language | Framework | Detection signals |
|---|---|---|
| PHP | Symfony / Doctrine ORM | composer.json has doctrine/orm or doctrine/doctrine-bundle; src/Entity/; config/packages/doctrine.yaml; migrations/ |
| PHP | Laravel / Eloquent | composer.json has laravel/framework; app/Models/; database/migrations/; config/database.php |
| PHP | Doctrine ODM (MongoDB) | composer.json has doctrine/mongodb-odm; src/Document/ |
| Python | Django | manage.py; settings.py with DATABASES; models.py in apps; */migrations/ |
| Python | Flask / FastAPI + SQLAlchemy | requirements.txt/pyproject.toml has sqlalchemy or flask-sqlalchemy; models.py or models/; alembic/ |
| Python | Django + MongoDB | settings.py has djongo or mongoengine |
| Python | PyMongo / Motor | requirements.txt has pymongo or motor |
| Ruby | Rails / ActiveRecord | Gemfile has rails; app/models/; db/migrate/; db/schema.rb or db/structure.sql; config/database.yml |
| Ruby | Mongoid | Gemfile has mongoid; config/mongoid.yml |
| Go | GORM | go.mod has gorm.io/gorm; structs with gorm: tags; models/ or internal/models/ |
| Go | sqlx | go.mod has github.com/jmoiron/sqlx |
| Go | mongo-driver | go.mod has go.mongodb.org/mongo-driver |
| Go | ent | go.mod has entgo.io/ent; ent/schema/ |
| Node / TS | TypeORM | package.json has typeorm; src/entity/ or entities/; ormconfig.json or data-source.ts |
| Node / TS | Prisma | prisma/schema.prisma; package.json has @prisma/client |
| Node / TS | Sequelize | package.json has sequelize; models/; migrations/ |
| Node / TS | Mongoose | package.json has mongoose; new Schema(...) patterns |
| Node / TS | Drizzle | package.json has drizzle-orm; drizzle/ |
| Node / TS | Knex | package.json has knex; knexfile.js/knexfile.ts; migrations/ |
| Java / Kotlin | Spring Boot + JPA/Hibernate | pom.xml/build.gradle has spring-boot-starter-data-jpa; @Entity classes; application.properties/application.yml with spring.datasource; **/entity/ or **/model/ |
| Java / Kotlin | Spring Data MongoDB | spring-boot-starter-data-mongodb; @Document classes |
| .NET / C# | EF Core | *.csproj has Microsoft.EntityFrameworkCore; DbContext classes; Migrations/; appsettings.json with connection strings |
| .NET / C# | MongoDB.Driver | *.csproj has MongoDB.Driver |
| Rust | Diesel | Cargo.toml has diesel; diesel.toml; migrations/; schema.rs |
| Rust | SeaORM | Cargo.toml has sea-orm; entity/ |
| Rust | SQLx | Cargo.toml has sqlx; .sqlx/ or migrations/ |
Identify each DB used by inspecting:
.env/.env.example; SQL driver dependencies.fos_elastica.yaml/elasticsearch.yml.BQ_PROJECT/BQ_DATASETS set; google-cloud-bigquery (Python) or @google-cloud/bigquery (Node) deps.| Framework | Entity location | Migration location | Schema source |
|---|---|---|---|
| Symfony / Doctrine | src/Entity/ | migrations/ | php bin/console doctrine:mapping:info |
| Laravel / Eloquent | app/Models/ | database/migrations/ | php artisan model:show |
| Django | */models.py | */migrations/ | python manage.py inspectdb |
| Rails / ActiveRecord | app/models/ | db/migrate/ | db/schema.rb |
| TypeORM | src/entity/ | migrations/ | entity decorators |
| Prisma | prisma/schema.prisma | (Prisma migrations) | schema.prisma |
| Spring JPA | **/entity/ | Flyway / Liquibase | @Entity classes |
| EF Core | Models/ or Entities/ | Migrations/ | DbContext |
| GORM | models/ | migration files | struct tags |
| Diesel | src/models.rs | migrations/ | schema.rs |
For SQL: extract column types, primary keys, indexes, foreign keys, unique constraints.
For MongoDB ODMs — Doctrine ODM (@ODM\ annotations in src/Document/); Mongoose (new Schema({...}) in models/); MongoEngine (Document subclass in models.py); Mongoid (field :name, type: in app/models/); Spring Data MongoDB (@Document in **/document/). Extract field types, references, embedded documents, indexes.
For Elasticsearch: index mappings, field types/analyzers, nested object structures.
For Redis: key naming patterns in code, data structures used (String/Hash/Set/ZSet/List/HyperLogLog), TTL patterns.
IMPORTANT — code is not exhaustive. ORM entities don't cover join tables, framework tables (sessions, migrations, jobs, cache), or raw-SQL tables. Always reconcile against the live database in Step 7.
Look for:
deleted_at, is_deleted).tenant_id, organization_id).docs/db.md draftmkdir -p docs
Write the initial draft using the per-DB template (see "Document Templates" below).
Test connectivity using the simplest CLI ping per DB:
| Database | Test command |
|---|---|
| MySQL | MYSQL_PWD="$MYSQL_PASS" mysql -h "$MYSQL_HOST" -P "$MYSQL_PORT" -u "$MYSQL_USER" "$MYSQL_DB" -e "SELECT 1" |
| PostgreSQL | psql -c "SELECT 1" |
| MongoDB | mongosh "$MONGODB_URI" --eval "db.runCommand({ping: 1})" |
| Elasticsearch | curl -s "$ES_URL/_cluster/health" |
| Redis | redis-cli -u "$REDIS_URL" PING |
| BigQuery | bq query --use_legacy_sql=false --project_id="$BQ_PROJECT" "SELECT 1" (if it fails, ask the user to run gcloud auth application-default login and gcloud auth application-default set-quota-project $BQ_PROJECT) |
If a test fails, output the missing env var(s) and ask the user to set them. Wait for confirmation.
If the user declines or can't provide credentials, skip Steps 7-8 and proceed to Step 9 using code-based analysis only. The Last verified line in db.md MUST reflect this (see Step 9 timestamp formats).
CRITICAL — enumerate ALL objects first. List every table / collection / index in the live database before anything else. Compare against what you documented from code in Steps 3-4. Add anything missing.
Performance safeguards for large tables:
information_schema.tables.table_rows, pg_stat_user_tables.n_live_tup, estimatedDocumentCount()); never COUNT(*) on large tables.LIMIT ad-hoc sampling queries.Use the schema commands from the "CLI Command Reference" table above, then for each table/collection capture:
SHOW INDEX FROM <table>. PostgreSQL: SELECT indexname, indexdef FROM pg_indexes WHERE tablename = '<table>';. MongoDB: db.<coll>.getIndexes(). Elasticsearch: curl -s "$ES_URL/<index>/_mapping" | jq.SELECT MIN(created_at), MAX(created_at) FROM <table>; (or MongoDB $min/$max aggregation).db.<coll>.findOne(); Redis HGETALL/TTL.for ds in $(echo "$BQ_DATASETS" | tr ',' ' '); do echo "=== $ds ==="; bq ls --project_id="$BQ_PROJECT" "$ds"; done. For each table: bq show --schema --format=prettyjson --project_id="$BQ_PROJECT" "$DATASET.<table>" and bq show --project_id="$BQ_PROJECT" "$DATASET.<table>" (row count, partitioning).Use safe sampling depending on table size:
| DB | Small table (<1M) | Large table (>1M) | Very large |
|---|---|---|---|
| MySQL/PostgreSQL | SELECT status, COUNT(*) FROM TABLE GROUP BY status ORDER BY count DESC; | Add WHERE created_at >= NOW() - INTERVAL 30 DAY (PG: INTERVAL '30 days') | SELECT DISTINCT status FROM TABLE LIMIT 20; |
| MongoDB | db.COLL.aggregate([{$group: {_id: "$status", count: {$sum: 1}}}, {$sort: {count: -1}}]) | Prepend {$sample: {size: 10000}} to the pipeline | (sampled) |
| Elasticsearch | terms aggregation with size: 0 (always safe — uses approximate counts) | same | same |
| BigQuery | SELECT status, COUNT(*) FROM \$BQ_PROJECT.$DATASET.TABLE` GROUP BY status ORDER BY count DESC LIMIT 20;` | Use APPROX_COUNT_DISTINCT(ID) and always include partition filter | --dry_run first to estimate cost |
docs/db.md with verified dataCompleteness check before writing: every table/collection/index returned by Step 7 has a row in the "All Tables / Collections / Indices" section. There must be a 1:1 correspondence — no skipping framework or join tables.
Add Large Table Warnings. For tables >1M rows: list with safeguards. For >10M rows: mark "VERY LARGE — always filter by date/indexed column" and list specific indexed columns.
Common Business Questions — from Step 4's BI/dashboard scan, document recurring analytics questions with the correct tables/joins/filters. Helps query-db users avoid common mistakes.
Add row/document counts to listings, replace enum guesses with actual values + counts, document actual indexes, add date ranges.
"Last verified" line at top of docs/db.md:
> **Last verified**: YYYY-MM-DD — verified against live database> **Last verified**: YYYY-MM-DD — derived from code analysis only (not verified against live database)docs/db.md always starts with H1 # Database Schema Documentation and the "Last verified" line. The body sections depend on the DB type. Below are the required sections per DB. Fill them with discovered content; do not paste placeholder rows.
Required sections, in order:
query-db skill (e.g. MYSQL_PWD="$MYSQL_PASS" mysql -h "$MYSQL_HOST" -P "$MYSQL_PORT" -u "$MYSQL_USER" "$MYSQL_DB" or psql).Table | Purpose | Key Fields for Filtering/Grouping | Rows.Table.Field | Value | Meaning | Count.parent.fk → child.pk arrows.Table.Field | Purpose | Notes (TZ, granularity).Table.Field | Unit | Notes (e.g. cents, divide by 100).deleted_at/is_deleted; remind to add WHERE deleted_at IS NULL.organization_id, tenant_id).Table | Rows | Required Safeguards.# | Anti-Pattern | Why It's Bad | Do Instead. Standard rows: SELECT * without WHERE on large tables; unbounded COUNT(*); unfiltered JOIN between large tables; GROUP BY on non-indexed columns; ignoring denormalized analytics tables.# | Question | Tables Involved | Key Filters (from Step 4 BI scan).deleted_at IS NULL).Required sections:
mongosh "$MONGODB_URI".Collection | Purpose | Key Fields for Filtering/Grouping | Document Count.Collection.Field | Value | Meaning.coll.fkField → otherColl._id.Collection | Embedded Field | Structure.Collection.Field | Purpose.find({}); $lookup between large collections without $match first; large allowDiskUse aggregations without $match.$match first).Required sections:
curl -s "$ES_URL".Index | Purpose | Key Fields | Doc Count.Index.Field | Type | Notes (e.g. scaled_float factor 100, text + keyword).Index.Field | Format (epoch_millis, ISO).Index | Nested Field | Structure.size > 10000; deep from + size pagination (>10000 limit); match_all without size: 0 on large indices.size: 0).Required sections:
redis-cli -u "$REDIS_URL".Pattern | Type | Purpose (e.g. user:{id} Hash, cache:product:{id} String/JSON, stats:pageviews HyperLogLog).Pattern | TTL | Notes.KEYS * in production (use SCAN); FLUSHDB/FLUSHALL without confirmation.HGETALL/ZREVRANGE/PFCOUNT examples in fenced blocks.Required sections:
bq query --use_legacy_sql=false --format=prettyjson --project_id="$BQ_PROJECT".Dataset | Period | Description.Table | Purpose | Key Fields | Rows. Note any datasets with differing schemas.Dataset.Table.Field | Value | Meaning (use *.table.field if uniform across datasets).Dataset.Table | Partition Column | Clustering Columns | Notes — always filter on partition to reduce bytes scanned.UNION ALL across yearly archives.SELECT * on wide tables; UNION ALL across all datasets without date filter; LIMIT to reduce cost (it doesn't); skipping --dry_run for large queries.--dry_run workflow and --maximum_bytes_billed=1000000000 cap. BigQuery pricing ~$5/TB scanned.If multiple DBs are used, the file has one H1 + a "Databases Used" list, then one H2 section per database following the appropriate template above. Example:
# Database Schema Documentation
## Databases Used
1. PostgreSQL (primary data)
2. Redis (caching, sessions)
3. Elasticsearch (search)
## PostgreSQL
[full SQL template sections]
## Redis
[full Redis template sections]
## Elasticsearch
[full Elasticsearch template sections]
query-db).npx claudepluginhub cloud-officer/claude-code-plugin-dev --plugin co-devCreates bite-sized, testable implementation plans from specs or requirements, with file structure and task decomposition. Activates before coding multi-step tasks.