Production-grade database design skill for SQL/NoSQL selection, schema design, indexing, and query optimization
Provides production-grade database design recommendations for SQL/NoSQL selection, schema design, indexing, and query optimization. Use when designing new databases, optimizing slow queries, or planning database migrations.
/plugin marketplace add pluginagentmarketplace/custom-plugin-system-design/plugin install custom-plugin-system-design@pluginagentmarketplace-system-designThis skill inherits all available tools. When active, it can use any tool Claude has access to.
assets/config.yamlassets/schema.jsonreferences/GUIDE.mdreferences/PATTERNS.mdscripts/validate.pyPurpose: Atomic skill for database architecture with comprehensive parameter validation and optimization patterns.
| Attribute | Value |
|---|---|
| Scope | Schema Design, Indexing, Query Optimization |
| Responsibility | Single: Data modeling and storage patterns |
| Invocation | Skill("database-design") |
parameters:
database_context:
type: object
required: true
properties:
use_case:
type: string
enum: [oltp, olap, htap, time_series, document, graph]
required: true
data_characteristics:
type: object
required: true
properties:
volume:
type: string
pattern: "^\\d+[KMGTP]B$"
velocity:
type: string
pattern: "^\\d+[KM]?\\s*(reads?|writes?)/s$"
variety:
type: string
enum: [structured, semi_structured, unstructured]
access_patterns:
type: array
items:
type: object
properties:
query_type: { type: string }
frequency: { type: string, enum: [high, medium, low] }
latency_sla: { type: string, pattern: "^\\d+ms$" }
consistency:
type: string
enum: [strong, eventual, read_your_writes]
default: strong
validation_rules:
- name: "oltp_latency"
rule: "use_case == 'oltp' implies latency_sla <= '100ms'"
error: "OLTP workloads typically require <100ms latency"
- name: "volume_velocity_match"
rule: "high_volume implies adequate_velocity"
error: "Volume and velocity constraints may be incompatible"
output:
type: object
properties:
database_recommendation:
type: object
properties:
primary: { type: string }
secondary: { type: string }
rationale: { type: string }
schema:
type: object
properties:
tables: { type: array }
relationships: { type: array }
constraints: { type: array }
indexes:
type: array
items:
type: object
properties:
table: { type: string }
columns: { type: array }
type: { type: string }
rationale: { type: string }
optimization_tips:
type: array
items: { type: string }
OLTP (Transactional):
├── PostgreSQL
│ ├── Best for: Complex queries, ACID, JSON support
│ ├── Scale: Vertical + read replicas
│ └── Limit: ~10TB effective, 10K TPS
├── MySQL/MariaDB
│ ├── Best for: Web applications, simple queries
│ ├── Scale: Read replicas, ProxySQL
│ └── Limit: Similar to PostgreSQL
└── CockroachDB/TiDB
├── Best for: Distributed ACID
├── Scale: Horizontal auto-sharding
└── Trade-off: Higher latency
OLAP (Analytical):
├── ClickHouse
│ ├── Best for: Real-time analytics, columnar
│ ├── Performance: 1B+ rows/second
│ └── Use: Log analysis, metrics
├── Snowflake/BigQuery
│ ├── Best for: Data warehouse, serverless
│ ├── Scale: Unlimited, pay-per-query
│ └── Use: BI, reporting
└── DuckDB
├── Best for: Embedded analytics
└── Use: Edge, single-machine analytics
NoSQL:
├── MongoDB
│ ├── Best for: Documents, flexible schema
│ └── Pattern: Embedding, denormalization
├── Cassandra
│ ├── Best for: Write-heavy, wide-column
│ └── Pattern: Partition key design
├── Redis
│ ├── Best for: Cache, sessions, pub/sub
│ └── Limit: Memory-bound
└── Neo4j
├── Best for: Graph relationships
└── Pattern: Cypher queries
Normalization (OLTP):
├── 1NF: Atomic values
├── 2NF: No partial dependencies
├── 3NF: No transitive dependencies
└── BCNF: Strict key dependencies
Denormalization (OLAP/NoSQL):
├── Embedding
│ ├── Nest related documents
│ ├── Best for: 1:few relationships
│ └── Avoid: Large arrays, frequent updates
├── Bucketing
│ ├── Group time-series data
│ ├── Reduce document count
│ └── Improve write performance
└── Materialized Views
├── Pre-computed aggregates
├── Trade: Storage vs query time
└── Refresh: Incremental or full
Index Types:
├── B-tree (default)
│ ├── Range queries, sorting
│ ├── Overhead: ~10% storage
│ └── Example: WHERE date BETWEEN x AND y
├── Hash
│ ├── Equality only
│ ├── Faster lookup
│ └── Example: WHERE id = 123
├── GIN (PostgreSQL)
│ ├── Full-text, JSONB, arrays
│ ├── Higher write cost
│ └── Example: WHERE tags @> '{redis}'
├── BRIN (Block Range)
│ ├── Large sorted tables
│ ├── Minimal storage
│ └── Example: Time-series data
└── Composite
├── Multi-column index
├── Order matters (leftmost prefix)
└── Example: (user_id, created_at DESC)
Index Selection Rules:
├── Query frequency > 10% of total
├── Selectivity > 10% (filter out 90%)
├── Write overhead acceptable
└── Index size < table size
retry_config:
connection:
max_attempts: 3
initial_delay_ms: 100
max_delay_ms: 5000
multiplier: 2.0
query:
max_attempts: 2
timeout_ms: 30000
retry_on:
- CONNECTION_LOST
- DEADLOCK
- LOCK_TIMEOUT
abort_on:
- SYNTAX_ERROR
- CONSTRAINT_VIOLATION
- PERMISSION_DENIED
transaction:
retry_on_deadlock: true
max_deadlock_retries: 3
isolation_level: READ_COMMITTED
log_schema:
level: { type: string }
timestamp: { type: string, format: ISO8601 }
skill: { type: string, value: "database-design" }
event:
type: string
enum:
- schema_analyzed
- index_recommended
- query_optimized
- capacity_estimated
- migration_planned
context:
type: object
properties:
table: { type: string }
query: { type: string }
execution_time_ms: { type: number }
rows_affected: { type: integer }
example:
level: INFO
event: index_recommended
context:
table: orders
columns: [user_id, status, created_at]
type: composite
rationale: "Covers 80% of queries"
metrics:
- name: query_duration_seconds
type: histogram
labels: [query_type, table]
buckets: [0.001, 0.01, 0.1, 1, 10]
- name: index_usage_ratio
type: gauge
labels: [index_name, table]
- name: table_size_bytes
type: gauge
labels: [table, database]
- name: connection_pool_usage
type: gauge
labels: [pool_name]
| Issue | Cause | Resolution |
|---|---|---|
| Slow queries | Missing index | Add appropriate index |
| Lock waits | Long transactions | Split or optimize |
| Connection exhaustion | Pool too small | Increase pool, add pooler |
| Replication lag | Heavy writes | Add replicas, optimize queries |
| Disk full | Bloat, no vacuum | VACUUM, archival |
□ EXPLAIN ANALYZE run?
□ Index usage verified?
□ Connection pool monitored?
□ Lock contention checked?
□ Table statistics current?
□ Slow query log enabled?
# test_database_design.py
def test_valid_oltp_schema():
params = {
"database_context": {
"use_case": "oltp",
"data_characteristics": {
"volume": "100GB",
"velocity": "10K writes/s",
"variety": "structured"
},
"access_patterns": [
{"query_type": "point_lookup", "frequency": "high", "latency_sla": "10ms"}
]
}
}
result = validate_parameters(params)
assert result.valid == True
def test_oltp_latency_validation():
params = {
"database_context": {
"use_case": "oltp",
"access_patterns": [
{"latency_sla": "500ms"} # Too high for OLTP
]
}
}
result = validate_parameters(params)
assert "latency" in result.warnings[0]
def test_index_recommendation():
query = "SELECT * FROM orders WHERE user_id = ? AND status = ? ORDER BY created_at DESC"
access_pattern = {"frequency": "high", "latency_sla": "10ms"}
result = recommend_index(query, access_pattern)
assert result.columns == ["user_id", "status", "created_at"]
assert result.type == "composite"
def test_capacity_estimation():
result = estimate_capacity(
rows=10_000_000,
row_size_bytes=500,
growth_rate_monthly=0.05,
months=12
)
assert result.current_size == "5GB"
assert result.projected_size == "8.97GB" # 5GB * 1.05^12
def test_explain_analysis():
explain_output = """
Seq Scan on orders (rows=1000000)
Filter: (user_id = 123)
"""
result = analyze_explain(explain_output)
assert result.issues[0].type == "SEQ_SCAN"
assert result.suggestions[0] == "Add index on orders(user_id)"
def test_composite_index_order():
queries = [
"WHERE user_id = ? AND status = ?",
"WHERE user_id = ? ORDER BY created_at"
]
result = optimize_index_order(queries)
# user_id should be first (equality), created_at last (sort)
assert result.recommended_order == ["user_id", "status", "created_at"]
| Version | Date | Changes |
|---|---|---|
| 2.0.0 | 2025-01 | Production-grade rewrite with optimization patterns |
| 1.0.0 | 2024-12 | Initial release |
This skill should be used when the user asks to "create a slash command", "add a command", "write a custom command", "define command arguments", "use command frontmatter", "organize commands", "create command with file references", "interactive command", "use AskUserQuestion in command", or needs guidance on slash command structure, YAML frontmatter fields, dynamic arguments, bash execution in commands, user interaction patterns, or command development best practices for Claude Code.
This skill should be used when the user asks to "create an agent", "add an agent", "write a subagent", "agent frontmatter", "when to use description", "agent examples", "agent tools", "agent colors", "autonomous agent", or needs guidance on agent structure, system prompts, triggering conditions, or agent development best practices for Claude Code plugins.
This skill should be used when the user asks to "create a hook", "add a PreToolUse/PostToolUse/Stop hook", "validate tool use", "implement prompt-based hooks", "use ${CLAUDE_PLUGIN_ROOT}", "set up event-driven automation", "block dangerous commands", or mentions hook events (PreToolUse, PostToolUse, Stop, SubagentStop, SessionStart, SessionEnd, UserPromptSubmit, PreCompact, Notification). Provides comprehensive guidance for creating and implementing Claude Code plugin hooks with focus on advanced prompt-based hooks API.