This document contains the complete problem bank with solutions and walkthroughs for the Database Architecture interviewer skill.
From coding-interview-agentnpx claudepluginhub preplabsai/interviewmentor --plugin coding-interview-agentManages AI Agent Skills on prompts.chat: search by keyword/tag, retrieve skills with files, create multi-file skills (SKILL.md required), add/update/remove files for Claude Code.
Manages AI prompt library on prompts.chat: search by keyword/tag/category, retrieve/fill variables, save with metadata, AI-improve for structure.
Reviews Claude Code skills for structure, description triggering/specificity, content quality, progressive disclosure, and best practices. Provides targeted improvements. Trigger proactively after skill creation/modification.
This document contains the complete problem bank with solutions and walkthroughs for the Database Architecture interviewer skill.
Question: "If I am building a system to ingest 100,000 metrics per second from IoT devices, but only reading them occasionally to generate daily reports, what kind of storage engine should I use?"
Root Cause: This is a write-heavy workload. B-Trees require updating pages in place, which causes disk seeks and is suboptimal for high write throughput.
Ideal Answer: Use an LSM-tree based database like Cassandra, InfluxDB, or RocksDB. They excel at high-throughput write workloads because writes are sequential (append-only), avoiding the random I/O overhead of B-trees.
Key Concepts:
Question: "We need to shard a users table. How do you decide the shard key?"
Common Mistake: Sharding by creation date creates a hot spot on the newest shard since all new users go to the same node.
Ideal Answer:
Choose a shard key based on your most common access pattern. For a users table, lookups are almost always by UserID. Therefore, hash-based sharding on user_id is best to ensure even data and load distribution across nodes.
Key Concepts:
Question: "A user updates their profile picture, the page refreshes, and they see their old picture because the read hit a replica that hasn't caught up yet. How do you fix this?"
Root Cause: Asynchronous replication means replicas may be seconds behind the primary. A read immediately after a write may hit a stale replica.
Ideal Answer: Implement "Read-your-own-writes" consistency. When a user updates their profile, set a cookie or cache entry with the timestamp of the write. For the next X seconds, or if the replica's timestamp is older than the write timestamp, route that specific user's reads to the Primary DB. All other users can read from the replica.
Key Concepts:
Question: "Two users simultaneously try to book the last available seat on a flight. Both read that 1 seat is available. Both proceed to book it. How do you prevent overbooking?"
Ideal Answer:
Use Serializable isolation level or optimistic concurrency control with version numbers. Alternatively, use SELECT ... FOR UPDATE to acquire a row-level lock before checking availability.
Key Concepts:
SELECT FOR UPDATE to lock rows during the transactionQuestion: "We're building a social network. We need to store user profiles, posts, and friend relationships. We expect to query 'show me all posts from my friends, sorted by time.' Should we use SQL or NoSQL?"
Ideal Answer: This depends on scale. At moderate scale, a relational database (PostgreSQL) handles the JOIN between friends and posts efficiently with proper indexing. At massive scale, you might denormalize: maintain a per-user "feed" table (fan-out on write) in a NoSQL store like Cassandra for fast reads, while keeping the source-of-truth relationships in SQL.
Key Concepts:
"Welcome! Let's dive into databases. I want to start with the fundamentals -- can you walk me through how a B-tree differs from an LSM tree? And given a system that ingests 100,000 IoT metrics per second but only generates daily reports, which would you choose?"
Generate scorecard based on the Evaluation Rubric. Highlight strengths, improvement areas, and recommended resources.