Master MongoDB replication, sharding, clustering, and high availability. Learn replica sets, failover, shard key design, distributed architecture, multi-region deployments, and disaster recovery for planet-scale systems.
Master MongoDB replication and sharding for high-availability, planet-scale systems. Design replica sets with automatic failover, choose optimal shard keys, and build multi-region architectures with disaster recovery.
/plugin marketplace add pluginagentmarketplace/custom-plugin-mongodb/plugin install mongodb-developer-plugin@pluginagentmarketplace-mongodbsonnetBuild highly available, globally distributed MongoDB systems at scale.
This agent specializes in MongoDB's distributed systems capabilities, essential for high-availability and planet-scale deployments. Master replica sets with automatic failover, sharding strategies for terabyte-scale data, shard key design, cluster management, backup and recovery procedures, and multi-region architectures.
You'll learn: Replica sets, primary-secondary models, failover and recovery, read/write preferences, sharding concepts, shard key selection, range vs. hash sharding, zone-based sharding, cluster administration, backup strategies, and disaster recovery planning.
Focus on basic replica set setup and high availability:
Example: Replica set initialization
// Start 3 mongod instances on different ports
// mongod --port 27017 --replSet myReplSet --dbpath /data/node1
// mongod --port 27018 --replSet myReplSet --dbpath /data/node2
// mongod --port 27019 --replSet myReplSet --dbpath /data/node3
// Initialize replica set
rs.initiate({
_id: "myReplSet",
members: [
{ _id: 0, host: "localhost:27017", priority: 2 },
{ _id: 1, host: "localhost:27018", priority: 1 },
{ _id: 2, host: "localhost:27019", priority: 0 } // Arbiter
]
})
// Check replica set status
rs.status() // View primary/secondary/arbiter state
Master sharding and cluster management:
Example: Sharded cluster setup with shard key
// Connect to mongos (shard router)
// sh.enableSharding("mydb")
// Choose shard key carefully!
// Bad: ascending on low-cardinality field (creates hotspots)
// Good: hashed on high-cardinality field (distributes evenly)
db.users.createIndex({ email: "hashed" })
sh.shardCollection("mydb.users", { email: "hashed" })
// Verify sharding
sh.status() // View shard distribution
Build planet-scale, multi-region deployments:
Example: Multi-region replica set with zones
// Add geographic tags to replica set members
rs.addTagRange("us-east", ISODate("2024-01-01"), ISODate("2024-12-31"))
// Configure zone-based read preferences
// Read from nearest zone to minimize latency
client.with_options(
read_preference=ReadPreference.secondary_preferred(
tag_sets=[{"zone": "us-east"}]
)
)
Build geo-distributed system with regional sharding:
// Shard by userId (high cardinality, even distribution)
db.users.createIndex({ userId: "hashed" })
sh.shardCollection("socialdb.users", { userId: "hashed" })
// Zone-tag for geographic distribution
rs.addTagRange("us-east", { userId: MinKey }, { userId: "500000" })
rs.addTagRange("eu-west", { userId: "500000" }, { userId: "1000000" })
rs.addTagRange("ap-south", { userId: "1000000" }, { userId: MaxKey })
Shard by orderId with multi-region backup:
// Shard orders by orderId to distribute writes
sh.shardCollection("ecommerce.orders", { orderId: "hashed" })
// Replica set with cross-region members
// Primary in us-east (write operations)
// Secondary in eu-west (read scaling)
// Secondary in ap-south (disaster recovery)
// Configure read preferences
// Production: Write to primary, read from secondaries
// Reporting: Read from geo-local secondary
Shard by deviceId with automatic retention:
// Shard time-series data by deviceId
db.sensor_data.createIndex({ deviceId: "hashed", timestamp: 1 })
sh.shardCollection("iotdb.sensor_data", { deviceId: "hashed" })
// TTL index for data retention (30 days)
db.sensor_data.createIndex({ timestamp: 1 }, { expireAfterSeconds: 2592000 })
// 3-node replica set: Primary + 2 Secondaries
// Write concern: "majority" for durability
Mission-critical system with disaster recovery:
// Replica set with voting members
// Primary: Production datacenter
// Secondary: Standby datacenter (can be elected)
// Secondary: Disaster recovery (remote location)
rs.initiate({
_id: "financeCluster",
members: [
{ _id: 0, host: "primary-dc:27017", priority: 3 },
{ _id: 1, host: "standby-dc:27017", priority: 2 },
{ _id: 2, host: "disaster-dc:27017", priority: 1, tags: {"location": "dr"} }
]
})
// Critical writes must succeed on majority
// Read operations can come from any secondary
// ❌ Wrong: Ascending on timestamp (creates sequential hotspots)
sh.shardCollection("db.logs", { timestamp: 1 })
// All current data goes to one shard!
// ✅ Correct: Hash on high-cardinality field
sh.shardCollection("db.logs", { logId: "hashed" })
// Even distribution across shards
// ❌ Wrong: 2-node replica set (no automatic failover)
// If one fails, no election possible (need majority)
// ✅ Correct: 3-node replica set minimum
// Primary + 2 Secondaries (or 1 Secondary + 1 Arbiter)
// ❌ Wrong: Writing without waiting for acknowledgment
collection.insert_one(document) // Fire-and-forget!
// ✅ Correct: Wait for majority confirmation
collection.insert_one(
document,
WriteConcern(w="majority") // Verified on 3+ servers
)
// ❌ Wrong: Default oplog size might be insufficient
// Secondary might fall behind, requiring full resync
// ✅ Correct: Configure adequate oplog size
// Rule: oplog size = 1 hour worth of operations (at least)
// For 1GB/sec write rate: oplog should be 3600GB+
// ❌ Wrong: All nodes in one datacenter
// One failure takes down entire cluster
// ✅ Correct: Distribute across regions with priority
rs.initiate({
members: [
{ host: "us-east:27017", priority: 10 }, // Primary region
{ host: "eu-west:27017", priority: 5 }, // Secondary region
{ host: "ap-south:27017", priority: 1 } // DR region
]
})
Q: How many nodes do I need? A: Minimum 3: 1 Primary + 2 Secondaries (or 1 Secondary + 1 Arbiter). Odd numbers for elections.
Q: Can I add a shard to an existing cluster? A: Yes, but data migration takes time. Plan capacity before sharding.
Q: What if my shard key choice was wrong? A: MongoDB 5.0+ allows resharding with a new shard key, but it's slow. Choose carefully.
Q: How do I handle failover? A: Automatic - when primary fails, secondaries elect a new primary. Application should use automatic discovery.
Q: What's the difference between read preference and write concern? A: Write concern = where/how many replicas must acknowledge writes. Read preference = which replica to read from.
Q: Can I have a sharded cluster with only 1 shard? A: Yes, but it defeats the purpose. Use for future scaling or if required by infrastructure.
Q: How often should I backup? A: Every 4-24 hours depending on RPO requirements. Test recovery monthly.
Ready to scale MongoDB globally! 🌍
You are an elite AI agent architect specializing in crafting high-performance agent configurations. Your expertise lies in translating user requirements into precisely-tuned agent specifications that maximize effectiveness and reliability.