MongoDB Replication & Sharding Specialist

Build highly available, globally distributed MongoDB systems at scale.

Agent Overview

This agent specializes in MongoDB's distributed systems capabilities, essential for high-availability and planet-scale deployments. Master replica sets with automatic failover, sharding strategies for terabyte-scale data, shard key design, cluster management, backup and recovery procedures, and multi-region architectures.

You'll learn: Replica sets, primary-secondary models, failover and recovery, read/write preferences, sharding concepts, shard key selection, range vs. hash sharding, zone-based sharding, cluster administration, backup strategies, and disaster recovery planning.

Core Competencies

1. Replica Sets Fundamentals

Replica set architecture and member types
Primary node and read/write operations
Secondary nodes and read scalability
Arbiter nodes for tiebreaking
Replica set initialization and configuration
Election algorithm and voting
Replica set states and transitions

2. Replication Process & Oplog

Operation log (oplog) concepts
Oplog size calculation and management
Oplog window (how far back you can resync)
Initial sync and incremental sync
Secondary replica catching up
Replication lag measurement
Heartbeat and health checking

3. Read & Write Preferences

Write concern levels (0, 1, "majority", "all")
Journaling and durability
Read preference modes (primary, primaryPreferred, secondary, secondaryPreferred, nearest)
Staleness thresholds for secondaries
Tag-based read preferences
Read preference use cases

4. Failover & High Availability

Automatic failover mechanisms
RTO (Recovery Time Objective) optimization
RPO (Recovery Point Objective) targets
Election criteria and priority
Healing after node failure
Preventing split-brain scenarios
Replica set monitoring

5. Shard Key Design & Strategy

Shard key characteristics and requirements
Cardinality considerations (high vs. low)
Write distribution (avoiding hotspots)
Query patterns and shard selection
Compound shard keys
Ascending vs. hashed shard keys
Changing shard keys

6. Sharding Architecture & Components

Shard nodes (data holders)
Config servers (cluster metadata)
Mongos routers (query routing)
Shard cluster topologies
Replica sets within shards
Cluster initialization process
Connection routing and failover

7. Chunk Management & Balancing

Chunk creation and sizing
Range-based splitting
Hash-based distribution
Chunk migration between shards
Balancer configuration and control
Jumbo chunk handling
Migration management

8. Backup & Recovery Strategies

Physical backups (file-based snapshots)
Logical backups (mongodump/mongorestore)
Incremental backup using oplog
Point-in-time recovery capabilities
Backup scheduling and retention
Testing recovery procedures
Backup validation

9. Distributed System Operations

Adding shards to existing clusters
Removing shards safely
Maintenance windows
Rolling updates and zero-downtime
Monitoring shard distribution
Handling imbalanced clusters
Alerting and notifications

10. Multi-Region & Disaster Recovery

Cross-region replica sets
Geographically distributed architecture
Zone-based replica set tags
Multi-region read preferences
Network latency considerations
Data locality optimization
Disaster recovery procedures

Learning Progressions

Beginner HA & Replication (Weeks 1-2)

Focus on basic replica set setup and high availability:

Initialize replica sets with 3 members
Understand primary/secondary roles
Test automatic failover
Configure read/write preferences
Monitor replica set status
Handle node failures

Example: Replica set initialization

// Start 3 mongod instances on different ports
// mongod --port 27017 --replSet myReplSet --dbpath /data/node1
// mongod --port 27018 --replSet myReplSet --dbpath /data/node2
// mongod --port 27019 --replSet myReplSet --dbpath /data/node3

// Initialize replica set
rs.initiate({
  _id: "myReplSet",
  members: [
    { _id: 0, host: "localhost:27017", priority: 2 },
    { _id: 1, host: "localhost:27018", priority: 1 },
    { _id: 2, host: "localhost:27019", priority: 0 }  // Arbiter
  ]
})

// Check replica set status
rs.status()  // View primary/secondary/arbiter state

Intermediate Distributed Systems (Weeks 3-4)

Master sharding and cluster management:

Understand shard key selection
Design and implement sharding
Monitor shard distribution
Handle balancing issues
Configure zone-based sharding
Implement backup procedures

Example: Sharded cluster setup with shard key

// Connect to mongos (shard router)
// sh.enableSharding("mydb")

// Choose shard key carefully!
// Bad: ascending on low-cardinality field (creates hotspots)
// Good: hashed on high-cardinality field (distributes evenly)
db.users.createIndex({ email: "hashed" })
sh.shardCollection("mydb.users", { email: "hashed" })

// Verify sharding
sh.status()  // View shard distribution

Advanced Large-Scale Systems (Weeks 5-6)

Build planet-scale, multi-region deployments:

Implement multi-region replication
Design for disaster recovery
Optimize for geographic distribution
Handle failover at scale
Plan capacity for growth
Implement monitoring and alerting

Example: Multi-region replica set with zones

// Add geographic tags to replica set members
rs.addTagRange("us-east", ISODate("2024-01-01"), ISODate("2024-12-31"))

// Configure zone-based read preferences
// Read from nearest zone to minimize latency
client.with_options(
  read_preference=ReadPreference.secondary_preferred(
    tag_sets=[{"zone": "us-east"}]
  )
)

When to Use This Agent

✅ Setting up replica sets for high availability
✅ Designing sharding strategy for scale
✅ Choosing optimal shard keys
✅ Managing sharded clusters
✅ Planning disaster recovery procedures
✅ Implementing multi-region deployments
✅ Configuring read/write preferences
✅ Monitoring replication and sharding
✅ Handling failover scenarios
✅ Optimizing for geographic distribution
✅ Planning backup and recovery
✅ Scaling from single instance to cluster

Real-World Scenarios

Scenario 1: Global Social Network

Build geo-distributed system with regional sharding:

// Shard by userId (high cardinality, even distribution)
db.users.createIndex({ userId: "hashed" })
sh.shardCollection("socialdb.users", { userId: "hashed" })

// Zone-tag for geographic distribution
rs.addTagRange("us-east", { userId: MinKey }, { userId: "500000" })
rs.addTagRange("eu-west", { userId: "500000" }, { userId: "1000000" })
rs.addTagRange("ap-south", { userId: "1000000" }, { userId: MaxKey })

Scenario 2: E-Commerce at Scale

Shard by orderId with multi-region backup:

// Shard orders by orderId to distribute writes
sh.shardCollection("ecommerce.orders", { orderId: "hashed" })

// Replica set with cross-region members
// Primary in us-east (write operations)
// Secondary in eu-west (read scaling)
// Secondary in ap-south (disaster recovery)

// Configure read preferences
// Production: Write to primary, read from secondaries
// Reporting: Read from geo-local secondary

Scenario 3: IoT Time-Series Data

Shard by deviceId with automatic retention:

// Shard time-series data by deviceId
db.sensor_data.createIndex({ deviceId: "hashed", timestamp: 1 })
sh.shardCollection("iotdb.sensor_data", { deviceId: "hashed" })

// TTL index for data retention (30 days)
db.sensor_data.createIndex({ timestamp: 1 }, { expireAfterSeconds: 2592000 })

// 3-node replica set: Primary + 2 Secondaries
// Write concern: "majority" for durability

Scenario 4: Financial System with DR

Mission-critical system with disaster recovery:

// Replica set with voting members
// Primary: Production datacenter
// Secondary: Standby datacenter (can be elected)
// Secondary: Disaster recovery (remote location)

rs.initiate({
  _id: "financeCluster",
  members: [
    { _id: 0, host: "primary-dc:27017", priority: 3 },
    { _id: 1, host: "standby-dc:27017", priority: 2 },
    { _id: 2, host: "disaster-dc:27017", priority: 1, tags: {"location": "dr"} }
  ]
})

// Critical writes must succeed on majority
// Read operations can come from any secondary

Common Pitfalls & Solutions

❌ Pitfall 1: Poor Shard Key Choice

// ❌ Wrong: Ascending on timestamp (creates sequential hotspots)
sh.shardCollection("db.logs", { timestamp: 1 })
// All current data goes to one shard!

// ✅ Correct: Hash on high-cardinality field
sh.shardCollection("db.logs", { logId: "hashed" })
// Even distribution across shards

❌ Pitfall 2: Insufficient Replica Set Members

// ❌ Wrong: 2-node replica set (no automatic failover)
// If one fails, no election possible (need majority)

// ✅ Correct: 3-node replica set minimum
// Primary + 2 Secondaries (or 1 Secondary + 1 Arbiter)

❌ Pitfall 3: Ignoring Write Concern

// ❌ Wrong: Writing without waiting for acknowledgment
collection.insert_one(document)  // Fire-and-forget!

// ✅ Correct: Wait for majority confirmation
collection.insert_one(
  document,
  WriteConcern(w="majority")  // Verified on 3+ servers
)

❌ Pitfall 4: Not Planning Oplog Size

// ❌ Wrong: Default oplog size might be insufficient
// Secondary might fall behind, requiring full resync

// ✅ Correct: Configure adequate oplog size
// Rule: oplog size = 1 hour worth of operations (at least)
// For 1GB/sec write rate: oplog should be 3600GB+

❌ Pitfall 5: Single Region Deployment

// ❌ Wrong: All nodes in one datacenter
// One failure takes down entire cluster

// ✅ Correct: Distribute across regions with priority
rs.initiate({
  members: [
    { host: "us-east:27017", priority: 10 },    // Primary region
    { host: "eu-west:27017", priority: 5 },     // Secondary region
    { host: "ap-south:27017", priority: 1 }     // DR region
  ]
})

Tips & Best Practices

Replica Set Design

Odd number of members - Ensures clear majority (3, 5, 7)
Dedicated arbiter - Only 1 vote, no storage overhead
Priority configuration - Prefer expected primary
Hidden secondaries - For backups/reporting
Delayed secondaries - Window for accidental deletes

Shard Key Selection

High cardinality - Many unique values for even distribution
Non-monotonic - Avoid time-based keys that hotspot
Frequent in queries - Should be in most query filters
Immutable - Don't change after sharding
Well-distributed - Check distribution before sharding

Monitoring & Operations

Monitor oplog window - Ensure secondaries don't fall behind
Watch shard balance - Migrate chunks if imbalanced
Alert on elections - Unexpected failovers indicate issues
Test backups - Regular restore drills
Plan maintenance - Rolling restarts without downtime

Related Skills to Master

mongodb-authentication - Secure replica sets and clusters
mongodb-index-creation - Optimize shard key queries
mongodb-crud-operations - Write concern patterns
mongodb-transactions - Distributed transactions
mongodb-schema-design - Design for sharding

Frequently Asked Questions

Q: How many nodes do I need? A: Minimum 3: 1 Primary + 2 Secondaries (or 1 Secondary + 1 Arbiter). Odd numbers for elections.

Q: Can I add a shard to an existing cluster? A: Yes, but data migration takes time. Plan capacity before sharding.

Q: What if my shard key choice was wrong? A: MongoDB 5.0+ allows resharding with a new shard key, but it's slow. Choose carefully.

Q: How do I handle failover? A: Automatic - when primary fails, secondaries elect a new primary. Application should use automatic discovery.

Q: What's the difference between read preference and write concern? A: Write concern = where/how many replicas must acknowledge writes. Read preference = which replica to read from.

Q: Can I have a sharded cluster with only 1 shard? A: Yes, but it defeats the purpose. Use for future scaling or if required by infrastructure.

Q: How often should I backup? A: Every 4-24 hours depending on RPO requirements. Test recovery monthly.

Next Steps

Start with Replication - Setup 3-node replica set
Implement Read/Write Preferences - Optimize for your access patterns
Plan Sharding - Analyze data growth and choose shard key
Test Failover - Manually kill primary and verify recovery
Setup Monitoring - Track oplog, shard balance, elections
Plan Backups - Regular backups with tested recovery

Ready to scale MongoDB globally! 🌍

05-mongodb-replication-sharding