MongoDB ReplicaSet – Deployment

The following picture shows a typical environment with an application querying a three-node replica set.

We have a client application (Mongo Shell, other clients or any custom application) that connects to the database through the Driver. Any supported programming language has its own specialized connector for MongoDB.

Let’s also say that, during normal operation, a replica set has just one node as PRIMARY and all the others are SECONDARY. The PRIMARY member is the only one that receives writes. It updates its local collections and documents as well as the oplog. The oplog events are then sent through the replication channel to all the SECONDARY nodes. Each SECONDARY node applies locally, and asynchronously, the same modifications on local data and the oplog.

The next picture shows internally how the replica set works. Each node is connected to all the others and a heartbeat mechanism is in place to ping any other node. The heartbeat has a configurable time for pinging nodes, and the default is 10 seconds.

If all the nodes respond with an acknowledge to the heartbeat, the cluster continues to work. If one of the nodes crashes, the PRIMARY for example (the worst occurrence), an election takes place involving the remaining nodes.

When a SECONDARY doesn’t receive a response to the heartbeats after the configured timeout, it calls for an election. The still-alive nodes vote for a new PRIMARY. The election phase doesn’t normally take a long time and the election algorithm is complex enough to let them choose the best secondary to be the new primary. Let’s say it’s the secondary that is mostly up to date with the dead primary.

Other than the crash of the primary, there are other situations where a node calls for an election: when adding a node to the replica set, during “initiating a replica set” or during some maintenance activity. These kind of elections are not the purpose of this article.

The replica set cannot process write operations until the election completes successfully but could continue to serve read queries if such queries are configured to run on secondaries (we’ll discuss that later). After the election completes correctly, the cluster resumes normal operations.

The following picture shows what we’ve described so far.

To work properly, a replica set needs to have an odd number of members. In case of a network split, only an odd number of members ensures we have the majority of votes in one of the subsets. A new PRIMARY is elected in the subset with the majority of nodes.

So, as a consequence, three is the minimum number of nodes for a Replica Set to assure high availability.

1. Become the Sudo user
[[email protected] ~]$ sudo su [sudo] password for tom:

2. Create separate database directories for each Mongod process
[[email protected] ~]$ sudo su [sudo] password for tom:
[[email protected] tom]# cd /var/lib/mongo [[email protected] mongo]# mkdir replicaset0 [[email protected] mongo]# mkdir replicaset1 [[email protected] mongo]# mkdir replicaset2

3. Start the first Mongod process on port 27017
[[email protected] mongo]# mongod --port 27017 -dbpath /var/lib/mongo/replicaset0 --logpath /var/lib/mongo/replicaset0/log --replSet myreplicaset --fork --smallfiles --nojournal
about to fork child process, waiting until server is ready for connections. forked process: 2984
child process started successfully, parent exiting

4. Start the second Mongod process on port 27018
[[email protected] mongo]# mongod --port 27018 -dbpath /var/lib/mongo/replicaset1 --logpath /var/lib/mongo/replicaset1/log --replSet myreplicaset --fork --smallfiles --nojournal
about to fork child process, waiting until server is ready for connections. forked process: 3036
child process started successfully, parent exiting

5. Start the third Mongod process on port 27019
[[email protected] mongo]# mongod --port 27019 -dbpath /var/lib/mongo/replicaset2 --logpath /var/lib/mongo/replicaset2/log --replSet myreplicaset --fork --smallfiles --nojournal
about to fork child process, waiting until server is ready for connections. forked process: 3082
child process started successfully, parent exiting

6. Log into the Console of the first Mongod Process on port 27017
[[email protected] ~]$ mongo --port 27017
MongoDB shell version: 2.6.3 connecting to: 127.0.0.1:27017/test
>

7.	Initiate the Replica Set 
>	rs.initiate() 
{
"info2" : "no configuration explicitly specified -- making one", "me" : "localhost.localdomain:27017",
"info" : "Config now saved locally. Should come online in about a minute.",
"ok" : 1
}

8. Press enter after a second or so and you must see this
myreplicaset:PRIMARY>

9. Add members to the Replica set
Syntax: myreplicaset:PRIMARY> rs.add("<hostname>:<port>")
myreplicaset:PRIMARY> rs.add("localhost.localdomain:27018")
{ "ok" : 1 } 

10.	Do the same for the Mongod process on port 27019 
myreplicaset:PRIMARY> rs.add("localhost.localdomain:27019") 
{ "ok" : 1 } 

Leave a Reply