Database Replication Encyclopaedia — Single Leader Replication (1/3)

#beginners #learning #database #architecture

Replicated databases are the new reality of data based software applications. Gone are those days when databases were confined to a single machine in a data centre. With the rise of distributed systems, database replication has become one of those topics which every developers should know.

If you are hearing the term ‘Database replication’ for the first time, let me summarise it for you in a single line as follows,

Database replication is the process of maintaining several copies of data across several machines that are interconnected via a network.

Why replicate database you may ask? There are several reasons to why distributed systems use replication. Some of the important ones are the following —

To keep data geographically closer to the users for reduced latency.
To increase fault tolerance and increase overall availability of the system.
To be able to scale out the number of machines that can server the read queries.

If the replicated data is constant and does not change over time, there is nothing to worry about. Simply replicate your data across nodes and you are done. Almost all of the complexities of database replication arise because of changing data!

To tackle these difficulties there are three popular approaches to database replications — Single Leader Replication, Multi-Leader Replication, and Leaderless replication. Almost all of the database systems out there use one of these three to achieve database replication. In this blog, we will be discussing the Single Leader Replication method.

Single Leader Replication

Each machine that maintains a copy of the data is called a replica.

In single-leader replication, one of the replica is designated the Leader. The Leader is the one responsible for processing write operations received from the clients. It is also responsible for propagating the write operations to other replicas called follower.

While writes are only served by the leader, read requests can be served by both the leader and followers.

On receiving a write request from the client, the leader first updates its own local storage. It then sends the update request to followers, which on receiving the message from the leader update their own local storages. In this way, all write operations are ensured across replicas.

Synchronous Vs Asynchronous Replication

Depending on how the updates from leader propagates to the follower, the replication is of two types — Synchronous and Asynchronous.

Under Synchronous replication, leader waits for followers to acknowledge the update. Once the leader ensures that the update has been successfully written to all the synchronous replicas and its own local storage as well, it sends success message in response to the client’s request. This replication method ensures that all of the replicas remain in-sync with the leader. Synchronous replication, however, is slow since all replicas must commit changes before the operation can be deemed successful. It is also susceptible to downtime owing to a single node failure.

Under Asynchronous replication, leader does not wait for followers to process the write operation. Once the leader updates its own local storage, it sends a success message back to the client. In this setting, the system may not be consistent immediately but will be eventually consistent. Since the leader does not wait on followers to complete their work, updates are faster when compared to synchronous replication.

In practice, synchronous replication is characterised by a single follower being replicated synchronously while the other replicas are still replicated asynchronously. This ensures that there is always a follower with up-to-date data that can be made the leader in case the current leader fails. This type of replication is called semi-synchronous replication.

Adding new Followers

Adding a new follower replica to the cluster requires some careful considerations. A copy operation on the leader’s database might not give us the correct result since leader is actively accepting write requests and its state is constantly evolving. This could result in corrupted data getting copied. Another option is to take a lock on the leader and then perform the copying operation. This is also not acceptable since it will cause downtime anytime a new follower has to be added.

Instead, we can take a snapshot of the leader’s database. Most database provide this option for the purpose of backup. Restore the snapshot on the new follower’s database. Once the snapshot data is restored, follower can start asking the leader for changes that have occurred since the snapshot was taken. They can use Log Sequence Number (or LSN) to get that data from the leader.

Once the follower has done the catching-up process, it can continue working like the other followers.

Handling Node Outages

It is common for nodes to drop out of the cluster. It can be because of network or machine failure, or it could be because of scheduled maintenance. Let us understand how node outages are handled in Single-Leader replication.

When a follower node goes down, the system can continue to work as usual. Read requests that were earlier served by the impacted node will now be served by other replicas. Later when the follower joins back, it can ask for the updates from the leader and can continue working normally.

Things become interesting in case of a leader failure. To keep the system in a working state, a new leader has to be elected. There are several algorithm by which the cluster can select the new leader replica. Once the new leader is elected, the client needs to be configured to send all write request to this new leader and all the followers should now accept update requests from this new leader.

Issues arising from Replication Lag

Replication lag between the follower and the leader can lead to issues that would otherwise not occur in a single-machine database. These issues require careful considerations, especially when the database does not provide a mechanism to solve them.

Read Your Own Writes

Consider the following scenario, a client sends an update request to the leader. As soon as the client receive a success response from the leader, it makes a read request to the cluster and receive a response from an asynchronous follower. Interestingly, the client will see its updates reverted. Therefore we must design applications that provide read-after-write consistency.

There are several possible ways in which we can provide read-after-write consistency in our application. Some of them are as follows —

We can always use leader to respond for the data that a client may have updated. This obviously depends on being able to identify all the data that could be updated by a client.
Another way is to make all reads from the leader until a certain time after a client makes any update. We can also monitor the replication lag on the followers and prevent queries on followers that lags more than a certain amount of time.
Client may keep the timestamp of the update and this can then be used by the cluster to respond from replicas that are up-to-date until that point in time.

Same user accessing your cluster from different device poses new challenges. In such cases, we want to provide cross-device read-after-write consistency.

Monotonic Reads

With multiple replicas having different replication lags, it is possible that a client, after making request to an up-to-date replica, makes a request to a replica with larger lag. In this situation, client would see an older state and for it the time would appear to go backward.

Monotonic reads is a guarantee that this kind of anomaly does not happen. One way of achieving it is to ensure that a client always reads from a particular replica. Though this is not the ideal solution since a replica may go down at any point in time.

Single leader replication is one of the most used replication techniques. When working with read-heavy workloads, single leader replication becomes a good solution since reads are distributes across replicas. This however becomes an issue when working with write heavy-workloads that can cause your leader to bottleneck.

With this we reach the end of this blog. If you learned something new, then follow me up for more such interesting reads!