DEV Community

Sasi Kumar T
Sasi Kumar T

Posted on

OverComing Weak Consistency using Cache

Recently I had faced off with weak consistency for the first time in my professional career(< 2 yrs). I didn’t know that the DB which we were using in our PROD env was weakly consistent when reads are done using secondary replica. (how silly of me 😓)

Weak/Eventual Consistency is a guarantee that when an update is made in a distributed database, that update will eventually be reflected in all nodes that store the data, resulting in the same response every time the data is queried.

What was I doing?

Consider two tables

  1. Users table, having the below columns

    id name age
  2. Relationship table, which has the below columns

    contactId id

    id is the primary key of the users table

Usually the relationship is created whenever some info is to be persisted for a particular user in the users table. When there is no relationship existing, a new DB row is created. And a new entry is saved in the relationship table mapping to the new user entry

When, new details are to be saved, the users id is fetched from the relationship table using the contactId and the details are persisted.

What was problem?

Consider this scenario, when someone lands on a webpage, we will have some details like ip, timezone, email which can be auto collected.

When trying to save the first field, email we check the Relationship table, if there is no data present, a new row is created in the users table and add a new relationship entry is also created.

When the next update of ip is to be done we check the Relationship table, and get the id and do the updates.

function getRelationship(contactId) {}
function createRelationship(contactId, user) {}
function createUser() {}
function updateUser(relationship, dataToUpdate) {}

function forEachUpdate(contactId, dataToUpdate) {
    relationship = getRelationship(contactId);
    if(!relationship) {
        newUser = createUser()
        relationship = createRelationship(contactId, user);
    }
    updateUser(relationship, dataToUpdate);
}

// main
forEachUpdate('sasi@hotmail.com', {email: 'sasi@hotmail.com'})
forEachUpdate('sasi@hotmail.com', {ip: '127.0.0.1'})
Enter fullscreen mode Exit fullscreen mode

The problem happens when we don’t find a relationship when checking after a few milliseconds of delay. But how? A new entry was added previously? What happened was, The writes are done in a primary replica, the reads goes to secondary replica. Which meant that the reads and writes were going to different servers which will not have the exact same data at any given time. This is called replication lag. Because of this duplicate user entries were created in the users table.

What is the intermediate fix?

Pointing the read queries to primary solved the issue. But, there is a reason why we have read replicas so that the primary will not be overloaded with all the read queries and respond quickly for writes. So this was not a permanent fix. For a permanent fix I had to switch to a different database, for which I didn’t have the time.

Solution:

Every time, I have to update some data it’s done sequentially. So, whenever a relationship is created, a copy of it is stored in Redis(an in-memory cache) too.

The sudo-code will look something like the below.

function saveInRedis(contactId, user) {}
function saveInDB(contactId, user) {}
function createRelationship(contactId, user) {
    saveInDB(contactId, user);
    saveInRedis(contactId, user);
}
Enter fullscreen mode Exit fullscreen mode

Therefore, before every update Redis is checked first to see if there is any relationship, if there is none, then the DB is checked

The function to getRelationship has been transformed like the below

function getDataFromRedis(contactId) {}
function getRelationship(contactId) {
    relationShip = getDataFromRedis(contactId);
    if(!relationShip) {
        // check in DB
    }
    return relationShip
}
Enter fullscreen mode Exit fullscreen mode

Top comments (0)