DEV Community

Cover image for Implementation of a Distributed LRU Cache Using ZooKeeper
Zakaria Maaraki
Zakaria Maaraki

Posted on

Implementation of a Distributed LRU Cache Using ZooKeeper

In this tutorial, we are going to see how to create a distributed cache (LRU: Least recently used) using ZooKeeper for leader election.

ZCache architecture

Before starting you can find the source code of the project at the following link https://github.com/zakariamaaraki/Distributed-LRU-Cache

What is ZooKeeper

Apache ZooKeeper is an open source volunteer project under the Apache Software Foundation. It's essentially a service for distributed systems offering a hierarchical key-value store.

ZooKeeper hierarchical key-value store

ZooKeeper is used to provide a distributed configuration service, leader-election mechanism, and service discovery / registry for large distributed systems.

To build our distributed cache, we are going to use two features provided by ZooKeeper :

  • Leader-election to elect the master (leader) (we need only one instance to talk with. Why ? we are going to see that after)
  • Service discovery / registry, to be able to forward requests to the master (leader)

Why Leader election ?

Suppose we have two instances (A and B) with which we communicate, we suppose also that both instances are alwayse synchronized. If we send the request 1:Set(x,1) to instance A, then to instance B request 2:Set(x,2), there is no guarantee that request 1 is executed before request 2, for several reasons, maybe instance A crash and restart or maybe because of the latency.

So the solution to this problem is to have only one instance to talk to, we can solve this problem by choosing one instance to be the master (leader), and we are allowed to talk only to the master.

Ok but what happens if this instance does down (this is a single point of failure), this is not good !!
to solve this problem we need an automatic mechanism that elect a leader whenever the leader instance goes down.

There is a lot of algorithms for doing leader election, for example :

Leader Election with ZooKeeper

Doing leader election with ZooKeeper is very simple. A simple way of doing that is to use the sequence & ephemeral flags when creating znodes that represent proposals of clients. The idea is to have a znode, say /cache-election, such that each znode creates a child znode cache-election/p-0000X With both flags sequence and ephemeral.
With the sequence flag, ZooKeeper automatically appends a sequence number that is greater than any one previously appended to a child of /cache-election. The instance that created the znode with the smallest appended sequence number is the leader.

Service Discovery

At the same time, we're using ZooKeeper as Service Discovery using Spring Cloud Zookeeper that leverages this extension for service registration and discovery.

Consistency vs Availability

Note that according to the CAP theorem, ZooKeeper is a CP system. This implies that it sacrifices availability in order to achieve consistency and partition tolerance. In other words, if it cannot guarantee correct behaviour it will not respond to queries.
Using a Service Discovery with a CP system is not a good idea, so it might be better if we use Eureka (which is a AP system) as a service discovery/registry instead of ZooKeeper (to guarantee availability) but in this case we'll need to set up another cluster.

Implementation

CRUD Operations

Cache CRUD

Cache Data structure

The java.util.LinkedHashMap.removeEldestEntry() method in Java is used to keep a track of whether the map removes any eldest entry from the map. So each time a new element is added to the LinkedHashMap, the eldest entry is removed from the map. This method is generally invoked after the addition of the elements into the map by the use of put() and putall() method.

public class Cache<K,V> {

    public boolean isLeader;

    public Long startUpTime;

    private Map<K,V> map;

    public Cache(int capacity) {

        this.startUpTime = System.currentTimeMillis();

        map = new LinkedHashMap(capacity, 0.75f, true) {
            protected boolean removeEldestEntry(Map.Entry eldest) {
                return size() > capacity;
            }
        };
    }

    public V get(K key) {
        return map.get(key);
    }

    public boolean containsKey(K key) {
        return map.containsKey(key);
    }

    public V remove(K key) {
        return map.remove(key);
    }

    public V set(K key, V value) {
        return map.put(key,value);
    }

    public String toString() {
        return map.toString();
    }

}
Enter fullscreen mode Exit fullscreen mode

Leader Election code snippet

private void attemptForLeaderPosition() {

        final List<String> childNodePaths = zooKeeperService.getChildren(LEADER_ELECTION_ROOT_NODE, false);

        Collections.sort(childNodePaths);

        int index = childNodePaths.indexOf(processNodePath.substring(processNodePath.lastIndexOf('/') + 1));
        if(index == 0) {
            log.info("[Process: " + nodeId + "] I am the new leader!");
            cache.isLeader = true;
        } else {
            final String watchedNodeShortPath = childNodePaths.get(index - 1);
            cache.isLeader = false;

            watchedNodePath = LEADER_ELECTION_ROOT_NODE + "/" + watchedNodeShortPath;

            log.info("[Process: " + nodeId + "] - Setting watch on node with path: " + watchedNodePath);
            zooKeeperService.watchNode(watchedNodePath, true);
        }
    }
Enter fullscreen mode Exit fullscreen mode

There is a lot of things to talk about, so i suggest you to see the whole code of the system by visiting this link : https://github.com/zakariamaaraki/Distributed-LRU-Cache

Discussion (0)