When we Google best cache eviction policy, we get LRU which is an abbreviation of Least Recently Used as the top search result. In this blog, I am gonna decode LRU Cache Eviction Policy. Let's start by understanding the core goal for any cache eviction algorithm.
1. The Goal
The goal for any cache eviction algorithm or technique is to improve cache performance by ensuring the most commonly accessed resources in the application remain in the cache memory while the other resources get evicted. In this way, commonly accessed resources can be retrieved by the user or application modules almost instantaneously which reduces the no. of expensive repetitive queries and improves the overall performance of the application in turn. The cache eviction algorithm or policy is responsible to remove or evict data from the cache when it's full. LRU is one of the cache eviction policy.
2. The Defination
In LRU, as the name suggests the least recently used resource or data gets evicted from the cache when it's full.
For example, if we have a cache with a capacity of 3 items. First, we add an element 7, since the cache's max capacity is not reached, 7 can easily be accommodated inside cache.
Same will happen with the next two elements 2 and 9.
Now when we add next element say 5, as the max capacity of cache, is already reached, we need to evict least recently used element that is 7 and add 5. This is similar to Queue data structure since we are following FIFO.
If we have got 7 in place of 5 the 7 would have come in the front of the queue since it's recently used.
3. Laying the Foundation
3.1 HashMaps
Hashmap is a data structure which can search and insert data in time complexity O(1). For every entry in the hashmap, we have a key value pair. A key is a unique identifier of the value. Let's decode the working of the hashmaps.
Hashmaps store data in buckets. A Bucket is simply a List of Nodes of Linked List. Key-Value pairs of the hashmap are stored in the nodes of a Linked List.
The key is transformed into a hashcode using a hashing function. In java, we have a method
hashCode()
that converts and returns hashcode (integer value) of data types like Strings and Integers. For example, if we have a string "abc". Hashcode value of the string will be calculated as s[0]*31^(n - 1) + s[1]*31^(n - 2) + ... + s[n - 1] where s[i] is the ith character of the string, n is the length of the string.
The hashcode of "abc" is 97 * 31^2 + 98 * 31^1 + 99 = 93217 + 3038 + 99 = 96354.96354 is a very big no. and how we can determine in which bucket we can put it in? The answer is using the modulo function. Suppose we have 4 buckets initially. We will find the index of the bucket we need to put key-value pair by performing hashcode modulo no.of buckets. In this case 96354 % 4 which is 2, hence we are going to put the key abc and its value in bucket 2. If the hashcode is a negative value, we convert it into its absolute value before performing modulo.
Consider a scenario in which the modulo of a hashcode returns an index which already contains a node with another key-value pair. This scenario is known as collision and it is handled using Linked List. In this case, the node with the new key-value pair will be placed next to the previous node just like we add a node in Linked List.
To learn more on Hashmap, Initial Capacity and Load Factor refer this.
3.2 The Exercise
In the above section, we learned about the hashmap and it's working. Now let's implement hashmap without actually using the class HashMap.
We need:
- An ArrayList as Buckets
- Linked List containing key-value pair to handle collision
- A Hash function
- A Modulo function Let's start by creating a Node of the Linked List that will contain key-value pairs.
class Node {
String key;
String value;
Node next;
Node(String key, String value) {
this.key = key;
this.value = value;
}
}
Now, we will write a Linked List class that will contain a method to add node in case of collision.
class NewLinkedList {
public Node addNode(Node head, String key, String value) {
if(head == null ||(head.key == null && head.value == null)) {
head = new Node(key, value);
return head;
}
Node current = head;
while(current.next != null) {
current = current.next;
}
current.next = new Node(key,value);
return head;
}
}
Finally we will write HashMap class which will have get, put, hashFunction and moduloFunction method.
public class NewHashMap {
ArrayList<Node> bucket;
final int BUCKET_SIZE = 4;
NewHashMap() {
//initailize ArrayList bucket
bucket = new ArrayList<Node>(BUCKET_SIZE);
for(int i = 0; i < BUCKET_SIZE; i++) {
bucket.add(null);
}
}
private int hashFunction(String key) {
char[] keyCharArr = key.toCharArray();
int n = keyCharArr.length;
int hashcode = 0;
for(int i=0;i<n;i++) {
hashcode = hashcode + keyCharArr[i] * (31^(n - 1 - i));
}
return Math.abs(hashcode);
}
private int moduloFunction(int hashcode) {
return (hashcode % (BUCKET_SIZE - 1));
}
public String get(String key) {
//get the index of the bucket in which the key can be possibly stored
int index = moduloFunction(hashFunction(key));
//get the node at that index
Node node = bucket.get(index);
//iterate through linked list in search of the key
while(node != null) {
if(node.key.equals(key))
return node.value;
node = node.next;
}
return null;
}
public void put(String key, String value) {
//get the index of the bucket in which the key can be possibly stored
int index = moduloFunction(hashFunction(key));
//get the node at that index
Node node = bucket.get(index);
NewLinkedList list = new NewLinkedList();
//add the key-value in a node of the linked list
Node head = list.addNode(node, key, value);
bucket.set(index, head);
}
}
Note: This is not the perfect implementation since we are not handlining integers as Keys, Integer Overflow in hash-function and many other issues. This is just for the sake of developing an understanding.
4. The Endgame: Implementing LRU
First, let's discuss how to implement LRU. We are going to have the same implementation buckets, hashcode and modulo function but what will change in instead of using Linked List we will use Doubly Linked List. Why a Doubly Linked List? In order to remove the node, you have to let the next pointer of its previous node point to its next node, but in a single linked list, you can not easily get the reference(O(n)) to its previous the node when you only have the reference to itself. Thus, the Doubly Linked List solves this problem.
The most recently used key-value pair will be added to the front of Doubly Linked List and will go at the end as more key-value pairs are added to the List. Finally, when it reaches the extreme end or tail, it gets deleted from the List and hence evicted from the cache. The following showcases eviction when the capacity of the list is 3.
After key 5 is added, 7 get evicted.
Let's tweak the above code implementing hashmap to achieve this. We will implement LRU use HashMaps and Doubly Linked List.
- Write the Node class. Since it is an implementation of the Doubly Linked List, we will have prev variable that will point towards the previous node.
class Node {
Node prev, next;
int key, value;
Node(int key, int value) {
this.key = key;
this.value = value;
}
}
- Write the LinkedNodeList class. This is the main class where we will define moveToHead, addToHead and removeTail and few other methods.
//we are going to store head and tail of the list
Node dummyHead;
Node dummyTail;
LinkedNodeList(){
//initialize the variables which in this case is head and tail node with key and value as 0. The next of head is tail and prev of the tail is head. All the nodes with key-value will be added in between dummy Head and dummyTail.
dummyHead = new Node(0,0);
dummyTail = new Node(0,0);
dummyHead.next = dummyTail;
dummyTail.prev = dummyHead;
}
Now let write addToHead method. This is used when a new entry is introduced in the cache. Since its the most recent entry, it will be added to the head of the list of the respective bucket.
void addToHead(Node node){
//add node after dummyHead
Node tmp = dummyHead.next ;
dummyHead.next = node;
node.next = tmp;
node.prev = dummyHead;
tmp.prev = node;
}
Suppose we get an entry which is already present in the list but since its the recent entry, it needs to be moved to the head. To implement this we will define moveToHead node.
void moveToHead(Node node){
//next of prev of the node points to next of the node
node.prev.next = node.next;
//prev of next of the node points to prev of the node
node.next.prev = node.prev;
//the node is set after dummyHead
addToHead(node);
}
When the capacity of the list reaches it's maximum we will evict the tail. This is done using removeTail method.
void removeTail(){
//the node before dummyTail is removed and the next of prev of the evicted node will be dummyTail and prev of the dummyTail is prev of the evicted node.
Node newTail = dummyTail.prev.prev;
newTail.next = dummyTail;
dummyTail.prev = newTail;
}
Define the getter methods.
Node getTail(){
return dummyTail.prev;
}
Node getHead(){
return next;
}
- Write the LRUCache class. We are going to define get and put method here. Let's start by writing the constructor.
class LRUCache {
LinkedNodeList list;
Map<Integer, Node> map;
Integer capacity;
//initilaize capacity, map and LinkedNodeList
public LRUCache(int capacity) {
list = new LinkedNodeList();
map = new HashMap(capacity);
this.capacity = capacity;
}
Define the get method
public int get(int key) {
//get the node/head from the map
Node node = map.get(key);
//if node is null return -1
if(node == null){
return -1;
}
//since the key is most recently accessed, we are going to move it to the head and return its value.
list.moveToHead(node);
return node.val;
}
Define the put method.
public void put(int key, int value) {
//get the head node
Node node = map.get(key);
//if value exists in node we will move the node to the head (after dummyHead)
if(node != null){
list.moveToHead(node);
node.val = value;
//else we create a new node with key and value as parameters.The node is added after dummyHead. if the capacity exceeds the tail is removed.
} else {
Node newNode = new Node(key, value);
if(map.size() == capacity){
Node tail = list.getTail();
map.remove(tail.key);
list.removeTail();
}
map.put(key, newNode);
list.addToHead(newNode);
}
Ta-Da! We successfully implemented LRU Cache Eviction policy in JAVA.
Happy Coding.
Discussion