DEV Community

Katherine Kelly
Katherine Kelly

Posted on

Implementing an LRU Cache

Last week, I wrote about caching and discussed different caching approaches and ways to keep your cache data in sync with the database. A cache is a super effective way to make your application or website more performant, as it can store frequently requested data in a fast-retrieval, data storage layer instead of querying the database every time.

However, a cache is limited in size and memory. To maintain what gets stored in memory, there needs to be a way for the cache to regulate what goes in (cached) and what goes out (evicted).

This post is dedicated to a common cache policy (that often comes up in coding interviews): the Least Recently Used (LRU) cache. An LRU cache will discard the least recently used item in the cache to make space for new item(s).

Implementation

Implementing a Least Recently Used (LRU) cache traditionally involves a hash map and a doubly linked list.

The linked list would have the most-recently used item at the head of the list and the least-recently used item stored at the tail.

I love Interview Cake's examples of visualizing an LRU cache. Below is the doubly linked list illustrated sweetly (see what I did there?):
interview-cake doubly linked list

Via Interview Cake

At this point, getting the least-recently used item would take O(1) time since we can look at the tail, but accessing any other specific item that's not the tail or head would take O(n) time since we would have to walk through the entire list.

To make lookups efficient, a hash map is used to map items to linked list nodes. More sweetness from Interview Cake illustrating this:
interview-cake hash map

Via Interview Cake

Access and Eviction

Below are the following steps to go through each time an item is accessed in the cache.

  • Look up the item in the hash map
  • If the item is the hash map, hooray, it's a "cache hit" and it's already in the cache
    1. Find the corresponding linked list node with the hash map
    2. Move the item's linked list node to the head of the linked list. It's now the most recently used item.
  • If the item is not in the hash map, boo, it's a "cache miss" and you'll have to load the item into the cache
    1. Cache full? Then an item will have to be evicted (evict the LRU cache item, the tail, by removing it from the linked list and hash map)
    2. Create a new linked list node for the item and insert it at the head of the linked list
    3. Add the item to the hash map, with the new node as the value

Code

Like I mentioned above, implementing an LRU cache can often come up in coding interviews. Leetcode features an LRU cache problem where you have to implement get and put operations for the cache.

  • get(key) gets the value of the key if the key exists in the cache
  • put(key, value) sets or inserts the value if the key is not already present
  • If the cache has reached its capacity, it should invalidate the least recently used item before inserting a new item.

In my solution below, it features a few classes including LRUCache, DoublyLinkedList, and Node.

Happy coding!

Resources
LRU Cache - Interview Cake
LRU Cache - Leetcode

Top comments (6)

Collapse
 
turnerj profile image
James Turner

What are your thoughts on actually considering cache size? Like the implementation shown is a limit of a number of items though as each item could be of any size, it could be under utilised or over utilised.

Do you think measuring the actual size of a cache item is something to consider or should effort instead be profiling their application and what they cache to take a "best guess" at how many would fit in memory?

Collapse
 
katkelly profile image
Katherine Kelly

That's a good point you make about the differing item sizes. The generic cache implementation assumes an average size but there is probably an approach you could utilize to take specific item size into account.

Collapse
 
turnerj profile image
James Turner

Thanks for the reply. Do you think it is worth measuring the actual item size or do you think it adds unnecessary complexity to a caching solution?

I ask this because I have my own caching solution in C# and I am contemplating calculating the item size. Problem is, calculating the size remotely accurately is both time expensive and memory intensive (eg. serializing the whole thing to a string and measuring the length allocates a lot of memory).

Thread Thread
 
katkelly profile image
Katherine Kelly

Personally, I think it adds unnecessary complexity and wouldn't do it myself but I think it's great you're considering the tradeoffs. Best of luck with whatever you choose for your caching solution!

Collapse
 
sakshatshinde profile image
Sakshat

Great work! Maybe do MRU next :p

Collapse
 
katkelly profile image
Katherine Kelly

Thank you! And thanks for the great idea to do an MRU post in the future!