loading...

Multilayer Caching in .NET

turnerj profile image James Turner Originally published at turnerj.com ・5 min read

Caching is a powerful tool in a programmer's toolbox but it isn't magic. It can help scale an application to a vast number of users or it can be the thing dragging down your application. Layered caching is a technique of stacking different types of cache on top of each other which play to different strengths.

I was first inspired to the idea of multilayered caching by Nick Craver. He wrote a great article about how Stack Overflow do caching which has a lot of interesting insights - definitely worth checking out if you haven't already. It was his article that inspired me to create Cache Tower, my own multilayered caching solution for .NET with an emphasis on performance.

Using the example he illustrated in his post, our own computers already do multiple layers of caching:

  • L1/L2 CPU Cache
  • RAM
  • SSD/HDD (Pagefile)

The performance profiles of each of these is drastically different where the CPU caches are the fastest but also hold the least amount of data. This is probably the first important takeaway from caching - its not just what you cache, its how you cache it.

There is an interesting case with Cloudflare where they put unpopular items in the RAM and more popular items into their SSD storage. They use a multilayered cache system of RAM then SSD. While they have some extremely fast SSDs, it turns out when you read and write to them at the same time, you can suffer a performance penalty.
To avoid that penalty, they realised that having unpopular items (items never hit or hit only once) purely in the RAM allowed their overall system to perform better. It may not be perfect but they got some interesting results!

Looking at caching from an application's point of view, the layers may look a bit different but the concept is still the same. We move from the fastest layers which have limited space to slower layers which have more space.

  • In-Memory Cache
  • Redis/Memcached
  • Database/File

While it might seem simple enough to implement yourself, there are a few considerations to keep in mind for building a scalable multilayered caching solution.

Keeping Cache Layers Up-to-Date

Scenario: You have multiple instances of an application with their own local caches (in-memory) while also having a shared cache (Redis).

Like in a normal caching scenario, you want to avoid cache misses. In multilayered caching, we have two types of cache misses - close misses and complete misses. If your in-memory cache does not have the item but Redis does, this is a close cache miss. You will need to propagate the cache result back to your in-memory cache to achieve maximum performance.

You could do this via a background task however this wouldn't scale. It would require iterating all keys of one cache layer and comparing them to the keys in another.

To get the best benefit here, you will want only propagate the item if you actually need it. This keeps your in-memory cache as small as what it actually requires. Because we are having to fetch the item from the shared cache anyway, we can spend a few extra cycles storing it in our local in-memory cache.

The extra time spent storing it in our in-memory cache should pale in comparison to the time required for a complete cache miss.

Managing Evictions

Scenario: You have an in-memory cache and a filesystem cache for a single application instance

Depending on the in-memory caching solution, you might already have an auto-eviction system. This can be found, for example, in Microsoft's MemoryCache. Unlike what is available in something like Redis though, caching to a file is both extremely slow and doesn't have a method to auto-evict expired items.

While your code may consider expired cache items as "missed", its important to actually evict the expired records as they may be taking up precious space in memory, disk or a database. It seems pretty straight forward, loop over the items known to be in the cache and evict any expired records.

Its important to consider that some cache layer technologies may have optimizations that allow bulk eviction of records instead of individual evictions. For example, a database cache layer would likely be able to query all expired items at once and be able to run a single "delete" operation.

This bulk eviction "cleanup" is a good candidate for a background task - something where there are few instances of it and it can start the cleanup at regular intervals.

Background Refreshing (Stale vs Expired Cache Items)

Background refreshing isn't exclusive to a multilayer cache solution however it can be invaluable for maximising performance in one.
The important part for background refreshes is working out the best time for refreshing. Refreshing too early may put an unnecessary strain on the data source however refreshing too late may have the data be overly stale.

The control of the refreshing is important too - you don't want to do this on a schedule as the cache may be overly eager. Like propagating between cache layers, you want to perform this if the cache item is actively being hit.

To keep throughput up, we need to simultaneously return our "stale" cache item while triggering a refresh to update our data. This update of data needs to hit every cache layer too so other application instances can benefit from the refreshed data.

Distributed Locking

Scenario: You have multiple instances of an application with their own local caches (in-memory) while also having a shared cache (Redis).

If you're looking at a multilayered caching solution, you likely are running multiple instances of your application. If "Web Server 1" is already attempting to update Redis then "Web Server 2" doesn't need to waste any time doing the same. This is important to factor especially if retrieving the original data is an expensive operation.

Distributed locking helps alleviate this however there is a catch - you don't want multiple requests on the same server checking the distributed cache every time for a lock. If the same server already has a lock, you will want to track that locally in-memory so the lock-check is faster.

Summary

Layered caching can provide the best of multiple different cache types. You can get the performance of in-memory cache with the larger cache sizes from a Redis instance, database or file system. It won't automatically solve every caching performance problem but in the right scenarios, can be an extremely useful tool.

I hope these tips can help you out with your own caching solution. If you don't want to roll your own, check out my library Cache Tower which supports these things and more.

GitHub logo TurnerSoftware / CacheTower

An efficient multi-layered caching system for .NET

Cache Tower

An efficient multi-layered caching system for .NET

AppVeyor Codecov NuGet

Overview

Computers have multiple layers of caching from L1/L2/L3 CPU caches to RAM or even disk caches, each with a different purpose and performance profile Why don't we do this with our code?

Cache Tower isn't a single type of cache, its a multi-layer solution to caching with each layer on top of another.

Officially supported cache layers include:

  • MemoryCacheLayer (built-in)
  • JsonFileCacheLayer (via CacheTower.Providers.FileSystem.Json)
  • ProtobufFileCacheLayer (via CacheTower.Providers.FileSystem.Protobuf)
  • MongoDbCacheLayer (via CacheTower.Providers.Database.MongoDB)
  • RedisCacheLayer (via CacheTower.Providers.Redis)

These various cache layers, configurable by you, are controlled through the CacheStack which allows a few other goodies.

  • Local refresh locking (a single instance of CacheStack prevents multiple threads refreshing at the same time)
  • Remote refresh locking (see details)
  • Optional stale-time for a cache entry (serve stale data while a background task refreshes the cache)
  • Remote eviction on refresh (see details

Discussion

pic
Editor guide
Collapse
sarafian profile image
Alex Sarafian

Not longer than 5 years ago I had designed such a system (.NET assembly) to drive such scenarios. It had the concept of layers and you could always inject your own implementation per layer. Out of the box, I had implemented the memory for L1 and Redis for L2. But a filesystem one was also possible for rich clients. There was also an option to synchronize each layer through pub/sub scheme with oob SignalR and Redis with focus on invalidating and not re-populating.

Good memories.

Collapse
turnerj profile image
James Turner Author

Nice! If you don't mind me asking, what happened to it? Was it a commercial application for a particular business or OSS?

For me, the whole concept seemed pretty interesting after reading Nick Craver's blog post about how Stack Overflow does caching. At the time, I didn't see any real OSS solutions for it in the way I wanted to go. In-memory and Redis were my initial goal which I then stretched to MongoDB and Disk. I added pub/sub for distributed eviction too.

Collapse
sarafian profile image
Alex Sarafian

Unfortunately, the organization I worked for back then was not open source friendly. The implementation was left at a proof of concept but I generally tend to build my POCs well so it was good. It was with full .net framework.

To give you some context, the application I worked for was built over a domain that is very file system oriented which worked for rich clients but we had issues with porting to Web (It's called DITA). Additional to that, we had some performance issues and because the back end was stateless, in memory caching was not an option as it would add state. So the team was stuck in general and had always chosen to avoid caching and focus on performance which really were really good at. They had the best SQL programmers I've met to be honest. My role as an architect was to provide a foundation on how to add this capability to the stack and this is what the POC did. It was never adopted though because of prioritization and general lack of understanding of how to cache.

If you ask me, it is not about the how but first and foremost about understanding about is being cached and what are the implications. You need first to classify your data in terms of volatility, cost but most importantly on impact of staleness. And this is the biggest exercise of all and carries the biggest risk because inevitably it will change the behavior of some parts of the app. Most shops I've worked with have a severe problem on exactly this topic and haven't thought about their data, besides performance. More agile and more modern stacks refer to this part of the problem with the "eventually consistency" which scares traditional teams and their customers and not without justification.

I've explained this because the implementation was not that difficult to be honest. It was simpler than I expected for my self because cache is easy. But, what the biggest problem was the above. And that becomes even more complicated when the type of cache provides change because they connect function with choice and architecture.

Thread Thread
turnerj profile image
James Turner Author

That was really interesting, thanks for sharing!

If you ask me, it is not about the how but first and foremost about understanding about is being cached and what are the implications. You need first to classify your data in terms of volatility, cost but most importantly on impact of staleness.

This 100%. I've seen caching solutions where they just cache everything but later wonder why they run into obscure issues with the data. Not putting the right data for an appropriate amount of time in a cache can severely impact both performance and code maintenance.

Collapse
shaijut profile image
ShaijuT

😄, Nice, But why you want to store cache in-memory ?, I think in-memory cache is not good for high traffic websites. If we store cache in same computer where application is hosted i think the application will not scale for large traffic. So Isn't it good to Make cache distributed in another computer using Redis ?

Collapse
sirseanofloxley profile image
Sean Allin Newell

An in memory cache hit is much faster than a redis hit. My boss' previous job used messaging (like MSMQ/RabbitMQ) to do the propagation, it got sub 10ms response times on full hot cache hits from what I recall.

Distributed in process in memory caching is a strategy to push high traffic web app response times beyond what a distributed out of process caching system like redis or couchbaae can give you.

It's basically applying the caching mechanisms of DNS and globally distributed CDNs in a web app.

Collapse
shaijut profile image
ShaijuT

can we decouple in memory cache like distributed cache to another computer ?

Collapse
turnerj profile image
James Turner Author

In-memory caching is great and is super fast but remember, this is about a multilayer cache where we can play to the strengths of multiple different types of cache. Redis is fast but in-memory in the same process will always be faster.

Distributed caches like Redis have two major benefits - they allow a common cache for multiple web applications and allow a cache to survive a web application restart. An in-memory cache really just accelerates cache hits even more so.

If you had a single instance of a web app but were running both in-memory and Redis, while you would still gain performance (from in-memory), it really isn't giving you much more than what in-memory by itself was besides caches surviving web application restarts. It's when you go to multiple instances of your application - your in-memory caches are only going to have whatever is needed by that specific server whereas Redis is going to have everything for all servers.

If your curious to a big site that does in-memory caching and Redis in .NET, it is what Stack Overflow does.