DEV Community

Discussion on: Cache-Aside Pattern

Collapse
 
davidkroell profile image
David Kröll • Edited

Hi, very nice article with real background knowledge. But I think you missed the biggest advantage of the (self-written) in-memory cache: real hard performance. There are no blocking I/O connections, networking syscalls etc.

I've also developed such a system (was never real productive) where I implemented an in-memory cache. I think in such an example, you pointed out, like just fetching data from a database there is no real need to have an external caching service deployed. By optimizing and adding more memory to the MySQL instance you may also achieve near the performance like with redis (with less complexity). You just have to make sure that the cache (query and results) inside MySQL is big enough to fit as much pages in it. You would not have to deal with TTL, cache hit/miss and so on. Furthermore you would also have the single instance of thruth.

Of course you are welcomed to read about the results I managed to achive with my in-memory cache (also in Go). I've also conducted much performance evaluations.

GitHub logo davidkroell / shortcut

URL Shortner in Go

shortcut

Shortcut is an URL Shortener written in Go. It's fully featured with a JSON API and an in-memory cache.

Benchmark

For an URL Shortener it's very important to satisfy as most request per seconds as possible Therefore, a benchmark is conducted I am measuring the speed of shortcut with wrk 4.1.0 in a VMWare Workstation 15 virtual machine In order to find out what slows down the application, various approaches were pursued.

Host Virtual Machine
OS Windows 10 (1803) Fedora 29 (Linux 4.20)
CPU Intel i7-7500 Dual Core Intel i7-7500 Dual Core
Memory 16GB 4GB
Go version go 1.12 linux/amd64
Database MySQL 8.0.15 Community Server

Original

The original application writes every request to the database (logging table). Furthermore, any request is logged on stdout. This code uses UUID as primary keys.

$ wrk -c 256 -d 5s -t 48 http://localhost:9999/asdf
Running 5s test @ http://localhost:9999/asdf
  48 threads and 256
Enter fullscreen mode Exit fullscreen mode
Collapse
 
husniadil profile image
Husni Adil Makmur

Hi, thanks for the feedback.

I think in such an example, you pointed out, like just fetching data from a database there is no real need to have an external caching service deployed.

Yeah, I might put a very simple example that looks overkill to implement cache-aside pattern for such problem space, you're right. But I did this for the sake of simplicity. All comes back again to the real use case.


By optimizing and adding more memory to the MySQL instance you may also achieve near the performance like with redis (with less complexity). You just have to make sure that the cache (query and results) inside MySQL is big enough to fit as much pages in it. You would not have to deal with TTL, cache hit/miss and so on. Furthermore you would also have the single instance of thruth.

By tuning up MySQL, it should be able to withstand more traffic. But I think there are also trade-offs, for instance we need to aware the maximum numbers of database connections, and need more bucks for better machine specs.

I believe there's no silver bullet to handle this problem. There are some benefits when we separate the read model for this, for example if there's a heavy locking on MySQL during writes, the customer-facing app will still serves quickly on a warm cache situation, versus no cache at all.


Of course you are welcomed to read about the results I managed to achive with my in-memory cache (also in Go). I've also conducted much performance evaluations.

This is interesting, I might miss on finding on how you handle cache invalidation when there are some data being updated (not inserted) into the database during heavy reads. Will the app serve stale data for a longer period of time until no one is accessing the endpoint?

Collapse
 
davidkroell profile image
David Kröll

This is interesting, I might miss on finding on how you handle cache invalidation when there are some data being updated (not inserted) into the database during heavy reads.

Thanks for the reminder, I maybe missed this out (this piece of software is already two years old). But the idea of course was to update the cache first and afterwards the database.

Will the app serve stale data for a longer period of time until no one is accessing the endpoint?

I do not fully understand you question, but I've created a manager goroutine which cleans up the cache based on the configuration here:
github.com/davidkroell/shortcut/bl...

Thread Thread
 
husniadil profile image
Husni Adil Makmur

I do not fully understand you question

When a particular data it's still being accessed by many users in specific time range, that specific data will still be in the cache, although there's an update data request coming up from another endpoint. Thus new users is getting stale data (old data prior update).

Eventually, after no one is accessing the data for a period of time, the data will be deleted by the cache manager. CMIIW.