DEV Community

Christian Zink
Christian Zink

Posted on • Originally published at itnext.io

How to Cache Aggregated Data with Redis and Lua Scripts for a Scaled Microservice Architecture

Alt Text

A scaled microservice-based application with a huge amount of growing data has a challenge to effectively deliver aggregated data like top lists.

In this article, I show you how to use Redis to cache the aggregated data. While the databases store the item/line data as “source of truth” and use sharding to scale.

A single Redis instance can handle some 100,000 operations per second

My example data model with users, posts, and categories can be a basis for your own usecases.

Contents

  1. Example Use-Cases and Data Model

  2. Setup Redis and Implement the Top Categories

  3. Top Users, Latest User Posts, and the Inbox Pattern

  4. Lua Scripting for Atomicity

  5. Final Thoughts and Outlook


1. Example Use-Cases and Data Model

In the example microservice application users can write posts in categories. They can also read the posts by category including the author name. The newest posts are on top. The categories are fixed and change seldom.

See my previous post “How to use Database Sharding and Scale an ASP.NET Core Microservice Architecture” if you are interested in source code and more details about the example application.

Logical Data Model:

Currently, one million users exist. Every day each user writes about ten posts.

Top 10 Categories

The top 10 categories will be displayed on the main page. This would require a statement like this for MySql:

SELECT CategoryId, COUNT(PostId) FROM Post GROUP BY CategoryId ORDER BY COUNT(PostId) LIMIT 10;
Enter fullscreen mode Exit fullscreen mode

Executing this statement for millions of lines would be very slow. And on every page visit, it would be impossible.

Because of the large amount of data, I also decided to shard by category. So it would require merging the top lists from multiple databases:


2. Setup Redis and Implement the Top Categories

Install Docker Desktop

Create the Redis container:

C:\dev>docker run --name redis -d redis
Enter fullscreen mode Exit fullscreen mode

Connect to the container and start the redis-cli:

C:\dev>docker exec -it redis redis-cli
Enter fullscreen mode Exit fullscreen mode

Add Top Categories

The the top categories (“CategoriesByPostCount”) use a Redis sorted set (ZSET).

Add the first entry with ZADD and 99 posts for the category “Category5”:

127.0.0.1:6379> ZADD CategoriesByPostCount GT 99 "Category5"
Enter fullscreen mode Exit fullscreen mode

It adds one entry:

(integer) 1
Enter fullscreen mode Exit fullscreen mode

Add some more entries:

> ZADD CategoriesByPostCount GT 1 "Category1"

(integer) 1

> ZADD CategoriesByPostCount GT 10 "Category2"

(integer) 1
Enter fullscreen mode Exit fullscreen mode

Update Category5:

> ZADD CategoriesByPostCount GT 100 "Category5"

(integer) 1

> ZADD CategoriesByPostCount GT 98 "Category5"

(integer) 0
Enter fullscreen mode Exit fullscreen mode

The last command gives a result of zero. This happens because of the GT parameter. The parameter helps to handle situations where updates arrive out-of-order (post counts don’t decrease).

Read Top Categories

Use ZRANGE and read the top 10 categories with count of posts:

> ZRANGE CategoriesByPostCount 0 9 WITHSCORES REV

1) "Category5"
2) "100"
3) "Category2"
4) "10"
5) "Category1"
6) "1"
Enter fullscreen mode Exit fullscreen mode

Easily retrieve the second page (entries 11–20), etc:

ZRANGE CategoriesByPostCount 10 19 WITHSCORES REV
Enter fullscreen mode Exit fullscreen mode

Prerequisites

The posts per category can be counted in SQL when a new post is created:

BEGIN TRANSACTION
INSERT INTO Post (...)
UPDATE Categories SET PostCount = PostCount + 1
COMMIT TRANSACTION
Enter fullscreen mode Exit fullscreen mode

This is possible because the database is sharded by category. All posts of one category are in the same database.


3. Top Users, Latest User Posts, and the Inbox Pattern

The user’s posts are scattered over all shards. It is not possible to use UPDATE User SET PostCount = PostCount + 1 and then update Redis.

The operation in Redis has to be “idempotent”. The inbox pattern makes this possible.

Further reading: Outbox, Inbox patterns and delivery guarantees explained

Add Posts (with a race condition)

On every new post add add an entry to the *PostsByTimestamp *sorted set of the user:

> ZADD {User:5}:PostsByTimestamp 3455667878 '{Title: "MyPostTitle", Category: "Category5", PostId: 13}'

(integer) 1
Enter fullscreen mode Exit fullscreen mode

Then increment the post count in UsersByPostCount:

> ZINCRBY UsersByPostCount 1 "5"
Enter fullscreen mode Exit fullscreen mode

To make it idempotent check the result of adding the post to the inbox. Issuing the command again gives a result of zero (the entry already existed):

> ZADD {User:5}:PostsByTimestamp 3455667878 '{Title: "MyPostTitle", Category: "Category5", PostId: 13}'

(integer) 0
Enter fullscreen mode Exit fullscreen mode

Then don’t increment UsersByPostCount.

The command ZADD to PostsByTimestamp and the command ZINCRBY to UsersByPostCount have to be atomic. I will show you how to use a Redis Lua-Script to make it atomic. But first, let’s read the top users and latest user posts.

Read the Top Users and the Latest User Posts

Top 10 users:

> ZRANGE UsersByPostCount 0 9 WITHSCORES REV

1) "6"
2) "10"
3) "5"
4) "8"
5) "3"
6) "4"
7) "1"
8) "3"
Enter fullscreen mode Exit fullscreen mode

The user with ID 6 has 10 posts, ID 5 has 8 posts, etc.

Top posts of the user with ID 5:

> ZRANGE {User:5}:PostsByTimestamp 0 9 WITHSCORES REV

1) "{Title: \"MyPostTitle2\", Category: \"Category1\", PostId: 14}"
2) "3455667999"
3) "{Title: \"MyPostTitle\", Category: \"Category5\", PostId: 13}"
4) "3455667878"
Enter fullscreen mode Exit fullscreen mode

4. Lua Scripting for Atomicity

Atomically Add Posts with Lua Scripting

A Redis Lua script can make the command ZADD to PostsByTimestamp and the command ZINCRBY to UsersByPostCount atomic. But an extra counter per user is needed so that all key parameters map to the same Redis hash tag.

The curly braces like in the key “{User:5}:PostsByTimestamp” are signifiers for a Redis hash tag.

This Lua script tries to add a key to a sorted set. If it can add the key, it also increments a counter. If the key already exists, it returns the value of the key:

Use EVAL to call the Lua script and pass “{User:8}:PostsByTimestamp” and “{User:8}:PostCount” as keys (one line on the command line):

> EVAL "if tonumber(redis.call('ZADD', KEYS[1], ARGV[1], ARGV[2])) == 1 then return redis.call('INCR', KEYS[2]) else return redis.call('GET', KEYS[2]) end" 2 {User:8}:PostsByTimestamp {User:8}:PostCount 3455667999 "{Title: \"MyPostTitle2\", Category: \"Category1\", PostId: 14}"

(integer) 1
Enter fullscreen mode Exit fullscreen mode

Then set the count for user 8 in UsersByPostCount:

ZADD UsersByPostCount GT 1 "8"
Enter fullscreen mode Exit fullscreen mode

Store the Script in Redis

For performance reason you can store the sript in Redis:

> SCRIPT LOAD "if tonumber(redis.call('ZADD', KEYS[1], ARGV[1], ARGV[2])) == 1 then return redis.call('INCR', KEYS[2]) else return redis.call('GET', KEYS[2]) end"

"cd9222afab5eb8d579942016a8c22427eff99429"
Enter fullscreen mode Exit fullscreen mode

Use the hash to call the script:

> EVALSHA "cd9222afab5eb8d579942016a8c22427eff99429" 2 {User:8}:PostsByTimestamp {User:8}:PostCount 4455667999 "{Title: \"MyPostTitle3\", Category: \"Category1\", PostId: 20}"

(integer) 2
Enter fullscreen mode Exit fullscreen mode

5. Final Thoughts and Outlook

In this article, you set up Redis and started with a simple use-case to cache aggregated data. Then you used the inbox pattern and Lua scripting for atomicity.

In one of my next articles, I will show you how to implement it in a C# ASP.NET Core microservice application.

Redis offers much more than I showed you in this article. You can explore the other commands and use-cases how they solve problems in your application. In a real-life application, you might have to use TTL to automatically expire entries so that the cache does not grow unlimited. Maybe you also need to scale Redis.

Please contact me if you have any questions, ideas, or suggestions.

Top comments (0)