DEV Community

Cover image for Switching from Memcache to Redis and Some Tips on Caching

Switching from Memcache to Redis and Some Tips on Caching

Molly Struve (she/her) on January 04, 2020

Last month at DEV we made the switch from Memcache to Redis. This post talks about the why, the how, and also has a few gotchas to watch out for so...
Collapse
 
victorgil profile image
Víctor Gil • Edited

Thanks for the detailed migration story and also the don't list, very useful!

Few comments about something which triggered my curiosity:
as a developer, it is long time that I have not used any caching directly, back in 2013 we were using Oracle Coherence (closed proprietary software, but I did not have any choice) with Java (although I do not think it is relevant for this discussion) and the key of the cache entries would not change when the value changed.

I.e., reusing your example, the cache key would be just:
"user-follow-count-#{id}"
And then every time one more follower had to be added for a user, the application code would increment the value of the corresponding cache entry (and also in the database in a transactional fashion), but the key of the cache entry would remain the same.

In your example, I see that whenever the value of the cache entry changes, the key also changes since the timestamp is part of the key.
You said that the updated_at timestamp is present in the key "to help ensure that it doesn't get stale if a user is updated" but at first look it would seem neater if the timestamp was part of the value (object) instead of being part of the key and it could serve the same purpose.

Also, having the timestamp included in the key means that just the user id is not enough to construct the desired key, which I think adds complexity to the client code.
How does the application code retrieves a specific cache entry value?
Does it use a wildcard for the timestamp part of the key?

To sum up, what are the advantages of having mutable keys when compared with the immutable cache keys which I have described above?
Thanks in advance!

Collapse
 
molly profile image
Molly Struve (she/her)

What are the advantages of having mutable keys when compared with the immutable cache keys?

The advantage to having the keys change is that you never need to worry about updating or deleting them which can add a lot of code complexity. Instead, if I have a user with an id and followed_at timestamp that I use to store follower_count then any time that last_followed_at timestamp changes(ie a follower is added or removed) my cache request:

Rails.cache.fetch("user-follow-count-#{id}-#{last_followed_at.rfc3339}", expires_in: 1.hour) do 
  followers.count
end

will create a new key to store the new count. The old key will simply expire. Now every time I request that key until it changes again I use the id and last_followed_at timestamp and the cache will return the correct key.

If I do not use the last_followed_at timestamp then every time the follower count changes I have to add additional code to delete the old cache key. By using the timestamp this code is not needed.

Rails.cache.delete("user-follow-count-#{id}")
Collapse
 
victorgil profile image
Víctor Gil

Ok, now I fully understand it.

So the entries in your cache are actually immutable (both the key and the value never change) and whenever you need to store a new value (i.e., increase the number of followers for a specific user), you just create a new cache entry with the new key (the timestamp being the part which is different from the previous entry key) and also with the new value.
BTW, I previously missed the fact that the last_followed_at is an instance field of the User object.
And as you explained, this way the application code does not bother to delete the outdated cache entries because they will be purged by Redis automatically at the expiration time.

The only drawback I see with this approach is that you are keeping outdated entries in the cache for longer than strictly needed (until expiration time) but you also explained that data space is not a constraint in your case so far, hence, it is a fair compromise in order to remove complexity from the application code.
Everything makes sense now, thank you!

Thread Thread
 
molly profile image
Molly Struve (she/her)

WOOT! Glad I was able to explain it better!

To be clear, we do this for a lot of really simple keys, but in the future, any keys that are very large we would likely plan to remove them as soon as they become invalid rather than letting them hang around. Or, as you said, start removing them more aggressively if cache size becomes a problem.

Thread Thread
 
amermahmudkh profile image
Amer Mahmud

I am still confused by the reasoning you have provided, Molly. I may be missing something?

Redis inserts/updates a key using the SET command. It automatically overwrites the value of a key if the same key is provided again. So you do not have to worry about writing new code to update, where ever you are currently saving to Redis when the follower count changes with a new key, you can just save the follower count with the same old key?

With your current approach you first need to query for the "last_followed_at" value, then only can you query the user-follow-count? (The "last_followed_at" value may be part of the User object which has already been retrieved but you are still looking it up, yes?) But if you have the same key which is always updated, it is never stale.

And as you mention in your follow up comment, that you'll implement deletion for larger keys as they become invalid, but then you are introducing that same code complexity you were trying to avoid? (Though as per my understanding stated above, I don't believe there is code complexity to be added.)

I guess Victor's initial question remains unclear to me, "what are the advantages of having mutable keys when compared with the immutable cache keys"?

Thread Thread
 
molly profile image
Molly Struve (she/her)

The advantage to having the keys change is that you never need to worry about updating or deleting them which can add a lot of code complexity. Instead, if I have a user with an id and followed_at timestamp that I use to store follower_count then any time that last_followed_at timestamp changes(ie a follower is added or removed) my cache request will create a new key to store the new count.

Rails.cache.fetch("user-follow-count-#{id}-#{last_followed_at.rfc3339}", expires_in: 1.hour) do 
  followers.count
end

The real advantage here is that Rails has this handy fetch method which will default look for a key, if it is there return it, if it is not it will set it. This means we can do ALL the work we need to with this key in this single fetch block rather than having to set up a set AND del command.

Collapse
 
leewarrickjr profile image
Lee Warrick

Ever written anything on "caching for beginners"?

Collapse
 
molly profile image
Molly Struve (she/her)

I have written multiple posts on caching but none that are tailored specifically for beginners. What kinds of things do you think would be valuable to cover? I might add it to my list of things to write about 😊

Collapse
 
leewarrickjr profile image
Lee Warrick
  • What a cache does
  • when you should use caching
  • how to cache (front-end? Backend?)
  • tools (redis, etc)

How did you learn about caching?

Thread Thread
 
molly profile image
Molly Struve (she/her)

I learned about caching by reading about it then using it on the job

Thread Thread
 
leewarrickjr profile image
Lee Warrick

Makes sense. It's one of those things that's tough for beginners I think because it doesn't make a ton of sense on small personal projects. Sort of like testing.

You seem to know a lot about it, so if you're able to channel your inner-beginner and reflect on things you wish you knew or were explained better way back when you first learned it and turn that into an article... I think that would make an excellent post!

Thread Thread
 
molly profile image
Molly Struve (she/her)

A little while ago I started a Level Up Your Ruby Skillz series and I just started a post that will make caching the next topic covered 😎

Collapse
 
kbiedrzycki profile image
Kamil Biedrzycki

Nice guide, really happy to see how it worked! Also, a nice tip - if you use Redis for ActiveJob as well, it's good to keep namespaces separate for cache and jobs. Otherwise, while cleaning cache, you can drop enqueued jobs - oops! 🤷‍♂️

Collapse
 
steelwolf180 profile image
Max Ong Zong Bao

Nice, I really love the war story on it and what to look out for when migrating.

Collapse
 
muuvmuuv profile image
Marvin Heilemann

Oh well, thank you! We are currently looking into caching db requests and testing redis atm. Have you tested Gunjs yet? Wondering what the differences, use cases and pro/cons are between this and redis. It seems faster and way better developed as I can see from the GitHub repo. Maybe you gonna write a next post about that? :) would love to read that as well!

Collapse
 
molly profile image
Molly Struve (she/her)

Have you tested Gunjs yet?

We did not test that and will be sticking with Redis for the foreseeable future. But if you end up testing Gunjs I would love to know how it compares!

Collapse
 
etampro profile image
Edward Tam • Edited

Very nice!! I am just curious Molly, did you have a rough idea how much the flip could save before you made the suggestion?

Collapse
 
molly profile image
Molly Struve (she/her)

Based on what I saw being cached in the app 75 GB seemed absurd. My guess was that we would get down to between 5-10GB but we ended up even lower than that which made me pretty happy. That guess was based on what I was seeing being cached and the fact that at my previous company we were caching way more data and our cache size was between 10-15GB.