Last month at DEV we made the switch from Memcache to Redis. This post talks about the why, the how, and also has a few gotchas to watch out for so...
For further actions, you may consider blocking this person and/or reporting abuse
Thanks for the detailed migration story and also the don't list, very useful!
Few comments about something which triggered my curiosity:
as a developer, it is long time that I have not used any caching directly, back in 2013 we were using Oracle Coherence (closed proprietary software, but I did not have any choice) with Java (although I do not think it is relevant for this discussion) and the key of the cache entries would not change when the value changed.
I.e., reusing your example, the cache key would be just:
"user-follow-count-#{id}"
And then every time one more follower had to be added for a user, the application code would increment the value of the corresponding cache entry (and also in the database in a transactional fashion), but the key of the cache entry would remain the same.
In your example, I see that whenever the value of the cache entry changes, the key also changes since the timestamp is part of the key.
You said that the
updated_at
timestamp is present in the key "to help ensure that it doesn't get stale if a user is updated" but at first look it would seem neater if the timestamp was part of the value (object) instead of being part of the key and it could serve the same purpose.Also, having the timestamp included in the key means that just the user id is not enough to construct the desired key, which I think adds complexity to the client code.
How does the application code retrieves a specific cache entry value?
Does it use a wildcard for the timestamp part of the key?
To sum up, what are the advantages of having mutable keys when compared with the immutable cache keys which I have described above?
Thanks in advance!
The advantage to having the keys change is that you never need to worry about updating or deleting them which can add a lot of code complexity. Instead, if I have a user with an
id
andfollowed_at
timestamp that I use to storefollower_count
then any time thatlast_followed_at
timestamp changes(ie a follower is added or removed) my cache request:will create a new key to store the new count. The old key will simply expire. Now every time I request that key until it changes again I use the
id
andlast_followed_at
timestamp and the cache will return the correct key.If I do not use the
last_followed_at
timestamp then every time the follower count changes I have to add additional code to delete the old cache key. By using the timestamp this code is not needed.Ok, now I fully understand it.
So the entries in your cache are actually immutable (both the key and the value never change) and whenever you need to store a new value (i.e., increase the number of followers for a specific user), you just create a new cache entry with the new key (the timestamp being the part which is different from the previous entry key) and also with the new value.
BTW, I previously missed the fact that the
last_followed_at
is an instance field of theUser
object.And as you explained, this way the application code does not bother to delete the outdated cache entries because they will be purged by Redis automatically at the expiration time.
The only drawback I see with this approach is that you are keeping outdated entries in the cache for longer than strictly needed (until expiration time) but you also explained that data space is not a constraint in your case so far, hence, it is a fair compromise in order to remove complexity from the application code.
Everything makes sense now, thank you!
WOOT! Glad I was able to explain it better!
To be clear, we do this for a lot of really simple keys, but in the future, any keys that are very large we would likely plan to remove them as soon as they become invalid rather than letting them hang around. Or, as you said, start removing them more aggressively if cache size becomes a problem.
I am still confused by the reasoning you have provided, Molly. I may be missing something?
Redis inserts/updates a key using the SET command. It automatically overwrites the value of a key if the same key is provided again. So you do not have to worry about writing new code to update, where ever you are currently saving to Redis when the follower count changes with a new key, you can just save the follower count with the same old key?
With your current approach you first need to query for the "last_followed_at" value, then only can you query the user-follow-count? (The "last_followed_at" value may be part of the User object which has already been retrieved but you are still looking it up, yes?) But if you have the same key which is always updated, it is never stale.
And as you mention in your follow up comment, that you'll implement deletion for larger keys as they become invalid, but then you are introducing that same code complexity you were trying to avoid? (Though as per my understanding stated above, I don't believe there is code complexity to be added.)
I guess Victor's initial question remains unclear to me, "what are the advantages of having mutable keys when compared with the immutable cache keys"?
The real advantage here is that Rails has this handy
fetch
method which will default look for a key, if it is there return it, if it is not it will set it. This means we can do ALL the work we need to with this key in this single fetch block rather than having to set up aset
ANDdel
command.Ever written anything on "caching for beginners"?
I have written multiple posts on caching but none that are tailored specifically for beginners. What kinds of things do you think would be valuable to cover? I might add it to my list of things to write about 😊
How did you learn about caching?
I learned about caching by reading about it then using it on the job
Makes sense. It's one of those things that's tough for beginners I think because it doesn't make a ton of sense on small personal projects. Sort of like testing.
You seem to know a lot about it, so if you're able to channel your inner-beginner and reflect on things you wish you knew or were explained better way back when you first learned it and turn that into an article... I think that would make an excellent post!
A little while ago I started a Level Up Your Ruby Skillz series and I just started a post that will make caching the next topic covered 😎
Nice guide, really happy to see how it worked! Also, a nice tip - if you use Redis for ActiveJob as well, it's good to keep namespaces separate for cache and jobs. Otherwise, while cleaning cache, you can drop enqueued jobs - oops! 🤷♂️
Nice, I really love the war story on it and what to look out for when migrating.
Oh well, thank you! We are currently looking into caching db requests and testing redis atm. Have you tested Gunjs yet? Wondering what the differences, use cases and pro/cons are between this and redis. It seems faster and way better developed as I can see from the GitHub repo. Maybe you gonna write a next post about that? :) would love to read that as well!
We did not test that and will be sticking with Redis for the foreseeable future. But if you end up testing Gunjs I would love to know how it compares!
Very nice!! I am just curious Molly, did you have a rough idea how much the flip could save before you made the suggestion?
Based on what I saw being cached in the app 75 GB seemed absurd. My guess was that we would get down to between 5-10GB but we ended up even lower than that which made me pretty happy. That guess was based on what I was seeing being cached and the fact that at my previous company we were caching way more data and our cache size was between 10-15GB.