Ridhwana Khan for The DEV Team

Posted on Feb 10, 2023

Caching at DEV

#performance #tutorial #opensource #webdev

We’ve always put a lot of effort into performance at DEV. We want our users to be able to see their content almost instantaneously when interacting with our content. In order to do so we’ve placed a big emphasis on caching. We’ve had to ask ourselves questions like what are the right things to cache? Which layer in the stack would be best to cache it? And how will this affect the overall performance?

As someone reading this post, you’re most likely a user of DEV 🙌🏼 which means that you know that DEV is a read-heavy site. Tens of thousands of users all around the world are reading content on our site at any point in the day. This content - for example, an article, once created will rarely change and so we can refer to it as “mostly” static content. There are also more interactive bits with DEV like being able to react to and comment on posts which provides us with interesting caching opportunities.

There are many types of caching that we apply in the application, however in this post we'll be discussing Response Caching specifically. At the core of our response caching strategy, we identify the static content and try to cache it so that users are not making a trip to our servers for each request.

Types of response caching on DEV

There are many types of response caching that occur in different layers of the application stack. One of the important decisions that you’ll make when developing features is to decide on which part of the stack you should implement caching.

💡 Do we implement caching to avoid hitting the origin server completely in the form of Edge caching?

💡 Do we implement caching at the view layer to reduce the number of database queries and complex rendering of the UI in the form of Fragment caching?

💡 Or do we implement Browser caching to constrict our request to never leave our browser?

Each of these strategies has different use cases and sometimes we may end up using a combination of multiple caching techniques on one feature to achieve the most optimised result.

Have you ever wondered why the article page on DEV loads up so quickly - even when the page is really long and there are tons of images it’s still pretty snappy! That’s mostly due to edge caching, and when the edge cache is being refreshed, you can thank Fragment caching for stepping in. The assets load pretty quickly on that page as well, we can thank the browser cache for caching our JavaScript and CSS for those pages.

Let’s go into more detail about each of these types of caching.

⚡️ Edge Caching

Edge Caching lives between the browser and the origin server, thus reducing the need to make a trip to the origin server for every request. Its where we add an intermediary storage (ideally closer to the user) between the user and the server to store the data for a period of time.

Why do we edge cache at DEV?

There are two parts of edge caching at DEV that make it beneficial for the application:

Edge caching moves memory storage closer to the end users by adding a machine ahead of the origin server. This means that a user from South Africa will get content served from a point of presence in Cape Town which contains the edge cache instead of going all the way to the origin server in the United States - the trip ends up being much faster.
The edge cache contains a cached version of the page that the user is requesting without having to do any re-computation thus making the response time really really fast.

Some of the benefits of the edge cache include reducing server load and stress on the origin server, improving content delivery and response times of requests thus reducing waiting time on web pages, and lightening the network load by reducing the amount of duplicate data.

At DEV we currently use Nginx or Fastly for our edge cache. In the future, we hope to allow for our configuration to be scalable enough to run through any caching intermediary.

Currently Fastly caches the content stored on our origin server at points of presence (POPs) around the world which then improves the user experience of our site.

Within our Fastly configuration we have shielding enabled. When one of Fastlys POPs is used as an "origin shield" it will reduce the load on the origin server. Thus, the requests to the origin will come from a single POP, thereby increasing the chances of an end user request resulting in a cache HIT.

How does edge caching actually work?

When a user navigates to our site https://dev.to/, they first hit our edge cache. Within this layer, the edge server will either have a warm cache or a cold cache.

Usually, the first visit to a site after a cache is set up or after it expires will reach a “cold” cache. A cold cache is an empty one that does not have any data stored. When a cache is “cold”, then the request will make its way to the origin server to retrieve the data, and it will be labeled a cache “MISS”. However, when it does this it also retains the data that it got from the origin server within the cache. This is referred to as the process of ‘warming’ the cache. A warm cache will contain data that is already stored and prepared to serve users. If the cache is warm, the data is returned from the cache to the browser and it will be labeled as a “HIT”. Every subsequent user will hit a warm cache until we expire or purge the cache and the same process repeats itself.

Expiring a cache

When caching objects it’s important to think about how long you want the cache to be around until it gets stale. One approach is to set a reasonably longer cache lifetime and then purge the cache on certain conditions

When we want our content to be edge cached we set the appropriate cache control headers. Here’s a snippet of the code that we use in the DEV codebase:



 before_action :set_cache_control_headers, only: %i[index show]

set_cache_control_headers is defined with the following configuration:



 def set_cache_control_headers(
   max_age = 1.day.to_i,
   surrogate_control: nil,
   stale_while_revalidate: nil,
   stale_if_error: 26_400
 )

   # Only public forems should be edge-cached based on current functionality.
   return unless Settings::UserExperience.public

   request.session_options[:skip] = true # no cookies

   RequestStore.store[:edge_caching_in_place] = true # To be observed downstream.

   response.headers["Cache-Control"] = "public, no-cache" # Used only by Fastly.
   response.headers["X-Accel-Expires"] = max_age.to_s # Used only by Nginx.
   response.headers["Surrogate-Control"] = surrogate_control.presence || build_surrogate_control(
     max_age, stale_while_revalidate: stale_while_revalidate, stale_if_error: stale_if_error
   )

The max-age header will help the edge cache server to calculate a Time To Live (TTL) for the cache. TTL is the maximum amount of time that the content will be used to respond to requests. Thereafter, the cache will need to be revalidated by consulting the origin server. TTL is defined in seconds. For DEV we set the default max-age to be 1 day, however, in some cases, we may override this value. We override this value for caching of the feed where we set the max-age to be two hours. I encourage you to grep for set_cache_control_headers in the codebase to explore the length of the caches for the various controller actions.

In case you’re curious about those other values in the snippet of code above:

stale-while-revalidate tells caches that they may continue to serve a response after it becomes stale for up to the specified number of seconds, provided that they work asynchronously in the background to fetch a new one.

stale-if-error tells the caches that they may continue to serve a response after it becomes stale for up to the specified number of seconds in the case where the check for a fresh one fails (in most cases where there is an issue at the origin server).

Some best practices worth outlining are that it is recommended to specify a short stale-while-revalidate and a longstale-if-error value. The Fastly docs rationale for recommending this is that if your origin is working, you don't want to subject users to content that is significantly out of date. But if your origin is down, you're probably much more willing to serve something old if the alternative is an error page.

Purging a cache

Expiring a cache allows a cache to be populated with fresh data periodically, however, there are times when you’d want to refresh the cache based on actions. This is where purging becomes useful. Purging describes the act of explicitly removing content from the edge cache, rather than allowing it to expire or to be evicted.

You’ve read above that we expire the cache on an article page after one day, but what if after publishing the article, the author realizes that they made some typos and they update the article? In this case, we wouldn’t want readers to continue viewing the outdated version with the typos. We’d want to “purge” that cache so that we can get the latest version from the origin server. Hence, when creating a cache you want to evaluate the conditions for which you’d need to purge the cache. In the case of the article page, we purge the cache on some of these actions below:

The above are some of the main cases where we purge the article, however, it is not the exhaustive list.

You can read more about Purging on the fastly developer documentation

Observing edge caching on DEV

We can observe edge caching on an article page on DEV.

When you load up a page like https://dev.to/devteam/top-7-featured-dev-posts-from-the-past-week-3id8, you’ll notice that the content of the article page loads really quickly but then the reactions take some time to come in. This is due to the fact that the first time we render the page, we hit an edge cache, and the article gets rendered from that cache. We then make a follow-up asynchronous request to get the reactions that are not needed immediately. From a user experience point of view, you’re most likely keeping your eye on the content to read the article before reacting to it. It's also useful to note that the comments are rendered along with the article on the first load, and this is mostly for SEO purposes.

In order to see whether a request is cached, you can click on the request in the network tab and look at the request headers. The request headers have an x-cache attribute which is written by Fastly which indicates whether the request was a HIT or a MISS. It also contains a header X-cache-hits which indicates the number of cache hits in each node.

These are useful headers to look out for to determine if the requests are being cached.

⚡️ Fragment Caching

Fragment Caching is used to cache parts of views, thus reducing the need to re-compute complex views

Why do we fragment cache at DEV?

In a typical Rails application, when a user visits a page on the site then a request to load the page would get sent to the Rails application. The application will then call the relevant controller which in turn will request the model for the data. The model fetches the data from the database and returns it to the controller. The controller, armed with the data, will render the view which is the user interface that you as a user will see on the web page.

However, that rendering can be slow for numerous reasons. Some of these include:

There can be expensive database queries in the view.
It may be a complex view with nested loops and we have tons of complex views at DEV.
There may also be many partials with some being nested which can increase rendering time.

Hence, to avoid the slowness that comes with the above problems, we sometimes cache the view to allow the request to complete more quickly.

How does fragment caching work?

Fragment caching removes the call to our Postgres database and reduces the time taken to compute a view in favor of storing the “fragment” in a memory cache (like Redis) as key-value objects. The key is provided within the Rails application and the fragment is stored as its value.

During the view rendering, if a cache key is come upon, it checks the Redis store for that cache key, if it finds the cache key, it reads the value of the cache key which is the Fragment, and renders it within the block. If it does not find the cache key, it will then compute the view, and write that key to the Redis store for next time.

Expiring a cache

A unique key is provided for every fragment that will be cached in our Rails view. Below is an example of one of our more complex cache keys that rely on numerous identifies :



<% cache("whole-comment-area-#{@article.id}-#{@article.last_comment_at}-#{@article.show_comments}-#{@discussion_lock&.updated_at}-#{@comments_order}-#{user_signed_in?}", expires_in: 2.hours) do %>
...

This view touches a few different resources, and hence you’ll notice the different dynamic aspects that make up the cache key. The cache key allows the cache to be purged each time it changes.

#{@article.id} requests that we maintain a cache of the comments section for every article page. #{@article.last_comment_at} will change when we add a new comment, and hence we’d want to refresh the cache. If a user chooses to not show the comments @article.show_comments or to lock a discussion thread discussion_lock&.updated_at, we want it to refresh the fragment cache as well. If a parameter is passed through that changes the sorting order of the comments #{@comments_order}.Finally we show a different view fragment for logged in vs logged out view.

When any of the above cache keys change then we’ll be writing to the Redis store.

Another way that we get fresh data is by expiring the cache. Just like with edge caching, we can set an expiry time after which the cache will be refreshed.

Observing fragment caching on DEV

We use fragment caching in numerous places on the DEV application, you can grep for <% cache in our codebase to view these instances. Some include:

These are just some of the instances where we use Fragment caching, and each of these views are cached for one of the reasons outlined above.

In order to observe Fragment caching you can run rails dev:cache. You can start off by clearing the cache with rake tmp:cache:clear to ensure that the first partial is rendered by the server. Thereafter, spin up the DEV server locally and navigate to the article page.

When you navigate to this page, you should be able to see the whole-comment-area... partial being logged.

On the first load when the cache is clear you will notice that we write to the cache



Cache write: views/articles/_full_comment_area:83798ee6f16a072196604860c4f72c64/whole-comment-area-9-2022-08-22 10:26:59 UTC-true--top-true ({:namespace=>nil, :compress=>true, :compress_threshold=>1024, :expires_in=>2 hours, :race_condition_ttl=>nil})
15:35:44 web.1       | [dd.env=development dd.service=rails-development dd.trace_id=781241634623167295 dd.span_id=0] Cache write: views/articles/show:136d4e5867aa13a4886bc993aba6558e/specific-article-extra-scripts-9-2022-12-19 14:55:37 UTC ({:namespace=>nil, :compress=>true, :compress_threshold=>1024, :expires_in=>24 hours, :race_condition_ttl=>nil})

However, on subsequent loads we’ll simply continue reading from the store.



15:28:44 web.1       | [dd.env=development dd.service=rails-development dd.trace_id=2192981961066220961 dd.span_id=0] Cache read: views/articles/_full_comment_area:83798ee6f16a072196604860c4f72c64/whole-comment-area-10-2023-01-25 13:02:56 UTC-true--top-true ({:namespace=>nil, :compress=>true, :compress_threshold=>1024, :expires_in=>2 hours, :race_condition_ttl=>nil})
15:28:44 web.1       | [dd.env=development dd.service=rails-development dd.trace_id=2192981961066220961 dd.span_id=0] Read fragment views/articles/_full_comment_area:83798ee6f16a072196604860c4f72c64/whole-comment-area-10-2023-01-25 13:02:56 UTC-true--top-true (0.4ms)

We realize that sometimes we may overuse Fragment caching at times in our application and there is probably some room for cleanup and improvement. One aspect to keep an eye on is the complexity of the cache keys. If the cache keys are more complex than the cached content then you may end up with a circumstance where it takes longer to check if there is a cache stored for that key in memory store than it does to render the view, then perhaps it's time to re-evaluate. Whilst Redis handles this very well and scales magnificently, sometimes the better path can be optimizing the database queries.

🪆 Russian Doll Caching

If you look through the application, we you will see that we sometimes nest the cached fragments inside other cached fragments - this is referred to as Russian Doll Caching. It ensures that our cached fragments are broken up into smaller pieces which allows the outer cache to be rendered faster when only one of the nested fragments change.

An example of Russian Doll Caching can be seen in the manner that we render our navigation links



 <%= render partial: "layouts/sidebar_nav_link", collection: NavigationLink.where(id: navigation_links[:other_nav_ids]).ordered, as: :link, cached: true %>

NavigationLink.where(id: navigation_links[:other_nav_ids]).ordered renders a navigation link collection. Hence, using the collection attribute, we’re theoretically wrapping each navigation link layouts/sidebar_nav_link partial in a cache block. Each object in the collection contains the details of a navigation link, and so if any of that data is updated, the cache of that particular element will be invalidated whilst the outer cache and the other nested caches will remain unchanged.

⚡️ Browser Caching

The Browser cache allows resources to be cached in one's browser, thus reducing the page load time and eliminating the need to go to the server. It stores resources in the web browser.

Why do we browser cache

At DEV we cache our static assets like images, CSS, and JavaScript. Previously, we had early adopted service workers as a form of browser caching but we later removed it when we ran into many caching bugs. We mainly browser cache to speed up the page loading process as well as minimize the load on the server.

How does the browser cache work?

When a user makes a request for the first time on a site, the browser will request those resources from the server and store some of the resources in the browser cache. On subsequent requests, these resources will then be returned from the browser instead of having to travel over the internet to the user.

In order to browser cache we set some headers in our production.rb:



 config.public_file_server.headers = {
   "Cache-Control" => "public, s-maxage=#{30.days.to_i}, max-age=#{3000.days.to_i}"
 }

Here, we are first setting the Cache-Control header to use public (intermediate) caching, for 3000 days with max-age. s-maxage stands for Surrogate cache, which is used to cache internally by Fastly. When the browser sees the above cache-control header it will cache the asset for a year, thus the network calls will no longer hit the server.

When the browser parses the index.html file it looks for the different referenced script files. You’ll notice that these files are fingerprinted. Hence the request for those files will look like https://dev.to/assets/base-bb53898178e7a6557807ce845d06cd2d60fd1e7ab108f2bad351d5bb92ec53b9.js. This versioning technique binds the name of a file to its content, usually by adding the file hash to the name.

If a file has been updated on the server, the fingerprint changes as well which then results in the browser not having a reference to that filename in its cache. Hence, it refreshes the cache for that resource.

Observing browser caching

From the screenshot below you can see that our js file returns a 200 (from memory cache) which shows that the response was indeed served from the cache. If you look at the Request Url you will notice that the file name is fingerprinted to let the browser know whether the resource has changed on the server or not.

Conclusion

To conclude, I’d like to discuss some of the questions I ponder on when trying to determine whether or not to implement caching on a feature.

One of them is “How often do we see the data for that feature changing?” If the answer to that is very seldom and “relatively static” then caching is most likely the way to go.

Another question that I ask is at what layer does it make sense to implement the caching? Here I think about what needs to be cached and how often will it need to be refreshed.

If I think that caching is the way to go then I explore whether it would be beneficial to implement only Fragment caching to cache any complex views on that page. Hence, allowing the requests to still hit the server but reduce the load on the database by serving from the application cache.

Or do I need something more, perhaps we anticipate a large load on the page, with successive hits from all around the world? In this case, edge caching will be more efficient in serving the page. Maybe it makes sense to do both, like we do in many parts of the DEV application.

There’s no right or wrong answer here, but as with all performance problems, we encourage our team members and contributors to analyze the problem carefully and run some tests in order to make the most informed decision.

And that’s all folks - I hope that you found this post useful. Please drop any feedback and comments below 👇🏽.

Top comments (17)

Chris Greening • Feb 10 '23

Fantastic read - thanks so much for sharing Ridhwana!

I love when The DEV Team releases technical posts explaining topics related to DEV.to as the platform, gives a really concrete reference point of these concepts in action with real-world applications

Ridhwana Khan • Feb 10 '23

Thanks Chris, I’m really glad you enjoyed it!

Let us know if there’s anything on the platform that you’re curious about and would be interested in learning more about.

Che • Feb 10 '23

Great write up, very interesting!

spO0q • Feb 10 '23

Nice read!

Cache invalidation is still one the toughest problems for technical teams, especially when there are multiple layers of different kinds of caching systems, which is not uncommon in modern apps.

Ruslan Stelmachenko • Feb 11 '23

After reading the article I wonder what name for DEV environment is used by DEV's DEVs and how many times this confusion popped up in conversations? 😄