DEV Community

Cover image for Web Caching, a quick guide.
Pushpendra Singh
Pushpendra Singh

Posted on • Updated on • Originally published at

Web Caching, a quick guide.

Welcome, lets build fast web app. There is so much content and competition on the internet, this gives endusers many options to select from. How you make sure that the user decides to use your app, are your services good enough for users to make that decision? No matter how good your services are when a user visit your websites they have certain expectations, the one feature, today every user looks out for is an immediate response, fast feedback, on an average a user only tends to wait for 2-3 sec for a page to load before moving on. I know right, thats so impatience. But we got to work on it nevertheless. So, why the page load time is high? The time taken for a page to load is result of so many factors like parsing and execution of JS scripts, loading of assets like(images, fonts, css), data fetching etc. By reducing page load time we have a way to improve the user experience and meet their expectations. One of the solutions to this is by using caching, caching can help us solve the slow response problem to an reasonable aspect.

When it comes to web performance and app optimization, caching is one of the common solutions you cannot overlook, caching is something which is built into Browsers and HTTP requests and is natively available to us the developers to integrate into our app without much configuration overhead, lets learn when & how to use it.

What is caching,

Take a look at your desk and you will find some commonly used items there, like pens, notepad, snack bars, water bottle, mobile charger etc. Since your desk has a limited space you only keep stuff which you find important and frequently used on it, while the other stuff you keep in cupboard or somewhere nearby, your fridge 😉.

So in above setup when you feel thirsty you don't always have to getup and walk to kitchen to drink water, you can simply grab the bottle on your desk. This definitely saves us time and effort.

Caching is also something similar, its concept when you store a copy of the frequently used data closest to your user to avoid the re-fetching. In caching we tend to identify data which we know won't change that often or data that is repeatedly accessed by user to near proximity of the access point that is our application. The data i.e. stored or copied during caching is know as cache.

A desk setup as a cache store is very contrived example but it helps with mental model. 🤷‍♂️

Its not only about response time, caching also help us with

  • Performance, it reduces the latency between requests and response, and when using cached data for our pages, the startup time decreases
  • it reduces the server load, request to server decreases(rpm)
  • it also consumes less network bandwidth, cutting the cost

When implementing caching its should be known not all request happening from client will be captured at server thus server metric will not give true picture of client behavior.


Before moving on with the implementing caching lets understand where in our architecture we can have a cache store. Caching can be introduced at multiple levels of an app ecosystem,

  • Browser cache : It is a cache system that’s built into a browser(memory cache or disk cache), it is available in the user system and is fastest and closest to user. Chrome
  • Proxy Cache : In this caching methodology the cache is shared by multiple users sharing same proxy. When using proxy cache the network trip to origin(server) is avoided. The proxy server pulls the content from origin server once a while. Proxy Server, Reverse Proxy(CDNs)
  • Application Cache : It is a cache stored at application level and data can be served directly from cache storage. It frees up compute resources. Redis, memcached, local storage


When we are using the stored data instead of fetching it on every usage, we should be aware of its state on the network

  • Cold cache: A cold cache is empty and results in mostly cache misses, a cache miss is when data has to be fetched from the origin.
  • Warm cache: The cache has started receiving requests and has begun retrieving objects and filling itself up.
  • Hot cache: All cacheable objects are retrieved, stored, and up to date.

Working with the Caching

The browser inspects the headers of the HTTP response generated by the web server, to decide which request response it should cache in the system.


Expire header can be seen in many sites, it was introduced in HTTP 1.0 but it's not very common today,

  • Expires have a expiration date after which the asset is considered invalid, this is an absolute value to all the clients
  • As a date is used for validation, the cache life is dependent on the time zone of the user. Web server date should be in sync with client date
  • After the expiration date, cache is not used and browser makes request to the server
  • All the same resources for multiple clients will expire at same time and can result in DDOS.

Cache Control

Introduced in HTTP 1.1 which accepts comma-delimited string that outlines the specific rules, called directives. The Cache-control header allows you to set whether to cache or not and the duration of the cache.

  • public
    • can be stored on shared cache
  • private
    • Cache is intended for the single user, can't be store on shared cache (proxy servers)
  • no-store
    • Header value used is no-store cache is not stored in browser or server under any condition
    • When the cache is missing or not used during the request.
    • New Request is made every time
  • no-cache
    • Header n-cache is used, cache created at browser but not used
    • When the cache is available but considered stale and validation is required.
    • New request is made every time for validation if its not expired then used.
  • max-age
    • in seconds, time for which cache will remain fresh
    • Preferred over Expire header as it stores time in relative to request made.
    • No time zone and DDOS attack issue as time is relative and not an absolute date
  • s-maxage
    • Used by the proxy
  • must-revalidate
    • Browser must re-validate and can't used stale cache
  • no-transform
    • Some CDNs have features that will transform images at the edge for performance gains, but setting the no-transform directive will tell the cache layer to not alter or transform the response in any way.

Before HTML5, meta tags were used inside HTML to specify cache-control. This is not encouraged now as only browsers will be able to parse the meta tag and understand it. Intermediate caches(proxy servers, cdns) won’t.

Browser won’t make a call to the origin regardless of whether the content has changed or not till the cache is not expired or invalidated.


  • ETag (Entity Tag)
    • Etags are hash values associated with the request
    • Multiple logic can be used for etags formation, like file size, file content etc.
    • If etags are changed then cache is considered to be expired, and server sends latest data
    • 304 response is sent in case Etag is not changed
  • Last Modified
    • Is a date timestamp value of content last modified, which can be used to decide whether the cache is valid or not.
    • As date is involved content must be time stamped.
    • Timestamp should be independent of zone.
  • Files name usually changes when a site is built again, because of hash names. In this case, all files from previous build are considered expired and not used again by browser.

Sometimes, when using caching we had to serve user with stale content.

Out of the box solution to scale

Since caching is such a powerful optimizations technique, web/app architecture have a layer for caching and this layer is occupied by Content Delivery Networks(CDN).


A content delivery network (CDN) is a group of geographically distributed servers, also known as points of presence (POP). CDNs are used to cache static content closer to consumers. This reduces the latency between the consumer and the content or data needed

CDN Example

A CDN can achieve scalable content delivery by distributing load among its servers, by serving client requests from servers that are close to requesters, and by bypassing congested network paths.

  • Distance reduction – reduce the physical distance between a client and the requested data
  • Hardware/software optimizations – improve performance of server-side infrastructure, efficient load balancing, RAM and SSD are used to provide high-speed access to cached objects and
  • Reduced data transfer – employ techniques such as minification and file compression to reduce file sizes so that initial page loads occur quickly. Smaller file sizes mean quicker load times.
  • Security – keep a site secured with fresh TLS/SSL certificates which will ensure a high standard of authentication, encryption, and integrity. They also provide protection against DoS attack

CDN Cache Hit Ratio, this is the amount of traffic, both total bandwidth and sheer number of transactions, that can be handled by the cache nodes versus the number that gets passed back to your origin servers.

CDN offload = (offloaded responses / total requests)

Origin servers still have an important function to play when using a CDN, as important server-side code such as a database of hashed client credentials used for authentication, typically is maintained at the origin.

So know we know web caching comes at multiple layers and it help us improve our app performance and reduce operation cost. If your site uses static content then its a must to have solution for your web app. See how frequent and redundant those https calls are and accordingly add caching into your application arch. Lets give the user a better experience, Adios 👋


Top comments (0)