I would like to share my learning on how we were able to cache web app and then bust the cache on every deployment so that the users can experience the latest changes.
Why do we need to cache ?
Well obviously why would you do the same work again, if you can save and reuse it.
In other words, lets say you need something and have to travel a long distance, so you keep it somewhere near to access it quickly.
Before I go into more details, would like to brief you on my tech stack:
- React app using create-react-app
- Using aws s3 to upload and host the static build
-
Cloudfront to cache it on the edge locations, with
s3
origins
How did we cached before and why it was not a good approach
We have created a script that was used in our pipeline to deploy our app to s3
:
What's happening in above script ?
We are using aws cli to upload build to s3
using s3 sync command which upsert and delete files.
Important thing to notice is that we are handling the cache on s3
with static
folder being cached and rest is not.
Well coming back to tech stack we are using s3
to upload builds and cloudfront
to cache, then why the hell are we using s3
to cache.. :D
How did we manage to change the approach
Using one of the rule of SOLID is single usability principle
Robert C. Martin describes it as:
A class should have one, and only one, reason to change.
Using s3
to just upload the files, of what is purpose is like:
#!/bin/bash
if [[ "$1" != "" ]]; then
S3BUCKETNAME="$1"
else
echo ERROR: Failed to supply S3 bucket name
exit 1
fi
aws s3 sync build/ s3://$S3BUCKETNAME --delete
and using cloudfront
to create Cache behavior using unique path
patterns in the order of priority to apply.
As shown in the above image we created three behaviors, with its specific purposes to cache or not cache. Make note that index.html
is not cached because its entry point of our app and we want this file to be always updated on new deployment, hence it will always refetch from s3
origin.
Moreover, you can use existing cache policies provided by AWS
or create your own here
P.S: if you want to use a cache header such as no-cache
in CloudFront
to perticular resource in origin
, then we can add headers to s3 origin
, as I couldn't find a equivalent way to do the following in CloudFront
:
aws s3 cp build/index.html s3://$S3BUCKETNAME/index.html --cache-control no-cache
Additionally, Note: no-cache
doesn't mean "don't cache", it means it must check (or "revalidate" as it calls it) with the server before using the cached resource.
Further reading
And basically that's it folks!, this is all we needed to manage cache using dashboard of cloudfront
.
Cache Busting
Well one of advantages of using build
produced from CRA
is that it generates a unique hash with file names which automatically gets cache busted on cloudfront
when we upload to s3
.
Well another approach to cache busting is invalidating the cache, which is not a good approach as it's relatively slow, and could get expensive fast seeing as cloudfront
gives you just 1,000
free invalidation's per month, and then charges $0.005
per invalidation path requested as at the date of writing.
Besides, it’s pretty clear that CloudFront recommends Object Versioning or unique file names over invalidation…
If you’ll want to update your files frequently, we recommend that you primarily use file versioning
Conclusion
Here we learned how to manage cache using
cloudfront
, rather than writing your ownbash
scripts and managing itMaking full use of
cloudfront
to update and reuse the cache policies between differentorigins
ordistributions
Can add your own regex as
path
in cache behaviorNo need of invalidating cache, as its expensive and not a best approach
Would love to hear your thoughts with this approach.
Regards,
Top comments (0)