A reverse proxy is the application that sits in front of back-end applications and forwards client (e.g. browser) requests to those applications. Reverse proxies help increase scalability, performance, resilience and security. The resources returned to the client appear as if they originated from the web server itself.
AWS CloudFront is a CDN service for high performance and security convenience that offers a lot of advantages including a global edge network with a low latency and high throughput network connectivity (the one that matter to us).
One typical example where we could be needing a reverse proxy cache mechanism is when building HTTP APIs (API Gateway v2) on AWS, this type of APIs are designed with minimal features so that they can be offered at lower price, lacking options as edge optimization, support for api keys, throttling and cache, more detailed comparison here; not having support for cache means processing time load will increase on backend side on origin servers, resulting in high latency on every request.
As it turns out, CloudFront solves this problem nicely.
For simplicity we will be using Serverless Framework v3 to handle AWS stack creation.
A very basic serverless api deployment should be working and usable to be able to configure CloudFormation distribution on top of it.
If you don't have previous experience with serverless, follow this link on how to do it, just remember to select "HTTP API" as is the one that doesn't have cache support already built in.
Let's start by defining the type of api we need and some basic function to be able to exemplify:
# serverless.yml provider: name: aws # ... httpApi: name: "myapi" cors: true functions: hello: handler: src/handler.hello events: - httpApi: path: / method: get
Regarding to what data we will be returning, lets run a process that sleep for 5 seconds to simulate some background process that "take too long" to complete using the
Timers API, something like this:
🎉 After a correct deployment, the api should be created successfully and ready to use.
Now that the API is live and usable, we can make request by just calling the endpoint provided:
curl --location --request GET 'https://644z4ooroe.execute-api.us-east-1.amazonaws.com/'
This will return the response we explicitly send back in our lambda, BUT, after 5 seconds:
If you notice the time taken to complete the request
5.21sis the time we setup to sleep, this time is also influenced by the spin up (known as freeze time) of lambdas, consecutive requests will decrease the time needed by the script to return data but only by a few ms.
So what happens if we cache this response not to wait those
5s of processing time?
The process consists in creating a distribution using the API domain as origin, enabling the built-in cache inside the distribution and controlling the caching time by TTL.
Following the Amazon CloudFront resource type reference we will create the distribution directly from serverless template and connect it to the previous created API as our origin.
We need to create two resources to be able to create the distribution:
serverless.yml file (at the end), let's create a new section:
resources where we can add resources that will be created for us inside AWS by the
sls deploy command, those resources are:
resources: Resources: # https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-cloudfront-cachepolicy.html mycachepolicy: Type: AWS::CloudFront::CachePolicy Properties: CachePolicyConfig: Name: mycachepolicy # We can custom or TTL values below DefaultTTL: 86400 MaxTTL: 86400 MinTTL: 1 ParametersInCacheKeyAndForwardedToOrigin: EnableAcceptEncodingGzip: true EnableAcceptEncodingBrotli: true CookiesConfig: CookieBehavior: none HeadersConfig: HeaderBehavior: none QueryStringsConfig: QueryStringBehavior: none # https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-cloudfront-distribution.html mydistribution: Type: AWS::CloudFront::Distribution Properties: DistributionConfig: Enabled: true Origins: # auto generated by serverless, also removed "http:" as is not allowed in domain name, is going to use the default API URL generated by AWS, if you have a custom api url, just replace it here - DomainName: !Select [1, !Split ["//", !GetAtt HttpApi.ApiEndpoint]] # this value should be moved to a custom global var instead of duplicating the same string below Id: mydistributiondomainid CustomOriginConfig: OriginProtocolPolicy: https-only DefaultCacheBehavior: CachePolicyId: !Ref mycachepolicy DefaultTTL: 300 TargetOriginId: mydistributiondomainid ViewerProtocolPolicy: https-only # List of allowed method acceded by cache, only GET for our case AllowedMethods: - GET - HEAD # all means all edge locations (recommended) PriceClass: PriceClass_All
Dont forget to run the re-deploy to update the AWS stack with the new config, if everything works, we should be able to make request to the cloudfront URL and it will cache the responses from the origin.
⚡️ Here is the final requests, the first is to our origin, the second to cloudfront cache.
🤯 As we can see, the response time is absurd in comparison just by enabling a cache.
Remember the very first request (miss cloudfront) will have the same load time as the origin due to will populate the first time the cache.
All the code is available here if you want to test it.
Hope it helps, cheers 🍻