Daniele Frasca for AWS Community Builders

Posted on Feb 15, 2023 • Edited on May 19, 2023

Get me there faster

#typescript #go #rust #softwaredevelopment

If you are following me on Twitter or read some of my posts, especially on this "Serverless Latency" series, you have noticed that I am not concentrating on the hello world aspect of Serverless anymore but more on squeezing out the best performance that I can get with Serverless application.

Latency is a complex subject, and it is composed of many factors.

Until now, I have talked mainly about the back end part of the applications, from the front-door selection where you should prefer Elastic Load Balancing over Amazon API Gateway for costs and speed but only if you can work without some of the APIGW features and how to optimise the Lambda duration applying best practices to even move to a better runtime to get the extra ms.

Today I will explain how we can reach the front door (either an ALB or APIGW) as fast as possible, and it will not be a networking article that I leave to SRE/DevOps people.

When I am building an application, this application may expose an endpoint (front door), and the users could be in the same region or around the world.

I will not go into the multi-region setup because what I will talk about today applies to the single and multi-region design. However, if you want to read more about multi-region, I have written something about it, and you can find it all in the multi-region series.

As I was saying, I want to expose an endpoint for my users/clients and let's imagine that many of them experience a high latency of up to 1 second. So first, I would check the p99 metrics of my backend service, and again, let's imagine that the p99 is 200 milliseconds. This means that many of my users have this 800 milliseconds somewhere, and I can pinpoint the problem to internet connections like:

Distance: The distance between the user and the service endpoint.
Network congestion: If the network is congested, it can slow down data transfer speed and increase latency.

As not a networking expert, I am going to check my application setup and see that I am using the standard Domain Name System (DNS) service like Amazon Route 53. Whatever the DNS service I am using, my users are crossing the public internet to reach my endpoint, which could cause their high latency.

What options do I have? If I do my research and I want to stay only in the AWS world, I find:

Amazon CloudFront

Amazon CloudFront is a general-purpose CDN. CloudFront works by caching your content at edge locations worldwide so that when your end users request content, it can be delivered from the nearest edge location, reducing latency and improving performance.

You can also use CloudFront Origin Shield as an extra layer of cache that helps to minimise your origin's load.
Using CloudFront Origin Shield, I get the following:

Better cache hit ratio - because of the additional caching layer in front of my origin, all requests from all of CloudFront's caching layers to my origin go through Origin Shield.
Reduced origin load - reduce the number of simultaneous requests sent to my origin for the same object.
Better network performance - routing to Origin Shield remains on the CloudFront network all the way, which has a low latency connection to my origin.

Works on Layer 7 HTTP/S content delivery network
Supports HTTP(S) & WebSocket protocols
Supports content caching
Uses DNS-based routing
Uses Dynamic IP addresses
Uses native origin failover based on HTTP error codes
Supports any HTTP(S) based origin
Supports computing at the edge via CloudFront Functions & Lambda@Edge

AWS Global Accelerator

AWS Global Accelerator gives me two global static public IPs that act as a fixed entry point for my application, and because it uses Amazon's globally distributed network of Points of Presence (PoPs), the users' traffic is moved off the internet and routed to the optimal AWS Region for your end users, based on factors like latency, network health, and geography. This helps reduce latency and improve performance for end users, regardless of where they are located.

This could benefit up to 60% in traffic performance, and in the unlikely event that all Route 53 health checkers globally are impacted, Global Accelerator continues to serve traffic to application endpoints.

Here is the AWS Global Accelerator Speed Comparison

Works on Layer 4 TCP/UDP proxy OR Global traffic manager
Supports any protocol running over HTTP and non-HTTP protocols such as TCP and UDP.
Doesn't support content caching
Uses Anycast Routing
Uses 2 global static IP addresses
Uses built-in origin failover in <30 secs with no dependency on DNS TTLs
Supports ALB, NLB and EC2 instances as endpoints
Doesn't support edge compute functions

In summary, both AWS Global Accelerator and Amazon CloudFront are designed to help improve the performance of my applications but cover different/cross use cases. Both move traffic over AWS's dedicated network backbone but here are some differences:

CloudFront has no static IPs, while Global Accelerator gives you 2 static IP addresses that I can use as an entry point for my application.
CloudFront pricing is based on data transfer out and HTTP requests. At the same time, Global Accelerator charges a fixed hourly fee.
CloudFront uses Edge Locations for caching, while Global Accelerator uses Edge Locations to find the optimal path to my origin.
CloudFront is designed to handle HTTP protocol, while Global Accelerator is best used for HTTP and non-HTTP protocols such as TCP and UDP.

As you can see from this image:

AWS Global Accelerator works with Application Load Balancers, Network Load Balancers, and Amazon Elastic Compute Cloud (EC2) instances.

To quote AWS Global Accelerator FAQ:

Q: How is AWS Global Accelerator different from Amazon CloudFront?
A: AWS Global Accelerator and Amazon CloudFront are separate services that use the AWS global network and its edge locations around the world. CloudFront improves performance for both cacheable content (such as images and videos) and dynamic content (such as API acceleration and dynamic site delivery). Global Accelerator improves performance for a wide range of applications over TCP or UDP by proxying packets at the edge to applications running in one or more AWS Regions. Global Accelerator is a good fit for non-HTTP use cases, such as gaming (UDP), IoT (MQTT), or Voice over IP, as well as for HTTP use cases that specifically require static IP addresses or deterministic, fast regional failover. Both services integrate with AWS Shield for DDoS protection.

Reading the FAQ, the first that came to my mind was that AWS Global Accelerator is unsuitable for API acceleration. I should prefer Amazon CloudFront for typical applications (web, mobile) and AWS Global Accelerator for other types like IoT applications.

If my application serves:

Mobile
Web
IoT

I wonder what justifies the usage of AWS Global Accelerator over Amazon CloudFront because:

I can use CloudFront to reach the API, but I can also use the Global Accelerator
I can use the MQTT WebSocket or just HTTP. Both services move traffic to the AWS Backbone network
CloudFront has more edge points, so I should reach my origin faster than Global Accelerator.
Global Accelerator works only with NLB, ALB, and EC2, while CloudFront can also use APIGW, Lambda Function Url
CloudFront can be used for cache and no-cache scenarios

Of course, I can argue with myself and find specific cases:

I can build a specific API for each type of client
I can use one AppSync endpoint to cover all if I don't need WebSocket
I don't need to cache content, so there is no need for CloudFront

There is probably much more, but if I am thinking about the use case of WebSocket, I can use the MQTT protocol in the browser.

Amazon API Gateway WebSockets is suitable for 1-to-1 (no fan out) with less than 10 minutes messaging needs.
AWS IoT Core can fan out to large numbers of clients and is generally much cheaper. It also doesn't need an API call to initiate the process. IoT Core SDK provides the logic that makes it easier for frontends to reconnect. I can also configure shadow devices in case frontends need to catch up on missed messages.
AWS AppSync WebSocket could be an alternative if I used GraphQL, but I think it is not compatible with IoT.

It is clear that, based on specific requirements, I can move the tip of the scale in favour of either service.

There is a significant difference in how the two services use the edge locations.

CloudFront uses Edge Locations for caching, while Global Accelerator uses Edge Locations to find the optimal path to my origin.

CloudFront routes traffic based on the distribution's price class. My request may be routed to an unexpected edge location, increasing the overall latency for retrieving an object or reaching the origin.

To compare the performance between Amazon CloudFront and AWS Global Accelerator for a specific origin, I could:

Test access to the origin directly without either CloudFront or Global Accelerator
Test access via CloudFront and Global Accelerator from multiple geographic locations

I did a basic test with the following:

On the left of the images, you have Amazon CloudFront (Use only North America and Europe) and on the right, AWS Global Accelerator.

From these basic tests, I see they are the same where the Edge Locations cross over in North America and Europe, but AWS Global Accelerator makes the difference when not. So unless I have a specific case like Gaming, UDP, or TCP communication, I would use Amazon CloudFront and risk the request being routed to an unexpected edge location, increasing the overall latency. Still, if, for example, I should point many domains to one location (the root record of a domain needs to point to an IP address), I need to rely on some static IP (AWS Global Accelerator).

Can I use both services together?
I can use Route 53 to point to the static IP of an AWS Global Accelerator and then configure Global Accelerator to direct traffic to my CloudFront distribution. This can improve the performance of my application by routing it through Global Accelerator's optimised network routes and using CloudFront's caching capabilities.

Here's an overview of how this might work:

Create an AWS Global Accelerator
Point AWS Global Accelerator to the Application Load Balancer
Add a rule to the listener that directs traffic based on the URL path pattern to my Amazon CloudFront distribution
Configure in Amazon CloudFront the path pattern to identify the origin or origin group that you want CloudFront to route the request to

With this configuration, the user request is first routed to the static IP of the Global Accelerator. Global Accelerator then uses the rules in the listener to determine the best endpoint to forward the request to, in this case, my Application Load Balancer. From here, I forward the request to my Amazon CloudFront distribution.
Amazon CloudFront then retrieves and serves the requested content to the user.

Top comments (1)

Avinash Dalvi AWS Community Builders • Feb 15 '23

Like title "Get me there faster" Very well written 👏🏻👏🏻