DEV Community

Hugo Thomaz
Hugo Thomaz

Posted on

Zonal autoshift: Automatically redirecting traffic in response to potential issues across Availability Zones

Hello everyone! 🤓

Today, I've chosen to discuss one of the Networking & Content Delivery announcements from AWS re:Invent 2023.

The new feature of the Amazon Route 53 Application Recovery Controller allows AWS to shift our workloads from an Availability Zone when AWS identifies a potential failure affecting that Availability Zone, such as power, connectivity, network devices, and so on. It rarely happens, but it has already occurred, and it will happen. However, to avoid it, we should design and deploy our applications/services across multiple Availability Zones in a Region to ensure high availability.

The old feature, "Zonal shift," deployed before the new "Zonal autoshift," allows us to shift our workload to a new Availability Zone manually when the error originates from our side (customer side). However, when there is a potential issue at the Availability Zone level, it's challenging to identify because we don't monitor these resources managed by AWS. Typically, we check the "AWS Health Dashboard" or rely on others sharing their complaints about the issue. Consequently, during this period, we probably waste some time, and the application is out of service.

Now, with this release, you can configure Zonal autoshift to safeguard your workloads from potential failures in an Availability Zone. The AWS itself with their internal monitoring tools and metrics to determine when to initiate a network traffic shift.

Both Zonal shift and Zonal autoshift features, they operates exclusively at the Application Load Balancer (ALB) or Network Load Balancer (NLB) level, but only when cross-zone load balancing is disabled.

Ok, with these information let's see how to configure Zonal autoshift.

Note: We won't delve into the deployment of the Elastic Load Balancer, assuming familiarity on your part. Instead, let's concentrate on configuring the Zonal Autoshift settings.

1 - Seearch "Route 53 Application Recovery Controller", and open it.

Image description

2 - On the left pane, I select Zonal autoshift, and click in Configure zonal autoshift button.

Image description

3 - I've chosen the load balancer for my demo application. It's essential to note that currently, only load balancers with cross-zone load balancing turned off are eligible for zonal autoshift.

Image description

4 - To proceed with the settings, as you scroll down, you'll encounter the "Practice run blocked windows and dates" section, which is optional but crucial to configure. AWS typically tests the shift to another Availability Zone to ensure smooth application functioning in the event of an Availability Zone failure. Therefore, it's advisable to set the business hours during which you wouldn't want AWS to run this practice test, preventing any disruptions during peak business times. Additionally, you can block specific dates to avoid AWS conducting practice tests on public holidays, for example.

Image description

5 - In the final section of the settings, it's needed to configure at least the initial CloudWatch alarm ARN, but I recommend you create a CloudWatch alarm for both.

The primary purpose is to ensure that during a practice test, if the alarm doesn't transition into the ALARM state (down), AWS will consider it successful. Consequently, creating an alarm to monitor the health of the application or service is essential for this certification.

The second field, this alarm serves as a safeguard, preventing AWS from conducting the practice test if the specified alarm state is down. For instance, in exceptional scenarios, during business hours, your EC2 instance running the application might experience high CPU utilization due to increased customer access. To circumvent the inadvertent initiation of the Practice test during such atypical events, you can create an alarm to monitor and prevent the test accordingly.

Subsequently, tick the box to authorize AWS to shift our workloads from one Availability Zone to another when a failure at the Availability Zone level is identified.

Image description

Conclusion

In conclusion, I hope you liked it, I'd like to talk about this service, and also how we setup it. We can effectively deploy this solution to enhance our infrastructure. I trust that you found this discussion enjoyable.

Reference Link:
https://aws.amazon.com/blogs/aws/top-announcements-of-aws-reinvent-2023/
https://aws.amazon.com/blogs/aws/zonal-autoshift-automatically-shift-your-traffic-away-from-availability-zones-when-we-detect-potential-issues/

Top comments (0)