DEV Community

Muhammad Ahmad Khan
Muhammad Ahmad Khan

Posted on

Upgrading EKS Version with Zero Downtime Incorporating Custom Blue-Green Methodology

One of the most significant operational overheads of EKS is its version upgrade, which we have to tackle each quarter. Although AWS has announced extended support for older versions of EKS, the cost of extended support is something you do not want to incur. AWS maintains its standard support and extended support timeline in the following documentation: Kubernetes Release Calendar.

Keeping EKS clusters up to date with the latest versions is essential for security, performance, and access to new features. However, upgrading Kubernetes clusters can be challenging, especially in production environments where downtime is not an option. In this blog post, we'll explore how you can leverage the custom blue-green methodology to perform version upgrades on AWS EKS clusters seamlessly and with minimal disruption.

The blue-green deployment strategy is a technique used to reduce downtime and risk when deploying updates to applications or infrastructure. In the context of AWS EKS version upgrades, the traditional blue-green methodology involves creating a new cluster with the updated Kubernetes version (green), migrating workloads from the old cluster (blue) to the new one, and then decommissioning the old cluster. However, in our case, we will maintain a blue cluster only to serve traffic when we plan to upgrade our original cluster (green). This means that our original cluster will always remain green (before and after the EKS upgrade), and traffic will only shift from the green to the blue cluster while we perform upgrades on the original cluster. After the successful upgrade, traffic will shift back from the blue cluster to the original cluster (green).

Steps to Perform EKS Version Upgrade using Custom Blue-Green Methodology:

  1. Prepare for Upgrade: Before initiating the upgrade process, thoroughly review the release notes for the new Kubernetes version to understand any changes or potential compatibility issues with your workloads. This includes checking deprecated resources and APIs, whether in your self-maintained application chart or in some open-source community chart. If you are using an open-source community chart, please ensure that the EKS version you are upgrading to has been tested with your installed chart.
  2. Create a New Blue Cluster or Upgrade the Already Created Blue Cluster: Using AWS EKS, provision a new Kubernetes cluster with the desired version. Ensure that the cluster configuration, including node groups, networking, and IAM roles, aligns with your requirements. If you have created a new EKS cluster, make sure to configure DNS and Load Balancers by updating DNS records or adjusting load balancer settings to direct traffic to the new cluster gradually when workloads are migrated. If your cluster is large and you have many Kubernetes Ops tools installed in Kubernetes using Helm charts, it is a good idea to maintain this separate cluster for each upgrade cycle to save time. If you are already maintaining the cluster, then you have to prepare this cluster for upgrade by performing the following tasks:
    • Upgrade the EKS control plane from the console.
    • Upgrade each EKS Node group by setting the right update config as per the priority from the console and using Force Update.
    • Upgrade add-ons from the console.
  3. Deploy/Update Workloads: Utilize Kubernetes tools such as kubectl or Helm charts to deploy workloads onto the new blue cluster or update in the existing blue cluster. Monitor performance and functionality to ensure a smooth transition.
  4. Validate and Test: Conduct thorough testing on this new/upgraded blue cluster to verify that all workloads and applications function as expected. Use automated testing tools and perform manual checks to validate performance, scalability, and reliability.
  5. Cut Over Traffic: Once validation is complete, gradually shift production traffic from the old cluster to the new upgraded one using weighted routing. Monitor logs, pod scaling, and AWS Load Balancer metrics such as RequestCount as traffic increases. Monitor closely for any anomalies or performance issues during the transition.
  6. Upgrade Original Cluster: Now perform the same process for the original cluster, including upgrading the EKS control plane, node groups, add-ons, as well as Kubernetes workloads. Verify if everything is working fine and transition traffic gradually back from the blue cluster to this original one after the full upgrade.
  7. Decommission/Scale Down Blue Cluster: After confirming that the original cluster is handling all traffic effectively, decommission/scale down all node groups of the blue cluster to release resources and reduce operational overhead.

Best Practices for EKS Version Upgrades:

  • Automate wherever possible to streamline the upgrade process and reduce manual errors.
  • Implement monitoring and alerting to detect and respond to any issues promptly.
  • Maintain clear communication with stakeholders throughout the upgrade process to manage expectations and address concerns proactively.

AWS EKS provides a robust platform for managing Kubernetes clusters, and staying current with the latest versions is crucial for leveraging new features and ensuring security and performance. By adopting the blue-green methodology for version upgrades, organizations can minimize downtime, reduce risks, and maintain a seamless experience for end-users. By following best practices and leveraging automation and monitoring capabilities, you can confidently navigate EKS version upgrades and keep your Kubernetes infrastructure up to date with minimal disruption.

References: Kubernetes Deprecation Guide.

Top comments (0)