Things to consider before using AWS NLB with EKS pods

#aws #awskubernetes #awseks #awsnlb

As organizations embrace microservices architecture and container orchestration with Kubernetes, securing inter-service communication becomes paramount. One powerful tool in the AWS ecosystem for achieving this is the Network Load Balancer (NLB). In this blog post, we'll delve into things to consider before adopting the AWS NLB in your design.

In the realm of cloud computing, load balancing plays a crucial role in distributing incoming network traffic across multiple servers or resources. Among Amazon Web Services' (AWS) suite of load balancing options, the Network Load Balancer (NLB) stands out as a powerful tool for achieving high availability, scalability, and efficient traffic distribution. AWS NLB is a highly scalable and performant load balancer that operates at the transport layer (Layer 4) of the OSI model. It is designed for applications that require high throughput and low latency.

Key Concepts of AWS NLB

Target Groups
Target Groups are collections of resources (such as EC2 instances) that receive traffic from the NLB. They play a crucial role in defining how traffic is distributed.
Availability Zones
AWS NLB can distribute traffic across multiple Availability Zones (AZs) for high availability. Understanding AZs and their relationship with NLB is crucial for designing fault-tolerant architectures.
Health Checks
NLB monitors the health of registered targets by periodically sending health checks. Targets that fail these checks are removed from the pool, ensuring traffic is only routed to healthy instances.
Listeners and Rules
Listeners define the protocols and ports that NLB uses to route traffic to targets. Rules determine how incoming requests are directed based on factors like host, path, or source IP.
Cross-Zone Load Balancing
This feature allows NLB to evenly distribute traffic across instances in all enabled AZs, improving fault tolerance and application availability.

Things to consider

Understand how NLB health check works
When we register a new target to the Network Load Balancer, it is expected to take between 3-5 minutes (180 and 300 seconds) to complete the registration process. After registration is complete, the Network Load Balancer health check systems will begin to send health checks to the target. A newly registered target must pass health checks for the configured interval to enter service and receive traffic. For example, if you configure your health check for a 30 second interval, and require 3 health checks to become healthy, the minimum time a newly registered target could enter service is 270 seconds (180 seconds for registration, and another 90 (3*30) seconds for passing health checks) after a new target passes its first health check.
Sometimes NLB sends traffic to the target before marking it healthy
It is a known issue at AWS NLB team and good thing is they are aware of it. So, I'm positive that we'll get a fix soon. Just make sure to keep this in mind and prepare your logs and alert framework accordingly.
NLB continues to send traffic to the target which is in draining state
Another AWS known issue and AWS NLB team is working on it. When we deregister a target from Network Load Balancer, it is expected to take 3-5 minutes (180-300 seconds) to process the requested de-registration, after which it will no longer receive new connections. During this time the Elastic Load Balancing API will report the target in 'draining' state. The target will continue to receive new connections until the de-registration processing has completed. At the end of the configured de-registration delay, the target will not be included in the describe-target-health response for the Target Group, and will return 'unused' with reason 'Target.NotRegistered' when querying for the specific target.
Always set a proper de-registration delay using

service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: deregistration_delay.timeout_seconds=120

The initial state of a deregistering target is draining. By default, the load balancer changes the state of a deregistering target to unused after 300 seconds. This is needed to ensure the requests are completed successfully. 
If you enable the target group attribute for connection termination, connections to deregistered targets are closed shortly after the end of the deregistration timeout.
Enable connection termination on deregistration using

service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: deregistration_delay.connection_termination.enabled=true



* If you set the de-registration delay to 0 seconds then also NLB will take at least 2 minutes (120 seconds) to de-register the target.


* There is no AWS API that tell if the deregistration has completed. However, use "describe-target-health" API call to check the status of the de-registration.


By combining the NGINX Ingress Controller with AWS NLB in AWS EKS, you open up a world of possibilities for efficient, secure, and highly available ingress management. This integration solves critical use cases that may be challenging with other configurations. Elevate your ingress control capabilities and enhance the overall security and performance of your Kubernetes workloads in AWS EKS with this powerful combination. Happy Load Balancing!