Load Balancing is the process of evenly distributing your network load across several servers. It helps in scaling the demand during the peak traffic hours by helping spread the work uniformly. The server can be present in a cloud or a data center or on-premises. It can be either physical server or a virtual one. Some of the main functions of a Load Balancer (LB) are:
Routes data efficiently
Prevents server overloading
Performs health checks for the servers
Provision new server instances in the face of large traffic
In the 7 layer OSI model, load balancing occurs from layer 4 (Transport Layer) to 7 (Application Layer).
The different types of LB algorithms are effective in distributing the network traffic based on how the distribution of traffic looks like i.e. whether it’s a network layer traffic or an application layer traffic.
The network layer traffic is routed by LB based on TCP port, IP addresses, etc.
The application layer traffic is routed based on various additional attributes like HTTP header, SSL and even provides content switching capabilities to LBs.
The traffic load is distributed to first available server and then that server is pushed down into the queue. If the servers are identical and there are no persistent connections, this algorithm can prove effective. There are 2 major types of Round Robin algorithms:
Weighted Round Robin: If the servers are not of identical capacity, then this algorithm can be used to distribute load. Some weights or efficiency parameter can be assigned to all the servers in a pool and based on that in a similar cyclic fashion, load is distributed.
Dynamic Round Robin: The weights that are assigned to a server to identify it’s capacity can also be calculated on runtime. Dynamic Round Robin helps in sending the requests to a server based on runtime weight.
This algorithm calculates the number of active connections per server during a certain time and directs the incoming traffic to the server with least connections. This is super helpful in the scenarios where persistent connection is required.
Weighted Least Connections Algorithm: This is similar to the Least Connections Algorithm above but apart from number of active connections to a server, it also keeps in mind the server capacity.
Least Response Time Algorithm: This is again similar to the Least Connections Algorithm but it also considers the response time of servers. The request is sent to the server with least response time.
The different request parameters are used to determine where the request will be sent. The different types of algorithms based on this are:
Source/Destination IP Hash
The source and destination IP addresses are hashed together to determine the server that will serve the request. In case of a dropped connection, the same request can be redirected to the same server upon retry.
The request URL is used for performing hashing and this method helps in reducing duplication of server caches by avoiding storing the same request object in many caches.
There are a few other algorithms as well which are as follows:
Least Bandwidth Algorithm: The server with least consumption of bandwidth in the last 14 minutes is selected by the Load Balancer.
Least Packets Algorithm: Similar to above, the server that is transmitting the least number of packets is chosen by the Load Balancer to redirect traffic.
Custom Load Algorithm: Load Balancer selects the server based on the current load on it which can be determined by memory, processing unit usage, response time, number of requests etc.
At this layer traffic can be distributed based on the contents of the request hence, a much informed decision can be made by LBs. The server response can be tracked as well since it has traveled all the way from server and help in determining the server load much effectively.
One of the most significant algorithm used at this layer is Least Pending Request Algorithm. This algorithm directs the traffic of pending HTTP(s) requests to the most available server. This algorithm is helpful in adjusting the sudden spike in requests by monitoring the server load.
These are some of the known load balancing algorithms and while selecting the most desirable algorithm, a number of factors need to be considered eg high traffic, sudden spikes etc. A good selection of algorithm helps in maintaining the reliability and performance of any application. Hence, a good understanding of these will prove helpful while designing large scale distributed systems.
If you like the post, share and subscribe to the newsletter to stay up to date with tech/product musings.
(The contents of this blog are of my personal opinion and/or self-reading a bunch of articles and in no way influenced by my employer.)