Software based load balancing
Let’s now take a glance at the load balancing solutions implemented without the need for a dedicated piece of hardware like the ADCs we’ve discussed in the previous posts. Although there are several available software solutions for the Unix/Linux world, I will focus primarily on Microsoft Windows technologies. In the future I plan to write a series of step by step tutorials and then I might do it also for the Linux community.
DNS Load Balancing
DNS load balancing is a popular yet simple approach to balancing server requests and consists basically in creating multiple DNS entries in the DNS record for the domain meaning that the authoritative DNS server contains multiple “A” records for a single host.
Let’s imagine we want to balance the load on www.mywebsite.com, and we have three web servers with IP addresses of 184.108.40.206, 220.127.116.11, and 18.104.22.168 respectively, each is running a complete copy of the website, so no matter which server a request is directed to, the same response is provided.
To implement this, simply create the following DNS entries:
When a DNS request comes to the DNS server to resolve the domain name, it might give out one of the server IP addresses based on scheduling strategies, such as simple round-robin scheduling or geographical scheduling thus redirecting the request to one of the servers in a server group. Once the domain is resolved to one of the servers, subsequent requests from the clients using the same local caching DNS server are sent to the same server but request coming from other local DNSs will be sent to another server. This process is known as Round Robin DNS (RRDNS).
Generally, these are the steps as soon as a DNS query is made:
- When a client attempts to access the Website, a local DNS lookup is performed to determine what the corresponding IP address is;
- The address request reaches the authoritative DNS server for the domain;
- The first time this query is made, the remote DNS server might return all the address records it has for the website;
- The local DNS server then determines what address record to return to the client;
- If all records are returned, the client will take the first one that it is given;
- The server replies to the client and serves the request;
- With each request, the Round Robin algorithm rotates the returned addresses in the order in which they are;
- Each DNS query will result in a client using a different IP address;
- This address rotation will distribute the sending of requests to the servers.
- Some clients will cache the lookups they’ve performed in order to improve performance and following queries may not be performed because the address resolution has already been performed. The end result is that the same IP address record will be returned to multiple clients so the caching mechanism breaks this load balancing scheme;
- RRDNS doesn’t deal well with machines that are non-responsive because the DNS server has absolutely no means for monitoring the health of individual hosts and consequently, a DNS server using the Round Robin algorithm could very well return the IP address of a server that has been turned off or one that’s on but has had a crash in its services;
- Lastly, there are occasions when session state is important and there is a need to tie a client to the same server, which is something that cannot be done using Round Robin DNS.
Network Load Balancing
Microsoft’s Network Load Balancing (NLB) solution is a software-based implementation that runs on every cluster node by using a hashing algorithm that takes the IP address, or IP address and port, of an incoming request and determining which node (host) in the cluster will process that request. Every node within the cluster receives every packet of traffic and the determination as to what node is responsible for responding is made by applying a filter to each packet thus only one node will ultimately end up servicing a request.
The concept behind NLB is pretty simple: Each server in a Load Balancing Cluster (a loose term that's unrelated to the Microsoft Cluster Service) is configured with a 'virtual' IP address and this address is configured on all the servers that are participating in the 'cluster'. Whenever a request is made on this virtual IP, a network driver on each of these machines intercepts the request for the IP address and re-routes the request to one of the machines in the Load Balancing Cluster based on rules that can be configured for each of the servers in the cluster.
The NLB acts as a virtual network device, with its own IP address and the real devices (the actual Ethernet ports) are associated with the Network Load Balancing Software. Instead of using and advertising the ports' IP addresses, the system will use the NLB software's IP address and this will ultimately result in the NLB software and its ports looking like a single device to the clients.
As new machines join or are removed from the cluster, in order for the load to be distributed correctly among all active nodes, the algorithm must be re-executed in an extremely important re-evaluation process called convergence. It is also important to realize that at no time is NLB ever aware of what the load is on any particular cluster node because NLB cannot determine whether a node’s CPU usage is extremely high or that a node has little to no available memory to process a request.
Should a node in the cluster ever experience such resource saturation, NLB will continue sending it requests until the node stops sending heartbeat messages, at which time the node is automatically pulled from the cluster. However, if the node continues sending heartbeat messages, it will remain in the cluster, albeit in an unusable state.
In order to monitor the state of the cluster, all nodes broadcast a heartbeat message of 1500 bytes per second to all other nodes enabling any single node in the cluster to easily determine when a convergence operation is required. The frequency that these heartbeat messages are broadcast can be explicitly set by changing the value of the AliveMsgPeriod registry key but id this value ifs changed (1000 milliseconds), it must be changed on every node in the cluster to preclude NLB functionality problems.
NLB provides high availability for stateless applications like Web servers by adding additional servers as the load increases and can be used with approximately any TCP or UDP application. The term “stateless” refers to workloads that respond to each client request as an isolated transaction meaning that requests handled before a given client request have no impact on that current transaction.
A good example of this is a Web server; for each request for a Web page, the server gathers all of the necessary information to present that Web page to the client. The server then gathers all of the information it needs for the next client request, and so on. Because each request supplies all of the information that a stateless server needs to complete the transaction, it is a relatively simple matter for any given request to be handled by any of the identical instances of a server workload running on any of the hosts of an NLB cluster.
The following diagram illustrate the usage of NLB:
Each server in the cluster is fully self-contained, which means it should be able to function without any other in the cluster with the exception of the database (if present and which is not part of the NLB cluster). This means each server must be configured separately but if running a static site, all HTML files and images must be replicated across all servers. There are a number of significant benefits to deploying a NLB solution, including the following:
- Network Load Balancing is very efficient and can provide a very big performance improvement for each machine added into the cluster;
- NLB has a fault tolerance capability. Many other load balancing implementations, such as Round Robin DNS (RRDNS), continue to send requests to servers that have “died” until system administrators pick up on the fact that there is a problem and then manually perform a configuration change. The key is redundancy in addition to load balancing; if any machine in the cluster goes down, NLB will re-balance the incoming requests to the still running servers thus handling scenarios where a power supply has burnt out, a network card has gone bad, the primary hard disk has crashed, et cetera;
- This level of redundancy increasing the load balancing capability becomes simply a matter of adding machines to the cluster, which results in a practically unlimited application scalability;
- NLB works with any TCP or UDP application-based protocol. This means that it’s possible to configure a variety of NLB clusters within an organization, and each one can have its own specific function. For example, one cluster may be dedicated to handling all Internet-originated HTTP traffic while another may be used to serve all intranet requests. If the employees have a need for transferring files, there can be a FTP cluster acting as centralized file storage with closely monitored uploads and downloads;
- By far, one of the biggest advantages of NLB is its ease of use. NLB installs only a networking driver component – absolutely no special hardware is required. Not only does this facilitate the deployment of a load balancing solution, but it also significantly reduces costs.