Load Balancing (IV)

Hardware based load balancing

A hardware load-balancing device, also known as a layer 4-7 router, is a computer appliance that is used to split network load across multiple servers based on factors such as CPU processor utilization, the number of connections or the overall server performance.

The use of an this kind of appliances minimizes the probability that any particular server will be overwhelmed and optimizes the bandwidth available to each computer or terminal. In addition, the use of an hardware load-balancing device can minimize network downtime, facilitate traffic prioritization, provide end-to-end application monitoring, provide user authentication, and help protect against malicious activity such as Denial-of-Service (DoS) attacks.

The basic principle is that network traffic is sent to a shared IP called a virtual IP (VIP), or listening IP and this address is attached to the load balancer. Once the load balancer receives a request on this VIP it will need to make a decision on where to send it and this decision is normally controlled by a load balancing algorithm, a server health check or a rule set.

The request is then sent to the appropriate server and the server will produce a response that, depending on the type of load balancer in use, will be sent either back to the load balancer, in the case of a Layer 7 device, or more typically with a Layer 4 device, directly back to the end user (normally via its default gateway).
In the case of a proxy based load balancer, the request from the web server can be returned to the load balancer and manipulated before being sent back to the user. This manipulation could involve content substitution or compression and some top end devices offer full scripting capability.

Load Balancing Algorithms

Load balancers use different algorithms to control traffic and with the specific goal of intelligently distribute load and/or maximize the utilization of all servers within the cluster.

Random Allocation

In a random allocation, the traffic is assigned to any server picked randomly among the group of destination servers. In such a case, one of the servers may be assigned many more requests to process while the other servers are sitting idle. However, on average, each server gets an approximately equal share of the load due to the random selection. Although simple to implement it can lead to the overloading of one server or more while under-utilization of others.

Load Balancing (III)

Before we go any deeper into the abyss of all the techniques and algorithms used in the load balancing world it is important to clarify some concepts and notions and take a look at the most used load balancing terminology. The target audience of this blog is supposed to know what the OSI Model is and therefore I won’t even bother to explain what the layers are...  

Server health checking

Server health checking is the ability of the load balancer to run a test against the servers to determine if they are providing service:

  • Ping: This is the most simple method, however it is not very reliable as the server can be up whilst the web service could be down;

  • TCP connect:  This is a more sophisticated method which can check if a service is up and running like a service on port 80 for web. i.e. try and open a connection to that port on the real server;

  • HTTP GET HEADER: This will make a HTTP GET request to the web server and typically check for a header response such as 200 OK;

  • HTTP GET CONTENTS:  This will make a HTTP GET and check the actual content body for a correct response. Can be useful to check a dynamic web page that returns 'OK' only if some application health checks work i.e. backend database query validates. This feature is only available on some of the more advanced products but is the superior method for web applications as its will check that the actual application is available.

Layer-2 Load Balancing

Layer-2 load balancing (also referred as link aggregation, port aggregation, ether channel or gigabit ether channel port bundling) is to bond two or more links into a single, higher-bandwidth logical link. Aggregated links also provide redundancy and fault tolerance if each of the aggregated links follows a different physical path.

Load Balancing (II)

Client Based Load Balancing

It might be easier to make the client code and resources highly available and scalable than to do so for the servers; serving non-dynamic content requires fewer server resources. Before going into the details, let us consider a desktop application that needs to connect to servers on the internet to retrieve data. If our theoretical desktop application generates more requests to the remote server than it can handle, we will need a load balancing solution.

Server based load balancing

Instead of letting the client know of only one server from which to retrieve data, we can provide many servers—s1.mywebsite.com, s2.mywebsite.com, and so on. The desktop client randomly selects a server and attempts to retrieve data. If the server is not available, or does not respond in a preset time period, the client can select another server until the data is retrieved. Unlike web applications—which store the client code (JavaScript code or Flash SWF) on the same server that provides data and resource—the desktop client is independent of the server and able to load balance servers from the client side to achieve scalability for the application.

Client based load balancing

Load Balancing (I)

Load Balancing

The steady growth of the Internet is causing many performance problems, including low response times, network congestion and disruption of services either caused by normal system overload or by cyber attacks (DDoS). The most widely used solution to minimize or solve these problems in Load Balancing.

Load balancing is dividing the amount of work that a computer has to do between two or more computers so that more work gets done in the same amount of time and, in general, all users get served faster.

Load Balancing (sometimes also referred as to Network Load Balancing or Server Load Balancing) can also be described as the process of distributing service requests across a group of servers. This addresses several requirements that are becoming increasingly important in networks:

  • Increased scalability: When many content-intensive applications scale beyond the point where a single server can provide adequate processing power, it is increasingly important to have the flexibility to deploy additional servers quickly and transparently to end-users;


  • High performance: The highest performance is achieved when the processing power of servers is used intelligently. An advanced load balancing infrastructure can direct end-user service requests to the servers that are least busy and therefore capable of providing the fastest response time;


  • High availability and disaster recovery: The third benefit of load balancing is its ability to improve application availability. If an application or server fails, load balancing can automatically redistribute end-user service requests to other servers within a server cluster or to servers in another location;

On the Internet, companies whose Web sites get a great deal of traffic usually use load balancing. When a single Web Server machine isn’t enough to handle the traffic in a Web site it’s time to look into building a Web Farm that uses multiple machines on the network acting as a single server. In a web farm, services or applications can be installed onto multiple servers that are configured to share the workload. This type of configuration is a load-balanced cluster which scales the performance of server-based programs, such as a Web server, by distributing client requests across multiple servers.

High Availability – Networks (II)

Redundant Protocols

If you read the previous posts, your network already has redundant links and now you must decide how packets on the network will select their paths and avoid loops. This isn't a new problem; redundant paths have been addressed by protocols like Spanning Tree Protocol (STP) at Layer 2 and routing protocols like Open Shortest Path First (OSPF) at Layer 3. But these protocols can take 40 seconds or more to resolve and converge and this is unacceptable for critical networks, especially those with real-time applications like VoIP and video.
STP is a link management protocol that provides path redundancy while preventing undesirable loops in the network. For an Ethernet network to function properly, only one active path can exist between two stations. This protocol should be used in situations where you want redundant links, but not loops. Redundant links are as important as backups in the case of a failover in a network. A failure of your primary router activates the backup links so that users can continue to use the network. Without STP on the bridges and switches, such a failure can result in a loop.
To provide path redundancy, STP defines a tree that spans all switches in an extended network and forces certain redundant data paths into a standby (blocked) state. If one network segment in the STP becomes unreachable, or if STP costs change, the spanning-tree algorithm reconfigures the spanning-tree topology and reestablishes the link by activating the standby path.
An upgraded version of STP called RSTP (Rapid Spanning Tree 802.1w) cuts the convergence time of STP to about one second. One disadvantage to RSTP (and STP) is that only one of the redundant links can be active at a time in an "active standby" configuration another is that STP when changes the active path to another router, so the gateway addresses of the clients must change as well. To avoid these problems, you must run Virtual Router Redundancy Protocol (VRRP) along with STP and RSTP on your routers, which emulates one virtual router address for the core routers and takes about three seconds to fail over.
The advantage of using VRRP is that you gain a higher availability for the default path without requiring configuration of dynamic routing or router discovery protocols on every end host. VRRP routers viewed as a "redundancy group" share the responsibility for forwarding packets as if they "owned" the IP address corresponding to the default gateway configured on the hosts. One of the VRRP routers acts as the master and others as backups; if the master router fails, a backup router becomes the new master. In this way, router redundancy is always provided, allowing traffic on the LAN to be routed without relying on a single router.
But because VRRP and RSTP work independently, it's possible VRRP will designate one router as master and RSTP would determine the path to the backup router as the preferred path. Worst case, this means if the backup VRRP router receives traffic, it will immediately forward it to the master router for processing, adding a router hop.

High Availability - Networks (I)

Redundant devices

Today’s networks are high-tech and most times high speed. Common to most Wide Area Network (WAN) designs is the need for a backup to take over in case of any type of failure to your main link. A simple scenario would be if you had a single T1 connection from your core site to each remote office or branch office you connect with. What if that link went down? How would you continue your operations if it did?
Adding redundancy is the most common way to increase your uptime. First, make sure there's redundancy within your core router; redundant CPU cards, power supplies and fans usually can be added to chassis-based routers and switches, and some router and switch vendors have equipment with dual backplanes. With redundant CPU cards, you can force a failover to one card while you upgrade the second one, instead of having to bring the whole router down for the upgrade.
The goal of redundant topologies is to eliminate network downtime caused by a single point of failure. All networks need redundancy for enhanced reliability and this is achieved through reliable equipment and network designs that are tolerant to failures and faults and networks should be designed to reconverge rapidly so that the fault is bypassed.
Network redundancy is a simple concept to understand. If you have a single point of failure and it fails you, then you have nothing to rely on. If you put in a secondary (or tertiary) method of access, then when the main connection goes down, you will have a way to connect to resources and keep the business operational.
The critical point is that highly reliable network equipment is expensive because it is designed not to break and this typically includes things like dual power supplies, watchdog processors and redundant disk systems.
A highly available system may be built out of less expensive network products but these components may lack the redundant power supplies or other features of high-reliability equipment, and therefore, they may fail more often than the more expensive equipment. However, if the overall network design takes into account the fact that equipment may fail, then end users will still be able to access the network even if something goes wrong.