High Availability Networks and Servers

Showing posts with label High Availability. Show all posts

How to use the Virtualization Lab (II)

Picking up from where I left, it was now time to change the setup into something very different. The first step was the creation of another VM inside Hyper-V to be used as an alternative source for iSCSI storage. I achieved this by installing the Microsoft iSCSI Target 3.3 on a new Server 2008 R2 x64 VM. I created this machine with two vhd files; one for the OS and the other one for the iSCSI storage.

I will now show you the steps taken to create three new iSCSI virtual disks:

Creation of the iSCSI target:

How to use the Virtualization Lab (I)

I finished last post on this series with a fully working cluster installed between two Hyper-V virtual machines (VM) using a virtual iSCSI solution installed on a Virtual Box VM as depicted in the next picture:

Before moving on in the process of adding complexity to the lab scenario, don't forget to safeguard your work; although this just a lab, it doesn't reduce the nuisance of having to reinstall everything in the event of any failure. So, create VM snapshots:

High Availability with Failover Clusters

Before moving on to the next chapter on my virtualization lab series, I think this might be a good opportunity to review some of the clustering options available today. I will use Windows Server Failover Clustering with Hyper-V because in today's world the trend is to combine Virtualization with High Availability (HA).

There are many ways to implement these solutions and the basic design concepts presented here can be adjusted to other virtualization platforms. Some of them will actually not guarantee a fault-tolerant solution, but most of them can be used in specific scenarios (even if only for demonstration purposes).

Two virtual machines on one physical server

In this scenario an HA cluster is built between two (or more) virtual machines on a single physical machine. Here we have a single physical server running Hyper-V and two child partitions where you run Failover Clustering. This setup does not protect against hardware failures because when the physical server fails, both (virtual) cluster nodes will fail. Therefore, the physical machine itself is a single point of failure (SPOF).

(Click to enlarge)

Failover Cluster Networking

The first step in the setup of a failover cluster is the creation of an AD domain because all the cluster nodes have to belong to the same domain. But before doing so, I changed the networks settings again in order to adjust them for this purpose.

LAB-DC:IP: 192.168.1.10
Gateway: 192.168.1.1 (Physical Router)
DNS: 127.0.0.1
Alternate DNS: 192.168.1.1

LAB-NODE1:
IP: 192.168.1.11
Gateway: 192.168.1.1
DNS: 192.168.1.10 (DC)
Alternate DNS: 192.168.1.1 (Physical Router)

LAB-NODE2:IP: 192.168.1.12
Gateway: 192.168.1.1
DNS: 192.168.1.10
Alternate DNS: 192.168.1.1

LAB-NODE3:
IP: 192.168.1.13
Gateway: 192.168.1.1
DNS: 192.168.1.10
Alternate DNS: 192.168.1.1

LAB-STORAGE:IP: 192.168.1.14
Gateway: 192.168.1.1
DNS: 192.168.1.10
Alternate DNS: 192.168.1.1

Therefore, I created a domain comprised of 5 machines; a DC and two member servers as Hyper-V VMs, a member server as a VMware VM and another member server as a VirtualBox VM.

So far I have demonstrated the possibility of integrating in the same logical infrastructure virtualized servers running on different platforms using different virtualization techniques; in this case we have VMs running in a Type 1 hypervisor (Hyper-V) and in two distinct Type 2 hypervisors (VMware Workstation and VirtualBox).

The option to create a network with static IP addresses is as valid as the alternative of using DHCP. Later on I plan to explore the several options provided by the cluster networking in Windows 2008 but for the time being I kept my network in a simple and basic configuration in order to proceed with the lab installation.

Hardware-Assisted Virtualization Explained

Hardware-assisted virtualization was first introduced on the IBM System/370 in 1972, for use with VM/370, the first virtual machine operating system. Virtualization was forgotten in the late 1970s but the proliferation of x86 servers rekindled interest in virtualization driven for the need for server consolidation; virtualization allowed a single server to replace multiple underutilized dedicated servers.

However, the x86 architecture did not meet the Popek and Goldberg Criteria to achieve the so called “classical virtualization″. To compensate for these limitations, virtualization of the x86 architecture has been accomplished through two methods: full virtualization or paravirtualization. Both create the illusion of physical hardware to achieve the goal of operating system independence from the hardware but present some trade-offs in performance and complexity.

Thus, Intel and AMD have introduced their new virtualization technologies, a handful of new instructions and — crucially — a new privilege level. The hypervisor can now run at "Ring -1"; so the guest operating systems can run in Ring 0.

Hardware virtualization leverages virtualization features built into the latest generations of CPUs from both Intel and AMD. These technologies, known as Intel VT and AMD-V respectively, provide extensions necessary to run unmodified virtual machines without the overheads inherent in full virtualization CPU emulation. In very simplistic terms these new processors provide an additional privilege mode below ring 0 in which the hypervisor can operate essentially leaving ring 0 available for unmodified guest operating systems.

Operating System-Level Virtualization Explained

This kind of server virtualization is a technique where the kernel of an operating system allows for multiple isolated user-space instances. These instances run on top of an existing host operating system and provide a set of libraries that applications interact with, giving them the illusion that they are running on a machine dedicated to its use. The instances are known as Containers, Virtual Private Servers or Virtual Environments.

Operating system level virtualization is achieved by the host system running a single OS kernel and through its control of guest operating system functionality. Under this shared kernel virtualization the virtual guest systems each have their own root file system but share the kernel of the host operating system.

Paravirtualization Explained

“Para“ is an English affix of Greek origin that means "beside," "with," or "alongside.” Paravirtualization is another approach to server virtualization where, rather than emulate a complete hardware environment, paravirtualization acts as a thin layer, which ensures that all of the guest operating systems share the system resources and work well together.

Under paravirtualization, the kernel of the guest operating system is modified specifically to run on the hypervisor. This typically involves replacing any privileged operations that will only run in ring 0 of the CPU, with calls to the hypervisor (known as hypercalls). The hypervisor in turn performs the task on behalf of the guest kernel and also provides hypercall interfaces for other critical kernel operations such as memory management, interrupt handling and time keeping.

Full Virtualization Explained

This is probably the most common and most easily explained kind of server virtualization. When IT departments were struggling to get results with machines at full capacity, it made sense to assign one physical server to every IT function taking advantage of cheap hardware A typical enterprise would have one box for SQL, one for the Apache server and another physical box for the Exchange server. Now, each of those machines could be using only 5% of its full processing potential. This is where hardware emulators come into play in an effort to consolidate those servers.

A hardware emulator presents a simulated hardware interface to guest operating systems. In hardware emulation, the virtualization software (usually referred to as a hypervisor) actually creates an artificial hardware device with everything it needs to run an operating system and presents an emulated hardware environment that guest operating systems operate upon. This emulated hardware environment is typically referred to as a Virtual Machine Monitor or VMM.

Hardware emulation supports actual guest operating systems; the applications running in each guest operating system are running in truly isolated operating environments. This way, we can have multiple servers running on a single box, each completely independent of the other. The VMM provides the guest OS with a complete emulation of the underlying hardware and for this reason, this kind of virtualization is also referred to as Full Virtualization.

Thus, full virtualization provides a complete simulation of the underlying hardware and is a technique used to provide support for unmodified guest operating systems. This means that all software, operating systems and applications, which can run natively on the hardware can also be run in the virtual machine.

The term unmodified refers to operating system kernels which have not been altered to run on a hypervisor and therefore still execute privileged operations as though running in ring 0 of the CPU. Full virtualization uses the hypervisor to coordinate the CPU of the server and the host machine's system resources in order to manage the running of guest operating systems without any modification. In this scenario, the hypervisor provides CPU emulation to handle and modify privileged and protected CPU operations made by unmodified guest operating system kernels.

The guest operating system makes system calls to the emulated hardware. These calls, which would actually interact with underlying hardware, are intercepted by the virtualization hypervisor which maps them onto the real underlying hardware. The hypervisor provides complete independence and autonomy of each virtual server to other virtual servers running on the same physical machine. Each guest server has its own operating system and it may even happen that one guest server is running Linux and the other is running Windows.

The hypervisor also monitors and controls the physical server resources, allocating what is needed to each operating system and making sure that the guest operating systems (the virtual machines) cannot disrupt each other. With full virtualization, the guest OS is not aware it is being virtualized and thus requires no modification.

Type 1 Hypervisor

Also known as Native or Bare-Metal Virtualization, this is a technique where the abstraction layer sits directly on the hardware and all the other blocks reside on top of it. The Type 1 hypervisor runs directly on the hardware of the host system in ring 0. The task of this hypervisor is to handle resource and memory allocation for the virtual machines in addition to providing interfaces for higher level administration and monitoring tools.The operating systems run on another level above the hypervisor.

Clearly, with the hypervisor occupying ring 0 of the CPU, the kernels for any guest operating systems running on the system must run in less privileged CPU rings. The Type 1 Hypervisor contains functionalities like CPU scheduling and Memory Management and, even though there is no Host OS, usually one of the Virtual Machines has certain privileged status (Control/Admin/Parent VM).

Unfortunately, most operating system kernels are written explicitly to run in ring 0 for the simple reason that they need to perform tasks that are only available in that ring, such as the ability to execute privileged CPU instructions and directly manipulate memory.

Also, depending on the architecture, the hypervisor may either contain the drivers for the hardware resources (referred to as a Monolithic Hypervisor) or the drivers may be retained at the Guest OS level itself (in which case it can be called a Microkernelized Hypervisor).

Since it has low level direct access to the hardware, a Type 1 hypervisor is more efficient than a hosted application and delivers greater performance because it uses fewer resources (no separate CPU cycles or memory footprint as in the case of a full-fledged Host OS).

The disadvantage of this model is that there is dependency on the hypervisor for the drivers (at least in case of the Monolithic Hypervisor). Besides, most implementations of the bare-metal approach require specific virtualization support at the hardware level (“Hardware-assisted”, to be discussed in a future post).

Examples of this architecture are Microsoft Hyper-V and VMware ESX Server.

Note: Microsoft Hyper-V (released in June 2008) exemplifies a Type 1 product that can be mistaken for a Type 2. Both the free stand-alone version and the version that is part of the commercial Windows Server 2008 product use a virtualized Windows Server 2008 parent partition to manage the Type 1 Hyper-V hypervisor. In both cases the Hyper-V hypervisor loads prior to the management operating system, and any virtual environments created run directly on the hypervisor, not via the management operating system.

Type 2 Hypervisor

Guest OS Virtualization is perhaps the easiest concept to understand. In this scenario the physical host computer system runs a standard unmodified operating system such as Windows, Linux, Unix or MacOS X, and the virtualization layer runs on top of that OS being in effect a hosted application. In this architecture, the VMM provides each virtual machine with all the services of the physical system, including a virtual BIOS, virtual devices and virtual memory. This has the effect of making the guest system think it is running directly on the system hardware, rather than in a virtual machine within an application.

The OS itself provides the abstraction layer (known as Type 2 Hypervisor), such that it allows other OSes to reside within; thus creating virtual machines. This architecture then can be called as Hosted Virtualization (since an OS is ‘hosting’ it), as depicted below.

The virtualization layer in the above figure contains the software needed for hosting and managing the virtual machines but its exact functionality can vary based on the different architectures from different vendors. In this approach, the Guest OS relies on the underlying Host OS for access to the hardware resources.

The multiple layers of abstraction between the guest operating systems and the underlying host hardware are not conducive to high levels of virtual machine performance. This technique does, however, have the advantage that no changes are necessary to either host or guest operating systems and no special CPU hardware virtualization support is required.

Examples of a Type 2 Hypervisor are Oracle's VirtualBox, VMware Server, VMware Workstation and Microsoft Virtual PC.

Embedded Hypervisor

In Kernel Level Virtualization the host operating system runs on a specially modified kernel which contains extensions designed to manage and control multiple virtual machines each containing a guest operating system.

The virtualization layer is embedded into an operating system kernel and each guest runs its own kernel, although restrictions apply in that the guest operating systems must have been compiled for the same hardware as the kernel in which they are running.

The real benefit with this approach is that the hypervisor code is dramatically leaner than either Type 1 or Type 2. With the hypervisor embedded into the Linux kernel, the guest operating systems benefit from excellent disk and network I/O performance.

Examples of kernel level virtualization technologies include User Mode Linux (UML) and Kernel-based Virtual Machine (KVM).

Full Virtualization Advantages

This approach to virtualization means that applications run in a truly isolated guest OS, with one or more of these guest OSs running simultaneously on the same hardware. Not only does this method support multiple OSes, it can support dissimilar OSes, differing in minor ways (for example, version and patch level) or in major ways (for example, completely different OSes like Windows and Linux);

The guest OS is not aware it is being virtualized and requires no modification. Full virtualization is the only option that requires no hardware assist or operating system assist to virtualize sensitive and privileged instructions. The hypervisor translates all operating system instructions on the fly and caches the results for future use, while user level instructions run unmodified at native speed;

The VMM provides a standardized hardware environment that the guest OS resides on and interacts with. Because the guest OS and the VMM form a consistent package, that package can be migrated from one machine to another, even though the physical machines the packages run upon may differ;

Full virtualization offers the best isolation and security for virtual machines, and simplifies migration and portability as the same guest OS instance can run virtualized or on native hardware.

Full Virtualization Limitations

The virtualization software hurts performance, which is to say that applications often run somewhat slower on virtualized systems than if they were run on unvirtualized systems. The hypervisor needs data processing, which means that part of the computing power of a physical server and related resources should be reserved for the hypervisor program. While the VMMs appears to solve the entire problem with regard to virtualized machines, it does bring in some level of performance degradation, caused by the extra processing (in terms of the ‘instruction translation’) that the hypervisor has to do. This can have a negative impact on overall server performance and slow down the applications;

The hypervisor must contain the interfaces to the resources of the machine; these interfaces are referred to as device drivers. Because hardware emulation uses software to trick the guest OS into communicating with simulated non-existent hardware, this approach has created some driver compatibility problems. The issue is that the hypervisor contains the device drivers and it might be difficult for new device drivers to be installed by users (unlike on your typical PC). Consequently, if a machine has hardware resources the hypervisor has no driver for, the virtualization software can’t be run on that machine. This can cause problems, especially for organizations that want to take advantage of new hardware developments.

Server Virtualization Explained

You have probably heard about lots of distinct types of server virtualization; full, bare metal, para-virtualization, guest OS, OS assisted, hardware assisted, hosted, OS level, kernel level, shared kernel, hardware emulation, hardware virtualization, hypervisor based, containers or native virtualization. Confusing, right?

Fear not my faithful readers; the whole purpose of this blog is exactly to explain these things so that everyone can have a clear view over issues usually restricted to a bunch of geeks. But keep in mind that some of these terms are popularized by certain vendors and do not have a common industry-wide acceptance. Plus, many of the terms are used rather loosely and interchangeably (which is why they are so confusing).

Although others classify the current virtualization techniques in a different way, I will use the following criteria:

On the following exciting chapters I will explain these techniques, one by one, but before that I believe it would be useful to give you a quick introduction to some underlying concepts.

Server Virtualization

Out of all three of the different types of virtualization discussed in this blog, I believe that server virtualization is the type of virtualization everybody is most familiar with. When people say "virtualization", they are usually referring to server virtualization because this is the main area of virtualization, whereby a number of “virtual machines” are created on one server meaning that multiple tasks can then be assigned to the one server, saving on processing power, cost and space.

Server virtualization inserts a layer of abstraction between the physical server hardware and the software that runs on the server allowing us to run multiple guest computers on a single host computer with those guest computers believing they are running on their own hardware. The physical machine is translated into one or more virtual machines (VMs). Each VM runs its own operating system and applications, and each utilizes some allocated portion of the server's processing resources such as CPU, memory, network access and storage I/O. This means that any network tasks that are happening on the server still appear to be on a separate space, so that any errors can be diagnosed and fixed quickly.

By doing this, we gain all the benefits of any type of virtualization: portability of guest virtual machines, reduced operating costs, reduced administrative overhead, server consolidation, testing & training, disaster recovery benefits, and more.

Storage Area Network

A Storage Area Network (SAN) is a dedicated high-performance subnet that provides access to consolidated, block level data storage and is primarily used to transfer data between computer systems and storage elements and among multiple storage elements, making storage devices, such as disk arrays, tape libraries, and optical jukeboxes, accessible to servers so that the devices appear like locally attached devices to the operating system.

A SAN typically has its own communication infrastructure that is generally not accessible through the local area network by other devices. A SAN moves data among various storage devices, allowing for the sharing data between different servers, and provides a fast connection medium for backing up, restoring, archiving, and retrieving data. SAN devices are usually installed closely in a single room, but they can also be connected over long distances, making it very useful to large companies.

SAN Benefits

The primary benefits of a SAN are:

High Availability: One copy of every piece of data is always accessible to any and all hosts via multiple paths;
Reliability: Dependable data transportation ensures a low error rate, and fault tolerance capabilities;
Scalability: Servers and storage devices may be added independently of one another and from any proprietary systems;
Performance: Fibre Channel (the standard method for SAN interconnectivity) has now over than 2000MB/sec bandwidth and low overhead, and it separates storage and network I/O;

RAID Concepts

The acronym RAID stands for Redundant Array of Inexpensive Disks and is a technology that provides increased storage functions and reliability through redundancy. It was developed using a large number of low cost hard drives linked together to form a single large capacity storage device that offered superior performance, storage capacity and reliability over older storage systems. This was achieved by combining multiple disk drive components into a logical unit, where data was distributed across the drives in one of several ways called "RAID levels".
This concept of storage virtualization and was first defined as Redundant Arrays of Inexpensive Disks but the term later evolved into Redundant Array of Independent Disks as a means of dissociating a low-cost expectation from RAID technology.
There are two primary reasons that RAID was implemented:

Redundancy: This is the most important factor in the development of RAID for server environments. A typical RAID system will assure some level of fault tolerance by providing real time data recovery with uninterrupted access when hard drive fails;

Increased Performance: The increased performance is only found when specific versions of the RAID are used. Performance will also be dependent upon the number of drives used in the array and the controller;

Hardware-based RAID

When using hardware RAID controllers, all algorithms are generated on the RAID controller board, thus freeing the server CPU. On a desktop system, a hardware RAID controller may be a PCI or PCIe expansion card or a component integrated into the motherboard. These are more robust and fault tolerant than software RAID but require a dedicated RAID controller to work.

Hardware implementations provide guaranteed performance, add no computational overhead to the host computer, and can support many operating systems; the controller simply presents the RAID array as another logical drive

Software-based RAID

Many operating systems provide functionality for implementing software based RAID systems where the OS generate the RAID algorithms using the server CPU. In fact the burden of RAID processing is borne by a host computer's central processing unit rather than the RAID controller itself which can severely limit the RAID performance.

Although cheap to implement it does not guarantee any kind of fault tolerance; should a server fail the whole RAID system is lost.

Hot spare drive

Both hardware and software RAIDs with redundancy may support the use of hot spare drives, a drive
physically installed in the array which is inactive until an active drive fails. The system then automatically replaces the failed drive with the spare, rebuilding the array with the spare drive included. This reduces the mean time to recovery (MTTR), but does not completely eliminate it. Subsequent additional failure(s) in the same RAID redundancy group before the array is fully rebuilt can result in data loss. Rebuilding can take several hours, especially on busy systems.

Cluster Node Configurations

The most common size for an high availability cluster is a two-node cluster, since that's the minimum required to assure redundancy, but many clusters consist of many more, sometimes dozens of nodes and such configurations can be categorized into one of the following models:

Active/Passive Cluster

In an Active/Passive (or asymmetric) configuration, applications run on a primary, or master, server. A dedicated redundant server is present to take over on any failure but apart from that it is not configured to perform any other functions. Thus, at any time, one of the nodes is active and the other is passive. This configuration provides a fully redundant instance of each node, which is only brought online when its associated primary node fails.
The active/passive cluster generally contains two identical nodes. Database applications single instances are installed on both nodes, but the database is located on shared storage. During normal operation, the database instance runs only on the active node. In the event of a failure of the currently active primary system, clustering software will transfer control of the disk subsystem to the secondary system. As part of the failover process, the database instance on the secondary node is started, thereby resuming the service.

This configuration is the simplest and most reliable but typically requires the most extra hardware.

Cluster Quorum Configurations

In simple terms, the quorum for a cluster is the number of elements that must be online for that cluster to continue running. Server clusters require a quorum resource to function and this, like any other resource, is a resource which can only be owned by one server at a time, and for which servers can negotiate for ownership. In effect, each element can cast one “vote” to determine whether the cluster continues running. The voting elements are nodes or, in some cases, a disk witness or file share witness. The quorum resource is used to store the definitive copy of the cluster configuration so that regardless of any sequence of failures, the cluster configuration will always remain consistent. Each voting element (with the exception of a file share witness) contains a copy of the cluster configuration, and the Cluster service works to keep all copies synchronized at all times.

When network problems occur, they can interfere with communication between cluster nodes. A small set of nodes might be able to communicate together across a functioning part of a network but not be able to communicate with a different set of nodes in another part of the network. This can cause serious issues. In this "split" situation, at least one of the sets of nodes must stop running as a cluster.
Negotiating for the quorum resource allows Server clusters to avoid "split-brain" situations where the servers are active and think the other servers are down.
To prevent the issues that are caused by a split in the cluster, the cluster software requires that any set of nodes running as a cluster must use a voting algorithm to determine whether, at a given time, that set has quorum. Because a given cluster has a specific set of nodes and a specific quorum configuration, the cluster will know how many "votes" constitutes a majority (that is, a quorum). If the number drops below the majority, the cluster stops running. Nodes will still listen for the presence of other nodes, in case another node appears again on the network, but the nodes will not begin to function as a cluster until the quorum exists again.

The failover mechanism

In the cluster, shared resources can be seen from all computers, or nodes. Each node automatically senses if another node in the cluster has failed, and processes running on the failed node continue to run on an operational one. To the user, this failover is usually transparent. Depicted in the next figure is a two-node cluster; both nodes have local disks and shared resources available to both computers. The heartbeat connection allows the two computers to communicate and detects when one fails; if Node 1 fails, the clustering software will seamlessly transfer all services to Node 2.

The Windows service associated with the failover cluster named Cluster Service has a few components:

Event processor;
Database manager;
Node manager;
Global update manager
Communication manager;
Resource (or failover) manager.

The resource manager communicates directly with a resource monitor that talks to a specific application DLL that makes an application cluster-aware. The communication manager talks directly to the Windows Winsock layer (a component of the Windows networking layer).
The failover process is as follows:

Clustering Basics

Clustering is the use of multiple computers and redundant interconnections to form what appears to be a single, highly available system. A cluster provides protection against downtime for important applications or services that need to be always available by distributing the workload among several computers in such a way that, in the event of a failure in one system, the service will be available on another.

The basic concept of a cluster is easy to understand; a cluster is two or more systems working in concert to achieve a common goal. Under Windows, two main types of clustering exist: scale-out/availability clusters known as Network Load Balancing (NLB) clusters, and strictly availability-based clusters known as failover clusters. Microsoft also has a variation of Windows called Windows Compute Cluster Server.

When a computer unexpectedly falls or is intentionally taken down, clustering ensures that the processes and services being run switch to another machine, or "failover," in the cluster. This happens without interruption or the need for immediate admin intervention providing a high availability solution, which means that critical data is available at all times.

Software based load balancing

Let’s now take a glance at the load balancing solutions implemented without the need for a dedicated piece of hardware like the ADCs we’ve discussed in the previous posts. Although there are several available software solutions for the Unix/Linux world, I will focus primarily on Microsoft Windows technologies. In the future I plan to write a series of step by step tutorials and then I might do it also for the Linux community.

DNS Load Balancing

DNS load balancing is a popular yet simple approach to balancing server requests and consists basically in creating multiple DNS entries in the DNS record for the domain meaning that the authoritative DNS server contains multiple “A” records for a single host.

Let’s imagine we want to balance the load on www.mywebsite.com, and we have three web servers with IP addresses of 64.13.192.120, 64.13.192.121, and 64.13.192.122 respectively, each is running a complete copy of the website, so no matter which server a request is directed to, the same response is provided.

To implement this, simply create the following DNS entries:

www.mywebsite.com    64.13.192.120
www.mywebsite.com    64.13.192.121
www.mywebsite.com    64.13.192.122

When a DNS request comes to the DNS server to resolve the domain name, it might give out one of the server IP addresses based on scheduling strategies, such as simple round-robin scheduling or geographical scheduling thus redirecting the request to one of the servers in a server group. Once the domain is resolved to one of the servers, subsequent requests from the clients using the same local caching DNS server are sent to the same server but request coming from other local DNSs will be sent to another server. This process is known as Round Robin DNS (RRDNS).

Hardware based load balancing

A hardware load-balancing device, also known as a layer 4-7 router, is a computer appliance that is used to split network load across multiple servers based on factors such as CPU processor utilization, the number of connections or the overall server performance.

The use of an this kind of appliances minimizes the probability that any particular server will be overwhelmed and optimizes the bandwidth available to each computer or terminal. In addition, the use of an hardware load-balancing device can minimize network downtime, facilitate traffic prioritization, provide end-to-end application monitoring, provide user authentication, and help protect against malicious activity such as Denial-of-Service (DoS) attacks.

The basic principle is that network traffic is sent to a shared IP called a virtual IP (VIP), or listening IP and this address is attached to the load balancer. Once the load balancer receives a request on this VIP it will need to make a decision on where to send it and this decision is normally controlled by a load balancing algorithm, a server health check or a rule set.

The request is then sent to the appropriate server and the server will produce a response that, depending on the type of load balancer in use, will be sent either back to the load balancer, in the case of a Layer 7 device, or more typically with a Layer 4 device, directly back to the end user (normally via its default gateway).
In the case of a proxy based load balancer, the request from the web server can be returned to the load balancer and manipulated before being sent back to the user. This manipulation could involve content substitution or compression and some top end devices offer full scripting capability.

Load Balancing Algorithms

Load balancers use different algorithms to control traffic and with the specific goal of intelligently distribute load and/or maximize the utilization of all servers within the cluster.

Random Allocation

In a random allocation, the traffic is assigned to any server picked randomly among the group of destination servers. In such a case, one of the servers may be assigned many more requests to process while the other servers are sitting idle. However, on average, each server gets an approximately equal share of the load due to the random selection. Although simple to implement it can lead to the overloading of one server or more while under-utilization of others.

Load Balancing (III)

Before we go any deeper into the abyss of all the techniques and algorithms used in the load balancing world it is important to clarify some concepts and notions and take a look at the most used load balancing terminology. The target audience of this blog is supposed to know what the OSI Model is and therefore I won’t even bother to explain what the layers are...

Server health checking

Server health checking is the ability of the load balancer to run a test against the servers to determine if they are providing service:

Ping: This is the most simple method, however it is not very reliable as the server can be up whilst the web service could be down;

TCP connect: This is a more sophisticated method which can check if a service is up and running like a service on port 80 for web. i.e. try and open a connection to that port on the real server;

HTTP GET HEADER: This will make a HTTP GET request to the web server and typically check for a header response such as 200 OK;

HTTP GET CONTENTS: This will make a HTTP GET and check the actual content body for a correct response. Can be useful to check a dynamic web page that returns 'OK' only if some application health checks work i.e. backend database query validates. This feature is only available on some of the more advanced products but is the superior method for web applications as its will check that the actual application is available.

Layer-2 Load Balancing

Layer-2 load balancing (also referred as link aggregation, port aggregation, ether channel or gigabit ether channel port bundling) is to bond two or more links into a single, higher-bandwidth logical link. Aggregated links also provide redundancy and fault tolerance if each of the aggregated links follows a different physical path.

Client Based Load Balancing

It might be easier to make the client code and resources highly available and scalable than to do so for the servers; serving non-dynamic content requires fewer server resources. Before going into the details, let us consider a desktop application that needs to connect to servers on the internet to retrieve data. If our theoretical desktop application generates more requests to the remote server than it can handle, we will need a load balancing solution.

Instead of letting the client know of only one server from which to retrieve data, we can provide many servers—s1.mywebsite.com, s2.mywebsite.com, and so on. The desktop client randomly selects a server and attempts to retrieve data. If the server is not available, or does not respond in a preset time period, the client can select another server until the data is retrieved. Unlike web applications—which store the client code (JavaScript code or Flash SWF) on the same server that provides data and resource—the desktop client is independent of the server and able to load balance servers from the client side to achieve scalability for the application.