Storage Networks and Servers

Showing posts with label Storage. Show all posts

How to use the Virtualization Lab (II)

Picking up from where I left, it was now time to change the setup into something very different. The first step was the creation of another VM inside Hyper-V to be used as an alternative source for iSCSI storage. I achieved this by installing the Microsoft iSCSI Target 3.3 on a new Server 2008 R2 x64 VM. I created this machine with two vhd files; one for the OS and the other one for the iSCSI storage.

I will now show you the steps taken to create three new iSCSI virtual disks:

Creation of the iSCSI target:

How to use the Virtualization Lab (I)

I finished last post on this series with a fully working cluster installed between two Hyper-V virtual machines (VM) using a virtual iSCSI solution installed on a Virtual Box VM as depicted in the next picture:

Before moving on in the process of adding complexity to the lab scenario, don't forget to safeguard your work; although this just a lab, it doesn't reduce the nuisance of having to reinstall everything in the event of any failure. So, create VM snapshots:

Failover Cluster Networking

The first step in the setup of a failover cluster is the creation of an AD domain because all the cluster nodes have to belong to the same domain. But before doing so, I changed the networks settings again in order to adjust them for this purpose.

LAB-DC:IP: 192.168.1.10
Gateway: 192.168.1.1 (Physical Router)
DNS: 127.0.0.1
Alternate DNS: 192.168.1.1

LAB-NODE1:
IP: 192.168.1.11
Gateway: 192.168.1.1
DNS: 192.168.1.10 (DC)
Alternate DNS: 192.168.1.1 (Physical Router)

LAB-NODE2:IP: 192.168.1.12
Gateway: 192.168.1.1
DNS: 192.168.1.10
Alternate DNS: 192.168.1.1

LAB-NODE3:
IP: 192.168.1.13
Gateway: 192.168.1.1
DNS: 192.168.1.10
Alternate DNS: 192.168.1.1

LAB-STORAGE:IP: 192.168.1.14
Gateway: 192.168.1.1
DNS: 192.168.1.10
Alternate DNS: 192.168.1.1

Therefore, I created a domain comprised of 5 machines; a DC and two member servers as Hyper-V VMs, a member server as a VMware VM and another member server as a VirtualBox VM.

So far I have demonstrated the possibility of integrating in the same logical infrastructure virtualized servers running on different platforms using different virtualization techniques; in this case we have VMs running in a Type 1 hypervisor (Hyper-V) and in two distinct Type 2 hypervisors (VMware Workstation and VirtualBox).

The option to create a network with static IP addresses is as valid as the alternative of using DHCP. Later on I plan to explore the several options provided by the cluster networking in Windows 2008 but for the time being I kept my network in a simple and basic configuration in order to proceed with the lab installation.

Virtualization (I)

Virtualization is a massively growing aspect of computing and IT creating "virtually" an unlimited number of possibilities for system administrators. Virtualization has been around for many years in some form or the other, the trouble with it, is being able to understand the different types of virtualization, what they offer, and how they can help us.

Virtualization has been defined as the abstraction of computer resources or as a technique for hiding the physical characteristics of computing resources from the way in which other systems, applications, or end users interact with those resources. In fact, virtualization always means abstraction. We make something transparent by adding layers that handle translations, causing previously important aspects of a system to become moot.

Storage Virtualization

The amount of data organizations are creating and storing is rapidly increasing due to the shift of business processes to Web-based digital applications and this huge amount of data is causing problems for many of them. First, many applications generate more data than can be stored physically on a single server. Second, many applications, particularly Internet-based ones, have multiple machines that need to access the same data. Having all of the data sitting on one machine can create a bottleneck, not to mention presenting risk from the situation where many machines might be made inoperable if a single machine containing all the application’s data crashes. Finally, the increase in the number of machines causes backup problems because trying to create safe copies of data is a tremendous task when there are hundreds or even thousands of machines that need data backup.

For these reasons, data has moved into virtualization. Companies use centralized storage (virtualized storage) as a way of avoiding data access problems. Furthermore, moving to centralized data storage can help IT organizations reduce costs and improve data management efficiency. The basic premise of storage virtualization solutions is not new. Disk storage has long relied on partitioning to organize physical disk tracks and sectors into clusters, and then abstract clusters into logical drive partitions (e.g., the C: drive). This allows the operating system to read and write data to the local disks without regard to the physical location of individual bytes on the disk platters.
Storage virtualization creates a layer of abstraction between the operating system and the physical disks used for data storage. The virtualized storage is then location-independent, which can enable more efficient use and better storage management. For example, the storage virtualization software or device creates a logical space, and then manages metadata that establishes a map between the logical space and the physical disk space. The creation of logical space allows a virtualization platform to present storage volumes that can be created and changed with little regard for the underlying disks.

The storage virtualization layer is where the resources of many different storage devices are pooled so that it looks like they are all one big container of storage. This is then managed by a central system that makes it all look much simpler to the network administrators. This is also a great way to monitor resources, as you can then see exactly how much you have left at a given time giving much less hassle when it comes to backups etc.

In most data centers, only a small percentage of storage is used because, even with a SAN, one has to allocate a full disk logical unit number (LUN) to the server (or servers) attaching to that LUN. Imagine that a LUN fills up but there is disk space available on another LUN. It is very difficult to take disk space away from one LUN and give it to another LUN. Plus, it is very difficult to mix and match storage and make it appear all as one.

Storage virtualization works great for mirroring traffic across a WAN and for migrating LUNs from one disk array to another without downtime. With some types of storage virtualization, for instance, you can forget about where you have allocated data, because migrating it somewhere else is much simpler. In fact, many systems will migrate data based on utilization to optimize performance.

Storage Area Network

A Storage Area Network (SAN) is a dedicated high-performance subnet that provides access to consolidated, block level data storage and is primarily used to transfer data between computer systems and storage elements and among multiple storage elements, making storage devices, such as disk arrays, tape libraries, and optical jukeboxes, accessible to servers so that the devices appear like locally attached devices to the operating system.

A SAN typically has its own communication infrastructure that is generally not accessible through the local area network by other devices. A SAN moves data among various storage devices, allowing for the sharing data between different servers, and provides a fast connection medium for backing up, restoring, archiving, and retrieving data. SAN devices are usually installed closely in a single room, but they can also be connected over long distances, making it very useful to large companies.

SAN Benefits

The primary benefits of a SAN are:

High Availability: One copy of every piece of data is always accessible to any and all hosts via multiple paths;
Reliability: Dependable data transportation ensures a low error rate, and fault tolerance capabilities;
Scalability: Servers and storage devices may be added independently of one another and from any proprietary systems;
Performance: Fibre Channel (the standard method for SAN interconnectivity) has now over than 2000MB/sec bandwidth and low overhead, and it separates storage and network I/O;

RAID Concepts

The acronym RAID stands for Redundant Array of Inexpensive Disks and is a technology that provides increased storage functions and reliability through redundancy. It was developed using a large number of low cost hard drives linked together to form a single large capacity storage device that offered superior performance, storage capacity and reliability over older storage systems. This was achieved by combining multiple disk drive components into a logical unit, where data was distributed across the drives in one of several ways called "RAID levels".
This concept of storage virtualization and was first defined as Redundant Arrays of Inexpensive Disks but the term later evolved into Redundant Array of Independent Disks as a means of dissociating a low-cost expectation from RAID technology.
There are two primary reasons that RAID was implemented:

Redundancy: This is the most important factor in the development of RAID for server environments. A typical RAID system will assure some level of fault tolerance by providing real time data recovery with uninterrupted access when hard drive fails;

Increased Performance: The increased performance is only found when specific versions of the RAID are used. Performance will also be dependent upon the number of drives used in the array and the controller;

Hardware-based RAID

When using hardware RAID controllers, all algorithms are generated on the RAID controller board, thus freeing the server CPU. On a desktop system, a hardware RAID controller may be a PCI or PCIe expansion card or a component integrated into the motherboard. These are more robust and fault tolerant than software RAID but require a dedicated RAID controller to work.

Hardware implementations provide guaranteed performance, add no computational overhead to the host computer, and can support many operating systems; the controller simply presents the RAID array as another logical drive

Software-based RAID

Many operating systems provide functionality for implementing software based RAID systems where the OS generate the RAID algorithms using the server CPU. In fact the burden of RAID processing is borne by a host computer's central processing unit rather than the RAID controller itself which can severely limit the RAID performance.

Although cheap to implement it does not guarantee any kind of fault tolerance; should a server fail the whole RAID system is lost.

Hot spare drive

Both hardware and software RAIDs with redundancy may support the use of hot spare drives, a drive
physically installed in the array which is inactive until an active drive fails. The system then automatically replaces the failed drive with the spare, rebuilding the array with the spare drive included. This reduces the mean time to recovery (MTTR), but does not completely eliminate it. Subsequent additional failure(s) in the same RAID redundancy group before the array is fully rebuilt can result in data loss. Rebuilding can take several hours, especially on busy systems.

How to use the Virtualization Lab (II)

How to use the Virtualization Lab (II)

How to use the Virtualization Lab (I)

How to use the Virtualization Lab (I)

How to Setup a Virtualization Lab (III)

How to Setup a Virtualization Lab (III)

Failover Cluster Networking

Virtualization (I)

Virtualization (I)

Storage Virtualization

High Availability Storage (II)

High Availability Storage (II)

Storage Area Network

SAN Benefits

High Availability Storage (I)

High Availability Storage (I)

RAID Concepts

Hardware-based RAID

Software-based RAID

Hot spare drive