You have probably heard about lots of distinct types of server virtualization; full, bare metal, para-virtualization, guest OS, OS assisted, hardware assisted, hosted, OS level, kernel level, shared kernel, hardware emulation, hardware virtualization, hypervisor based, containers or native virtualization. Confusing, right?
Fear not my faithful readers; the whole purpose of this blog is exactly to explain these things so that everyone can have a clear view over issues usually restricted to a bunch of geeks. But keep in mind that some of these terms are popularized by certain vendors and do not have a common industry-wide acceptance. Plus, many of the terms are used rather loosely and interchangeably (which is why they are so confusing).
Although others classify the current virtualization techniques in a different way, I will use the following criteria:
- Full Virtualization;
- Operating System-level Virtualization;
- Hardware assisted virtualization.
On the following exciting chapters I will explain these techniques, one by one, but before that I believe it would be useful to give you a quick introduction to some underlying concepts.
Server Virtualization History
Virtualization was first developed in the 1960s to partition large mainframes for better hardware utilization. In 1964, IBM developed a Virtual Machine Monitor to run their various operating systems on their mainframes providing a way to logically partition these big computers into separate virtual machines. These partitions allowed mainframes to multitask; run multiple applications and processes at the same time. Hardware was too expensive to leave underutilized so it was designed for partitioning as a way to fully leverage the investment.
However with the advent of cheap commodity hardware, virtualization was no longer popular and was viewed as a relic of an era where computing resources were scarce. This was reflected in design of x86 architectures which no longer provided enough support to implement virtualization efficiently. Virtualization was effectively abandoned during the 1980s and 1990s when client-server applications and inexpensive x86 servers and desktops led to distributed computing. The broad adoption of Windows and the emergence of Linux as server operating systems in the 1990s established x86 servers as the industry standard.
With the cost of hardware going down and complexities of software increasing, a large number of administrators started putting one application per server. This provided them isolation, where one application did not interfere with any other applications. However, over some time this IT management attitude started resulting into a problem called server sprawl. There were too many servers with average utilization between 5% and 15%. In addition to the cost of the hardware, there are also power and cooling requirements for all these servers and this caused the old problem of utilization of hardware resources to surface again.
The growth in x86 server and desktop deployments led to new IT infrastructure and operational challenges, namely:
- Low Infrastructure Utilization;
- Increasing Physical Infrastructure Costs;
- Increasing IT Management Costs;
- Insufficient Failover and Disaster Protection;
Guests and hosts
A virtual machine (VM) typically has two components: the host server and the guest virtual machine. The host server is the underlying hardware that provides computing resources, such as processing power, memory, disk and network I/O, and so on. The guest virtual machine is a completely separate and independent instance of an operating system and application software. Guests are the virtual workloads that reside on a host server and share that server's computing resources.
Virtual Machine Monitor
System virtual machines are capable of virtualizing a full set of hardware resources, including a processor (or processors), memory and storage resources and peripheral devices. A Virtual Machine Monitor (VMM, also called a hypervisor) is the piece of software that provides the abstraction of a virtual machine. There are three properties of interest when analyzing the environment created by a VMM:
- Safety/Resource Control;
Popek and Goldberg Criteria
The Popek and Goldberg criteria are a set of conditions sufficient for a computer architecture to support system virtualization efficiently. They were introduced by Gerald Popek and Robert Goldberg in their 1974 article "Formal Requirements for Virtualizable Third Generation Architectures". The paper establishes three essential characteristics for system software to be considered a VMM:
- Equivalence/Fidelity: The software running under the VMM should exhibit a behavior essentially identical to that demonstrated when running directly on equivalent hardware, barring timing effects;
- Efficiency/Performance: A vast majority of machine instructions must be executed by the hardware without VMM intervention;
- Resource Control/Safety: The VMM must be in complete control of the virtualized resources.
Even though these requirements are derived under simplifying assumptions, they still represent a convenient way of determining whether a computer architecture supports efficient virtualization and provide guidelines for the design of virtualized computer architectures.
CPU Protection Levels
The x86 architecture offers a range of protection levels, also known as rings, in which code can execute and which operating systems and applications use to manage access to the computer hardware. Ring 0 has the highest level privilege and it is in this ring that the operating system kernel normally runs. Code executing in ring 0 is said to be running in system space, kernel mode or supervisor mode. All other code such as applications running on the operating system operates in less privileged rings, typically ring 3.
Keep in mind that the CPU privilege level has nothing to do with operating system users. Whether you’re root, Administrator, guest, or a regular user, it does not matter. All user code runs in ring 3 and all kernel code runs in ring 0, regardless of the OS user on whose behalf the code operates.
The term "ring" (as it applies to x86 architecture machines and Windows) appears to refer to the original 80386 architecture reference manual's drawing of the four levels of protection - as concentric circles of operations. Thus, Ring 3 was the outermost ring and provided the most protection, allowing only the execution of instructions that could not affect overall processor state. Ring 0 was the innermost ring and allowed total control of the processor.
Problems in x86 Virtualization
Unlike mainframes, the x86 architecture was not designed to support full virtualization because in these operating systems, designed to run directly on the bare-metal hardware, certain sensitive instructions expect the OS to be directly interacting with the hardware, making it difficult to realize ‘true’ virtualization (as defined by the Popek and Goldberg criteria).
Virtualizing the x86 architecture requires placing a virtualization layer under the operating system (which expects to be in the most privileged Ring 0) to create and manage the virtual machines that deliver shared resources. Further complicating the situation, some sensitive instructions can’t effectively be virtualized as they have different semantics when they are not executed in Ring 0. The difficulty in trapping and translating these sensitive and privileged instruction requests at runtime was the challenge that originally made x86 architecture virtualization look impossible.
VMware was the first company to find a way around this obstacle, by way of a method called the Binary Translation in which the privileged instructions were trapped by a software interpreter layer known as the Virtual Machine Monitor (VMM), and converted to ‘safe’ instructions that could be virtualized.
In 1998, VMware figured out how to virtualize the x86 platform, developing binary translation techniques that allow the VMM to run in Ring 0 for isolation and performance, while moving the operating system to a user level ring with greater privilege than applications in Ring 3 but less privilege than the virtual machine monitor in Ring 0. This combination of binary translation and direct execution on the processor allows multiple guest OSes to run in full isolation on the same computer.