SQL University: Virtualization Basics
This week we’re going to talk about a topic that has been gaining steam in the last few years and as it has it has started impacting database administrator’s worlds more and more: virtualization. Why do I make this statement? Well since the economy currently sucks, shops are finding ways to consolidate and make their dollars stretch a little further. Back in the day when you had a new application you pretty much went out and bought yourself some new servers and went on your merry way. Now, when money’s tight, folks are a little less likely to go out and simply buy new equipment for each individual application. Not only is this option expensive, there are other factors to think about such as space (data center may not have capacity for new servers), electricity and cooling.
Enter virtualization. Virtualization allows you to consolidate this server sprawl issue by buying a physical server, filling it with tons of your typical resources such as CPU, memory and drives, and from this single box be able to create virtual servers on this single piece of hardware that look/act/feel like independent servers. This week we’re going to cover some basics of virtualization and stuff you need to know about if you’re going to be going that route in your shop.
First things first, we need to familiarize ourselves with some basic terminology. These concepts are the same no matter what brand of virtualization so don’t worry about specifics. Later on we’ll dive into the different platforms/vendors and what they offer, but for now we’ll stick with general concepts and terms.
Earlier I talked about buying a physical box to house your virtual machines on. This physical server is referred to as a host. The host contains all the physical resources that we will be allocating to our virtual environments including memory, CPU, networking and disks (I/O). Granted you can attach alternate methods of storage to your host, such as a SAN or NAS (which is common) for your storage needs, for these lessons we will refer to storage as being direct attached storage (DAS).
The hypervisor is essentially a special type of Operating System, also referred to as a virtual machine manager/monitor, that is installed on the hardware (host) and its purpose is to present a platform between the hardware and the guest to allow multiple operating systems to share a single host and its resources. In a very simplistic way, think of the hypervisor as the traffic cop between each guest and the resources on the host. If multiple guests are asking for memory or CPU the hypervisor is the one that doles out the goodies to everyone in a quick and efficient way. The hypervisor is the “secret sauce” for virtualization and what makes all the magic happen.
When you create a virtual machine on your host, it is referred to as a guest. A guest, or virtual machine (VM), runs as an independent machine. The beauty of virtualization is that you can create a multitude of guests on a host, all running different operating systems. Once configured a guest VM looks/acts just like a regular server or machine on the network. Each guest can be independently configured with its own resources such as virtual processors, memory and virtual disks.
This is where you will be running SQL Server. When you remote desktop into this machine and go to control panel/device manager keep in mind you’re not seeing real hardware, you’re looking at virtual hardware that is presented to you via the hypervisor.
Technically this isn’t a virtualization term but it’s a concept you’re going to need to need to be really familiar with when virtualizing. Abstraction essentially means when something is presented in a simplified format but underneath it you have more complexity that is involved but you don’t necessarily have to worry about for your use. An example of this in the database world would be a view. You create a simple view that you can select from that looks like you’re selecting from a single table. In reality, that view’s definition is actually the joining of one or more tables together to get the result set for the view. By creating a view, you simplify the work for the end user by letting them query one “table” instead of having to do the work of joining several tables to get what they need.
This concept extends over to the world of virtualization quite a bit. When you create a guest on a host, you create it with a certain set of resources. For example you create a virtual machine guest with 1 virtual processor that is 2.5 GHz and 2 GB of RAM (keeping this simple for now). The abstraction occurs when the hypervisor creates the guest, it creates it saying ‘you have X amount of memory and you have Y CPUs that are Z fast’. Behind the scenes, however, the host is doing something else. While the operating system on the guest says “I have 2 GB of memory”, the host is really only allocating as much memory as the guest needs at the time. So in reality the host is only allocating 128 MB of memory to that guest at that given time. The 2 GB you “gave” the guest, can almost be viewed as a more of a max memory option.
If the guest becomes really active and requires more resources, the hypervisor gets this request and allocates those resources from the host and subsequently passes it to the guest, up to the maximum of what you allocated to the guest. During all this process, the guest is never made aware of any of this shenanigans going on behind the scenes. It’s simply a server with 2 GB of memory doing it’s typical routines! Now, on a simple system you may never notice any performance issue with this, and a good hypervisor makes this situation seamless as you should never really see the effects of this process affecting your performance…until you do.
This also occurs with processing power as well. We gave our guest one virtual processor that runs at 2.5 GHz, so we’d expect if we were to run a process that is CPU intensive, that CPU would be running at speed of 2.5 GHz. Again, an administrator has the ability the throttle these resources so from the hypervisor so while activity monitor within your guest can show 100% utilization but on the host side, the CPU can really only be 25% utilized.
This is where understanding what resource allocation and abstraction becomes crucial in architecting a proper virtualization environment. Now that you’ve understood some of the core concepts, in our next lesson we’ll talk about how SQL Server fits into this whole picture and what you need to account for to ensure your virtualization project goes well.