My personal experience with VMware’s ESXi. Part 1

My personal experience with VMware’s ESXi. Part 1

I finally did it, I broke the piggy, purchased a bunch of hardware and built myself a new “server” for my house. My intention was to VM the crap out of this thing in order to reduce the shear number of single purpose machines in my house.

It turns out that I’ll actually only be able to reduce the machine count by 1, but every little bit helps and I will be adding 5 times the functionality. So I guess considering what it could have been I’m still coming out ahead.
I had to scale back my original plans for the physical hardware for this server due to budget constraints. (You can see my original plan here). What I ended up with was a Biosstar mobo combo deal from New Egg. It’s an AMD 64 6000 proc. 3GHz Dual core. The mother board is a Biosstar…something…cheap. The bundle also came with what I thought was 4GB of memory which turned out to be only 2GB. Whatever. Good enough for now. According to the motherboard specs it supports RAID 0, 1 and 10. I picked up 2, 1TB drives to plug into that board for the data and a 160GB for the OS. After getting through this whole VM thing, I’m not sure there is much merit for that configuration anymore.
After slapping all the hardware in to a case I had laying around, I fired it up. To my delight everything sprung to life. The drive spun up, the bios saw everything, I had video, life was good. So I put the server into it temporary space where I would be doing most of the initial configuration before phasing out my existing hardware. I burned the ESXi ISO to a CD and popped it in the drive. The server started to boot up, it’s loading file after file, little dots after each file to show the progress. It looks like it’s about to do something and BANG! Error! Now the fun begins
Typically my experience with things like, Linux, BSD, Mac, web content software has been frustrating . I figured would also be the case with ESXi. You get a bunch of people telling you how easy it is, how I should do this because it’s free and it’s awesome, and it’s soooo eeeaaasy it practically configures itself. “Yeah, just slap some hardware together and boot of the CD It works great. I’m running it, it’s awesome. “ You’re a fool if you don’t do whatever it is they are pushing. I find that these people are wrong, almost 100% of the time. It’s never as easy as they say, “Yeah dude, just grab some parts from around your house, put ‘em all in a box, shake it up and BANG, frigg’n Linux will be running oh and it’s SOOO much better then Windows” They will continue on with how my life will be so much better, I’ll lose weight, hair will grow where I need it and not where I don’t, my taxes will magically get done…oh and did you know that Linux actually generates it’s own power and therefore, not only does it not need power to run, it will generate enough energy to run your old, archaic Windows boxes too!
In just about every case this couldn’t be farther from the truth. I typically end up with a boot cd that doesn’t work, incompatible hardware, and an internet scavenger hunt to find all the components I need to make all the basic functions work. “Dude! Of course you need to get ‘red_pixel_GNU_V4.3.7.3.8.3.000012.TAR.Zip, green_pixel_x4V4.2.8.3.78-1.TAR and blue_pixel_1200ZV5.7.2.8.9.34.5.001-2-stable.TAR.GZ’. Linux comes with all the black pixels, but you need to download and compile the binaries for red, green a blue. Freak’n N00B.”
Anyway, after getting my first show stopping error I figured this was falling right in with all the other nerd crap out there and I was going to be spending the next 6 weeks downloading little bits of code and programs from 15 different obscure basement web and FTP servers and sifting through thousands of pages of ‘documentation’ explaining how to implement all of this. My blood pressure started to rise. So I read the error. “ESXi requires 2000MB of memory to run only 1700MB found…” What? I know there is 2GB in there. What is the….oh… the onboard video was taking roughly 256MB. So I go digging through my nerd pile and find a 3rd party video card and slap it in. Boom! I get a little further, but I’m faced with another error, “failed to load lvmdriver”. Let the internet scavenger hunt begin. The first search in google, I find that this is a very common error. Some how “failed to load lvmdriver” translates directly to “Your NIC isn’t supported”. Again to the nerd pile, bam I found a older USR PCI 10/100 NIC and slap it in and reboot. Same error. No problem, I’ve got an old 3COM PCI NIC hang’n around, I’ll slap THAT in, come on, who doesn’t support 3COM NICs? I’ll tell you who, VMWare, that’s who. So it’s back to Google. The second article I read is someone with the same problem, toward the end of the article he mentions that he tried a 3COM and it didn’t work either. So after consulting with the HCL from VMWare’s ESX page it turns out that it pretty much only supports Broadcom and intel NICs. Crap. I don’t have any of those and typically those are embedded chipsets on name brand machines. I pack it up for the night and figure I’ll take a look around in some scrap machines at work in the morning and see what I can find.
The next day I come home with an Intel NIC I found. I slap that puppy in and fire it up. SUCCESS! It boots up and starts running as advertized! I go through the install, this actually is VERY simple. Pretty hands off really. A few minutes later, I’ve got ESXi installed. The console interface is pretty limited, but it allows you to change the IP address and ROOT password, and I think the host name.
So, there it is, ESXi in all its glory. On the main console screen is a the URL where this ESXi host can be managed. It’s basically the URL of the machine. At this point you are done with your host box. You can pull the monitor off and never expect to need it again. All the management and configuration happens via the client on another workstation.
On one of my desktop machines, I enter the URL provided by the host machine into my browser. The a webpage generated by most host appears and provides me with a link to down load the VSphere client which will allow me to create VMs and allocate resources. This is pretty cool. It’s very easy use and understand, click click click and I’m starting to create my first VM.
My first VM is going to be a Windows 2003 Domain Controller and be the new DNS server for my network. Pretty simple setup. There’s a REALLY nice feature of the client that allows you to attach to an ISO anywhere on your client PC and it maps as a drive letter in VSphere. This drive letter then is available to all your VMs. For example, I’ve got the ISO for Windows 2003 server on my desktop. I attach that ISO as drive letter “D” and my VM now sees it. In fact, ALL my VMs see it. Very cool.
I click on the “create new VM” button and I’m immediately prompted for the specs of my new VM. I was asked how many processors I wanted to allocate to the VM, how much hard drive space (more on this later), what operating system I was going to put on there, how much memory I wanted to give it, how many NICs, etc… Click click click. Done. I gave this VM one processor (dual core shows up as two processors) 1GB of memory and a 60 GB partition on the hard drive for the OS. Not much else was going on this thing, so I figure I’m good.
So, I “power on” my first VM and the BIOS screen pops up and it starts to boot from my ISO file mapped as drive D. It launches right into the Server 2003 install I’m used to seeing. From this point on it was just like installing Windows Server 2003 on any machine. I go through the process. Thirty or so minutes later I’m done. I’ve got my Windows 2003 Server setup on a new domain and ready to go. I hit it from an RDP client, that works, I can get to the internet, that works, I hit a share via UNC path, that works. I start running the MS updates to get all the latest patches and goodies. Everything is working. Wow. It’s just like a regular machine. Sweet. On to the next VM.
Before I get too much further, I need to back up to before installing ESXI. When I first got video on my new system I jumped into the BIOS and configured my 2 1TB hard drives into a mirror array. With only two drives my options were limited, I went for redundancy over space. I figured I could always add more drives later if I wanted to. Fast forward to creating this first VM and I noticed that ESXi saw both of my drives as separate. Completely ignoring the RAID configuration. After some research I found that my RAID controller wasn’t supported either. So, I went BACK into the BIOS, turned off the RAID function and now have two separate “data” drives. I figured I’d use one for all the little VMs and reserve one of the drive just for my file server. Eh, not ideal, but it was all I could think of to do.

VM number two is going to be my file server. It’s again a Windows 2003 Server joined to the domain, one volume for the OS and a separate (1TB) volume for data. I create the VM with just the OS volume for now and go through the whole setup process. Done. Now to add the data volume. It’s a pretty simple process, click – add, set size… whoops! It’s a 1TB drive with nothing on it, but it’s only letting me partition 250GB of it. Not much of a file server, a few movies and I’m screwed. Hmmm… Let’s try it again… add, set size… BONK. 250GB is the max volume size. What the HECK!!>?!?!?!?!!? Another scavenger hunt….
After several Google searches, I find some information related to the “block size” the drives were allocated with. Ok, I think I remember those settings as I was setting up the drives, sure. I just took the default. Don’t do that. The default block size limits your volume size to 250GB. Crap. How do I change this. Back to Google.
It turns out there are a few ways of changing the block size, unfortunately they all blow away ALL drives on the VM host. So my other VM and the one just build are getting smoked. Whatever, rebuild. Back to step 1.
I remove all my HDDs from my ESXi host and read them using the correct block size for the maximum partition size I need. Then I rebuild my DC and my File server and allocate the space. Done. Nice. I’m in my happy place. However, as I’m doing the MS updates I notice that the performance SUCKS. It’s killing me, it’s taking forever to login, to down load stuff, etc… It’s just miserable. I know I only have 2GB of ram in there so, that’s probably what it is. So, I sort of shelf the thing and start looking for places to buy PC6400 relatively cheap.
In the mean time, I have family come out and stay with me, I spend a week in Wisconsin, then work picks up, I’ve got commitments after work etc… Well, finally I have a few days off of work and I decide to just suck it up and deal with the performance issues. At some point I’ll just drop some memory in there and everything will be fine. So I log back in to my ESXi VSphere management software and I get a message saying that my trial period is up and I need to register. Well, I’ve been ignoring that message all along, because I have it from SEVERAL reliable sources that even if the trial expires you can still use it. You just don’t get access to a ton of the advanced features like Vmotion and VCenter etc… Whatever, I’m not using that stuff so who cares? Sure enough, I click “OK” and I’m off and running again. I log into each of my VMs, install the latest patches, add some users, play with DNS and setup my DHCP scope etc… Now I’m ready to build my 3rd VM.
My 3rd VM is going to be a 2003 Exchange server, again one partition for the OS and one for the store. No big deal. I go to create a new VM and it errors out. Because my trail expired, it appears that I can use EXISTING VMs but I just can’t create any new ones. So now I’m stuck… again. I need to rebuild the whole thing ALL over again. I shelved it again. Maybe over Christmas I’ll get back to it.
Of course at this point I realize that most of my problems were somewhat self inflicted. I’m not too sure why the performance is so bad right now though. I mean, my existing server is an OLD Gateway Athlon 850 with 768MB of ram, running server 2000, Exchange 2003, and a bunch of other stuff. It’s a DC, DNS, Webserver, etc , and it runs great actually. So, I’m trying to figure out how best to allocate those resources for performance on my VMWare box.
So, in the meantime, I’ve got some research to do and some things to figure out. I’m going to try and start fresh over Christmas and see how far I can get. What I’m attempting to illustrate is, ESXi is not as easy as it sounds and as most people say it is. It might be if you have 100% Dell hardware, a SAN, and a crap load of memory. I don’t’ know. I’ll keep you all posted as I attempt it for the 3rd time.

Comments

Your Problems

You mentioned that indeed some of your problems were "self inflicted." VMWare's HCL is pretty narrow and they only support very specific equipment. The sum total of my ESX(i)/VI3/vSphere experience has been on Dell server hardware (PE2950, PE1955 blades, and M600 blades) hardware with QLogic HBAs, Dell's RAID controllers, and either Dell/EMC or IBM SANs. I've never tried running it in the type of environment you've described which, compared to the stuff I've worked with, is the technological equivalent of popsicle sticks and duct tape.

I did one put ESXi 4 on a Dell Precision workstation and it worked fine, except for some of the performance issues you described on the VMs due to low memory. ESX doesn't need much to run but the VMs need whatever hardware resources you can give them. The more processors and memory you've got, the better your VMs are going to run.

Most of the hardware I run it on are dual quad-core machines with 32GB of RAM on fibre-attached shared storage. In that configuration, I can easily run 8-12 VMs. VMWare's unofficial/official rule (one I've never gotten in writing but only verbally in phone conversations or face-to-face meetings) is that you shouldn't run more than 4-6 VMs per physical core. In my experience, processor cores are rarely the issue. If you run 4 VMs per core on a dual quad-core server, that's 32 VMs (2x4x4=32). You've going to choke on memory long before the processor becomes the bottleneck.

Our general cluster configuration - which, thanks to the magic of VMWare's VMotion/HA/DRS, balances VMs across the hosts in our cluster quite nicely - seems to have about a maximum of 8 VMs on our blades (again, dual quad-core/32GB of RAM). After that, you can see the VMWare starts wanting to find other places to put VMs because memory utilization on the physical host gets up above 70%. Still, an 8:1 ratio is pretty good when it comes to ROI calculations, especially over the duration of the hardware depreciation schedule.

So, you might be trying to put 10 pounds of stuff into a 5 pound bag and that might be part of your issue. Well, that and (totally understandable from a personal environment standpoint) not ponying up the $5,000 per ESX license to make the whole thing official. (That price includes 3 yr support, VMotion, VCB, HA, DRS, and the VMWare Converter tools, so it's pretty high. A more entry-level price is probably around $1,500.)

VMWare used to have their free GSX version that was totally free but behaved like their Workstation product and, as such, didn't have many of the nice enterprise features. I'm not sure they still have anything like this anymore but, if they do, that might be the better route to go in your situation.