Tuesday, January 20, 2009
Devices (Toys) I Love
Wireless Headphones
Staying on top of the latest technologies requires absorbing material from the variety of mediums at which folks are publishing information. Videos, podcasts and flash tutorials supplement the many blogs and white papers but can be more disruptive to people around me. Headphones are the natural solution but I have found that those pesky wires are always getting in the way when at the gym or wanting to get up for a drink refill.
Imagine my elation when I came across the Motorola S9 ROKR wireless bluetooth stereo headphones. I first saw them at Best Buy and found the original $150 price tag way too hard to swallow. Once I convinced myself that this was something that would certainly make my life just a little bit better I was on the prowl for a deal. As usual, NewEgg.com delivered and $50 later I was the proud owner of some NCSU Red ROKRs.
The S9 ROKR uses Stereo Bluetooth 2.0 and provides what I consider excellent sound quality for what I was asking from them. I found them extremely comfortable but this too is definitely a matter of taste. The best part? They have built in microphone for managing calls when using a cell phone or use with Skype. I later found out, to my chagrin, that my BlackBerry Pearl does not support Bluetooth 2.0. Naturally my wife's Pink Motorola RAZR connected right away and worked like a dream, but she cares little about these things thus far.
My primary use of the S9 ROKR is with my laptop. I love charging up the battery before traveling and playing whatever media I choose without the fuss of wires. It also comes in handy when I want to watch some training videos and not disturb anyone with the geeky information. I'm able to lean back, move around, grab a snack or refill and simply enjoy life without wires. Since I'm not really interested in walking around the gym with my wife's pink RAZR I'm just going to wait for a new BlackBerry before I get to use them the way I want but I definitely see years of happy use with this rechargeable wireless wonder.
Can you See me? WebCam
The other little wonder that I have found myself loving is my Logitech QuickCam Deluxe for Notebooks webcam. This past fall my mom and sister decided it was time for new computers so I pulled some strings, got them some great deals from the good folks at Dell, and purchased them some webcams for Christmas. The plan was to provide my mom the ability to see her kids and grandkids (coming 2009) without leaving her living room. I set them both up with Skype accounts, configured their webcams and showed them how to use them.
Setup with Skype was a cinch and I found the resolution great for the web at 640x480. The webcam simply clips nicely on my laptop monitor and came with a fantastic little carrying case for my laptop bag. The built in software provides some fun little "effects" but not something I've found much use for. My only disappointment that I've had is the inability to easily use it with my desktop system, but I bought a NOTEBOOK webcam so no complaints there.
My mom loves being able to communicate this way to stay in touch so mission accomplished. I'm sure I'll get some business use for it soon as we roll out new features with our new phone system upgrade, especially if it's going to continue to snow like it did today. Stay warm and take care.
Thursday, January 8, 2009
Death by Snapshot
One of the greatest features in VMware's many virtualization technologies is the ability to take snapshots of virtual machines while they are powered on. Put simply, VMware snapshots save the system state of the virtual machine providing a restore point. This technology is available and works very well on a variety of VMware platforms including Workstation, Server, and ESX. Unfortunately this feature carries it's own risk to the performance and stability of your environment. In this entry I will discuss the destructive event of a snapshot consuming remaining available space on a datastore.
Recovery Outline at End of post
How snapshots work
In preparing for this entry I came across an extremely helpful post by Eric Siebert where he very clearly explains the many aspects of snapshots and how they work. I'll summarize a few important points in this section but highly recommend reading his full post.
Snapshots can be initiated and managed by either the VI Client or through the command line and once initiated, the snapshot makes the base VMDK read only and writes all changes to a new file. Each new differential (delta) file begins at 16MB and grows at 16MB increments as changes are made to the VM but naturally can not exceed the size of the original base file. These files will continue to grow until the snapshot is reverted (the changes are applied back to the base VMDK) or disk space is depleted, which makes for a stressful afternoon if this happens.
Drive space is gone!
I have seen it happen several times. Someone, or something, initiates a snapshot in your ESX environment and forgets to remove that snapshot when they have completed their task. My personal experience has found that this typically becomes a problem when backup software uses snapshots and the snapshot isn't merged back. Snapshot usage in this manner is common for backing up VMs since it allows the full disk to be placed in a read-only state so that the copy can continue without interrupting the machine's ability to operate.
A live snapshot delta file for most VMs will not likely grow very quickly, except for servers with higher amounts of disk I/O such as Exchange, SQL, or File shares (especially when using Windows Volume Shadow Copy). If a growing snapshot file isn't discovered and either applied or reverted then the snapshot could consume the remaining available storage. Once this happens the VM with the applied snapshot can no longer write its changes to the delta file causing the server to stop. Additionally, any other VMs writing to the filled datastore will also be forced to shut down. Fortunately any virtual machines on the datastore without snapshots, or an active swap file, will continue to run.
Note: If you are dealing with a single snapshot then no additional space is required to commit that snapshot to the original VMDK file, but I personally feel better when I have some extra room to move.
Make Room
If you are like me and have storage claustrophobia you want to make some room on your datastore. If the VM is a critical server it would be advisable to move another VM to a different datastore so that the afflicted VM can be restarted quicker and reduce any risks from trying to move a VM with a snapshot. I've seen mixed information about migrating with snapshots and it's feasibility. I'll leave it to the reader in their own situation but my suggestion is to play it safe and move another system without running snapshots.
Storage Vmotion immediately comes to mind for this situation. Unfortunately as Chad Sakac clearly explains in his blog post, Storage Vmotion requires creating a snapshot in order to operate. Consequently, with no drive space, we're left with the horrid task of intentionally taking down a server to make room. In my situation we took down our intranet server since it was one of the smaller servers, would take the least amount of time to migrate, and would cause the least impact on employee productivity. The time to complete this task will depend on several factors, specifically in regards to the type and speed of storage that you are using. This process took about 15 minutes for us to migrate our server to a new datastore.
Apply Snapshots
Once drive space has been created you should be allowed to start up the VM and commit the snapshots. Don't forget to turn the migrated server back on!
I have yet to receive a consensus on whether applying the snapshot on a live machine is better than leaving it powered off, but if the server in question is a main production box then it may be worth giving it a shot. You can expect that the process will take longer on a live machine. It will certainly be performing better than it was a few minutes ago!
Applying the snapshots can be a long grueling ordeal depending on the amount of space you have consumed. Be patient and do not be surprised if you see your task timeout in the VI client. VirtualCenter will timeout any task at 15 minutes but your process will still be running. Check to see if your process is complete by keeping an eye on the datastore browser in the VI client. You will be looking to see that the delta files are no longer there and you will also note that there is storage available in your datastore again. You may need to hit refresh occasionally in order to witness the disappearing files.
Our environment provided us with over 70GB of snapshot files over the course of 3 days time which took approximately 90 minutes to apply. Eric Siebert speaks to this in the second part of his snapshot post where he states that "A 100 GB snapshot can take 3-6 hours to merge back into the original disk." Suffice it to say that the larger the snapshot the longer it will take, and the more storage "cushion" you have on the datastore, the greater your risk for a long wait.
Recovery
Once the snapshots have merged back into the original file you can get the VM back up and running (if you haven't done so already) and then Storage Vmotion the server you moved previously if that is available in your environment.
Prevention
Vmware does not provide any tools natively for monitoring active snapshots in your ESX environment. Third-party applications are available to help automate the process of finding these active snapshots. I have not personally used them yet but Jason Boche mentions a few of them in his blog where he briefly displays Xtravirt Snaphunter, RVTools, and hyper9.
I will probably get my hands on a couple of these in the coming weeks and will certainly provide some posts. If you are in a fix to get some monitoring on your snapshots it looks as though SnapHunter can notify you via email when you have snapshots or even commit them if you so choose.
If you want to go low tech and only manage a few machines, you can check for snapshots by looking in the VI client or keeping an eye out for delta files in the datastore browser. Be vigilant regardless of your method for tracking active snapshots. It certainly doesn’t look good to the bosses when your highly robust ESX environment fails your company, especially when it can be easily prevented.
Despite the agony that can be caused by an unchecked snapshot, Vmware's snapshot feature is a true saving grace for the administrator and should be used without too much trepidation. The ability to apply a patch, test a deployment, or change a configuration and then quickly revert the system is more than I'd be willing to give up. Just keep your eyes open to the snapshots that are out there and everything should run smoothly and optimally, which the bosses definitely appreciate.
Recovery Outline
- Identify the server(s) affected and determine priority on bringing them back online.
- Shutdown and cold migrate another virtual machine from the filled datastore to a new location. Not always necessary if you have only a single snapshot since applying a single snapshot requires no disk space.
- Apply snapshots to the affected server. You may power on server if you prefer but this will have an adverse effect on performance and cause this step to take longer.
- Be patient. A 100GB snapshot could take 3-6 hours to commit. VirtualCenter will timeout your task after 15 minutes so don't panic.
- Monitor the Datastore Browser in the VI client and wait for the delta file(s) to disappear. You will likely need to refresh occasionally which may take a moment to process each time.
- Once the snapshot is committed you can safely turn on the VM (if you haven't already) and hopefully breathe a sigh of relief.
- Play it safe and set up a system of monitoring your VMs for active snapshots through either an automated software like SnapHunter from Xtravirt or simply monitoring for delta files in the datastore browser