Speaking of restores

By Juan Orlandini, Chief Technology Officer, North America & Distinguished Engineer
5/31/2011

vStorage VADP APIs

I mentioned before that the vStorage APIs are VMware's new, better, faster, cooler, and usable method to do backups and storage better. It's time we explore the backup side a little bit more.

VADP (vStorage APIs for Data Protection) is a comprehensive set of APIs by which ISV (backup software vendors for this discussion) partners of VMware can develop a robust set of backup and recovery tool sets. They encompass three key features we should talk about:

1. Discovery
2. Data access/transport
3. Recovery

Each of these is a rich topic, so I'm only going to cover them briefly to introduce them. I'd love to hear back from you if you want me to offer more details on any of these.

Discovery
Before you can protect something, you first have to know what you are protecting. Sounds simple right? Well, turns out that in VMware environments in particular this is actually kinda hard. Because of the very nature of VMware, the number of virtual machines (VMs) and their locations change constantly. We do a number of assessments for our customers. In many cases, we find that the backup teams are not always aware of the changes that the server and virtualization teams are implementing. It is not unusual for us to find that the backup teams thought they were protecting everything, but that in reality the VMware team had neglected to mention a significant number of additions. Oops.

Well, VADP provides a mechanism for this to be solved by the ISVs. The APIs provide mechanisms by which the ISVs can query either vCenter or ESX/ESXi hosts directly to enumerate all of the VMs that exist in the environment. With this tool, the backup software can discover what VMs exist, what their features are (hardware version, CBT enabled, etc.), and where they currently exist. It's then up to the ISV to figure out how to best define a protection scheme for those machines. Currently, the backup vendors have a number of things that they are battling over in this front. The more advanced solutions are able to use this information to determine if an existing backup policy is protecting a VM, and if not, assign it to a default "fail safe" policy. For VMs with existing policies, they can also then determine what's the best method to back up and where to store data. Some can even automatically determine if it makes sense to do de-dupe, CBT (more on this later), full, incremental, or a combination of those for backups.

The key here is that for larger enterprise environments, static policy definitions are never going to be enough. VMware enables ISVs to be much more proactive about protecting the environment. Think of it this way: in the physical world, prior to VMware, how did you find out about a new machine on your network? Typically, the server, application, or database team would have to tell you that it exists. Oh and they also had to tell you that it didn't exist when it got retired. Unless some very sophisticated, complicated, and custom written automated discovery mechanism was put in place, the only way you were sure to capture all the new servers was by process. Process means people are involved. That means it was very liable to break. Nothing's perfect, but with VADP it's now possible to set up automated, rules-based discovery and protection of everything in your virtual environment. If the server team "forgets" to tell you about a new VM, you are still good. Yeah, discovery.

Data access/transport
Before you can back something up, you have to have a way of getting to the information. In traditional backups this is typically not a problem. You just install the client on the machine you want to back up and it gets direct access to all of the data. You can do some fancy stuff and do third-party copies (a.k.a. off-host backups), but that's not the norm. Well, with VMware, you can continue to do the client in the VM thing, but that doesn't leverage VADP. What you get with VADP is the ability for a "backup" server to act as a proxy for your VM. It can get to this data via several ways – either through the network (NBD transport) or by directly accessing the VMDK images that constitute your VMs (VDDK+SCSI hot add). With the network transport, the ESX/ESXi host is involved and moves the bits (in or out) for the proxy host. With the VDDK+SCSI hot add mode, the proxy host gains direct access to the VMDK file. If your proxy host is a separate physical machine, that means your ESX/ESXi hosts are not involved in any data movement. This is particularly good if you have lots and lots of data.

"Wait!" – you say. "Doesn't VCB use the same stuff?" Well, kinda. The concepts are similar, and some of the underlying code is similar, but the devil's in the details. We don't have time to get into the nitty gritty of why it's different, but let me tell you this: you no longer need a staging area, and it works now. With VCB, you had two choices of how to get at your data. The slow and the fast. The slow way would let you essentially mount the VMDK on the proxy host and you could do a file-level backup. But it was really ... really ... slow. In the fast option, you didn't mount the VMDK. Instead you made a copy of it from the VMFS file system to a temporary staging location. The backup application then had to copy from the temporary staging location to tape or disk or whatever. You were moving the data at least twice. Yuck.

With VADP you no longer need the staging area for the backups. Yup. That's a good thing. The backup application can read the data directly from the VM and put it to safe secondary storage. That's not all that VMware did. VMware put a ton of effort into refining the tool set. In my opinion, the best of these is a thing called CBT (Change Block Tracking). To leverage CBT, your VMs must have hardware version 7. Once that's enabled, backup software can, in very basic terms, say to VMware, "I'm doing a full backup – keep track of any changes after I'm done." When it comes to the next incremental backup, the backup software can say, "Hey VMware, what has changed since the last time I backed that VM up?" VMware, very efficiently, keeps track of the changes at the block level for each VM that has this turned on. With this, it just sends a new message back to the backup application saying, "these are the blocks that have changed." The backup app can then just use the NBD or VDDK paths to gather only those blocks.

Voila! Super quick backups.

Data recovery
But, ahh ... it's not about backups. At some point, you will need to restore some, if not all, of the data in your environment. Fortunately, VMware learned a few things with VCB. What you can now do is leverage the NBD or VDDK transports to quickly recover any data directly to the data store that hosts the VMs you are using. Yup. You no longer need a staging area for restores either! High five VMware! Why's this so good? Well with VCB you had to restore the data from secondary storage to a temporary disk location. Then, you had to use vConverter to take that restore and shove it back into a VMFS/NFS data store and let vCenter know that it happened. Double yuck.

It gets even better. Some of the backup applications are even smarter than this. They are leveraging some very clever proprietary technology so that you can do file-level restores even though you did a block-level backup. The details of how this is done are different between each of the backup vendors, but the gist of it is that they can parse through the backup streams and determine what files are there on the fly. Well ... some of them kinda do it on the fly and other do really do it on the fly. At some point, I'll have to start going from this theoretical stuff and get into the weeds about product differences. But not today. The good news is that we now have an API from VMware and integration by the various ISVs that give you reliable backup and restores for your VMware environment.

Except for two things.

VADP does not provide a way to do application consistent backups, and VADP still moves data.

Huh? I'll post about that next. You really should stop moving data when you are trying to protect it.

Achieve your objectives.

Solve the unsolvable.

Get mission-critical support.

Insight is here for you.

Information is power.

Speaking of restores

Archives