The Approaching Backup (Hyper)Convergence #VFD5

When we talk about convergence in IT, it usually means bringing things together to make them easier to manage and use.  Network convergence, in the data center, is bringing together your storage and IP stacks, while hyperconverged is about bringing together compute and storage together in a platform that can easily scale as new capacity is needed.

One area where we haven’t seen a lot of convergence is the backup industry.  One new startup, fresh out of stealth mode, aims to change that by bringing together backup storage, compute, and virtualization backup software in a scalable and easy to use package.

I had the opportunity to hear from Rubrik, a new player in the backup space, at Virtualization Field Day 5.   My coworker, and fellow VFD5 delegate, Eric Shanks, has also written his thoughts on Rubrik.


Note: All travel and incidental expenses for attending Virtualization Field Day 5 were paid for by Gestalt IT.  This was the only compensation provided, and it did not influence the content of this post.


One of the challenges of architecting backup solutions for IT environments is that you need to bring together a number of disparate pieces, often from different vendors, and try to make them function as one.  Even if multiple components are from the same vendor, they’re often not integrated in a way to make them easy to deploy.

Rubrik’s goal is to be a “Time Machine for private cloud” and to make backup so simple that you can have the appliance racked and starting backups within 15 minutes.  Their product, which hit general availability in May, combines backup software, storage, and hardware in a package that is easy to deploy, use, and scale.

They front this with an HTML5 interface and advanced search capabilities for virtual machines and files within the virtual machine file system.  This works across both locally stored data and data that has been aged out to the cloud due to a local metadata cache.

Because they control the hardware and software for the entire platform, Rubrik is able to engineer everything for the best performance.  They utilize flash in each node to store backup metadata as well as ingest the inbound data streams to deduplicate and compress data.

Rubrik uses SLAs to determine how often virtual machines are protected and how long that data is saved.  Over time, that data can be aged out to Amazon S3.  They do not currently support replication to another Rubrik appliance in another location, but that is on the roadmap.

Although there are a lot of cool features in Rubrik, it is a version 1.0 product.  It is missing some things that more mature products have such as application-level item recovery and role-based access control.  They only support vSphere in this reslease.  However, the vendor has committed to adding many more features, and support for additional hypervisors, in future releases.

You can watch the introduction and technical deep dive for the Rubrik presentation on Youtube.  The links are below.

If you want to see a hands-on review of Rubrik, you can read Brian Suhr’s unboxing post here.

Rubrik has brought an innovative and exciting product to market, and I look forward to seeing more from them in the future.

GPUs Should Be Optional for VDI

Note: I disabled comments on my blog in 2014 because of spammers. Please comment on this discussion on Twitter using the #VDIGPU hashtag.

Brian Madden recently published a blog arguing that GPU should not be considered optional for VDI.  This post stemmed from a conversation that he had with Dane Young about a BriForum 2015 London session on his podcast

Dane’s statement that kicked off this discussion was:
”I’m trying to convince people that GPUs should not be optional for VDI.”

The arguments that were laid out in Brian’s blog post were:

1. You don’t think of buying a desktop without a GPU
2. They’re not as expensive as people think

I think these are poor arguments for adopting a technology.  GPUs are not required for general purpose VDI, and they should only be used when the use case calls for it.  There are a couple of reasons why:

1. It doesn’t solve user experience issues: User experience is a big issue in VDI environments, and many of the complaints from users have to do with their experience.  From what I have seen, a good majority of those issues have resulted from a) IT doing a poor job of setting expectations, b) storage issues, and/or c) network issues.

Installing GPUs in virtual environments will not resolve any of those issues, and the best practices are to turn off or disable graphics intensive options like Aero to reduce the bandwidth used on wide-area network links.

Some modern applications, like Microsoft Office and Internet Explorer, will offload some processing to the GPU.  The software GPU in vSphere can easily handle these requirements with some additional CPU overhead.  CPU overhead, however, is rarely the bottleneck in VDI environments, so you’re not taking a huge performance hit by not having a dedicated hardware GPU.

2. It has serious impacts on consolidation ratios and user densities: There are three ways to do hardware graphics acceleration for virtual machines running on vSphere with discrete GPUs.

(Note: These methods only apply to VMware vSphere. Hyper-V and XenServer have their own methods of sharing GPUs that may be similar to this.)

  • Pass-Thru (vDGA): The physical GPU is passed directly through to the virtual machines on a 1 GPU:1 Virtual Desktop basis.  Density is limited to the number of GPUs installed on the host. The VM cannot be moved to another host unless the GPU is removed. The only video cards currently supported for this method are high-end NVIDIA Quadro and GRID cards.
  • Shared Virtual Graphics (vSGA): VMs share access to GPU resources through a driver that is installed at the host level, and the GPU is abstracted away from the VM. The software GPU driver is used, and the hypervisor-level driver acts as an interface to the physical GPU.  Density depends on configuration…and math is involved (note: PDF link) due to the allocated video memory being split between the host’s and GPU’s RAM. vSGA is the only 3D graphics type that can be vMotioned to another host while the VM is running, even if that host does not have a physical GPU installed. This method supports NVIDIA GRID cards along with select QUADRO, AMD FirePro, and Intel HD graphics cards.
  • vGPU: VMs share access to an NVIDIA GRID card.  A manager application is installed that controls the profiles and schedules access to GPU resources.  Profiles are assigned to virtual desktops that control resource allocation and number of virtual desktops that can utilize the card. A Shared PCI device is added to VMs that need to access the GPU, and VMs may not be live-migrated to a new host while running. VMs may not start up if there are no GPU resources available to use.

Figure 1: NVIDIA GRID Profiles and User Densities
clip_image001[10]

There is a hard limit to the number of users that you can place on a host when you give every desktop access to a GPU, so it would require additional hosts to meet the needs of the VDI environment.  That also means that hardware could be sitting idle and not used to its optimal capacity because the GPU becomes the bottleneck.

The alternative is to try and load up servers with a large number of GPUs, but there are limits to the number of GPUs that a server can hold.  This is usually determined by the number of available PCIe x16 slots and available power, and the standard 2U rackmount server can usually only handle two cards.   This means I would still need to take on additional expenses to give all users a virtual desktop with some GPU support.

Either way, you are taking on unnecessary additional costs.

There are few use cases that currently benefit from 3D acceleration.  Those cases, such as CAD or medical imaging, often have other requirements that make high user consolidation ratios unlikely and are replacing expensive, high-end workstations.

Do I Need GPUs?

So do I need a GPU?  The answer to that question, like any other design question, is “It Depends.”

It greatly depends on your use case, and the decision to deploy GPUs will be determined by the applications in your use case.  Some of the applications where a GPU will be required are:

  • CAD and BIM
  • Medical Imaging
  • 3D Modeling
  • Computer Animation
  • Graphic Design

You’ll notice that these are all higher-end applications where 3D graphics are a core requirement.

But what about Office, Internet Explorer, and other basic apps?  Yes, more applications are offloading some things to the GPU, but these are often minor things to improve UI performance.  They can also be disabled, and the user usually won’t notice any performance difference.

Even if they aren’t disabled, the software GPU can handle these elements.  There would be some additional CPU overhead, but as I said above, VDI environments usually constrained by memory and have enough available CPU capacity to accommodate this.

But My Desktop Has a GPU…

So let’s wrap up by addressing the point that all business computers have GPUs and how that should be a justification for putting GPUs in the servers that host VDI environments.

It is true that all desktops and laptops come with some form of a GPU.  But there is a very good reason for this. Business desktops and laptops are designed to be general purpose computers that can handle a wide-range of use cases and needs.  The GPUs in these computers are usually integrated Intel graphics cards, and they lack the capabilities and horsepower of the professional grade NVIDIA and AMD products used in VDI environments. 

Virtual desktops are not general purpose computers.  They should be tailored to their use case and the applications that will be running in them.  Most users only need a few core applications, and if they do not require that GPU, it should not be there.

It’s also worth noting that adding NVIDIA GRID cards to servers is a non-trivial task.  Servers require special factory configurations to support GPUs that need to be certified by the graphics manufacturer.  There are two reasons for this – GPUs often draw more than the 75W that a PCIe x16 slot can provide and are passively cooled, requiring additional fans.  Aside from one vendor on Amazon, these cards can only be acquired from OEM vendors as part of the server build.

The argument that GPUs should be required for VDI will make much more sense when hypervisors have support for mid-range GPUs from multiple vendors. Until that happens, adding GPUs to your virtual desktops is a decision that needs to be made carefully, and it needs to fit your intended use cases.  While there are many use cases where they are required or would add significant value, there are also many use cases where they would add unneeded constraints and costs to the environment. 

Countdown to Virtualization Field Day 5–#VFD5

In two weeks, I get the pleasure of joining some awesome members of the virtualization community in Boston for Virtualization Field Day 5. 

If you’re not familiar with Virtualization Field Day, it is one of the many Tech Field Day events put on by Stephen Foskett (@sfoskett) and the crew at Gestalt IT.  These events bring together vendors and members from the community to have technical discussions about the vendor’s products and offerings.   These events are streamed live on the Tech Field Day website, and there are many opportunities to interact with the delegates via Twitter by following the #VFD5 hashtag.

The vendors that will be sponsoring and presenting at Virtualization Field Day 5 are:

DataGravity_392x652-wpcf_100x17

 logo1

     logo-large-gray-wpcf_100x48
logo-wpcf_100x21        med-vert-notag-wpcf_93x60 PernixData_Logo_Color
Scale_Logo_High_Res-wpcf_100x38 v2Ravello_Logo_large-wpcf_100x27 VMTurboLogoSm

I will be joining an awesome group of delegates:

This will be my first time attending Virtualization Field Day as a delegate.  I’ve previously watched the events online and interacted with the delegates on Twitter. 

Keep watching this space, and the #VFD5 hashtag on Twitter, as there will be a lot more exciting stuff.