Nutanix Xtract for VMs #blogtober

One of the defining features of the Nutanix platform is simplicity.  Innovations like the Prism interface for infrastructure management and One-Click Upgrades for both the Nutanix software-defined storage platform and supported hypervisors have lowered the management burden of on-premises infrastructure.

Nutanix is now looking to bring that same level of simplicity to migrating virtual machines to a new hypervisor.  Nutanix has released a new tool today called Xtract for VM.  This tool, which is free to all Nutanix customers, brings the same one-click simplicity that Nutanix is known for to migrating workloads from ESXi to AHV.

So how does Xtract for VM differentiate from other migration tools?  First, it is an agentless migration tool.  Xtract will communicate with vCenter to get a list of VMs that are in the ESXi infrastructure, and it will build a migration plan and synchronize the VM data from ESXi to AHV.

During data synchronization and migration, Xtract will insert the AHV device drivers into the virtual machine.  It will also capture and preserve the network configuration, so the VM will not lose connectivity or require administrator intervention after the migration is complete.

Xtract1

By injecting the AHV drivers and preserving the network configuration during the data synchronization and cutover, Xtract is able to perform cross-hypervisor migrations with minimal downtime.  And since the original VM is not touched during the migration, rollback is as easy as shutting down the AHV VM and powering the ESXi VM back on, which significantly reduces the risk of cross-hypervisor migrations.

Analysis

The datacenter is clearly changing, and we now live in a multi-hypervisor world.  While many customers will still run VMware for their on-premises environments, there are many that are looking to reduce their spend on hypervisor products.  Xtract for VMs provides a tool to help reduce that cost while providing the simplicity that Nutanix is known for.

While Xtract is currently version 1.0, I can see this technology be a pivotal for helping customers move workloads between on-premises and cloud infrastructures.

To learn more about this new tool, you can check out the Xtract for VMs page on Nutanix’s webpage.

Coming Soon – The Virtual Horizon Podcast #blogtober

I’ve been blogging now for about seven or so years, and The Virtual Horizon has existed in it’s current form for about two or three years.

So what’s next?  Besides for more blogging, that is…

It’s time to go multimedia.  In the next few weeks, I will be launching The Virtual Horizon Podcast.  The podcast will only partially focus on the latest in end-user computing, and I hope to cover other topics such as career development, community involvement, and even other technologies that exist outside of the EUC space.

I’m still working out some of the logistics and workflow, but the first episode has already been recorded.  It should post in the next couple of weeks.

So keep an eye out here.  I’ll be adding a new section to the page once we’re a bit closer to go-live.

Announcing Rubrik 4.1 – The Microsoft Release

Rubrik has made significant enhancements to their platform since they came out of stealth just over two years ago, and their platform has grown from an innovative way to bring together software and hardware to solve virtualization backup challenges to a robust data protection platform due to their extremely aggressive release schedule.

Yesterday, Rubrik is announcing version 4.1.  The latest version builds on the already strong offerings in the Alta release that came out just a few months ago.  This release, in particular, is heavily focused on the Microsoft stacks, and there is also a heavy focus on cloud.

So what’s new in Rubrik 4.1?

Multi-Tenancy

The major enhancement is multi-tenancy support.  Rubrik 4.1 will now support dividing up a single physical Rubrik cluster into multiple Organizations.  Organizations are logical management units inside a physical Rubrik cluster, and each organization can manage their own logical objects such as users, protected objects, SLA domains, and replication targets.  This new multi-tenancy model is designed to meet the needs of service provider organizations, where multiple customers may use Rubrik as a backup target, as well as large enterprises that have multiple IT organizations.

In order to support the new multi-tenancy feature, Rubrik is adding role-based access control with multiple levels of access.  This will allow application owners and administrators to get limited access to Rubrik to manage their particular resources.

Azure, Azure Stack, and Hyper-V

One of the big foci of the Rubrik 4.1 release is Microsoft, and Rubrik has enhanced their Microsoft platform support.

The first major enhancement to Rubrik’s Microsoft platform offering is Azure Stack support.  Rubrik will be able to integrate with Azure Stack and provide protection to customer workloads running on this platform.

The second major enhancement is to the CloudOn App Instantiation feature.  CloudOn was released in Alta, and it enables customers to power-on VM snapshots in the public cloud.  The initial release supported AWS, and Rubrik is now adding support for Azure.

SQL Server Always-On Support

Rubrik is expanding it’s agent-based SQL Server backup support to Always-On Availability Groups.  In the current release, Rubrik will detect if a SQL Server is part of an availability group, but it requires an administrator to manually apply an SLA policy to databases.  If there is a failover in the availability group, a manual intervention would be required to change the replica that was being protected.  This could be an issue with 2-node availability groups as a node failure, or server reboot, would cause a failover that could impact SLAs on the protected databases.

Rubrik 4.1 will now detect the configuration of a SQL Server, including availability groups.  Based on the configuration, Rubrik will dynamically select the replica to back up.  If a failover occurs, Rubrik will select a different replica in the availability group to use as a backup source.  This feature is only supported on synchronous commit availability groups.

Google Cloud Storage Support

Google Cloud is now supported as a cloud archive target, and all Google Cloud storage tiers are supported.

AWS Glacier and GovCloud Support

One feature that has been requested multiple times since Rubrik was released was support for AWS Glacier for long-term storage retention.  Rubrik 4.1 now adds support for Glacier as an archive location.

Also in the 4.1 release is support for AWS GovCloud.  This will allow government entities with Rubrik to utilize AWS as a cloud archive.

Thoughts

Rubrik has had an aggressive release schedule since Day 1.  And they don’t seem to be letting up on quickly adding features.  The 4.1 release does not disappoint in this category.

The feature I’m most excited about is the enhanced support for SQL Always-On Availability Groups.  While Rubrik can detect if a database is part of an AG today, the ability to dynamically select the instance to back up is key for organizations that have smaller AGs or utilize the basic 2-node AG feature in SQL Server 2016.

 

vMotion Support for NVIDIA vGPU is Coming…And It’ll Be Bigger than you Think

One of the cooler tech announcements at VMworld 2017 was on display at the NVIDIA booth.  It wasn’t really an announcement, per se, but more of a demonstration of a long awaited solution to a very difficult challenge in the virtualization space.

NVIDIA displayed a tech demo of vMotion support for VMs with GRID vGPU running on ESXi.  Along with this demo was news that they had also solved the problem of suspend and resume on vGPU enabled machines, and these solutions would be included in future product releases.  NVIDIA announced live migration support for XenServer earlier this year.

Rob Beekmans (Twitter: @robbeekmans) also wrote about this recently, and his blog has video showing the tech demos in action.

I want to clarify that these are tech demos, not tech previews.  Tech Previews, in VMware EUC terms, usually means a feature that is in beta or pre-release to get real-world feedback.  These demos likely occurred on a development version of a future ESXi release, and there is no projected timeline as to when they will be released as part of a product.

Challenges to Enabling vMotion Support for vGPU

So you’re probably thinking “What’s the big deal? vMotion is old hat now.”  But when vGPU is enabled on a virtual machine, it requires that VM to have direct, but shared, access to physical hardware on the system – in this case, a GPU.  And vMotion never worked if a VM had direct access to hardware – be it a PCI device that was passed through or something plugged into a USB port.

If we look at how vGPU works, each VM has a shared PCI device added to it.  This shared PCI device provides shared access to a physical card.  To facilitate this access, each VM gets a portion of the GPU’s Base Address Register (BAR), or the hardware level interface between the machine and the PCI card.  In order to make this portable, there has to be some method of virtualizing the BAR.  A VM that migrates may not get the same address space on the BAR when it moves to a new host, and any changes to that would likely cause issues to Windows or any jobs that the VM has placed on the GPU.

There is another challenge to enabling vMotion support for vGPU.  Think about what a GPU is – it’s a (massively parallel) processor with dedicated RAM.  When you add a GPU into a VM, you’re essentially attaching a 2nd system to the VM, and the data that is in the GPU framebuffer and processor queues needs to be migrated along with the CPU, system RAM, and system state.  So this requires extra coordination to ensure that the GPU releases things so they can be migrated to the new host, and it has to be done in a way that doesn’t impact performance for other users or applications that may be sharing the GPU.

Suspend and Resume is another challenge that is very similar to vMotion support.  Suspending a VM basically hibernates the VM.  All current state information about the VM is saved to disk, and the hardware resources are released.  Instead of sending data to another machine, it needs to be written to a state file on disk.  This includes the GPU state.  When the VM is resumed, it may not get placed on the same host and/or GPU, but all the saved state needs to be restored.

Hardware Preemption and CUDA Support on Pascal

The August 2016 GRID release included support for the Pascal-series cards.  Pascal series cards include hardware support for preemption.  This is important for GRID because it uses time-slicing to share access to the GPU across multiple VMs.  When a time-slice expires, it moves onto the next VM.

This can cause issues when using GRID to run CUDA jobs.  CUDA jobs can be very long running, and the job is stopped when the time-slice is expired.  Hardware preemption enables long-running CUDA tasks to be interrupted and paused when the time-slice expires, and those jobs are resumed when that VM gets a new time-slice.

So why is this important?  In previous versions of GRID, CUDA was only available and supported on the largest profiles.  So to support the applications that required CUDA in a virtual virtual environment, and entire GPU would need to be dedicated to the VM. This could be a significant overallocation of resources, and it significantly reduced the density on a host.  If a customer was using M60s, which have two GPUs per card, then they may be limited to 4 machines with GPU access if they needed CUDA support.

With Pascal cards and the latest GRID software, CUDA support is enabled on all vDWS profiles (the ones that end with a Q).  Now customers can provide CUDA-enabled vGPU profiles to virtual machines without having to dedicate an entire GPU to one machine.

This has two benefits.  First, it enables more features in the high-end 3D applications that run on virtual workstations.  Not only can these machines be used for design, they can now utilize the GPU to run models or simulations.

The second benefit has nothing to do with virtual desktops or applications.  It actually allows GPU-enabled server applications to be fully virtualized.  This potentially means things like render farms or, in a future looking state, virtualized AI inference engines for business applications or infrastructure support services.  One potentially interesting use case for this is running MapD, a database that runs entirely in the GPU, on a virtual machine.

Analysis

GPUs have the ability to revolutionize enterprise applications in the data center.  They can potentially bring artificial intelligence, deep learning, and massively parallel computing to business apps.

vMotion support is critical in enabling enterprise applications in virtual environments.  The ability to move applications and servers around is important to keeping services available.

By enabling hardware preemption and vMotion support, it now becomes possible to virtualize the next generation of business applications.  These applications will require a GPU and CUDA support to improve performance or utilize deep learning algorithms.  Applications that require a GPU and CUDA support can be moved around in the datacenter without impacting the running workloads, maintaining availability and keeping active jobs running so they do not have to be restarted.

This also opens up new opportunities to better utilize data center resources.  If I have a large VDI footprint that utilizes GRID, I can’t vMotion any running desktops today to consolidate them on particular hosts.  If I can use vMotion to consolidate these desktops, I can utilize the remaining hosts with GPUs to perform other tasks with GPUs such as turning them into render farms, after-hours data processing with GPUs, or other tasks.

This may not seem important now.  But I believe that deep learning/artificial intelligence will become a critical feature in business applications, and the ability to turn my VDI hosts into something else after-hours will help enable these next generation applications.

 

#VMworld EUC Showcase Keynote – #EDW7002KU Live Blog

Good afternoon from Las Vegas.  The EUC Showcase keynote is about to start.  During this session, VMware EUC CTO Shawn Bass and GM of End User Computing Sumit Dhawan will showcase the latest advancements in VMware’s EUC technology portfolio.  So sit tight as we bring you the latest news here shortly.

3:30 PM – Session is about to begin.  They’re reading off the disclaimers.

3:30 PM – Follow hashtag #EUCShowcase on Twitter for real-time updates from EUC Champions like @vhojan and @youngtech

3:31 PM – The intro video covers some of the next generation technologies like AI and Machine Learning, and how people are the power behind this power.  EUC is fundamentally about people and using technology to improve how people get work done.

3:34 PM – “Most of you are here to learn how to redefine work.” Sumit Dhawan

3:38 PM – Marginal costs of endpoint management will continue to increase due to the proliferation of devices and applications.  IoT will only make this worse.

3:39 PM – VMware is leveraging public APIs to build a platform to manage devices and applications.  The APIs provide a context of the device along with the identity that allow the device to receive the proper level of security and management.  Workspace One combines identity and context seamlessly to deliver this experience to mobile devices.

3:42 PM – There is a tug of war between the needs of the business, such as security and application management, and the needs of the end user, such as privacy and personal data management.  VMware is using the Workspace One platform to deliver a balance between the needs of the employer and the needs of the end user without increasing marginal costs of management.

3:45 PM – Shawn Bass is now onstage.  He’s going to be showing a lot of demos.  Demos will include endpoint management of Windows 10, MacOS, and ChromeBook, BYO, and delivering Windows as a Service.

3:47 PM – Legacy Windows management is complex.  Imaging has a number of challenges, and delivering legacy applications has more complex challenge.  Workspace One can provide the same experience for delivering applications to Windows 10 as users get with mobile devices.  The process allows users to self-enroll their devices by just entering their email and joining it to an Airwatch-integrated Azure AD.

Application delivery is simplified and performance is improved by using Adaptiva.  This removes the need for local distribution points.  Integration with Workspace One also allows users to self-service enroll in applications without having to open a ticket with IT or manually install software.

3:54 PM – MacOS support is enabled in Workspace One.  The user experience is similar to what users experience on Windows 10 devices and mobile devices – both for enrollment and application experience.  A new Workspace One app experience is being delivered for MacOS.

3:57 PM – Chromebook integration can be configured out of the box and have devices joined to the Workspace One environment.  It also supports the Android Google Play store integration and allows users to get a curated app-store experience.

3:59 PM – The core message of Workspace One is that one solution can manage mobile devices, tablets, and desktop machines, removing the need for point solutions and management silos.

4:01 PM – Capital One and DXC are on stage to talk about their experience around digital workspace.  The key message is that the workplace is changing from one where everyone is an employee to a gig economy where employees are temporary and come and go.  Bring-Your-Own helps solve this challenge, but it raises new challenges around security and access.

Capital One sees major benefits of using Workspace One to manage Windows 10.  Key features include the ability to apply an MDM framework to manage devices and removing the need for application deployment and imaging.

4:10 PM – The discussion has now moved into BYO and privacy.

4:11 PM – And that’s it for me folks.  I need to jet.

GRID 5.0–Pascal Support and More

This morning, NVIDIA announced the latest version of the graphics virtualization stack – NVIDIA GRID 5.0.  This latest releases continues the trend that NVIDIA started two years ago when they separated the GRID software stack from the Tesla data center GPUs in the GRID 2.0 release.

GRID 5.0 adds several new key features to the GRID product line.  Along with these new features, NVIDIA is also adding a new Tesla card and rebranding the Virtual Workstation license SKU.

Quadro Virtual Data Center Workstation

Previous versions of GRID contained profiles designed for workstations and high-end professional applications.  These profiles, which ended in a Q, provided Quadro level features for the most demanding applications.  They also required the GRID Virtual Workstation license.

NVIDIA has decided to rebrand the professional series capabilities of GRID to better align with their professional visualization series of products.  The GRID Virtual Workstation license will now be called the Quadro Virtual Data Center Workstation license.  This change helps differentiate the Virtual PC and Apps features, which are geared towards knowledge users, from the professional series capabilities.

Tesla P6

The Tesla P6 is the Pascal-generation successor to the Maxwell-generation M6 GPU.  It provides a GPU purpose-built for blade servers.  In addition to using a Pascal-generation GPU, the P6 also increases the amount of Framebuffer to 16GB.   The P6 can now support up to 16 users per blade, which provides more value to customers who want to adopt GRID for VDI on their blade platform.

Pascal Support for GRID

The next generation GRID software adds support for the Pascal-generation Tesla cards.  The new cards that are supported in GRID 5.0 are the Tesla P4, P6, P40, and P100.

The P40 is the designated successor to the M60 card.  It is a 1U board with a single GPU and 24GB of Framebuffer.  The increased framebuffer also allows for a 50% increase in density, and the P40 can handle up to 24 users per board compared to the 16 users per M60.

Edit for Clarification – The comparison between the M60 and the P40 was done using the 1GB GRID profiles.  The M60 can support up to 32 users per board when assigning each VM 512MB of framebuffer, but this option is not available in GRID 5.0.  

On the other end of the scale is the P4. This is a 1U small form factor Pascal GPU with 8GB of Framebuffer.  Unlike other larger Tesla boards, this board can run on 75W, so it doesn’t need any additional power.  This makes it suitable for cloud and rack-dense computing environments.

In addition to better performance, the Pascal cards have a few key advantages over the previous generation Maxwell cards.  First, there is no need to use the GPU-Mode-Switch utility to convert the Pascal board from compute mode to graphics mode.  There is, however, a manual step that is required to disable ECC memory on the Pascal boards, but this is built into the NVIDIA-SMI utility.  This change streamlines the GRID deployment process for Pascal boards.

The second advantage involves hardware-level preemption support.  In previous generations of GRID, CUDA support was only available when using the 8Q profile.  This dedicated an entire GPU to a single VM.  Hardware preemption support enables Pascal cards to support CUDA on all profiles.

To understand why hardware preemption is required, we have to look at how GRID shares GPU resources.  GRID uses round-robin time slicing to share GPU resources amongst multiple VMs, and each VM gets a set amount of time on the GPU.  When the time slice expires, the GPU moves onto the next VM.  When the GPU is rendering graphics to be displayed on the screen, the round-robin method works well because the GPU can typically complete all the work in the allotted time slice.  CUDA jobs, however, pose a challenge because jobs can take hours to complete.  Without the ability to preempt the running jobs, the CUDA jobs could fail when the time slice expired.

Preemption support on Pascal cards allows VMs with any virtual workstation profile to have access to CUDA features.  This enables high-end applications to use smaller Quadro vDWS profiles instead of having to have an entire GPU dedicated to that specific user.

Fixed Share Round Robin Scheduling

As mentioned above, GRID uses round robin time slicing to share the GPU across multiple VMs.  One disadvantage of this method is that if a VM doesn’t have anything for the GPU to do, it is skipped and the time slice is given to the next VM in line.  This prevents the GPU from being idle if there are VMs that can utilize it.  It also means that some VMs may get more access to the GPU than others.

NVIDIA is adding a new scheduler option in GRID 5.0.  This option is called the Fixed Share Scheduler.  The Fixed Share scheduler grants each VM that is placed on the GPU an equal share of resources.  Time slices are still used in the fixed share scheduler, and fi a VM does not have any jobs for the GPU to execute, the GPU will be idled during that time slice.

As VMs are placed onto, or removed from, a GPU, the share of resources available to each VM is recalculated, and shares are redistributed to ensure that all VMs get equal access.

Enhanced Monitoring

GRID 5.0 adds new monitoring capabilities to the GRID platform.  One of the new features is per-application monitoring.  Administrators can now view GPU utilization on a per-application basis using the NVIDIA-SMI tool.  This new feature allows administrators to see exactly how much of the GPU resources each application is using.

License Enforcement

In previous versions of GRID, the licensing server basically acted as an auditing tool.  A license was required for GRID, but the GRID features would continue to function even if the licensed quantity was exceeded.  GRID 5.0 changes that.  Licensing is now enforced with GRID, and if a license is not available, the GRID drivers will not function.  Users will get reduced quality when they sign into their desktops.

Because licensing is now enforced, the license server has built-in HA functionality.  A secondary licensing server can be specified in the config of both the license server and the driver, and if the primary is not available, it will fall back to the secondary.

Other Announced Features

Two GRID 5.0 features were announced at Citrix Synergy back in May.  The first was Citrix Director support for monitoring GRID.  The second feature is beta Live Migration support for XenServer.

Top 10 EUC Sessions at #VMworld 2017 Las Vegas

VMworld 2017 is just around the corner.  The premier virtualization conference will be returning to the Mandalay Bay convention center in Las Vegas at the end of August. 

There is one major addition to the EUC content at VMworld this year.  VMware has decided to move the Airwatch Connect conference, which cover’s VMware’s offerings in the mobility management space, from Atlanta and colocate it with VMworld.  So not only do attendees interested in EUC get great expert content on VMware’s Horizon solutions, they’ll get more content on Airwatch, mobility management, identity management, and IoT as well.

My top 10 EUC sessions for 2017 are:

  1. ADV1594BU – Beyond the Marketing: VMware Horizon 7.1 Instant Clones Deep Dive – This session, by Jim Yanik and Oswald Chen, is a technical deep dive into how Instant Clone desktops work.  This updated session will cover new features that have been added to Instant Clones since they were released in Horizon 7.  I’m often wary of “deep dive sessions,” but I’ve seen Jim give a similar presentation at various events and he does a great job talking through the Instant Clone technology in a way that all skill levels can understand it.  If you’re interested in VMware EUC, this is the one session you must attend as this technology will be relevant for years to come. 
  2. ADV1609BU – Deliver Any App, Any Desktop, Anywhere in the World Using VMware Blast Extreme – Blast Extreme is VMware’s new protocol that was officially introduced in Horizon 7.  Pat Lee and Ramu Panayappan will provide a deep dive into Blast Extreme.  Pat does a good job talking about Blast Extreme and how it works, and attendees will definitely walk away having learned something.
  3. ADV1681GU/ADV1607BU – Delivering 3D graphics desktops and applications in the real world with VMware Horizon, BEAT and NVIDIA GRID – VMware’s Kiran Rao and NVIDIA’s Luke Wignall talk about how Blast Extreme utilizes NVIDIA GPUs to provide a better user experience in End-User Computing environments.  This session was actually listed twice in the Content Catalog, so don’t worry if you miss one.
  4. ADV1583BU – Delivering Skype for Business with VMware Horizon: All You Need to Know – Official support for Skype for Business GA’d with Horizon 7.2.  This session will dive into how that the new Skype for Business plugin works to provide a better telephony experience in EUC environments.
  5. ADV3370BUS – DeX Solutions: How Samsung and VMware are Pioneering Digital Transformation – Samsung DeX is a new cell phone from Samsung that, when placed in a dock, can utilize a keyboard, mouse, and monitor to act as a virtual thin client endpoint while still having all the capabilities of a phone.  DeX has the potential to revolutionize how businesses provide endpoints and cellular phones to users.
  6. ADV1655GU – CTOs perspective on the Workspace 2020 and beyond: time to act now! – End-User Computing expert and technology evangelist Ruben Spruijt talks about the future of the end-user workspace and strategies on how to implement next-generation workspace technology.
  7. UEM1359BU – Best Practices in Migrating Windows 7 to Windows 10 – Windows 10 migrations are a hot topic, and almost every business will need a Windows 10 strategy.  This session will explore the best practices for migrating to Windows 10 in any type of organization.
  8. SAAM1684GU – Ask the Experts: How to Enable Secure Access from Personal/BYO Devices and All Types of Users with Workspace ONE – How do you enable secure remote access to company resources while allowing employees, contractors, and other types of workers to use their personal devices?  This group discussion will cover best practices for using VMware Workspace ONE to provide various levels of secure access to company resources from personal devices based on various context settings.  Unlike most sessions, this is a group discussion.  There are very few slides, and most of the session time will be devoted to allowing attendees to ask questions to the discussion leaders.
  9. ADV1588BU – Architecting Horizon 7 and Horizon Apps – A successful EUC environment starts with a solid architecture.  This session covers how to architect an integrated Horizon environment consisting of all components of the Horizon Suite. 
  10. vBrownbag TechTalks on EUC – There are three community driven vBrownbag Tech Talks focusing on EUC led by EUC Champions.  These talks are:
    1. GPU-Enabled Linux VDI by Tony Foster – Tony will cover how to build GPU-enabled Linux virtual desktops in Horizon and some of the pain points he encountered while implementing this solution at a customer.
    2. Windows 10 and VDI – Better Come Prepared – Rob Beekmans and Sven Huisman will cover lessons they’ve learned while implementing Windows 10 in VDI environments.
    3. Leveraging User Environment Manager to Remove GPOs – Nigel Hickey will cover how to use VMware UEM as a Group Policy replacement tool.
  11. ADV1605PU – Ask the Experts: Practical Tips and Tricks to Help You Succeed in EUC – So this top 10 list will actually have 11 entries, and this one is a bit of shameless self-promotion.  This session is s a repeat of last year’s EUC champions session featuring Earl Gay, VCDX Johan van Amersfoot, moderator Matt Heldstab, and I.  We’re answering your questions about EUC based on our experiences in the trenches.  Last year, we also had some prizes. 

Bonus Session

There is one bonus session that you must put on your schedule.  It’s not EUC-related, but it is put on by two of the smartest people in the business today.  They were also two of my VCDX mentors.  The session is Upgrading to vSphere 6.5 the VCDX Way [SER2318BU] by Rebecca Fitzhugh and Melissa Palmer.  You should seriously check this session out as they’ll provide a roadmap to take your environment up to vSphere 6.5. 

Introducing Horizon 7.2

What’s new in Horizon 7.2?  A lot, actually.  There are some big features that will impact customers greatly, some beta features are now general availability, and some scalability enhancements. 

Horizon 7 Helpdesk Tool

One of the big drawbacks of Horizon is that it doesn’t have a very good tool for Helpdesk staff to support the environment.  While Horizon Administrator has role-based access control to limit what non-administrators can access and change, it doesn’t provide a good tool to enable a helpdesk technician to look up what desktop a user is in, grab some key metrics about their session, and remote in to assist the user.

VMware has provided a fling that can do some of this – the Horizon Toolbox.  But flings aren’t officially supported and aren’t always updated to support the latest version. 

Horizon 7.2 addresses this issue with the addition of the Horizon Helpdesk tool.  Horizon Helpdesk is an HTML5-based web interface designed for level 1 and 2 support desk engineers.  Unlike Citrix Director, which offers similar features for XenApp and XenDesktop environments, Horizon Helpdesk is integrated into the Connection Servers and installed by default.

Horizon Helpdesk provides a tool for helpdesk technicians who are supporting users with Horizon virtual desktops and published applications.  They’re able to log into the web-based tool, search for specific users, and view their active sessions and desktop and application entitlements.  Technicians can view details about existing sessions, including various metrics from within the virtual desktops and use the tool to launch a remote assistance.

Skype for Business Support is Now GA

The Horizon Virtualization Pack for Skype for Business was released under Technical Preview with Horizon 7.1.  It’s now fully supported on Windows endpoints in Horizon 7.2, and a private beta will be available for Linux-based clients soon. 

The Virtualization Pack allows Horizon clients to offload some of the audio and video features of Skype for business to the endpoint, and it provides optimized communication paths directly between the endpoints for multimedia calls.  What exactly does that mean?  Well, prior to the virtualization pack, multimedia streams would need to be sent from the endpoint to the virtual desktop, processed inside the VM, and then sent to the recipient.  The virtualization pack offloads the multimedia processing to the endpoint, and all media is sent between endpoints using a separate point-to-point stream.  This happens outside of the normal display protocol virtual channels.

image

Skype for Business has a number of PBX features, but not all of these features are supported in the first production-ready version of this plugin.  The following features are not supported:

  • Multiparty Conferencing
  • E911 Location Services
  • Call to Response Group
  • Call Park and Call Pickup from Park
  • Call via X (Work/Home, Cell, etc)
  • Customized Ringtones
  • Meet Now Conferencing
  • Call Recording

Instant Clone Updates

A few new features have been added to Horizon Instant Clones.  These new features are the ability to reuse existing Active Directory Computer accounts, support for configuring SVGA settings – such as video memory, resolution and number of monitors supported –  on the virtual desktop parent, and support for placing Instant Clones on local datastores.

Scalability Updates

VMware has improved Horizon stability in each of the previous releases, usually by expanding the number of connections, Horizon Pods, and sites that are supported in a Cloud Pod Architecture.  This release is no exception.  CPA now supports up to 120,000 sessions across 12 Horizon Pods in five sites.  While the number of sites has not increased, the number of supported sessions has increased by 40,000 compared to Horizon 7.1.

This isn’t the only scalability enhancement in Horizon 7.2.  The largest enhancement is in the number of desktops that will be supported per vCenter Server.  Prior to this version of Horizon, only 2000 desktops of any kind were supported per vCenter.  This means that a Horizon Pod that supported that maximum number of sessions would require five vCenter servers.  Horizon 7.2 doubles the number of supported desktops per vCenter, meaning that each vCenter Server can now support up to 4000 desktops of any type. 

Other New Features

Some of the other new features in Horizon 7.2 are:

  • Support for deploying Full Clone desktops to DRS Storage Clusters
  • A Workspace One mode configure access policies around desktops and published applications
  • Client Drive Redirection and USB Redirection are now supported on Linux
  • The ability to rebuild full clone desktops and reuse existing machine accounts.

Key Takeaways

The major features of Horizon 7.2 are very attractive, and they address things that customers have been asking for a while.  If you’re planning a greenfield environment, or it’s been a while since you’ve upgraded, there are a number of compelling reasons to look at implementing this version.

Integrating Duo MFA and Amazon WorkSpaces

I recently did some work integrating Duo MFA with Amazon WorkSpaces.  The integration work ran into a few challenges, and I wanted to blog about those challenges to help others in the future.

Understanding Amazon WorkSpaces

If you couldn’t tell by the name, Amazon WorkSpaces is a cloud-hosted Desktop-as-a-Service offering that runs on Amazon Web Services (AWS).  It utilizes AWS’s underlying infrastructure to deploy desktop workloads, either using licensing provided by Amazon or the customer.  The benefit of this over on-premises VDI solutions is that Amazon manages the management infrastructure.  Amazon also provides multiple methods to integrate with the customer’s Active Directory environment, so users can continue to use their existing credentials to access the deployed desktops.

WorkSpaces uses an implementation of the PCoIP protocol for accessing desktops from the client application, and the client application is available on Windows, Linux, MacOS, iOS, and Android.  Amazon also has a web-based client that allows users to access their desktop from a web browser.  This feature is not enabled by default.

Anyway, that’s a very high-level overview of WorkSpaces.

Understanding How Users Connect to WorkSpaces

Before we can talk about how to integrate any multi-factor authentication solution into WorkSpaces, let’s go through the connection path and how that impacts how MFA is used.  WorkSpaces differs from traditional on-premises VDI in that all user connections to the desktop go through a public-facing service.  This happens even if I have a VPN tunnel or use DirectConnect.

The following image illustrates the connection path between the users and the desktop when a VPN connection is in place between the on-premises environment and AWS:

ad_connector_architecture_vpn

Image courtesy of AWS, retrieved from http://docs.aws.amazon.com/workspaces/latest/adminguide/prep_connect.html

As you can see from the diagram, on-premises users don’t connect to the WorkSpaces over the VPN tunnel – they do it over the Internet.  They are coming into their desktop from an untrusted connection, even if the endpoint they are connecting from is on a trusted network.

Note: The page that I retrieved this diagram from shows users connecting to WorkSpaces over a DirectConnect link when one is present.  This diagram shows a DirectConnect link that has a public endpoint, and all WorkSpaces connections are routed through this public endpoint.  In this configuration, logins are treated the same as logins coming in over the Internet.

When multi-factor authentication is configured for WorkSpaces, it is required for all user logins.  There is no way to configure rules to apply MFA to users connecting from untrusted networks because WorkSpaces sees all connections coming in on a public, Internet-facing service.

WorkSpaces MFA Requirements

WorkSpaces only works with MFA providers that support RADIUS.  The RADIUS servers can be located on-premises or in AWS.  I recommend placing the RADIUS proxies in another VPC in AWS where other supporting infrastructure resources, like AD Domain Controllers, are located.  These servers can be placed in the WorkSpaces VPC, but I wouldn’t recommend it.

WorkSpaces can use multiple RADIUS servers,and it will load-balance requests across all configured RADIUS servers.  RADIUS can be configured on both the Active Directory Connectors and the managed Microsoft AD directory services options. 

Note: The Duo documentation states that RADIUS can only be configured on the Active Directory Connector. Testing shows that RADIUS works on both enterprise WorkSpaces Directory Services options.

Configuring Duo

Before we can enable MFA in WorkSpaces, Duo needs to be configured.  At least one Duo proxy needs to be reachable from the WorkSpaces VPC.  This could be on a Windows or Linux EC2 instance in the WorkSpaces VPC, but it will most likely be an EC2 instance in a shared services or management VPC or in an on-premises datacenter.  The steps for installing the Duo Proxies are beyond the scope of this article, but they can be found in the Duo documentation.

Before the Duo Authentication Proxies are configured, we need to configure a new application in the Duo Admin console.  This step generates an integration key and secret key that will be used by the proxy and configure how new users will be handled.  The steps for creating the new application in Duo are:

1. Log into your Duo Admin Console at https://admin.duosecurity.com

2. Login as  your user and select the two-factor authentication method for your account.  Once you’ve completed the two-factor sign-on, you will have access to your Duo Admin panel.

3. Click on Applications in the upper left-hand corner.

4. Click on Protect an Application

5. Search for RADIUS.

6. Click Protect this Application  underneath the RADIUS application.

image

7. Record the Integration Key, Secret Key, and API Hostname.  These will be used when configuring the Duo Proxy in a few steps.

image

8. Provide a Name for the Application.  Users will see this name in their mobile device if they use Duo Push.

image

9. Set the New User Policy to Deny Access.  Deny Access is required because the WorkSpaces setup process will fail if Duo is configured to fail open or auto-enroll.

image

10. Click Save Changes to save the new application.

Get the AWS Directory Services IP Addresses

The RADIUS Servers need to know which endpoints to accept RADIUS requests from.  When WorkSpaces is configured for MFA, these requests come from the Directory Services instance.  This can either be the Active Directory Connectors or the Managed Active Directory domain controllers.

The IP addresses that will be required can be found in the AWS Console under WorkSpaces –> Directories.  The field to use depends on the directory service type you’re using.  If you’re using Active Directory Connectors, the IP addresses are in the Directory IP Address field.  If you’re using the managed Microsoft AD service, the Directory IP address field has a value of “None,” so you will need to use the  IP addresses in the DNS Address field.

Configure AWS Security Group Rules for RADIUS

By default, the AWS security group for the Directory Services instance heavily restrict the inbound and outbound traffic.  In order to enable RADIUS, you’ll need to add inbound and outbound UDP 1812 to the destination destination IPs or subnet where the proxies are located.

The steps for updating the AWS security group rules are:

1. Log into the AWS Console.

2. Click on the Services menu and select WorkSpaces.

3. Click on Directories.

4. Copy the value in the Directory ID field of the directory that will have RADIUS configured.

5. Click on the Services menu and select VPC.

6. Click on Security Groups.

7. Paste the Directory ID into the Search field to find the Security Group attached to the Directory Services instance.

8. Select the Security Group.

9. Click on the Inbound Rules tab.

10. Click Edit.

11. Click Add Another Rule.

12. Select Custom UDP Rule in the Type dropdown box.

13. Enter 1812 in the port range field.

14. Enter the IP Address of the RADIUS Server in the Source field.

15. Repeat Steps 11 through 14 as necessary to create inbound rules for each Duo Proxy.

16. Click Save.

17. Click on the Outbound Rules tab.

18. Click Edit.

19. Click Add Another Rule.

20. Select Custom UDP Rule in the Type dropdown box.

21. Enter 1812 in the port range field.

22. Enter the IP Address of the RADIUS Server in the Source field.

23. Repeat Steps 11 through 14 as necessary to create inbound rules for each Duo Proxy.

24. Click Save.

Configure the Duo Authentication Proxies

Once the Duo side is configured and the AWS security rules are set up, we need to configure our Authentication Proxy or Proxies.  This needs to be done locally on each Authentication Proxy.  The steps for configuring the Authentication Proxy are:

Note:  This step assumes the authentication proxy is running on Windows.The steps for installing the authentication proxy are beyond the scope of this article.  You can find the directions for installing the proxy at: https://duo.com/docs/

1. Log into your Authentication Proxy.

2. Open the authproxy.cfg file located in C:\Program Files (x86)\Duo Security Authentication Proxy\conf.

Note: The authproxy.cfg file uses the Unix newline setup, and it may require a different text editor than Notepad.  If you can install it on your server, I recommend Notepad++ or Sublime Text. 

3. Add the following lines to your config file.  This requires the Integration Key, Secret Key, and API Hostname from Duo and the IP Addresses of the WorkSpaces Active Directory Connectors or Managed Active Directory domain controllers.  Both RADIUS servers need to use the same RADIUS shared secret, and it should be a complex string.

[duo_only_client]

[radius_server_duo_only]
ikey=<Integration Key>
skey=<Secret Key>
api_host=<API Hostname>.duosecurity.com
failmode=safe
radius_ip_1=Directory Service IP 1
radius_secret_1=Secret Key
radius_ip_2=Directory Service IP 2
radius_secret_2=Secret Key
port=1812

4. Save the configuration file.

5. Restart the Duo Authentication Proxy Service to apply the new configuration.

Integrating WorkSpaces with Duo

Now that our base infrastructure is configured, it’s time to set up Multi-Factor Authentication in WorkSpaces.  MFA is configured on the Directory Services instance.  If multiple directory services instances are required to meet different WorkSpaces use cases, then this configuration will need to be performed on each directory service instance. 

The steps for enabling WorkSpaces with Duo are:

1. Log into the AWS Console.

2. Click on the Services menu and select Directory Service.

3. Click on the Directory ID of the directory where MFA will be enabled.

4. Click the Multi-Factor authentication tab.

5. Click the Enable Multi-Factor Authentication checkbox.

6. Enter the IP address or addresses of your Duo Authentication Proxies in the RADIUS Server IP Address(es) field.

7. Enter the RADIUS Shared Secret in the Shared Secret Code and Confirm Shared Secret Code fields.

8. Change the Protocol to PAP.

9. Change the Server Timeout to 20 seconds.

10. Change the Max Retries to 3.

11. Click Update Directory.

12. If RADIUS setup is successful, the RADIUS Status should say Completed.

image

Using WorkSpaces with Duo

Once RADIUS is configured, the WorkSpaces client login prompt will change.  Users will be prompted for an MFA code along with their username and password.  This can be the one-time password generated in the mobile app, a push, or a text message with a one-time password.  If the SMS option is used, the user’s login will fail, but they will receive a text message with a one-time password. They will need to log in again using one of the codes received in the text message.

image

Troubleshooting

If the RADIUS setup does not complete successfully, there are a few things that you can check.  The first step is to verify that port 1812 is open on the Duo Authentication Proxies.  Also verify that the security group the directory services instance is configured to allow port 1812 inbound and outbound. 

One issue that I ran into in previous setups was the New User Policy in the Duo settings.  If it is not set to Deny Access, the process setup process will fail.  WorkSpaces is expecting a response code of Access Reject.  If it receives anything else, the process stalls out and fails about 15 minutes later. 

Finally, review the Duo Authentication Proxy logs.  These can be found in C:\Program Files (x86)\Duo Security Authentication Proxy\log on Authentication Proxies running Windows.  A tool like WinTail can be useful for watching the logs in real-time when trying to troubleshoot issues or verify that the RADIUS Services are functioning correctly.

Configuring Duo Security MFA for Horizon Unified Access Gateway

Note: After publishing, I decided to rework this blog post a bit to separate the AD-integrated Duo configuration from the Duo-only configuration.  This should make the post easier to follow.

In my last post, I went through the steps for deploying a Horizon Access Point/Unified Access Gateway using the PowerShell deployment script.  That post walked through the basic deployment steps.

The Unified Access Gateway supports multiple options for two-factor authentication, and many real-world deployments will use some form of two-factor when granting users access to their desktops and applications remotely.  The Unified Access Gateway supports the following two-factor authentication technologies:

  • RSA Secur-ID
  • RADIUS
  • Certificate Based/Smart-Cards

Because I’m doing this in a lab environment, I decided to use a RADIUS-based technology for this post.  I’ve been using Duo Security for a while because they support RADIUS, have a mobile app, and have a free tier.  Duo also supports VMware Horizon, although they do not currently have any documentation on integrating with the Access Point/Unified Access Gateway.

Duo Security for Multi-factor Authentication

Duo Security is a cloud-based MFA provider.  Duo utilizes an on-premises Authentication Proxy to integrate with customer systems.  In addition to providing their own authentication source, they can also integrate into existing Active Directory environments or RADIUS servers.

When using Active Directory as the authentication source, Duo will utilize the same username and password as the user’s AD account.  It will validate these against Active Directory before prompting the user for their second authentication factor.

Understanding Unified Access Gateway Authentication Path

Before I configure the Unified Access Gateway for two-factor authentication with Duo, let’s walk through how the appliance handles authentication for Horizon environments and how it compares to the Security Server.  There are some key differences between how these two technologies work.

When the Security Server was the only option, two-factor authentication was enabled on the Connection Servers.  When users signed in remotely, the security server would proxy all authentication traffic back to the connection server that it was paired with.  The user would first be prompted for their username and their one-time password, and if that validated successfully, they would be prompted to log in with their Active Directory credentials.  The reliance on connection servers meant that two sets of connection servers needed to be maintained – one internal facing without multi-factor authentication configured and one external facing with multi-factor configured.

Note: When using Duo in this setup, I can configure Duo to use the same initial credentials as the user’s AD account and then present the user with options for how they want to validate their identity – a phone call, push to a mobile device, or passcode.  I will discuss that configuration below.

The Unified Access Gateway setup is, in some ways, similar to the old Security Server/Connection Server setup.  If 2FA is used, the user is prompted for the one-time password first, and if that is successful, the user is prompted for their Active Directory credentials.  But there are two key differences here.  First, the Unified Access Gateway is not tied to a specific Connection Server, and it can be pointed at a load-balanced pool of connection servers.  The other difference is 2FA is validated in the DMZ by the appliance, so 2FA does not need to be configured on the Connection Servers.

Horizon has a couple of multi-factor authentication options that we need to know about when doing the initial configuration.  These two settings are:

  • Enforce 2-factor and Windows user name matching (matchWindowsUserName in UAG) – enabled when the MFA and Windows user names match.
  • Use the same username and password for RADIUS and Windows authentication (WindowsSSOEnabled in UAG) – Used when the RADIUS server and Windows have the same password.  When this setting is enabled, the domain sign-on prompt is skipped.

If my two-factor authentication system supports Active Directory authentication, I can use my Windows Username and Password to be authenticated against it and then receive a challenge for a one-time password (or device push).  If this second factor of authentication is successful, I will be automatically signed into Horizon.

Configuring Duo

The Duo environment needs to be configured before Horizon MFA can be set up.  Duo requires an on-premises authentication proxy.  This proxy acts as a RADIUS server, and it can run on Windows or Linux.  The steps for installing the Duo authentication proxy are beyond the scope of this article.

Once the authentication proxy is installed, it needs to be configured.  Duo has a number of options for configuration, and it’s important to review the documentation.  I’ll highlight a few of these options below.

The first thing we need to configure is the Duo client.  The client determines what type of primary authentication is performed.  Duo supports three methods:

  • ad_client: Duo uses Active Directory for primary authentication.  Users sign in with their Active Directory username and password.
  • radius_client: Duo uses the specified RADIUS server, such as Microsoft NPS or Cisco ACS, for primary authentication.
  • duo_only_client: Duo does not perform primary authentication.  The authentication proxy only performs secondary authentication, and primary authentication is handled by the configured system.

I don’t have a RADIUS server configured in my lab, so I did testing with both the ad_client and the duo_only_client. I prefer a single sign-on solution in my lab, so this setup will be configured with the ad_client.

The authentication proxy configuration is contained in the authproxy.cfg file located in C:\Program Files (x86)\Duo Security Authentication Proxy\conf.  Open this file in a text editor and add the following lines:

[ad_client]
host=DC.IP.0.1
host_2=DC.IP.0.2
service_account_username=serviceaccount
service_account_password=serviceaccountpassword
search_dn=DC=domain,DC=com

Duo includes a number of options for configuring an Active Directory client, including encrypted passwords, LDAPS support, and other security features.  The configuration above is a simple configuration meant for a lab or PoC environment.

Before the RADIUS server settings can be configured, a new application needs to be created in the Duo Administration Console.  The steps for this are:

1. Log into the Duo Admin Console.

2. Click Applications

3. Click Protect an Application

4. Scroll down to VMware View and select “Protect this Application.”

5. Copy the Integration Key, Secret Key, and API Hostname.

6. Change the username normalization option to “Simple.”

7. Click “Save Changes.”

The next step is to configure the authentication proxy as a RADIUS service.  Duo also has a number of options for this.  These determine how Duo interacts with the client service.  These options include:

  • RADIUS_Auto: The user’s authentication factor, and the device that receives the factor, is automatically selected by Duo.
  • RADIUS_Challenge: The user receives a textual challenge after primary authentication is complete.   The user then selects the authentication factor and device that it is received on.
  • RADIUS_Duo_Only: The RADIUS service does not handle primary authentication, and the user’s passcode or factor choice is used as the RADIUS password.

Duo also has multiple types of authentication factors.  These options are:

  • Passcode: A time-based one-time password that is generated by the mobile app.
  • Push: A challenge is pushed to the user’s mobile device with the Duo mobile app installed.  The user approves the access request to continue sign-in.
  • Phone Call: Users can opt to receive a phone call with their one-time passcode.
  • SMS: Users can opt to receive a text message with their one-time passcode.

RADIUS_Challenge will be used for this setup.  In my testing, the combination of an ad_client and RADIUS_Auto meant that my second authentication factor was a push to the mobile app.  In most cases, this was fine, but in situations where my laptop was online but my phone was not (such as on an airplane), meant that I was unable to access the environment.  The other option is to use RADIUS_Duo_Only with the Duo_Only_Client, but I wanted a single sign-on experience.

The RADIUS server configuration for the Horizon Unified Access Gateway is:

[radius_server_challenge]
ikey=[random key generated by the Duo Admin Console]
skey=[random key generated by the Duo Admin Console]
api_host=api-xxxx.duosecurity.com [generated by the Duo Admin Console]
failmode=secure
client=ad_client
radius_ip_1=IP Address or Range
radius_secret_1=RadiusSecret
port=1812
prompt_format=short

The above configuration is set to fail secure.  This means that if the Duo service is not available, users will be unable to log in.  The other option that was selected was a short prompt format.  This displays a simple text message with the options that are available and prompts the user to select one.

Save the authproxy.cfg file and restart the Duo Authentication Proxy service for the new settings to take effect.

The next step is to configure the Unified Access Gateway to use RADIUS.  The following lines need to be added to the [Horizon] section:

authMethods=radius-auth

matchWindowsUserName=true

windowsSSOEnabled=true

A new section needs to be added to handle the RADIUS configuration.  The following lines need to be added to create the RADIUS section and configure the RADIUS client on the Unified Access Gateway.  If more than one RADIUS server exists in the environment, you can add an _2 to the end of hostname and auth_port.

[RADIUSAuth]

hostName=IP.to.RADIUS.Server [IP of the primary RADIUS server]

authType=PAP

authPort=1812

radiusDisplayHint=XXX Token

Save the UAG configuration file and deploy, or redeploy, your Unified Access Gateway.  When the deployment completes, and you log in externally, you should see the following screens:

image

image

Duo-Only Authentication

So what if I don’t want an AD-integrated single sign-on experience?  What if I just want users to enter a code before signing in with their AD credentials.  Duo supports that configuration as well.  There are a couple of differences compared to the configuration for the AD-enabled Duo environment.

The first change is in the Duo authentication proxy config file.  No AD client configuration is required.  Instead of an [ad_client] section, a single line entry of [duo_only_client] is required.  The RADIUS Server configuration can also mostly stay the same.  The only required changes are to change the client from ad_client to duo_only_client and the section name from [radius_server_challenge] to [radius_server_duo_only].

The Access Point configuration will change.  The following configuration items should be used instead:

authMethods=radius-auth && sp-auth

matchWindowsUserName=true

windowsSSOEnabled=false