The Virtual Horizon Lab – February 2020

It’s been a while since I’ve done a home lab update.  In fact, the last one was over four years ago. William Lam’s home lab project and appearing on a future episode of “Hello from My Home Lab” with Lindy Collier has convinced me that it’s time to do an update.

My lab has both changed and grown since that last update.  Some of this was driven by vSphere changes – vSphere 6.7 required new hardware to replace my old R710s.  Changing requirements, new technology, and replacing broken equipment have also driven lab changes at various points.

My objectives have changed a bit too.  At the time of my last update, there were four key technologies and capabilities that I wanted in my lab.  These have changed as my career and my interests have changed, and my lab has evolved with it as well.  Today, my lab primarily focuses on end-user computing, learning Linux and AI, and running Minecraft servers for my kids.

vSphere Overview

The vSphere environment is probably the logical place to start.  My vSphere environment now consists of two vCenter Servers – one for my compute workloads and one for my EUC workloads.  The compute vCenter has two clusters – a 4 node cluster for general compute workloads and a 1 node cluster for backup.  The EUC vCenter has a single 2-node cluster for running desktop workloads.

Both environments run vSphere 6.7U3 and utilize the vCenter Server virtual appliance.  The EUC cluster utilzies VSAN and Horizon.  I don’t currently have NSX-T or vRealize Operations deployed, but those are on the roadmap to be redeployed.

Compute Overview

My lab has grown a bit in this area since the last update, and this is where the most changes have happened.

Most of my 11th generation Dell servers have been replaced, and I only have a single R710 left.  They were initially replaced by Cisco C220 M3 rackmounts, but I’ve switched back to Dell.  I preferred the Dell servers due to cost, availability, and HTML5-based remote management in the iDRACs.  Here are the specs for each of my clusters:

Compute Cluster – 4 Dell PowerEdge R620s with the following specs:

The R620s each have a 10GbE network card, but these cards are for future use.

Backup Cluster – 1 Dell PowerEdge R710 with the following specs:

This server is configured with local storage for my backup appliance.  This storage is provided by 1TB SSD SATA drives.

VDI Cluster – 2 Dell PowerEdge R720s with the following specs:

  • 2x Intel Xeon E5-2630 Processors
  • 96 GB RAM
  • NVIDIA Tesla P4 Card

Like the R620s, the R720s each have 10GbE networking available.

I also have an R730, however, it is not currently being used in the lab.

Network Overview

When I last wrote about my lab, I was using a pair of Linksys SRW2048 switches.  I’ve since replaced these with a pair of 48-port Cisco Catalyst 3560G switches.  One of the switches has PoE, and the other is a standard switch.  In addition to switching, routing has been enabled on these switches, and they act as the core router in the network.  HSRP is configured for redundancy.  These uplink to my firewall. Traffic in the lab is segregated into multiple VLANs, including a DMZ environment.

I use Ubiquiti AC-Lite APs for my home wifi.  The newer ones support standard PoE, which is provided by one of the Cisco switches.  The Unifi management console is installed on a Linux VM running in the lab.

For network services, I have a pair of PiHole appliances.  These appliances are running as virtual machines in the lab. I also have AVI Networks deployed for load balancing.

Storage Overview

There are two main options for primary storage in the lab.  Most primary storage is provided by Synology.  I’ve updated by Synology DS1515+ to a DS1818+.  The Synology appliance has four 4TB WD RED drives for capacity and four SSDs.  Two of the SSDs are used for a high-performance datastore, and the other two are used as a read-write cache for my primary datastore.  The array presents NFS-backed datastores to the VMware environment, and it also presents CIFS for file shares.

VSAN is the other form of primary storage in the lab.  The VSAN environment is an all-flash deployment in the VDI cluster, and it is used for serving up storage for VDI workloads.

The Cloud

With the proliferation of cloud providers and cloud-based services, it’s inevitable that cloud services work their way into home lab setups. My lab is no exception.

I use a couple of different cloud services in operating my lab across a couple of SaaS and cloud providers. These include:

  • Workspace ONE UEM and Workspace ONE Access
  • Office 365 and Azure – integrated with Workspace ONE through Azure AD
  • Amazon Web Services – management integrated into Workspace ONE Access, S3 as a offsite repository for backups
  • Atlassian Cloud – Jira and Confluence Free Tier integrated into Workspace ONE with Atlassian Access

Plans Going Forward

Home lab environments are dynamic, and they need to change to meet the technology and education needs of the users. My lab is no different, and I’m planning on growing my lab and it’s capabilities over the next year.

Some of the things I plan to focus on are:

  • Adding 10 GbE capability to the lab. I’m looking at some Mikrotik 24-port 10GbE SFP+ switches.
  • Upgrading my firewall
  • Implementing NSX-T
  • Deploying VMware Tunnel to securely publish out services like Code-Server
  • Putting my R730 back into production
  • Expanding my knowledge around DevOps and building pipelines to find ways to bring this to EUC
  • Work with Horizon Cloud Services and Horizon 7

Installing and Configuring the NVIDIA GRID License Server on CentOS 7.x

The release of NVIDIA GRID 10 included a new version of the GRID license server.  Rather than do an inplace upgrade of my existing Windows-based license servers that I was using in my lab, I decided to rebuild them on CentOS.

Prerequisites

In order to deploy the NVIDIA GRID license server, you will need two servers.  The license servers should be deployed in a highly-available architecture since the features enabled by the GRID drivers will not function if a license cannot be checked out.  These servers should be fully patched.  All of my CentOS boxes run without a GUI. All of the install steps will be done through the console, so you will need SSH access to the servers.

The license servers only require 2 vCPU and 4GB of RAM for most environments.  The license server component runs on Tomcat, so we will need to install Java and the Tomcat web server.  We will do that as part of our install.  Newer versions of Java default to IPv6, so if you are not using this technology in your environment, you will need to disable IPv6 on the server.  If you don’t, the license server will not be listening on any IPv4 addresses. While there are other ways to change Java’s default behavior, I find it easier to just disable IPv6 since I do not use it in my environment.

The documentation for the license server can be found on the NVIDIA docs site at https://docs.nvidia.com/grid/ls/2019.11/grid-license-server-user-guide/index.html

Installing the Prerequisites

First, we need to prepare the servers by installing and configuring our prerequisites.  We need to disable IPv6, install Java and Tomcat, and configure the Tomcat service to start automatically.

If you are planning to deploy the license servers in a highly available configuration, you will need to perform all of these steps on both servers.

The first step is to disable IPv6.  As mentioned above, Java appears to default to IPv6 for networking in recent releases on Linux.

The steps to do this are:

  1. Open the sysctl.conf file with the following command (substitute your preferred editor for nano).

    sudo nano /etc/sysctl.conf

  2. Add the following two lines at the end of the file:

    net.ipv6.conf.all.disable_ipv6 = 1
    net.ipv6.conf.default.disable_ipv6 = 1

  3. Save the file.
  4. Reboot to allow the changes to take effect.

Note: There are other ways to prevent Java from defaulting to IPv6.  These methods usually involve making changes to the application parameters when Java launches.  I selected this method because it was the easiest route to implement and I do not use IPv6 in my lab.

After the system reboots, the install can proceed.  The next steps are to install and configure Java and Tomcat.

  1. Install Java and Tomcat using the following commands:

    sudo yum install -y java tomcat tomcat-webapps

  2. Enable the tomcat service so that it starts automtically on reboot

    sudo systemctl enable tomcat.service

  3. Start Tomcat.

    sudo systemctl start tomcat.service

Finally, we will want to configure our JAVA_HOME variable.  The license server includes a command line tool, nvidialsadmin, that can be used to configure password authentication for the license server management console, and that tool requires a JAVA_HOME variable to be configured.  These steps will create the variable for all users on the system.

  1. Run the following command to see the path to the Java install:

    sudo alternatives –config java

  2. Copy the path to the Java folder, which is in parenthesis.  Do not include anyting after “jre/’
  3. Create a Bash plugin for Java with the following command:

    sudo nano /etc/profile.d/java.sh

  4. Add the following lines to the file:

    export JAVA_HOME=(Your Path to Java)
    export PATH=$PATH:$JAVA_HOME/bin

  5. Save the file.
  6. Reboot the system.
  7. Test to verify that the JAVA_HOME variable is set up properly

    echo $JAVA_HOME

Installing the NVIDIA License Server

Now that the prerequisites are configured, the NVIDIA license server software can be installed.  The license server binaries are stored on the NVIDIA Enterprise Licensing portal, and they will need to be downloaded on another machine and copied over using a tool like WinSCP.

The steps for installing the license server once the installer has been copied to the servers re:

  1. Set the binary to be executable.

    chmod +x setup.bin

  2. Run the setup program in console mode.

    sudo ./setup.bin -i console

  3. The first screen is a EULA that will need to be accepted.  To scroll down through the EULA, press Enter until you get to the EULA acceptance.
  4. Press Y to accept the EULA.
  5. When prompted, enter the path for the Tomcat WebApps folder.  On CentOS, this path is:
    /usr/share/tomcat
  6. When prompted, press 1 to enable firewall rules for the license server.  This will enable the license server port on TCP7070.
    Since this is a headless server, the management port on TCP8080 will also need to be enabled.  This will be done in a later step.
  7. Press Enter to install.
  8. When the install completes, press enter to exit the installer.

After the install completes, the management port firewall rules will need to be configured.  While the management interface can be secured with usernames and passwords, this is not configured out of the box.  The normal recommendation is to just use the browser on the local machine to set the configuration, but since this is a headless machine, that’s not avaialble either. For this step, I’m applying the rules to an internal zone and restricting access to the management port to the IP address of my management machine.  The steps for this are:

  1. Create a firewall rule for port TCP port 8080.

    sudo firewall-cmd –permanent –zone=internal –add-port=8080/tcp

  2. Create a firewall rule for the source IP address.

    sudo firewall-cmd –permanent –zone=internal –add-source=Management-Host-IP/32

  3. Reload the firewall daemon so the new rules take effect:

    sudo firewall-cmd –reload

Configuring the License Server For High Availability

Once the firewall rules for accessing the management port are in place, the server configuration can begin.  These steps will consist of configuring the high availability features.  Registering the license servers with the NVIDIA Licensing portal and retrieving and applying licenses will not be handled in this step.

In order to set the license servers up for high availability, you will need two servers running the same version of the license server software.  You will also need to identify which servers will be the primary and secondary servers in the infrastructure.

  1. Open a web browser on your management machine and go to http://<primary license server hostname or IP>:8080/licserver
  2. Click on Configuration
  3. In the License Generation section, fill in the following details:
    1. Backup URI:
      http://<secondary license server hostname or IP>:7070/fne/bin/capability
    2. Main URI:
      http://<primary license server hostname or IP>:7070/fne/bin/capability
  4. In the Settings for server to server sync between License servers section, fill in the following details:
    1. Synchronization to fne enabled: True
    2. Main FNE Server URI:
      http://<primary license server hostname or IP>:7070/fne/bin/capability
  5. Click Save.
  6. Open a new browser window or tab and go to go to http://<secondary license server hostname or IP>:8080/licserver
  7. Click on Configuration
  8. In the License Generation section, fill in the following details:
    1. Backup URI:
      http://<secondary license server hostname or IP>:7070/fne/bin/capability
    2. Main URI:
      http://<primary license server hostname or IP>:7070/fne/bin/capability
  9. In the Settings for server to server sync between License servers section, fill in the following details:
    1. Synchronization to fne enabled: True
    2. Main FNE Server URI:
      http://<primary license server hostname or IP>:7070/fne/bin/capability
  10. Click Save.

Summary

After completing the high availability setup section, the license servers are ready for the license file.  In order to generate and install this, the two license servers will need to be registered with the NVIDIA licensing service.  The steps to complete those tasks will be covered in a future post.