Home Lab Update

Back in October of 2014, I wrote a post about the (then) current state of my home lab.  My lab has grown a lot since then, and I’ve started building a strategy around my lab to cover technologies that I wanted to learn and the capabilities I would need to accomplish those learning goals.

I’ve also had some rather spectacular failures in the last year.  Some of these failures have been actual lab failures that have impacted the rest of the home network.  Others have been buying failures – equipment that appeared to meet my needs and was extremely cheap but ended up having extra costs that made it unsuitable in the long run.

Home Lab 1.0

I’ve never really had a strategy when it comes to my home lab.  Purchasing new hardware happened when I either outgrew something and needed capacity or to replace broken equipment.  If I could repurpose it, an older device would be “promoted” from running an actual workload to providing storage or some other dedicated service.

But this became unsustainable when I switched over to a consulting role.  There were too many things I needed, or wanted, to learn and try out that would require additional capacity.  My lab also had a mishmash of equipment, and I wanted to standardize on specific models.  This has two benefits – I can easily ensure that I have a standard set of capabilities across all components of the lab and it simplifies both upgrades and management.

The other challenge I wanted to address as I developed a strategy was separating out the “home network” from the lab.  While there would still be some overlap, such as wireless and Internet access,  it was possible to take down my entire network when I had issues in my home lab.  This actually happened on one occassion last August when the vDS in my lab corrupted itself and brought everything down.

The key technologies that I wanted to focus on with my lab are:

  1. End-User Computing:  I already use my lab for the VMware Horizon Suite.  I want to expand my VDI knowledge to include Citrix. I also want to spend time on persona management and application layering technologies like Liquidware Labs, Norskale, and Unidesk.
  2. Automation: I want to extend my skillset to include automation.  Although I have vRO deployed in my lab, I have never touched things like vRealize Automation and Puppet.  I also want to spend more time on PowerShell DSC and integrating it into vRO/vRA.  Another area I want to dive back into is automating Horizon environments – I haven’t really touched this subject since 2013.
  3. Containers: I want to learn more about Docker and the technologies surrounding it including Kubernetes, Swarm, and other technology in this stack.  This is the future of IT.
  4. Nutanix: Nutanix has a community edition that provides their hyperconverged storage technology along with the Acropolis Hypervisor.  I want to have a single-node Nutanix CE cluster up and running so I can dive deeper into their APIs and experiment with their upcoming Citrix integration.  At some point, I will probably expand that cluster to three node and use it for a home “private cloud” that my kids can deploy Minecraft servers into.

There are also a couple of key capabilities that I want in my lab.  These are:

  1. Remote Power Management:  This is the most important factor when it comes to my compute nodes.  I don’t want to have them running 24×7.  But at the same time, I don’t want to have to call up my wife and have her turn things on when I’m traveling.  Servers that I buy need to have some sort of remote management capability that does not require an external IP KVM or Wake-on-LAN.   The compute nodes I use need to have some sort of integrated remote management, preferably one with an API.
  2. Redundancy: I’m trying to avoid single-points of failure whenever possible.  Since much of my equipment is off-lease or used, I want to make sure that a single failure doesn’t take everything down.  I don’t have redundancy on all components – my storage, for instance, is a single Synology device due to budget constraints.  Network and Compute, however, are redundant.  Future lab roadmaps will address storage redundancy through hyperconverged offerings like ScaleIO and Nutanix CE.
  3. Flexibility: My lab needs to be able to shift between a number of different technologies.  I need to be able to jump from EUC to Cloud to containers without having to tear things down and rebuild them.  While my lab is virtualized, I will need to have the capacity to build and maintain these environments in a powered-off state.
  4. Segregation: A failure in the lab should not impact key home network services such as wireless and Internet access.

What’s in Home Lab 1.0

The components of my lab are:

Compute

Aside from one exception, I’ve standardized my compute tier on Dell 11th Generation servers.  I went with these particular servers because there are a number of off-lease boxes on eBay, and you can usually find a good deals on servers that come with large amounts of RAM.  RAM prices are also fairly low, and other components like iDRACs are readily available.

I have also standardized on the following components in each server:

  • iDRAC Enterprise for Remote Management
  • Broadcom 5709 Dual-Port Gigabit Ethernet
  • vSphere 6 Update 1 with the Host Client and Synology NFS Plugin installed

I have three vSphere clusters in my lab.  These clusters are:

  • Management Cluster
  • Workload Cluster
  • vGPU Cluster

The Management cluster consists of two PowerEdge R310s.  These servers have a single Xeon X3430 processor and 24GB of RAM.  This cluster is not built yet because I’ve had some trouble locating compatible RAM – the fairly common 2Rx4 DIMMs do not work with this server.  I think I’ve found some 2Rx8 or 4Rx8 DIMMs that should work.  The management cluster uses standard switches, and each host has a standard switch for Storage and a standard switch for all other traffic.

The Workload cluster consists of two PowerEdge R710s.  These servers have a pair of Xeon E5520 processors and 96GB of RAM.   My original plan was to upgrade each host to 72GB of RAM, but I had a bunch of 8GB DIMMs from my failed R310 upgrades, and I didn’t want to pay return shipping or restocking fees.  The Workload cluster is configured with a virtual distributed switch for storage, a vDS for VM traffic, and a standard switch for management and vMotion traffic.

The vGPU cluster is the only cluster that doesn’t follow the hardware standards.  The server is a Dell PowerEdge R730 with 32GB of RAM.  The server is configured with the Dell GPU enablement kit and currently has an NVIDIA GRID K1 card installed.

My Nutanix CE box is a PowerEdge R610 with 32GB of RAM.

Storage

The storage tier of my lab consists of a single Synology Diskstation 1515+.  It has four 2 TB WD Red drives in a RAID 10 and a single SSD acting as a read cache.  A single 2TB datastore is presented to my ESXi hosts using NFS.  The Synology also has a couple of CIFS shares for things like user profiles and network file shares.

Network

The network tier consists of a Juniper SRX100 firewall and a pair of Linksys SRW2048 switches.  The switches are not stacked but have similar configurations for redundancy.  Each server and the Synology are connected into both fabrics.

I have multiple VLANs on my network to segregate different types of traffic.  Storage, vMotion, and management traffic are all on their own VLANs.  Other VLANs are dedicated to different types of VM traffic.

That’s the overall high-level view of the current state of my home lab.  One component I haven’t spent much time on so far is my Horizon design.  I will cover that indepth in an upcoming post.