Notes from VAPP5483 – Virtualizing Active Directory the Right Way
Active Directory Overview
Windows Active Directory multi-master replication conundrum
Writes originate from any DC
Changes must converge
- Eventually
- preferably on time
Why virtualize Active Directory
- Virtualization is mainstream at this point
- Active Directory is fully supported in virtual environments
- Active Directory is virtualization friendly -> Distributed multi-master model, low resource requirements
- Domain Controllers are interchangable -> one breaks, they can be replaced. Cattle, not pets
- Physical domain controllers waste compute resources
Common Objections to DC Virtualization
- Fear of the stolen VMDK -> no different than stolen server or backup tape
- Priviledge Escalation -> vCenter priviledges are separate
- Have to keep certain roles physical -> no technical reason for this, can seize or move roles if needed
- Deviates from standards/build process -> helps standardization
- Time Keeping in VMs is hard -> Presenters agree
Time Sync Issues
Old way – VMs get time from ESXi
Changed to use Windows time tools
KB 1189 -> time sync with host still happens on vMotion or Guest OS reboot
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1189
Demo -> moving PDC emulator to host with bad clock
If time on host is more than 1 year old, NTP cannot update or fix the time
How do we determine the correct time
Ask ESXi host?
This could be OK if…
- Host times are always right
- CMOS doesn’t go bad
- Rogue operations don’t happen
- Security is a thing other people worry about
Reality – Stuff happens…
vSphere default behavior corrects time on the PDC emulator
Can cause a lot of issues in impacted Windows Forests
Preventing Bad Time Sync
- Ensure hardware clock is correct
- Configure reliable NTP
- Disable DRS on PDCe
- Use Host-Guest Affinity for PDCes
- Advanced Settings to disable Time Sync –> KB 1189
Best Practices
Don’t use WAN for Auth – Place domain controllers locally
Distribute FSMO Roles
Use Effective RBAC – don’t cross roles unless needed, give rights only to trusted operators
To P2V or Not – don’t do it unless you hate yourself
Use Anti-Affinity Rules -> don’t have DCs on the same hosts, use host rules to place important
Sizing
vCPU – under 10K users, 1 vCPU, over that, start with 2 vCPU
RAM – database server, database is held in RAM, more RAM is better, perfmon counter shows cache usage
Networking – VMXNET3
Storage – Space that it needs plus room to grow
DNS –
70% of issues are DNS issues
AD requires effective DNS
DNS solution – doesn’t matter if Windows or Appliance, but must be AD-Aware
Avoid pointing DNS to itself, otherwise DNS cannot start
Virtual Disk -> Caching MS KB 888794
Preventing USN Rollback
AD is distributed directory service, relies on clock-based replication
Each DC keeps track of all transactions and tags them with a GUID
If a DC is snapshotted and rolled back, local DC will believe it is right, but all others will know it is bad and refuse to replicate with it. This is called USN rollback
Demo USN rollback
If you have 2008 R2 and below DCs, they will stop replicating. Both will still advertise as domain controllers
VM-Generation ID – exposes counter to guest
- 2012 and newer. Operating system level feature and must be supported by hypervisor
- vSphere 5.0 Update 2 and newer
- Attribute is tracked in local copy of database on local domain controller, triggered by snapshots and snapshot rollback
Provides protection against USN rollback
Invented specifically for virtual domain controllers, allows for cloning of domain controllers
Demo – Clone a Domain Controller
Domain Controller must have software and services that support cloning – agents have to support cloning
Do NOT hot clone a domain controller. Must be in powered off state
Do not clone a DC that holds FSMO roles
Can Clone the PDCe, must power up reference domain controller before powering on clone
DNS must work
Do not sysprep the system
DC Safeguard allows a DC that has been reverted/restored to function as a DC
How it works:
- VM Generation ID checked on DC boot, when a snapshot is created, or when the VM is reverted to an old snapshot. VM Generation-ID on VM is checked against the copy in the local database.
- If it differs, RID Pool dumped and new RID pool issued
- When Generation ID has changed, AD will detect it and remediate it
- RID pool discarded, get new RID Pool and objects are re-replicated. VM essentially becomes a new DC
Pingback: Newsletter: VMworld 2015 Edition | Notes from MWhite