Thursday, 26 May 2016

Hyper-V Virtualization 101: Hardware and Performance

This is a post made to the SBS2K Yahoo List.

***

VMQ on Broadcom Gigabit NICs needs to be disabled at the NIC Port driver level. Not in the OS. Broadcom has not respected the spec for Gigabit NICs at all. I’m not so sure they have started to do so yet either. :S

In the BIOS:

  • ALL C-States: DISABLED
  • Power Profile: MAX
  • Intel Virtualization Features: ENABLED
  • Intel Virtualization for I/O: ENABLED

For the RAID setup we’d max out the available drive bays on the server. Go smaller volume and more spindles to achieve the required volume. This gains us more IOPS which are critical in smaller virtualization settings.

Go GHz over Cores. In our experience we are running mostly 2vCPU and 3vCPU VMs so ramming through the CPU pipeline quicker gets things done faster than having more threads in parallel at slower speeds.

Single RAM sticks per channel preferred with all being identical. Cost of 32GB DIMMs has come down. Check them out for your application. Intel’s CPUs are set up in three tiers. Purchase the RAM speed that matches the CPU tier. Don’t purchase faster RAM as that’s more expensive and thus money wasted.

Be aware of NUMA boundaries for the VMs. That means that each CPU may have one or more memory controller each. Each controller manages a chunk of RAM attached to that CPU. When a VM is set up with more vRAM than what is available on one memory controller that memory gets split up. That costs in performance.

Bottlenecks not necessarily in order:

  • Disk subsystem is vastly underperforming (in-guest latency and in-guest/host Disk Queue Length are key measures)
    • Latency: Triple digits = BAD
    • Disk Queue Length: > # Disks / 2 in RAID 6 = BAD (8 disks in RAID 6 then DQL of 4-5 is okay)
  • vCPUs assigned is greater than the number of physical cores – 1 on one CPU (CPU pipeline has to juggle those vCPU threads in parallel)
  • vRAM assigned spans NUMA nodes or takes up too much volume on one NUMA node
  • Broadcom Gigabit VMQ at the port level

The key in all of this though and it’s absolutely CRITICAL is this: Know your workloads!

All of the hardware and software performance knowledge in the world won’t help if we don’t know what our workloads are going to be doing.

An unhappy situation is spec’ing out a six to seven figure hyper-converged solution and having the client come back and say, “Take it away I’m fed up with the poor performance”. In this case the vendor over-promised and under-delivered.

Some further reading:

Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

Thursday, 12 May 2016

RDMA via RoCE 101 for Storage Spaces Direct (S2D)

We’ve decided to run with RoCE (RDMA over Converged Ethernet) for our Storage Spaces Direct (S2D) proof of concept (PoC).

image

  • (4) Intel Server Systems R2224WTTYS
    • Dual Intel Xeon Processors, 256GB ECC, Dual Mellanox ConnectX-3, and x540-T2
    • Storage is a mix of 10K SAS and Intel SATA SSDs to start
  • (2) Mellanox MSX1012 56Gbps Switches
  • (2) NETGEAR XS712T 10GbE Switches
  • (2) Cisco SG500x-48 Gigabit Switches
  • APC 1200mm 42U Enclosure
  • APC 6KV 220v UPS with extended runtime batteries

The following is a list of resources we’ve gathered together as of this writing:

This is, by far, not the most comprehensive of lists. The best place to start in our opinion is with Didier’s video and the PDF of the slides in that video. Then move on to the Mellanox resources.

We’ll update this blog post as we come across more materials and eventually get a process guide in place.

Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service