Tuesday 22 November 2016

Something to be Thankful For

There are many things to be grateful for. For us, it's our family, friends, business, and so much more.

This last weekend we were reminded in a not so subtle way just how fragile life can be.


The other driver, two of my kids, and myself were all very fortunate to walk away with no bones broken or blood spilled. We had a big grug later in the day when we are all finally back together at home.

Dealing with the soreness and migraines since the accident are a small price to pay for the fact that we are all okay.

And fortunately, the other driver took full responsibility for the critical error in judgement that caused the accident so no insurance scrambles will be dealt with.

We are truly thankful to be alive today.

Have a great Thanksgiving to our US neighbours. And for everyone, give those special folks in life a hug I sure have been! ;)

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

Tuesday 15 November 2016

What’s in a Lab? Profit!

Our previous post on Server Hardware: The Data Bus is Playing Catch-Up has had a lot of traction.

Our tweets on I.T. companies not having a lab for their solutions sales engineers and technicians has had a lot of traction.

So, let’s move forward with a rather blunt opinion piece shall we? ;)

What client wants to drop $25K on an 800bhp blown 454CID engine then shovel it in to that Vega/Monza only to find the car twisted into a pretzel on the first run and very possibly the driver with serious injuries or worse?


Image credit

Seriously, why wouldn’t the same question be asked by a prospect or client that is about to drop $95K or more on a Storage Spaces Direct (S2D) cluster that the I.T. provider has _never_ worked with? Does the client or prospect even think of asking that question? Are there any references with that solution in production? If the answer is “No” then get the chicken out of that house!

In the automotive industry folks ask those questions especially when they have some serious coin tied up in the project … at least we believe they would based on previous experience.

Note that there are a plethora of videos on YouTube and elsewhere showing the results of so-called “tuners” blowing the bottom end out of an already expensive engine. :P

In all seriousness though, how can an I.T. company sell a solution to a client that they’ve never worked with, put together, tested, or even _seen_ before?

It really surprised me to be chatting with a technical architect that works for a large I.T. provider when they told me their company doesn’t believe there is any value in providing a lab for them.

S2D Lab Setup

A company that keeps a lab, refreshes it every so often, stands to gain so much more than folks that count the beans may see.

For S2D, the following is a good place, and inexpensive, to start:

  • Typical 4-node S2D lab based on Intel Server Systems
    • R2224WTTYSR Servers: $15K each
    • Storage
      • Intel 750 Series NVMe $1K/Node
      • Intel 3700 Series SATA $2K/Node
      • Seagate/HGST Spindles $3K/Node
    • Mellanox RDMA Networking: $18K (MSX1012X + 10GbE CX-3 Adapters)
    • NETGEAR 10GbE Networking: $4K (XS716T + X540-T2 or X550-T2)
    • Cost: ~$75K to $85K

The setup should look something like this:


S2D Lab (Front)


S2D Lab (Rear)

Note that we have two extra nodes for a Hyper-V cluster setup to work with S2D as a SOFS only solution.

Okay, so the bean counters are saying, “what do we get for our $100K hmmm?”

Point 1: We’ve Done It

The above racked systems images go into any S2D Proposal with an explanation that we’ve been building these hyper-converged clusters since Windows Server 2016 was in its early technical preview days. The prospect that sees the section outlining our efforts to fine tune our solutions on our own dime places our competitors at a huge disadvantage.

Point 2: References

With our digging in and testing from the outset we would be bidding on deals with these solutions. As a result, we are one of the few with go-to-market ready solutions and will have deployed them before most others out there even know what S2D is!

Point 3: Killer and Flexible Performance

Most solutions we would be bidding against are traditional SAN style configurations. Our hyper-converged S2D platform provides a huge step up over these solutions in so many ways:

  1. IOPS: NVMe utilized at the cache layer for real IOPS gains over traditional SAN either via Fibre Channel or especially iSCSI.
  2. Throughput: Our storage can be set up to run huge amounts of data through the pipe if required.
  3. Scalability: We can start off small and scale out up to 16 nodes per cluster.
    • 2-8 nodes @ 10GbE RDMA via Mellanox and RoCEv2
    • 8-16 nodes @ 40GbE RDMA via Mellanox and RoCEv2
      • Or, 100GbE RDMA via Mellanox and RoCEv2

This begs the question: How does one know how one’s solution is going to perform if one has never deployed it before?

Oh, we know: “I’ve read it in Server’s Reports”, says the lead sales engineer. ;)

Point 4: Point of Principle

It has been mentioned here before: We would never,_ever_, deploy a solution that we’ve not worked with directly.


For one, because we want to make sure our solution would fulfil the promises we’ve made around it. We don’t want to be called to come and pick up our high availability solution because it does not do what it was supposed to do. We’ve heard of that happening for some rather expensive solutions from other vendors.

Point 5: Reputation

Our prospects can see that we have a history, and a rather long one at that, of digging in quite deep both in our own pockets but also of our own time to develop our solution sets. That also tells them that we are passionate about the solutions we propose.

We _are_ Server’s Reports so we don’t need to rely on any third party for a frame of reference! ;)


Finally, an I.T. company that invests in their crew both in lab kit, time, training, and mentorship will find their crew quite passionate about the solutions they are selling and working with. That translates into sales but also happy clients that can see for themselves that they are getting a great value for their I.T. dollars.

I.T. Services Companies get and maintain a lab! It is worth it!

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

Saturday 12 November 2016

Server Hardware: The Data Bus is Playing Catch-Up

After seeing the Mellanox ConnectX-6 200Gb announcement the following image came to mind:


Image credit

The Vega/Monza was a small car that some folks found the time to stuff a 454CID Chevy engine into then drop a 671 or 871 series roots blower on (they came off trucks back in the day). The driveline and the "frame" were then tweaked to accommodate it.

The moral of the story? It was great to have all of that power but putting down to the road was always a problem. Check out some of the "tubbed" Vega images out there to see a few of the ways to do so.

Our server hardware today does not, unfortunately, have the ability to be "tubbed" to allow us to get things moving.

PCI Express

The PCI Express (PCIe) v3 spec (Wikipedia) at a little over 15GB/Second (that's Gigabytes not Gigabits) across a 16 lane connector falls far short of the needed bandwidth for a dual port 100Gb ConnectX-5 part.

As a point of reference, the theoretical throughput of one 100Gb port is about 12.5GB/Second. That essentially renders the dual port ConnectX-5 adapter a moot point as that second port has very little left for it to use. So, it becomes essentially a "passive" port to a second switch for redundancy.

A quick search for "Intel Server Systems PCIe Gen 4" yields very little in the way of results. We know we are about due for a hardware step as the "R" code (meaning refresh such as R2224WTTYSR) is coming into its second to third year in 2017.

Note that the current Intel Xeon Processor E5-2600 v4 series only has a grand total of 40 PCI Express Generation 3 lanes available. Toss in two PCIe x16 wired lanes with two ConnectX-4 100Gb adapters and that's going to be about it for real throughput.

Connectivity fabric bandwidth outside the data bus is increasing in leaps and bounds. Storage technologies such as NVMe and now NVDIMM-N, 3D XPoint, and other such memory bus direct storage technologies are either centre stage or coming on to the stage.

The current PCIe v3 pipe is way too small. The fourth generation PCI Express pipe that is not even in production is _already_ too small! It's either time for an entirely new bus fabric or a transitioning of the memory bus into either a full or intermediate storage bus which is what NVDIMM-N and 3D XPoint are hinting at.

Oh, and one more tiny point: Drawing storage into the memory bus virtually eliminates latency ... almost.

Today's Solutions

Finally, one needs to keep in mind that the server platforms we are deploying on today have very specific limitations. We've already hit some limits in our performance testing (blog post: Storage Configuration: Know Your Workloads for IOPS or Throughput).

With our S2D solutions looking to three, five, or more years of service life these limitations _must_ be at the forefront of our thought process when in discovery and then solution planning.

If not, we stand to have an unhappy customer calling us to take the solution back after we deploy or a call a year or two down the road when they hit the limits.


Author's Note: I was just shy of my Journeyman's ticket as a mechanic, in a direction towards high-performance, when the computer bug bit me. ;)

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service