Wednesday, 7 December 2016

AutoDiscover “Broken” in Outlook 2016

We have a client that has their services spread across a number of different cloud systems.

Recently, users had started “losing” their connection to their hosted Exchange mailbox with Outlook coughing up some really strange errors that were not very helpful.

We ran the gamut trying to figure things out since any calls to the hosted Exchange provider and eventually the company hosting their web site always came back with “There’s a problem with Outlook”.


What we’ve managed to figure out is that Outlook 2016 will _always_ run an AutoDiscover check even if we’re manually setting up the mailbox for _any_ Exchange ActiveSync (EAS) connection. It must be some sort of new “security” feature in Outlook 2016.

What does that mean?

It means that when something changes unbeknownst to us things break. :(

In this case the AutoDiscover setup in Outlook for EAS connections and the web host changing something on their end as things were working for a _long_ time before the recent problems. Or, a recent update to Outlook 2016 changed things on the AutoDiscover side that revealed what was happening on the www hosting side.

Okay, back to the problem at hand. This is the prompt we would get when setting up a new mailbox, or eventually all users started getting who already had mailbox connections:


Internet E-mail

Enter your user name and password for the following server:


Well, our mailboxes are on a third party and not HostGator. So, on to chatting and eventually phoning them after opening a ticket with the Exchange host and hearing back that the problem was elsewhere.

Unfortunately, HostGator was not very helpful via chat or phone when we initially reached out. Outlook was always the problem they claimed.

So, we set up a test mailbox on the hosted Exchange platform and went to our handy Microsoft tool: Microsoft Remote Connectivity Analyzer.

We selected the Outlook Autodiscover option and ran through the steps setting up the mailbox information, then the CAPTCHA a few times ;-), and received the following results:


We now had concrete evidence that HostGator was not honouring the DNS setup we had for this domain which was not on their system.

A question was sent out to a fellow MVP on Exchange and their reply back was “HostGator had a URLReWrite rule in place for IIS/Apache that was grabbing the AutoDiscover polls from Outlook and sending them to their own servers.”

During that time we created the /AutoDiscover folder and put a test file in it. The problem still happened.

Okay, back on the phone with HostGator support. The first call had two “escalations” associated with it unfortunately with no results. A second call was made after seeing the MVP response with a specific request to HostGator: Delete the URLReWrite rule that was set up on this client’s site within the last month.

They could not do it. Nothing. Nada. Zippo. :(

So, for now our workaround was to move the DNS A record for @ ( to the same IP as the hosted Exchange service’s AutoDiscover IP to at least get Outlook to fail on the initial domain poll.

Moral of the story?

We’re moving all of our client’s web properties off HostGator to a hosting company that will honour the setup we implement and use the Microsoft Remote Connectivity Analyzer to test things out thoroughly.

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

Tuesday, 22 November 2016

Something to be Thankful For

There are many things to be grateful for. For us, it's our family, friends, business, and so much more.

This last weekend we were reminded in a not so subtle way just how fragile life can be.


The other driver, two of my kids, and myself were all very fortunate to walk away with no bones broken or blood spilled. We had a big grug later in the day when we are all finally back together at home.

Dealing with the soreness and migraines since the accident are a small price to pay for the fact that we are all okay.

And fortunately, the other driver took full responsibility for the critical error in judgement that caused the accident so no insurance scrambles will be dealt with.

We are truly thankful to be alive today.

Have a great Thanksgiving to our US neighbours. And for everyone, give those special folks in life a hug I sure have been! ;)

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

Tuesday, 15 November 2016

What’s in a Lab? Profit!

Our previous post on Server Hardware: The Data Bus is Playing Catch-Up has had a lot of traction.

Our tweets on I.T. companies not having a lab for their solutions sales engineers and technicians has had a lot of traction.

So, let’s move forward with a rather blunt opinion piece shall we? ;)

What client wants to drop $25K on an 800bhp blown 454CID engine then shovel it in to that Vega/Monza only to find the car twisted into a pretzel on the first run and very possibly the driver with serious injuries or worse?


Image credit

Seriously, why wouldn’t the same question be asked by a prospect or client that is about to drop $95K or more on a Storage Spaces Direct (S2D) cluster that the I.T. provider has _never_ worked with? Does the client or prospect even think of asking that question? Are there any references with that solution in production? If the answer is “No” then get the chicken out of that house!

In the automotive industry folks ask those questions especially when they have some serious coin tied up in the project … at least we believe they would based on previous experience.

Note that there are a plethora of videos on YouTube and elsewhere showing the results of so-called “tuners” blowing the bottom end out of an already expensive engine. :P

In all seriousness though, how can an I.T. company sell a solution to a client that they’ve never worked with, put together, tested, or even _seen_ before?

It really surprised me to be chatting with a technical architect that works for a large I.T. provider when they told me their company doesn’t believe there is any value in providing a lab for them.

S2D Lab Setup

A company that keeps a lab, refreshes it every so often, stands to gain so much more than folks that count the beans may see.

For S2D, the following is a good place, and inexpensive, to start:

  • Typical 4-node S2D lab based on Intel Server Systems
    • R2224WTTYSR Servers: $15K each
    • Storage
      • Intel 750 Series NVMe $1K/Node
      • Intel 3700 Series SATA $2K/Node
      • Seagate/HGST Spindles $3K/Node
    • Mellanox RDMA Networking: $18K (MSX1012X + 10GbE CX-3 Adapters)
    • NETGEAR 10GbE Networking: $4K (XS716T + X540-T2 or X550-T2)
    • Cost: ~$75K to $85K

The setup should look something like this:


S2D Lab (Front)


S2D Lab (Rear)

Note that we have two extra nodes for a Hyper-V cluster setup to work with S2D as a SOFS only solution.

Okay, so the bean counters are saying, “what do we get for our $100K hmmm?”

Point 1: We’ve Done It

The above racked systems images go into any S2D Proposal with an explanation that we’ve been building these hyper-converged clusters since Windows Server 2016 was in its early technical preview days. The prospect that sees the section outlining our efforts to fine tune our solutions on our own dime places our competitors at a huge disadvantage.

Point 2: References

With our digging in and testing from the outset we would be bidding on deals with these solutions. As a result, we are one of the few with go-to-market ready solutions and will have deployed them before most others out there even know what S2D is!

Point 3: Killer and Flexible Performance

Most solutions we would be bidding against are traditional SAN style configurations. Our hyper-converged S2D platform provides a huge step up over these solutions in so many ways:

  1. IOPS: NVMe utilized at the cache layer for real IOPS gains over traditional SAN either via Fibre Channel or especially iSCSI.
  2. Throughput: Our storage can be set up to run huge amounts of data through the pipe if required.
  3. Scalability: We can start off small and scale out up to 16 nodes per cluster.
    • 2-8 nodes @ 10GbE RDMA via Mellanox and RoCEv2
    • 8-16 nodes @ 40GbE RDMA via Mellanox and RoCEv2
      • Or, 100GbE RDMA via Mellanox and RoCEv2

This begs the question: How does one know how one’s solution is going to perform if one has never deployed it before?

Oh, we know: “I’ve read it in Server’s Reports”, says the lead sales engineer. ;)

Point 4: Point of Principle

It has been mentioned here before: We would never,_ever_, deploy a solution that we’ve not worked with directly.


For one, because we want to make sure our solution would fulfil the promises we’ve made around it. We don’t want to be called to come and pick up our high availability solution because it does not do what it was supposed to do. We’ve heard of that happening for some rather expensive solutions from other vendors.

Point 5: Reputation

Our prospects can see that we have a history, and a rather long one at that, of digging in quite deep both in our own pockets but also of our own time to develop our solution sets. That also tells them that we are passionate about the solutions we propose.

We _are_ Server’s Reports so we don’t need to rely on any third party for a frame of reference! ;)


Finally, an I.T. company that invests in their crew both in lab kit, time, training, and mentorship will find their crew quite passionate about the solutions they are selling and working with. That translates into sales but also happy clients that can see for themselves that they are getting a great value for their I.T. dollars.

I.T. Services Companies get and maintain a lab! It is worth it!

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

Saturday, 12 November 2016

Server Hardware: The Data Bus is Playing Catch-Up

After seeing the Mellanox ConnectX-6 200Gb announcement the following image came to mind:


Image credit

The Vega/Monza was a small car that some folks found the time to stuff a 454CID Chevy engine into then drop a 671 or 871 series roots blower on (they came off trucks back in the day). The driveline and the "frame" were then tweaked to accommodate it.

The moral of the story? It was great to have all of that power but putting down to the road was always a problem. Check out some of the "tubbed" Vega images out there to see a few of the ways to do so.

Our server hardware today does not, unfortunately, have the ability to be "tubbed" to allow us to get things moving.

PCI Express

The PCI Express (PCIe) v3 spec (Wikipedia) at a little over 15GB/Second (that's Gigabytes not Gigabits) across a 16 lane connector falls far short of the needed bandwidth for a dual port 100Gb ConnectX-5 part.

As a point of reference, the theoretical throughput of one 100Gb port is about 12.5GB/Second. That essentially renders the dual port ConnectX-5 adapter a moot point as that second port has very little left for it to use. So, it becomes essentially a "passive" port to a second switch for redundancy.

A quick search for "Intel Server Systems PCIe Gen 4" yields very little in the way of results. We know we are about due for a hardware step as the "R" code (meaning refresh such as R2224WTTYSR) is coming into its second to third year in 2017.

Note that the current Intel Xeon Processor E5-2600 v4 series only has a grand total of 40 PCI Express Generation 3 lanes available. Toss in two PCIe x16 wired lanes with two ConnectX-4 100Gb adapters and that's going to be about it for real throughput.

Connectivity fabric bandwidth outside the data bus is increasing in leaps and bounds. Storage technologies such as NVMe and now NVDIMM-N, 3D XPoint, and other such memory bus direct storage technologies are either centre stage or coming on to the stage.

The current PCIe v3 pipe is way too small. The fourth generation PCI Express pipe that is not even in production is _already_ too small! It's either time for an entirely new bus fabric or a transitioning of the memory bus into either a full or intermediate storage bus which is what NVDIMM-N and 3D XPoint are hinting at.

Oh, and one more tiny point: Drawing storage into the memory bus virtually eliminates latency ... almost.

Today's Solutions

Finally, one needs to keep in mind that the server platforms we are deploying on today have very specific limitations. We've already hit some limits in our performance testing (blog post: Storage Configuration: Know Your Workloads for IOPS or Throughput).

With our S2D solutions looking to three, five, or more years of service life these limitations _must_ be at the forefront of our thought process when in discovery and then solution planning.

If not, we stand to have an unhappy customer calling us to take the solution back after we deploy or a call a year or two down the road when they hit the limits.


Author's Note: I was just shy of my Journeyman's ticket as a mechanic, in a direction towards high-performance, when the computer bug bit me. ;)

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

Monday, 24 October 2016

Windows Server 2016 Feature Comparison Summary

This is a direct snip of the differences between Windows Server Standard and Datacenter from this PDF: image

Game Changing

A _lot_ of folks use the words “game changing” to somehow set their productions and/or service apart from others. Most of the time when we hear those words the products and/or services are “ho-hum” with little to no real impact on a company’s or user’s day-to-day work/personal life.
We’ve heard those words in our industry _a lot_ over the last number of years. The reality has been somewhat different for most of us.
We believe, based on our experience deploying inexpensive storage (clustered Scale-Out File Server/Storage Spaces) and compute (clustered Hyper-V) high availability solutions that Windows Server 2016 is indeed _game changing_ on three fronts:
  1. Compute
  2. Storage
  3. Networking
The Software Defined Data Centre (SDDC) story in Windows Server 2016 has traditional cluster, storage, and networking vendors concerned. In our opinion, deeply concerned. Just watch key vendor’s stocks, as we’ve been doing these last four or five years, to see just how much of an impact the Server 2016 SDDC story will have over the next two to five years. Some stocks have already reflected the inroads Hyper-V and recently (2 years) Storage Spaces have made into their markets. We’re a part of that story! :)
Using true commodity hardware we are able to set up an entire SDDC for a fraction of the cost of traditional data centre solutions. Not only that, with the Hyper-Converged Storage Spaces Direct platform we can provide lots of IOPS and compute in a small footprint without all of the added complexity and expense in traditional data centre solutions.

First Server 2016 Deployment

We’ve already deployed our first Windows Server 2016 Clustered Storage Spaces cluster on the General Availability (GA) bits while in Las Vegas two weeks ago:
That’s two Intel Server Systems R1208JP4OC 1U servers and a Quanta QCT JB4602 JBOD outfitted with (6) 200GB HGST SAS SSDs and (54) 8GB 8TB Seagate NearLine SAS drives. We are setting up for a Parity space to provide maximum storage availability as this client produces lots of 4K video. (EDIT NOTE: Updated the drive size)
Cost of the highly available storage solution is a fraction of the cost we’d see from Tier 1 storage or hardware vendors.

Going Forward

It’s no secret that we are excited about the Server 2016 story. We plan on posting a lot more about why we believe Windows Server 2016 is a game changer with the specifics around the above mentioned three areas. We may even mention some of the vendor’s stock tickers to add to your watch list too! ;)

Who is MPECS Inc.?

A bit of a shameless plug.
We’ve been building SDDC setups for small to medium hosting companies along with SMB/SME consultants and clients that are concerned about “Cloud being their data on someone else’s computer” since 2008/2009.
Our high availability solutions are designed with SMB (think sub $12K for a 2-node cluster) and SME (think sub $35K cluster) in mind along with SMB/SME focused hosting providers (think sub $50K to start). Our solutions are flexible and can be designed with the 3 year and 5 year stories, or more, in mind.
We run our Cloud on-premises, hybrid, or in _our_ Cloud that runs on _our_ highly available solutions.
Curious how we can help you? Then, please feel free to ask!
Have a great day and thanks for reading.
Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

Tuesday, 18 October 2016

Some Thoughts on the New Windows Server Per Core Licensing Model

This is a post sent to the SBS2K Yahoo Group.


Let’s take things from this perspective: Nothing has changed at our level.

How’s that?

We deploy four, six, and eight core single socket and dual socket servers.

We have not seen the need for four socket servers since one can set things up today with eight processors in one 2U space (4 nodes) without too much finagling and end up with way more performance.

Until Intel has conquered the GHz to core count ratio we will be deploying high GHz low core count CPUs for the time being. Not that eight cores is a “low” count in our world. ;)

Most of us will never see the need to set up a dual socket server with more than 12 cores with 16 being not as common a setup for us.

Our sweet spot right now in dual socket server cluster nodes is the E5-2643v4 at 6 cores and 3.4 GHz. For higher intensity workloads we run with the E5-2667v4 at 8 cores and 3.2 GHz. Price wise, these are the best bang for the buck relative to core count versus GHz.

With the introduction of the two node configuration for Storage Spaces Direct (S2D) we have an option to provide high availability (HA) in our smaller clients using two single socket 1U servers (R1208SPOSHOR or Dell R330) for a very reasonable cost. A Datacenter license is required for each node. Folks may balk at that, but keep this in mind:


What does that mean? It means that we can SPLA the Datacenter license whether the client leases the equipment, which was standard fair back in the day, or they own it but we SaaS the Windows Server licenses. Up here in Canada those licenses are about $175 per 16 cores. That’s $350/Month for a HA setup. We see a _huge_ market for this setup in SMB and SME. Oh, and keep in mind that we can then be very flexible about our VM layout. ;)

The licensing change reminds me of VMware’s changes a number of years back where they received so much backpressure that the “Core Tax” changes got reverted. So far, there’s not been a lot of backpressure that we’ve seen about this change. But then, the bulk of the on-premises SMB/SME world, where we spend most of our time, don’t deploy servers with more than 16 cores.

In the end, as I stated at the beginning, nothing has changed for us.

We’re still deploying “two” Server Standard licenses, now in this case with 16 cores per server, for our single box solutions with four VMs. And, we’re deploying “four” Server Standard licenses four our two node Clustered Storage Spaces and Hyper-V via shared storage that also yields four VMs in that setting.

If, and when, we broach into the higher density core counts for our cluster setups, or even standalone boxes, we will cross that bridge when we come to it.

Have a great day everyone and thanks for reading. :)

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

Tuesday, 20 September 2016

Windows Server 2016: Storage Spaces Direct (S2D) and VMware Virtual SAN IOPS?

We’re not VMware experts here. Let’s make that very clear. 

However, we do have the ear of many a VMware expert so we are able to ask: Is the Windows Server 2016 hyper-converged Storage Spaces Direct (S2D) setup similar to the VMware Virtual SAN platform? Apples to Apples if you will?

The resounding answer has been, “Yes!”

That leads us to the following articles that are about Intel NVMe all-flash hyper-converged setups:

  1. Record Performance of All Flash NVMe Configuration – Windows Server 2016 and Storage Spaces Direct - IT Peer Network
    • An Intel based four node S2D setup
  2. VMware Virtual SAN 6.2 Sets New Record - IT Peer Network
    • A VMware based 8 node Virtual SAN 6.2 setup

What’s interesting to note is the following:

  1. Intel/S2D setup at 4 nodes
    • 3.2M IOPS at 4K Read
      • 930K IOPS at 70/30 Read/Write
  2. VMware setup at 8 nodes
    1. 1.2M IOPS @ Capacity Mode
      • 500K IOPS at 70/30 Read/Write
    2. ~830K IOPS @ Balance Mode
      • 800K IOPS at 70/30 Read/Write

What is starting to become apparent, at least to us, is that the folks over at the newly minted Dell EMC _really_ need to pay attention to Microsoft’s feature depth in Windows Server 2016. In fact, that goes without saying for all vendors into virtualization, high performance computing (HPC), storage, and data centre products.

We’re pretty excited about the hyper-converged S2D feature set in Windows Server 2016. So much so, we have invested quite heavily in our Proof-of-Concept (PoC).


The bill of materials (BoM) for the setup so far:

  • S2D Nodes (4)
    • Intel Server Systems R2224WTTYS with dual E5-2640, 256GB ECC, Intel JBOD HBA, Intel X540-T2 10GbE, Mellanox 56GbE NICs, Intel PCIe 750 NVMe, Intel SATA SSDs, and Seagate 10K SAS spindles
  • Hyper-V Nodes (2)
    • Intel Server Systems R2208GZ4GC with dual E5-2650, 128GB ECC, Intel X540-T2 10GbE, and Mellanox 56GbE NICs
  • Lab DC
    • An Intel Server setup including S1200BT series board and Intel Xeon E3 processor.

Our testing includes using the S2D setup as a hyper-converged platform but also as a Scale-Out File Server (SOFS) cluster destination for a Hyper-V cluster on the two Hyper-V nodes. Then, some testing of various configurations beyond.

We believe Windows Server 2016 is looking to be one of the best server operating systems Microsoft has ever released. Hopefully we won’t be seeing any major bugs in the production version!

Philip Elder
Microsoft High Availability MVP
Our SMB/SME HA Solution
Our Cloud Service

Tuesday, 26 July 2016

Some Disaster Recovery Planning On-Premises, Hybrid, and Cloud Thoughts

This was a post to the SBS2K Yahoo list in response to a comment about the risks of encrypting all of our domain controllers (which we have been moving towards for a year or two now). It’s been tweaked for this blog post.


We’ve been moving to 100% encryption in all of our standalone and cluster settings.

Encrypting a setup does not change _anything_ as far as Disaster Recovery Plans go. Nothing. Period.

The “something can go wrong there” attitude should apply to everything from on-premises storage (we’ve been working with a firm that had Gigabytes/Terabytes of data lost due to the previous MSP’s failures) and services to Cloud resident data and services.

No stone should be left unturned when it comes to backing up data and Disaster Recovery Planning. None. Nada. Zippo. Zilch.

The new paradigm from Microsoft and others has migrated to “Hybrid” … for the moment. Do we have a backup of the cloud data and services? Is that backup air-gapped?

Google lost over 150K mailboxes a number of years back, we worked with one panicked call who lost everything, with no return. What happens then?

Recently, a UK VPS provider had a serious crash and, as it turns out lost _a lot_ of data. Where are their clients now? Where’s their client’s business after such a catastrophic loss?

Some on-premises versus cloud based backup experiences:

  • Veeam/ShadowProtect On-Premises: Air-gapped (no user access to avoid *Locker problems), encrypted, off-site rotated, and high performance recovery = Great.
  • Full recovery from the Cloud = Dismal.
  • Partial recovery of large files/numerous files/folders from the Cloud = Dismal.
  • Garbage In = Garbage Out = Cloud backup gets the botched bits in a *Locker event.
  • Cloud provider’s DC goes down = What then?
  • Cloud provider’s Services hit a wall and failover fails = What then (this was a part of Google’s earlier mentioned problem me thinks)?
    • ***Remember, we’re talking Data Centers on a grand scale where failover testing has been done?!?***
  • At Scale:
    • Cloud/Mail/Services providers rely on a myriad of systems to provide resilience
      • Most Cloud providers rely on those systems to keep things going
    • Backups?
      • Static, air-gapped backups?
      • “Off-Site” backups?
        • These do not, IMO, exist at scale
  • The BIG question: Does the Cloud service provider have a built-in backup facility?
    • Back up the data to local drive or NAS either manually or via schedule
    • Offer a virtual machine backup off their cloud service

There is an assumption, and we all know what that means right?, that seems to be prevalent among top tier cloud providers that their resiliency systems will be enough to protect them from that next big bang. But, has it? We seem to already have examples of the “not”.

In conclusion to this rather long winded post I can say this: It is up to us, our client’s trusted advisors, to make bl**dy well sure our client’s data and services are properly protected and that a down-to-earth backup exists of their cloud services/data.

We really don’t enjoy being on the other end of a phone call “OMG, my data’s gone, the service is offline, and I can’t get anywhere without it!” :(

Oh, and BTW, our SBS 2003/2008/2011 Standard/Premium sites all had 100% Uptime across YEARS of service. :P

We did have one exception in there due to an inability to cool the server closet as the A/C panel was full. Plus, the building’s HVAC had a bunch of open primary push ports (hot in winter cold in summer) above the ceiling tiles which is where the return air is supposed to happen. In the winter the server closet would hit +40C for long periods of time as the heat would settle into that area. ShadowProtect played a huge role in keeping this firm going plus technology changes over server refreshes helped (cooler running processors and our move to SAS drives).


Some further thoughts and references in addition to the above forum post.

The moral of this story is quite simple. Make sure _all_ data is backed up and air-gapped. Period.

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

Thursday, 14 July 2016

Hyper-V and Cluster Important: A Proper Time Setup

We’ve been deploying DCs into virtual settings since 2008 RTM. There was not a lot of information on virtualizing anything back then. :S

In a domain setting having time properly synchronized is critical. In the physical world the OS has the on board CMOS timer to keep itself in check along with the occasional poll of servers (we use specific sets for Canada, US, EU/UK, and other areas).

In a virtual setting one needs to make sure that time sync between host and guest PDCe/DC time authority is disabled. The PDCe needs to get its time from an outside, and accurate, source. The caveat with a virtual DC is that it no longer has a physical connection with the local CMOS clock.

What does this mean? We’ve seen high load standalone and clustered guests have their time skew before our eyes. It’s not as much of a problem in 2012 R2 as it was in 2008 RTM/R2 but time related problems still happen.

This is an older post but outlines our dilemma: MPECS Inc. Blog: Hyper-V: Preparing A High Load VM For Time Skew.

This is the method we use to set up our PDCe as authority and all other DCs as slaves: Hyper-V VM: Set Up PDCe NTP Time Server plus other DC’s time service.

In a cluster setting we _always_ deploy a physical DC as PDCe: MPECS Inc. Blog: Cluster: Why We Always Deploy a Physical DC in a Cluster Setting. The extra cost is 1U and a very minimal server to keep the time and have a starting place if something does go awry.

In higher load settings where time gets skewed scripting the time sync with a time server within the guest DC to happen more frequently means the time server will probably send a Kiss-O-Death packet (blog post). When that happens the PDCe will move on through its list of time servers until there are no more. Then things start breaking and clusters in the Windows world start stalling or failing.

As an FYI: A number of years ago we had a client call us to ask why things were wonky with the time and some services seemed to be offline. To the VMs everything seemed to be in order but their time was whacked as was the cluster node’s time.

After digging in and bringing things back online by correcting the time on the physical DC, the cluster nodes, and the VMs everything was okay.

When the owner asked why things went wonky the only explanation I had was that something in the time source system must have gone bad which subsequently threw everything out on the domain.

They indicated that a number of users had complained about their phone’s time being whacked that morning too. Putting two and two together there must have been a glitch in the time system providing time to our client site and the phone provider’s systems. At least, that’s the closest we could come to a reason for the time mysteriously going out on two disparate systems.

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

Tuesday, 5 July 2016

Outlook Crashes - Exchange 2013 or Exchange 2016 Backend

We’ve been deploying Office/Outlook 2016 and Exchange 2016 CU1 in our Small Business Solution (SBS) both on-premises and in our Cloud.

Since we’re deploying Exchange with a single common name certificate we are using the method for AutoDiscover.

A lot of our searching turned up “problems” around AutoDiscover but pretty much all of them were red herrings.

It turns out the Microsoft has deprecated the RPC over HTTPS setup in Outlook 2016. What does this mean?

MAPI over HTTPS is the go-to for Outlook communication with Exchange going forward.

Well, guess what?

MAPI over HTTPS is _disabled_ out of the box!

In Exchange PowerShell check on the service’s status:

Get-OrganizationConfig | fl *mapi*

To enable:

Set-OrganizationConfig -MapiHttpEnabled $true

Then, we need to set the virtual directory configuration:

Get-MapiVirtualDirectory -Server EXCHANGESERVER | Set-MapiVirtualDirectory -InternalUrl -ExternalUrl

Verify the settings took:

Get-MapiVirtualDirectory -Server EXCHANGESERVER | FL InternalUrl,ExternalUrl

And finally, test the setup:

Test-OutlookConnectivity -RunFromServerId EXCHANGESERVER -ProbeIdentity OutlookMapiHttpSelfTestProbe

EXCHANGESERVER needs to be changed to the Exchange server name.

Hat Tip: Mark Gossa: Exchange 2013 and Exchange 2016 MAPI over HTTP

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

Thursday, 26 May 2016

Hyper-V Virtualization 101: Hardware and Performance

This is a post made to the SBS2K Yahoo List.


VMQ on Broadcom Gigabit NICs needs to be disabled at the NIC Port driver level. Not in the OS. Broadcom has not respected the spec for Gigabit NICs at all. I’m not so sure they have started to do so yet either. :S

In the BIOS:

  • ALL C-States: DISABLED
  • Power Profile: MAX
  • Intel Virtualization Features: ENABLED
  • Intel Virtualization for I/O: ENABLED

For the RAID setup we’d max out the available drive bays on the server. Go smaller volume and more spindles to achieve the required volume. This gains us more IOPS which are critical in smaller virtualization settings.

Go GHz over Cores. In our experience we are running mostly 2vCPU and 3vCPU VMs so ramming through the CPU pipeline quicker gets things done faster than having more threads in parallel at slower speeds.

Single RAM sticks per channel preferred with all being identical. Cost of 32GB DIMMs has come down. Check them out for your application. Intel’s CPUs are set up in three tiers. Purchase the RAM speed that matches the CPU tier. Don’t purchase faster RAM as that’s more expensive and thus money wasted.

Be aware of NUMA boundaries for the VMs. That means that each CPU may have one or more memory controller each. Each controller manages a chunk of RAM attached to that CPU. When a VM is set up with more vRAM than what is available on one memory controller that memory gets split up. That costs in performance.

Bottlenecks not necessarily in order:

  • Disk subsystem is vastly underperforming (in-guest latency and in-guest/host Disk Queue Length are key measures)
    • Latency: Triple digits = BAD
    • Disk Queue Length: > # Disks / 2 in RAID 6 = BAD (8 disks in RAID 6 then DQL of 4-5 is okay)
  • vCPUs assigned is greater than the number of physical cores – 1 on one CPU (CPU pipeline has to juggle those vCPU threads in parallel)
  • vRAM assigned spans NUMA nodes or takes up too much volume on one NUMA node
  • Broadcom Gigabit VMQ at the port level

The key in all of this though and it’s absolutely CRITICAL is this: Know your workloads!

All of the hardware and software performance knowledge in the world won’t help if we don’t know what our workloads are going to be doing.

An unhappy situation is spec’ing out a six to seven figure hyper-converged solution and having the client come back and say, “Take it away I’m fed up with the poor performance”. In this case the vendor over-promised and under-delivered.

Some further reading:

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

Thursday, 12 May 2016

RDMA via RoCE 101 for Storage Spaces Direct (S2D)

We’ve decided to run with RoCE (RDMA over Converged Ethernet) for our Storage Spaces Direct (S2D) proof of concept (PoC).


  • (4) Intel Server Systems R2224WTTYS
    • Dual Intel Xeon Processors, 256GB ECC, Dual Mellanox ConnectX-3, and x540-T2
    • Storage is a mix of 10K SAS and Intel SATA SSDs to start
  • (2) Mellanox MSX1012 56Gbps Switches
  • (2) NETGEAR XS712T 10GbE Switches
  • (2) Cisco SG500x-48 Gigabit Switches
  • APC 1200mm 42U Enclosure
  • APC 6KV 220v UPS with extended runtime batteries

The following is a list of resources we’ve gathered together as of this writing:

This is, by far, not the most comprehensive of lists. The best place to start in our opinion is with Didier’s video and the PDF of the slides in that video. Then move on to the Mellanox resources.

We’ll update this blog post as we come across more materials and eventually get a process guide in place.

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

Tuesday, 26 April 2016

Remote Desktop Services 301: Some Advanced RDS Setup Guidelines

Here are some of the key tenants we’ve picked up deploying standalone and farm based Remote Desktop Services (RDS) for our Small Business Solution (SBS) on-premises and in our cloud.

Virtual Desktop Infrastructure, or VDI for short, covers both Remote Desktop Services standalone, farm, and desktop operating system deployments.

This list, while not totally comprehensive, covers a lot of ground.

  • Hardware, Storage, and Networking
    • GHz and Cores are king
      • Balance core count and GHz versus cost
      • NOTE: Server 2016 licensing will be, and is as of this writing, based on core count!
        • Today is 2 sockets tomorrow is a server total of 16 cores
        • Additional Cores purchased in pairs
        • Example 1: Dual Socket 8 Core pair = 16 Cores total so OK in current and 2016 licensing
        • Example 2: Dual Socket 12 Core pair = 24 Cores total so base license of 16 Cores plus a purchase of 4 licenses (2 cores per license) would be required
        • NOTE: Examples may not line up with actual license terms! Please verify when 2016 goes live.
    • RAM is cheap so load up now versus later
    • Balanced RAM setup is better than Unbalanced
      • Balanced: 4x 16GB in primary memory channel slot
        • Best performance
      • Unbalanced: 4x 16GB in primary and 4x 8GB in secondary
        • Performance hit
    • ~500MB of RAM per user per session to start
    • 25-50 IOPS per user depending on workloads/workflows RDS/VDI
      • Average 2.5” 10K SAS is ~250 to 400 IOPS depending on stack format (stripe/block sizes)
    • Latency kills
      • Direct Attached SAS or Hyper-Converged is best
      • Lots of small reads/writes
    • Average 16bpp RDS session single monitor 23” or 24” wide: ~95KB/Second
      • Average dual monitor ~150KB/Second
      • Bandwidth use is reduced substantially with a newer OS serving and connecting remotely (RDP version)
  • LoBs (Line of Business applications)
    • QuickBooks, Chrome, Firefox, and Sage are huge performance hogs in RDSH
      • Be mindful of LoB requirements and provision wisely
    • Keep the LoB database/server backend somewhere else
    • In larger settings dedicate an RDSH and RemoteApp to the resource hog LoB(s)
  • User Profile Disks
    • Office 2013 and Exchange 2013 are a wash in this setting
      • Search is difficult if not broken
    • Search Index database will bloat and fill the system disk! (Blog post with “fix”)
    • Office 2016, though still a bit buggy as of this writing, and Exchange 2016 address UPDs and search
    • Be mindful of network fabrics between UPDs and RDSH(s)
    • Set the UPD initial size a lot larger than needed as it can’t be changed later without a series of manual steps
      • UPDs are dynamic
      • Keep storage limits in mind because of this
  • Printing
    • Printers with PCL 5/6 engines built-in are preferred
      • Host-based printers are a no-go for us
    • HP Professional series LaserJet printers are our go-to
    • HP MFPs are preferred over Copiers
      • Copier engines tend to be hacked into Windows while the HP MFP is built as a printer out of the box

Previous post on the matter: Some Remote Desktop Session Host Guidelines.

Here’s a snippet of Intel’s current Intel Xeon Processor E5-2600v4 line sorted by base frequency in GHz:


Depending on the deployment type we’re deploying either E5-2643v4 or E5-2667v4 processors for higher density setups at this time. We are keeping at or under eight cores per socket unless we absolutely require more due to the upcoming sockets to cores changes in Windows Server licensing.

If you’d like a copy of that spreadsheet ping us or visit the Intel Ark site.

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud

Friday, 26 February 2016

Security: A Sample E-mail How-To Guide For End Users

With the plethora of e-mail born Office documents with active macros in them to pull down malware/ransomware we sent out the following e-mail to all of our clients for distribution internally.


Good day everyone,

It’s gotten to the point now where we are considering a universal restriction on incoming Office Documents. By that we mean plucking them right out of the e-mail via ExchangeDefender by default.

We have somehow travelled back to the 1990s where the bad guys are setting up Office documents with a Macro, an automatic script that runs when the document gets opened, that goes on to pull down their nefarious malware or ransomware.

Here are some steps to help protect us:

  1. Microsoft Office has a Save As PDF feature built-in. Please have all outside folks send a PDF instead of an Office document
    1. This is especially critical for Resumes. All job postings _must_ request PDF and note that Office documents would be deleted on the spot!
    2. If collaboration is required for Office documents use ShareFile
    3. Preferred over Dropbox since security is questionable with the Dropbox service
  2. Most Office documents that have Macros built-in have an “m” in the extension
    1. clip_image001
    2. Save the Office document to Downloads and verify!
    3. If extensions are not shown then right click the file and left click on Properties
    4. clip_image002
  3. Users _should_ be prompted:
    1. clip_image003
  4. Obviously, the answer should be to NOT click that button
  5. If they do, there is one last cause for pause
    1. clip_image004
  6. This is what happens if I try and click on something that is Macro driven _before_ clicking Enable Content
    1. clip_image005

Along with the need to be mindful of any Microsoft Office attachments in our e-mail we should also remember the following:

  1. Never click on a link in an e-mail without at the least verifying its destination:
    1. clip_image006
    2. Hover the mouse cursor over the link to verify
    3. As a rule: Never, ever, click on a link in an e-mail. Go to the web site after opening a new browser window (IE, Firefox, Chrome, Safari)
  2. It may _look_ like it came from someone you know but never trust that. Call and ask!
    1. There are a few exceptions to this rule thus make sure to hover your mouse over the link before clicking!
    2. Advanced users can check the headers
      1. image
      2. image
      3. Follow the flow from origin server to destination server
  3. Don’t save important site’s information in the browser
    1. Banking IDs and passwords
    2. CRA and critical site’s IDs and passwords
    3. Do not disable the secondary question for any computer
      1. Banking sites use this feature to help protect the account as one example
      2. Answer the question, it only takes a couple seconds and could save your savings!
  4. Never call the 800 number that comes up in a Search for Support!
    1. Go to the manufacturer’s web site and click on the Support link to find the correct phone number
  5. Never believe a pop-up message that says your computer is infected with something!
    1. And never, EVER, call the 800 number on that pop-up!
    2. Don’t click anywhere, close and save your work if needed then, reboot!
    3. Do NOT click anywhere in the pop-up window. Looks are deceiving as all areas of that pop-up = YES/ACCEPT/CONTINUE
  6. Never volunteer a credit card number or banking information to anyone
    1. Social Security/Social Insurance Numbers too!
    2. Folks can garner a lot about us online. Never volunteer any information when asked via any incoming call/e-mail/forum
    3. Always call them back!
  7. Caller says they are from the bank, CRA, or other seemingly critical business?
    1. Ask for their badge number, an 800 number to call, and an extension
    2. Open a browser and verify the 800 number belongs to the bank/CRA/CritBiz.
    3. Then call them back after hanging up if the number proves true!

While the above list is far from complete, by following these guidelines we can greatly reduce the chances of a malware or ransomware infection.

And, as always, e-mail or call if you are not sure about something!


Please feel free to use this as a template for training users!

Have a great weekend everyone. It’s +10C here and much like an awesome Spring day!

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book

Thursday, 25 February 2016

Some Thoughts On Security Layering for SMB and SME

We are by no means masters of security for our SMB and SME clients. Since we have to wear many hats we sometimes need to bring folks in that can help us to fine tune the security layers in our client’s networks.

Here are some of the Pearls (blog category) that we have garnered over the years. This was posted originally to the SBS2K Yahoo List and has been modified for this post.


Layering is important.

Some examples follow.

Windows Firewall

  • Windows Firewall is managed by Group Policy
    • All Profiles: ON
    • All Profiles: Block ON
    • All Profiles: Logging ON
    • All Profiles: Pop-Up for new services ON
    • DOMAIN Profile: Custom Inbound rule sets for required services beyond the default.
    • Private and Public Profile: INBOUND BLOCK ALL
      • If data sharing is required then a small and inexpensive NAS should be set up

Mail Sanitation and Continuity

ExchangeDefender (xD), for us, is one of the principle ways we keep bad stuff outside of the network.
Why allow it to hit the edge in the first place? Plus, it eliminates SMTP Auth attacks as the WAN IP is not published via MX among other attacks. Interested? Ping us and we’ll set you up.

Edge (Router)

A solid edge device, we use SonicWALL, with a BLOCK ALWAYS rule for ALL outbound traffic is a key element. Rule sets for outbound traffic are very specific and tailored to a client’s needs.

  • Examples:
    • DNS queries to non on-premises DNS servers are blocked. All DNS queries must go through the on-premises DCs.
    • On-Premises edge only or DCs can have the DNS Forwarders set to DNS filtering services.
    • SMTP traffic outbound only from the on-premises Exchange server. Or, local copiers/MFPs to ISP SMTP server IP only
    • Inbound is HTTPS via ANY
    • SMTP via xD subnets only.
    • RDP on ANY port should NEVER be published to the Internet.
      • RD Gateway with Network Level Authentication is a must today.
      • Any exceptions require a static IP on the source end to allow inbound rule filtering based on IP.
      • Look up TSGrinder if not sure why…

Ransomware Protection

Third Tier’s Ransomware Protection Kit is another layer of protection. Everything is contained in this kit to deploy a very tight layer of protection against today’s Ransomware.

Microsoft Office Group Policy Security

Office Group Policy structures with Macros disabled by default, non-local sources blocked, and other security settings for Office files provide another layer.

  • This one gives users grief because they need a few extra steps to get to the documents.
  • We’ve started requesting that clients have a PDF only policy on their Jobs listing pages and such.

A User Focused Effort

IMNSHO, A/V at the endpoint has become virtually useless today. Things seem to be a lot more targeted on the virus side with ransomware taking over as the big cash cow. We still install A/V on all endpoints. :)

What we are saying, is that the principle portion of the risk of infection comes via the user.

A well trained user means the risk of infection drops substantially.

A user’s browsing habits and link clicking are the two key areas of training we focus on. Sites visited are another.

We suggest to clients that a company policy of allowing browsing for business related tasks only while connected to the company’s network resources. This policy can further reduce exposure.

Part of our training regimen is a somewhat regular e-mail from an outside account to users to test them and challenge them is a good idea every once in a while.

  • Link hovering to discover the true destination
  • Attached Word doc with *BUZZ WRONG* when opened
  • Just because it SAYS it’s “FROM” someone we know doesn’t mean it is!

Backup Protection

Oh, and protect the backup loop (blog post on closing the backup loop)!

BTW, we just heard about another NAS based backup that was ransomware encrypted as a result of the destination folder being open to users.

Anyone, and I mean ANYONE, that has a backup structure, whether NAS or HDD based, that allows users and admins access outside of the backup software username and password setup needs to close that loop NOW. Not on the To Do List, not for tomorrow, not next week, but NOW.

Just in case: Close that Backup Loop Now.

Hyper-V Standalone Setups

One more point of order: In standalone Hyper-V settings leave the host in workgroup mode.

No one on the network should have the admin username and password to that host. No. One. It should be documented somewhere but not public knowledge.

Please feel free to add the layers you use to this post via comments.

Thanks for reading!

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book

Tuesday, 16 February 2016

Cluster 101: Some Hyper-V and SOFS Cluster Basics

Our focus here at MPECS Inc. has grown into providing cluster-based solutions to clients near and far over the last eight years or so as well as cluster infrastructure solutions for small to medium I.T. shops.

There were so many misconceptions when we started the process to build out our first Hyper-V cluster in 2008.

The call in to us was for a large food manufacturing company that had a very specific requirement for their SQL, ColdFusion, and mail workloads to be available. The platform of choice was the Intel Modular Server with an attached Promise VTrak E310sD for extra storage.

So, off we went.

We procured all of the hardware through the Intel and Promise demo program. There was _no_ way we were going to purchase close to $100K of hardware on our own!

Back then, there was a dearth of documentation … though that hasn’t changed all that much! ;)

It took six months of trial and error plus working with the Intel, Promise, LSI, and finally contacts at Microsoft to figure out the right recipe for standing up a Hyper-V cluster.

Once we had everything down we deployed the Intel Modular Server with three nodes and the Promise VTrak E310sD for extra storage.

Node Failure

One of the first discoveries: A cluster setup does not mean the workload stays up if the node it’s on spontaneously combusts!

What does that mean? It means that when a node suddenly goes offline because of a hardware failure the guest virtual machines get moved over to an available node in a powered off state.

To the guest OS it is as if someone hit the reset button on the front of a physical server. And, as anyone that has experienced a failed node knows the first prompt when logging in to the VM is the “What caused the spontaneous restart” prompt.

Shared Storage

Every node in a Hyper-V cluster needs identical access to the storage the VHD(x) files are going to reside on.

In the early days, there really was not a lot of information indicating exactly what this meant. Especially since we decided right from day one to avoid any possible solution set based on iSCSI. Direct Attached Storage (DAS) via SAS was the way we were going to run with. The bandwidth was vastly superior with virtually no latency. No other shared storage in a cluster setting could match the numbers. And, to this day the other options still can’t match DAS based SAS solutions.

It took some time to figure out, but in the end we needed a Shared Storage License (SharedLUNKey) for the Intel Modular Server setup and a storage shelf with the needed LUN Sharing and/or LUN Masking plus LUN Sharing depending on our needs.

We had our first Hyper-V cluster!

Storage Spaces

When Storage Spaces came along in 2012 RTM we decided to venture into Clustered Storage Spaces via 2 nodes and a shared JBOD. That process took about two to three months to figure out.

Our least expensive cluster option based on this setup (blog post) is deployed at a 15 seat accounting firm. The cost versus the highly available workloads benefit ratio is really attractive. :)

We have also ventured into providing backend storage via Scale-Out File Server clusters for Hyper-V cluster frontends. Fabric between the two starts with 10GbE and SMB Multichannel.


All Broadcom and vendor rebranded Broadcom NICs require VMQ disabled for each network port!

A best practice for setting up each node is to have a minimum of four ports available. Two for the management network and Live Migration network and two for the virtual switch team. Our preference is for a pair of Intel Server Adapter i350-T4s set up as follows:

  • Port 0: Management Team (both NICs)
  • Port 1 and 2: vSwitch (no host OS access both NICs)
  • Port 3: Live Migration networks (LM0 and LM1)

For higher end setups, we install at least one Intel Server Adapter X540-T2 to bind our Live Migration network to each port. In a two node clustered Storage Spaces setting the 10GbE ports are direct connected.

Enabling Jumbo Frames is mandatory for any network switch and NIC carrying storage I/O or Live Migration.


In our experience GHz is king over cores.

The maximum amount of memory per socket/NUMA node that can be afforded should be installed.

All components that can be should be run in pairs to eliminate as many single points of failure (SPFs) as is possible.

  • Two NICs for the networking setup
  • Two 10GbE NICs at the minimum for storage access (Hyper-V <—> SOFS),
  • Two SAS HBAs per SOFS node
  • Two power supplies per node

On the Scale-Out File Server cluster and Clustered Storage Spaces side of things one could scale up the number of JBODs to provide enclosure resilience thus protecting against a failed JBOD.

The new DataON DNS-2670 70-bay JBOD supports eight SAS ports per controller for a total of 16 SAS ports. This would allow us to scale out to eight SOFS nodes and eight JBODs using two pairs of LSI 9300-16e (PCIe 8x)  or the higher performance LSI 9302-16e (PCIe 16x) SAS HBAs per node! Would we do it? Probably not. Three or four SOFS nodes would be more than enough to serve the eight direct attached JBODs. ;)

Know Your Workloads

And finally, _know your workloads_!

Never, ever, rely on a vendor for performance data on their LoB or database backend. Always make a point of watching, scanning, and testing an already in-place solution set for performance metrics or the lack thereof. And, once baselines have been established in testing the results remain private to us.

The two key ingredients in any standalone or cluster virtualization setting are:

  1. IOPS
  2. Threads
  3. Memory

A balance must be struck between those three relative to the budget involved. It is our job to make sure our solution meets the workload requirements that have been placed before us.


We’ve seen a progression in the technologies we are using to deploy highly available virtualization and storage solutions.

While the technology does indeed change over time the above guidelines have stuck with us since the beginning.

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book

Thursday, 11 February 2016

Philip’s Ultra-Healthy and Quick Technician’s Breakfast!

We’re all super busy. Eating breakfast is an important part of it with a home grown meal being way better than most anything a fast food place can serve. It’s a lot less expensive in the long run plus the time savings is huge!

This breakfast meal assumes the person is in some sort of regular excercise routine which is also an important part of keeping ourselves healthy. Right? ;)

  • Breakfast Sloppy Toast
    • (3) Large Eggs
    • ~125ml to ~200ml of Half & Half Cream
    • (2) Whole Grain, 12 Grain, or other such solid bread
    • 1/8” slices of Cheddar, Marble, Havarti, Mozza, or other favourite cheese
    • A good chunk of baby spinach
    • A proper Pyrex microwave dish and cover
      • Plastic containers melt into the food :P
      • Ours is just larger WxL wise than the bread slizes and tall enough to host the lot

With the above:

  1. Break the 3 eggs into the Pyrex dish
  2. Start whisking
  3. Add cream until well frothed
  4. Place first slize of bread in the mix
  5. Cover the bread with the cheese slices
  6. Drop spinach in and evenly distribute
  7. Place second slice of bread in
  8. Use a spat to flip the stack over
  9. Press in to allow mix to soak into the new slice
  10. Cover and microwave for 5:15 at 60%
    1. Let sit for about a minute after the cycle completes
  11. Microwave for about 1:45 to 2:45 at 60% depending on microwave power

Once it’s done let it cool off for a good five minutes.

Total time put in to the above: Less than 4 minutes.

Time savings over the week?

Assuming a minimum 15-20 minute wait at Timmys (Tim Hortons) that’s easily 15 minutes per day or more.

Cost savings?

Two breakfast egg sandwiches with cheese and bacon on an English muffin is $6.20. The savings can be quite substantial.

I usually have a couple of Vietnamese bird peppers to chow on while eating the above to accentuate the flavour. ;)

While this breakfast is not for everyone, it follows a 50/25/25 rule for protien/fat/carbs. Tie that in to a good regular cardio workout we’re good to go!

Thanks for reading. :)

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book

Tuesday, 9 February 2016

Hyper-V 101: What Windows Server Media Should I Use?

This may seem like a bit of a silly N00b style post but there’s a good reason for it.

How many of us are using Windows Server Media to install hosts via USB Flash then guests via ISO?

I venture to guess almost all of us.

Okay, POLL Time: What is the _date stamp_ on the Setup.EXE located on that flash/ISO?

As of today, if it’s a date earlier than November 22, 2014 then it’s _too old_ to be used in production systems:


Please log on to the Microsoft Volume Licensing Service Centre, MSDN, or TechNet to download a newer ISO.

Then update the flash drives used to install Hyper-V hosts and nodes.

It should be Standard Order of Procedure (SOP) to keep operating system load souces up to date.

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book

Monday, 8 February 2016

Remote Desktop Session Host: The System Partition is Getting Full?

In our on-premises and Cloud Desktop RDSenvironments we’ve discovered a number of different things that cause problems for users in Windows Server 2012 R2, Exchange 2013, and Office 2013.

All of our Remote Desktop Session Hosts (RDSHs) are set up with two VHDX files. One for the operating system and one for the data and User Profile Disks (UPDs).

Unfortunately, while UPDs give us a great flexbility option that allow us to have then on the network thus avoid local profile pains in a RDS Farm setting they have a number of different negative impacts on user experience and RDSH health.

One that impacts both is the mysterious filling up of the system partition.

As it turns out, Outlook 2013 and Exchange 2013 plus UPDs means Outlook search is almost completely broken.

But, that doesn’t stop the Windows Server Search Service from doing its best to catalog everything anyway!

What does that mean?

Well, eventually we have a search database that can grow to epic proportions.

Since all of our OS partitions are rather small we end up with session hosts getting their system partition filled rather quickly on a busy RDSH. This is especially true in a Farm setting.

So, what are our options?

Well, we could disable the Windows Search Service. This would be a bad idea since users wouldn’t be able to find _anything_ anymore. We’d go from the occasional complaint to constant complaints. So, not good.

The alternative is to reset the Windows Search index.

  1. Start –> Indexing Options
  2. Advanced button (UAC)
  3. Click the Rebuild button

And, voila! In some cases we get 45GB to 60GB of space back in short order!

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book

Saturday, 6 February 2016

Hyper-V Virtualization 101: Some Basics Around Hyper-V Setups

Here are some pearls learned over our years working with Hyper-V:
  • Host/Node Setup
    • Make sure the host and all nodes in a cluster have the BIOS Settings identical (All settings)
    • Leave ~1GB to 1.5GB physical RAM to the host
    • We leave 1.5GB of space on a dedicated to VM LUN
    • We leave ~50GB to ~100GB of free space on a shared LUN/Partition
    • We set the MiniDump option and lock the swap file down to 840MB
      • wmic RECOVEROS set DebugInfoType = 3
    • Always set up a standalone host with two partitions: OS and Data
  • Hyper-V
    • Hyper-V lays out a file equivalent in size to the vRAM assigned to VMs. We must have space for them.
    • Snapshots/CheckPoints create differencing disks. These _grow_ over time.
    • Deleting Snapshots/CheckPoints requires enough free space to create an entirely new Parent VHDX.
    • vRAM assigned to the VM should not traverse NUMA nodes (performance) (more on hardware).
    • vCPUs = Threads in CPU and must be processed in parallel thus # physical cores - 1 is best.
    • GHz is preferred over CPU Core counts for most workloads.
  • Storage
    • Be aware of the IOPS required to run _all_ workloads on the host/nodes.
    • More smaller sized spindles is better than less larger size spindles = More IOPS.
    • 10GbE should be the minimum bandwidth considered for any iSCSI deployments.
    • At least _two_ 10GbE switches are mandatory for the storage path
  • Networking
    • Broadcom physical NIC ports must always have VMQ turned off (blog post)
    • We prefer to use Intel Gigabit and 10Gb Ethernet Server Adapters
    • We start with a minumum of 4 physical ports in our hosts/nodes
  • UPS Systems
    • UPS Systems should have at least 1-1.5 Hours of runtime.
    • Host/Nodes and storage should be tested for shutdown durations.

There are quite a lot of these types of posts on our blog. Please click through the category tags to find more!

Thanks for reading.

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book

Friday, 5 February 2016

Some Remote Desktop Session Host Guidelines

We’ve put about four years and two versions into our Small Business Solution (SBS). We have it running on-premises on standalone Hyper-V servers as well as on Hyper-V clusters (Clustered Storage Spaces and Hyper-V cluster we just deployed for a 15 seat accounting firm).

It is the foundation for the Cloud Office services we’ve been offering for the last year or so.

Since our Cloud Office solution runs in Remote Desktop Services we figured we’d share some pearls around delivering Remote Desktop Session Host based environments to clients:

  • ~512MB/User is cutting it tight
  • ~20 to 25 users in a 12GB to 16GB vRAM Hyper-V VM works okay with 2-3 vCPUs
  • RDP via 8.1 RDP clients saturates a 1Mb DSL uplink at ~13-15 users depending on workload
  • ALL browsers can bring the RDSHs to their knees
  • Printing can be a bear to manage (Use Universal Print Drivers and Isolation where possible)
  • Group Policy configuration and lockdown is mandatory
  • Two partitions with User Profile Disks (UPDs), if used, on the second partition
  • NOTE: UPDs + Office 2013 and earlier + Exchange 2013 and earlier = Broken Search!!!
  • NOTE: RDSH Search Indexes for Outlook OSTs in UPDs can fill up the C: partition!
    • Office 2016 and Exchange 2016 together are supposed to address the broken search situation in RDSH setups were UPDs are used. We have yet to begin testing the two together.

Our Cloud Office (SBS) is running on clusters we’ve designed based on Scale-Out File Server and Hyper-V.

Need a clustered solution for your SMB/SME clients? Drop us a line. They are _very_ affordable. ;)

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book

Thursday, 4 February 2016

Protecting a Backup Repository from Malware and Ransomware

With the abundance of malware and ransomware it’s absolutely necessary that we take the time to examine our backup structures.

  1. Volume Shadow Copies
    • Obviously not a “backup”
    • Most ransomware today kills these
  2. Backup to Disk/NAS
    • Rotated or streamed off-site
  3. Cloud Backup
    • Streamed off-site
  4. Backup Tiers
    1. Current, off-site 1, off-site 2, 6 Month, 12 Month, ETC…

With our last mile issue up here we are very careful about anything Cloud since most upload speeds are not capable enough nor are the download speeds capable of a decent recovery time.

Now, what is _the most important_ aspect to our backup setup?


It must be a closed loop!

What does that mean?

That means that at no point in the backup structure can anyone have access to the backups via the network or console.

Now, since almost all of our backups are streamed across the wire it takes a bit of a process to make sure our loop is closed.

  • NAS
    • ShadowProtect user with unique pass phrase (SPUP) and MOD on the repository root folder
      • Other than the NAS Admin account no other user account is set up with access
      • Turn on the NAS Recycle Bin!
        • Most ransomware creates a new file then deletes the old one
        • Create a separate username and folder structure for user facing resources!
  • ShadowProtect
    • Network destination set up with SPUP
  • ShadowProtect Backups
    • Encrypted AES 256-bit with a long pass phrase
  • ImageManager
    • All managed backups are set up to be accessed via SPUP only
      • No repository, whether NAS or USB HDD is left with Users MOD
      • No repository is left without a restricted username and password protecting it!

Recently, we know of a domain joined standalone Hyper-V server get hit by ransomware. As a rule we don’t join a standalone Hyper-V to the guest domain. This is just one more reason for us not to do so.

And finally, some of the more obvious aspects around backups and domain operation in general:

  • Users are Standard Users on the domain
    • If they absolutely need local admin because they are still running QuickBooks 2009 then make that choice
    • Standard User accounts have _NO_ access to any aspect of the backup loop
      • None, Nada, Zippo, Zilch! ;)
    • Domain Admin accounts should have no access to any aspect of the backup loop
      • Many client sites have one or two users (hopefully not more?!?!?) that know these credentials
    • Access via UNC will pop up an authentication dialogue box.
      • Use the SPUP and _do not save_ the credentials!
  • Backups are managed by us, spot recovered by us, and quarterly bare metal/hypervisor restored by us
    • No client intervention other than perhaps the off-site rotation (we do this too)
  • If some user or users insist running as DOMAIN ADMINs then REMOVE Admin’s MOD from USB HDD/NAS NTFS/File System
    • Leave only the SPUP with MOD

So, what spawned this blog post?

Hearing of a ShadowProtect destination NAS getting wiped out by ransomware. This should not be possible on our managed networks ever!

What spawned our lockdown of the backup structures?

Many years back we had a user that neglected to rotate the tape libraries and a faulty BackupExec that reported all being rosy until their server went full-stop and we had to recover (one aspect of the recovery in an SBS environment).

When we arrived, the person rotating the magazines turned sheet white when we asked for the off-site magazines. Oops. :(

We dropped BackupExec as their support failed to help us after three days of wrangling (Thursday afternoon until we cut the cord at 1730Hrs Saturday evening). We did end up recovering the full 650GB of data short of 24 files belonging to one of the firm’s partners across four to five days.

After that we went to all of our clients and proposed a managed backup strategy where we took care of all aspects of the backup. They all approved the changes after hearing what happened at the one firm. ;)

So, we tested and switched all of our clients to ShadowProtect 3.x and set up all backups so that no user could access them.

In our not so humble opinion, backups are not, and should never be, a user’s responsibility.

Thus, they should never have access to them even if they rotate them!

TIP: Need to do a side-by-side recovery or migration? ForensiT’s User Profile Wizard

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book

Thursday, 28 January 2016

Cluster: A Simple Cluster Storage Setup Guide

In a cluster setting we have a set way to configure our shared storage whether it resides on a SOFS (Scale-Out File Server) cluster or some sort of network based storage.

First, the process to set up the storage itself:

  1. Configure the LUN
    • LUN ID must be identical for all Hyper-V nodes for SAN/NAS
  2. Connect all nodes to the storage
    • iSCSI Target for SAN/NAS
  3. Format NTFS and set OFFLINE on Node01
  4. Node2 and up ignore Initialize in Disk Management and set OFFLINE
    • This step is optional depending on the setup

When it comes to the storage we configure the following LUNs for all of our cluster setups;

  1. 1.5GB LUN
    • Set up for the Witness Disk
    • Add to Cluster Storage but NOT CSV
  2. ???GB LUN
    • Sum of all physical RAM on the nodes plus 150GB
    • Add to Cluster Shared Volumes
    • All Hyper-V nodes set to deliver VM settings files to this location
    • Don’t forget that Hyper-V writes a file that is equivalent in size for _all_ VMs running on the cluster or standalone host!
  3. Minimum 50% Storage LUN x2
    • Divide the remaining storage into two or more LUNs depending on workload and storage requirements
    • A minimum of 2 LUNs allows for storage load to be shared across the SAN’s two storage controllers, the two iSCSI networks, and the two or more Hyper-V nodes

In a SOFS setting we set up a File Share Witness for our Hyper-V compute clusters and deliver the HA shares via SMB Multichannel and a minimum of 10GbE for the VHDX files.


The PowerShell steps for any of the above are here to avoid copy and paste issues.

Set Default Paths:

Set-VMHost -VirtualHardDiskPath “C:\ClusterStorage” –VirtualMachinePath “C:\ClusterStorage\Volume1”

We point the VHDX setting to the CSV root just in case. Our PowerShell scripts for setting up VMs put the VHDX files into the right storage location.

Set Quorum Up:

Set-ClusterQuorum -NodeAndDiskMajority "Cluster Virtual Disk (Witness Disk)"

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book

Tuesday, 26 January 2016

Opinion: Yammer Needs To Go

With all due respect, Microsoft really needs to cut and run from Yammer.

We bill into the hundreds of dollars per hour.

Yammer is such a bleed on time that it is costing us substantially to try and get any kind of use out of it.

We’ve all but abandoned the platform for all but necessary tasks.

We already wasted a couple of hours this week trying to get to our content downloads for our business needs.

We can’t imagine the impact to Microsoft employee productivity that Yammer has had. But, given our experience it has to be substantial.

Yammer needs to go. In our not so humble opinion it’s worse than a 1988 Yugo and it’s 3-6 month air cooled engine life.


Image Credit: Bing Search

Microsoft trying to fix it is no different than trying to fix a Yugo. In our opinion it’s good money after bad.


Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book

Monday, 25 January 2016

Windows 7, 10, and Skylake?

There’s been a lot of furor and FUD around Windows 7 on the upcoming Skylake platform being blocked or not?

Has anyone seen a revision to the x86 instruction set? We certainly haven’t. We’d most certainly have seen something somewhere about it.

There most certainly has been changes to the processor’s periphery in additional instruction sets for various tasks the CPU would perform in a given era. The Intel/AMD platforms are littered with their various efforts over the last two decades.

The only way we see something like this happening, where an OS won’t run on a certain platform, is by hard coding that into the OS. In the case of Windows 7, that would mean backporting a patch of some sort. Folks would figure out RPQ what patch caused the hard-code and that would be the end of that for many.

We realize Microsoft wants folks to move to Windows 10. There’s a huge revenue opportunity to be had there.

However, surreptitiously going about it with sneaky ads, involuntary downloads to users computers, delivery of the OS to ADDS based environments unannounced (coming to a domain near you), and any other clandestine methodology will, and is, do a lot of harm.

Pushing, pulling, and dragging folks who are kicking and screaming to be left where they are is not a good way to go about it.


Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book