Tuesday 26 July 2016

Some Disaster Recovery Planning On-Premises, Hybrid, and Cloud Thoughts

This was a post to the SBS2K Yahoo list in response to a comment about the risks of encrypting all of our domain controllers (which we have been moving towards for a year or two now). It’s been tweaked for this blog post.


We’ve been moving to 100% encryption in all of our standalone and cluster settings.

Encrypting a setup does not change _anything_ as far as Disaster Recovery Plans go. Nothing. Period.

The “something can go wrong there” attitude should apply to everything from on-premises storage (we’ve been working with a firm that had Gigabytes/Terabytes of data lost due to the previous MSP’s failures) and services to Cloud resident data and services.

No stone should be left unturned when it comes to backing up data and Disaster Recovery Planning. None. Nada. Zippo. Zilch.

The new paradigm from Microsoft and others has migrated to “Hybrid” … for the moment. Do we have a backup of the cloud data and services? Is that backup air-gapped?

Google lost over 150K mailboxes a number of years back, we worked with one panicked call who lost everything, with no return. What happens then?

Recently, a UK VPS provider had a serious crash and, as it turns out lost _a lot_ of data. Where are their clients now? Where’s their client’s business after such a catastrophic loss?

Some on-premises versus cloud based backup experiences:

  • Veeam/ShadowProtect On-Premises: Air-gapped (no user access to avoid *Locker problems), encrypted, off-site rotated, and high performance recovery = Great.
  • Full recovery from the Cloud = Dismal.
  • Partial recovery of large files/numerous files/folders from the Cloud = Dismal.
  • Garbage In = Garbage Out = Cloud backup gets the botched bits in a *Locker event.
  • Cloud provider’s DC goes down = What then?
  • Cloud provider’s Services hit a wall and failover fails = What then (this was a part of Google’s earlier mentioned problem me thinks)?
    • ***Remember, we’re talking Data Centers on a grand scale where failover testing has been done?!?***
  • At Scale:
    • Cloud/Mail/Services providers rely on a myriad of systems to provide resilience
      • Most Cloud providers rely on those systems to keep things going
    • Backups?
      • Static, air-gapped backups?
      • “Off-Site” backups?
        • These do not, IMO, exist at scale
  • The BIG question: Does the Cloud service provider have a built-in backup facility?
    • Back up the data to local drive or NAS either manually or via schedule
    • Offer a virtual machine backup off their cloud service

There is an assumption, and we all know what that means right?, that seems to be prevalent among top tier cloud providers that their resiliency systems will be enough to protect them from that next big bang. But, has it? We seem to already have examples of the “not”.

In conclusion to this rather long winded post I can say this: It is up to us, our client’s trusted advisors, to make bl**dy well sure our client’s data and services are properly protected and that a down-to-earth backup exists of their cloud services/data.

We really don’t enjoy being on the other end of a phone call “OMG, my data’s gone, the service is offline, and I can’t get anywhere without it!” :(

Oh, and BTW, our SBS 2003/2008/2011 Standard/Premium sites all had 100% Uptime across YEARS of service. :P

We did have one exception in there due to an inability to cool the server closet as the A/C panel was full. Plus, the building’s HVAC had a bunch of open primary push ports (hot in winter cold in summer) above the ceiling tiles which is where the return air is supposed to happen. In the winter the server closet would hit +40C for long periods of time as the heat would settle into that area. ShadowProtect played a huge role in keeping this firm going plus technology changes over server refreshes helped (cooler running processors and our move to SAS drives).


Some further thoughts and references in addition to the above forum post.

The moral of this story is quite simple. Make sure _all_ data is backed up and air-gapped. Period.

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service

No comments: