Wednesday, 25 April 2018

Working with and around the Cloud Kool-Aid

The last year and a half have certainly had their challenges. I've been on a road of both discovery and of recovery after an accident in November of 2016 (blog post).

Most certainly one of the discoveries is the amount of tolerance for fluff, especially marketing fluff, has been greatly reduced. Time is precious, even more so when one's faculties can be limited by head injury. :S

Microsoft's Cloud Message

It was during one of the last feedback sessions at MVP Summit 2018 that a startling realization came about: There's still anger, and to some degree bitterness, towards Microsoft and the cloud messaging of the last ten to twelve years. My session at SMBNation 2012 had some glimpses into that anger and struggle about our business and its direction.

After the MVP Summit 2018 session, when discussing it with a Microsoft employee that I greatly respect, his response to my apology for the glimpse into my anger and bitterness was, "You have nothing to apologize for". That affirmation brought a lot home.

One realization is this: The messaging from Microsoft, and others, around Cloud has not changed. Not. One. Bit.

That messaging started out oh so many years ago as, "Your I.T. Pro Business is going to die. Get over it" to paraphrase Microsoft's message to change business models or else when BPOS was launched.

The messaging was "successful" to some degree as the number of I.T. Pro consultants and small businesses that hung up their guns during that first four to six year period was substantial.

And yet, it wasn't as much of the SMB focused Microsoft Partner network basically left Cloud sales off the table when dealing with their clients.

Today, the content of the message and to some degree the method of delivering the message may be somewhat masked but it is still the same: Cloud or die.

At this last MVP Summit yet another realization came when listening to a fellow MVP and some Blue Badges (Microsoft employees) discussing various things around Cloud and Windows. It had never occurred to me to consider that the pain we were feeling out on the street would also be had within Microsoft and to some degree other vendors adopting a Cloud service.

The recent internal shuffle in Microsoft really brought that home.

On-Premises, Hybrid, and/or Cloud

We have a lot of Open Value Agreements in place to license our client's on-premises solution sets.

Quite a few of them came up for renewal this spring. Our supplier Microsoft licensing contact, and the contractor (v-) that kept calling, were trying to push us into Cloud Solution Provider (CSP) for all of our client's licensing.

Much of what was said in those calls was:

  • Clients get so much more in features
  • Clients get access anywhere
  • Clients are so much more agile.
  • Blah, blah, blah
  • Fluff, fluff, fluff

The Cloud Kool-Aid was being poured out by the decalitre. ;)

So, our response was, "Let's about our Small Business Solution (SBS)" and it's great features and benefits and how our clients have full features both on-premises, via the Internet, or anything in-between. And, oh, it's location and device agnostic. We can also run it on-premises or in someone else's Cloud.

That usually led to some sort of stunned silence on the other end of the phone.

It's as if the on-premises story has somehow been forgotten or folks have developed selective amnesia around it.

What's neat though is that our on-premises highly available solutions are selling really well especially for folks that want cloud-like resilience for their own business.

That being said, there _is_ a place for Cloud.

As a rule, Cloud is a great way to extend on-premises resources for companies that experience severe business swings such as construction companies that have slowdowns due to winter. The on-premises solution set can run the business through the quieter months then things get scaled-up during summer in the Cloud. In this case the Cloud spend is equitable.

Business Principled Clarity

There are two very clear realities for today's I.T. Pro and SMB/SME I.T. Business:

  1. On-Premises is not going away
  2. Building a business around Cloud is possible but difficult

The on-premises story is not going to change. One can repeat the Cloud message over and over and to some degree it becomes "truth". That's an old adage. However, the realities on the ground remain ... despite the messaging.

Okay, so maybe in the smaller 10 or less seat business where an all-in for Cloud may make sense (make sure to add all of those bills up and be sitting when doing so!).

That being said, our smallest High Availability client is 15 seats with a disaggregate converged cluster. That was before our Storage Spaces Direct Kepler-47 was finalized as that solution starts at a third of the cost.

For the on-premises story there are two primary principles operating here:

  1. The client wants to own it
  2. The client wants full control over their data and its access

Cloud vendors are not obligated, and in many cases can't say anything, when law enforcement shows up to either snoop or even, in some cases, to remove the vendor's physical server systems.

Many businesses are very conscious of this fact. Plus, many governments have a deep reach into other countries as the newly minted, as of this writing, EU privacy laws seem to be demonstrating.

Now, as far as building a business around another's Cloud offerings there are two ways that we see that happening with some success:

  1. Know a Cloud Vendor's products through and through
  2. Build a MSP (Managed Service Provider) business supporting endpoints

The first seems to be really big right now. There's a lot of I.T. companies out there selling cloud with no idea of how to put it all together. The companies that do know how to put it all together are growing in leaps and bounds.

The MSP method is, and has been, a way to keep that monthly income going. But, don't count on it being there for too much longer as _all_ Cloud vendors are looking to kill the managed endpoint in some way.

Our Direction

So, where do we fit in all of this?

Well, our business strategy has been pretty straightforward:

  1. Keep developing and providing cloud-like services on-premises with cloud-like resilient solutions for our clients
  2. Hybrid our on-premises solutions with Cloud when the need is there
  3. Continue to help clients get the most out of their Cloud services
  4. Cultivate our partnerships with SMB/SME I.T. organizations needing HA Solutions

We have managed to re-work our business model over the last five to ten years and we've been quite successful at it. Though, it is still a work in progress and probably will remain so given the nature of our industry.

We're pretty sure we will remain successful at it as we continue to put a lot of thought and energy into building and keeping our clients and contractors happy.

Ultimately, that goal has not changed in all of the years we've been in business.

We small to medium I.T. shops have the edge over every other I.T. provider out there.

"How is that?", you might ask.

Well, we _know_ how to run a small to medium business and all of the good and bad that comes with it.

That translates into great products and services to our fellow SMB/SME business clients. It really is that easy.

The hard part is staying on top of all of the knowledge churn happening in our field today.


Finally, as far as the anger, and to some degree bitterness, goes: Time. It will take time before it is fully dealt with.

In the mean time ...

A friend of mine, Tim Barrett did this comic many years ago (image credit to


The comic definitely puts an image to the Cloud messaging and its results. :)

Let's continue to build our dreams doing what we love to do.

Have a fantastic day and thanks for reading!

Philip Elder
Microsoft High Availability MVP
Co-Author: SBS 2008 Blueprint Book
Our Web Site
Our Cloud Service

Tuesday, 23 January 2018

Storage Spaces Direct (S2D): Sizing the East-West Fabric & Thoughts on All-Flash

Lately we've been seeing some discussion around the amount of time required to resync a S2D node's storage after it has come back from a reboot for whatever reason.

Unlike a RAID controller where we can tweak rebuild priorities, S2D does not offer the ability to do so.

It is with very much a good thing that the knobs and dials are not exposed for this process.


Because, there is a lot more going on under the hood than just the resync process.

While it does not happen as often anymore, there were times where someone would reach out about a performance problem after a disk had failed. After a quick look through the setup the Rebuild Priority setting turned out to be the culprit as someone had tweaked it from its usual 30% of cycles to 50% or 60% or even higher thinking that the rebuild should be the priority.

S2D Resync Bottlenecks

There are two key bottleneck areas in a S2D setup when it comes to resync performance:
  1. East-West Fabric
    • 10GbE with or without RDMA?
    • Anything faster than 10GbE?
  2. Storage Layout
    • Those 7200 RPM capacity drives can only handle ~110MB/Second to ~120MB/Second sustained
The two are not the mutually exclusive culprit depending on the setup as they both can play together to limit performance.

The physical CPU setup may also come into play but that's for another blog post. ;)

S2D East-West Fabric to Node Count

Let's start with the fabric setup that the nodes use to communicate with each other and pass storage traffic along.

This is a rule of thumb that was originally born out of a conversation at a MVP Summit a number of years back with a Microsoft fellow that was in on the S2D project at the beginning. We were discussing our own Proof-of-Concept that we had put together based on a Mellanox 10GbE and 40GbE RoCE (RDMA over Converged Ethernet) fabric. Essentially, at 4-nodes a 40GbE RDMA fabric was _way_ too much bandwidth.

Here's the rule of thumb we use for our baseline East-West Fabric setups. Note that we always use dual-port NICs/HBAs.
  • Kepler-47 2-Node
    • Hybrid SSD+HDD Storage Layout with 2-Way Mirror
    • 10GbE RDMA direct connect via Mellanox ConnectX-4 LX
    • This leaves us the option to add one or two SX1012X Mellanox 10GbE switches when adding more Kepler-47 nodes
  • 2-4 Node 2U 24 2.5" or 12/16 3.5" Drives with Intel Xeon Scalable Processors
    • 2-Way Mirror: 2-Node Hybrid SSD+HDD Storage Layout
    • 3-Way Mirror: 3-Node Hybrid SSD+HDD Storage Layout
    • Mirror-Accelerated Parity (MAP): 4 Nodes Hybrid SSD+HDD Storage Layout
    • 2x Mellanox SX1012X 10GbE Switches
      • 10GbE RDMA direct connect via Mellanox ConnectX-4 LX
  • 4-7 Node 2U 24 2.5" or 12/16 3.5" Drives with Intel Xeon Scalable Processors
    • 4-7 Nodes: 3-Way Mirror: 4+ Node Hybrid SSD+HDD Storage Layout
    • 4+ Nodes: Mirror-Accelerated Parity (MAP): 4 Nodes Hybrid SSD+HDD Storage Layout
    • 4+ Nodes: Mirror-Accelerated Parity (MAP): All-Flash NVMe cache + SSD
    • 2x Mellanox Spectrum Switches with break-out cables
      • 25GbE RDMA direct connect via Mellanox ConnectX-4/5
      • 50GbE RDMA direct connect via Mellanox ConnectX-4/5
  • 8+ Node 2U 24 2.5" or 12/16 3.5" Drives with Intel Xeon Scalable Processors
      • 4-7 Nodes: 3-Way Mirror: 4+ Node Hybrid SSD+HDD Storage Layout
      • 4+ Nodes: Mirror-Accelerated Parity (MAP): 4 Nodes Hybrid SSD+HDD Storage Layout
      • 4+ Nodes: Mirror-Accelerated Parity (MAP): All-Flash NVMe cache + SSD
      • 2x Mellanox Spectrum Switches with break-out cables
        • 50GbE RDMA direct connect via Mellanox ConnectX-4/5
        • 100GbE RDMA direct connect via Mellanox ConnectX-4/5
    Other than the Kepler-47 setup we always have at least a pair of Mellanox ConnectX-4 NICs in each node for East-West traffic. It's our preference to separate out the storage traffic and the rest.

    All-Flash Setups

    There's a lot of talk in the industry about all-flash.

    It's supposed to solve the biggest bottleneck of them all: Storage!

    The catch is, bottlenecks are moving targets.

    Drop in an all-flash array of some sort and all of a sudden the storage to compute fabric becomes the target. Then, it's the NICs/HBAs on the storage _and_ compute nodes, and so-on.

    If you've ever changed a single coolant hose in an older high miler car you'd see what I mean very quickly. ;)

    IMNSHO, at this point in time, unless there is a very specific business case for all-flash and the fabric in place allows for all that bandwidth with virtually zero latency, all-flash is a waste of money.

    One business case would be for a cloud services vendor that wants to provide a high IOPS and vCPU solution to their clients. So long as the fabric between storage and compute can fully utilize that storage and the market is there the revenues generated should more than make up for the huge costs involved.

    Using all-flash as a solution to a poorly written application or set of applications is questionable at best. But, sometimes, it is necessary as the software vendor has no plans to re-work their applications to run more efficiently on existing platforms.

    Caveat: The current PCIe bus just can't handle it. Period.

    A pair of 100Gb ports on one NIC/HBA can't be fully utilized due to the PCIe bus bandwidth limitation. Plus, we deploy with two NICs/HBAs for redundancy.

    Even with the addition of more PCIe Gen 3 lanes in the new Intel Xeon Scalable Processor Family we are still quite limited in the amount of data that can be moved about on the bus.

    S2D Thoughts and PoCs

    The Storage Spaces Direct (S2D) hyper-converged or SOFS only solution set can be configured and tuned for a very specific set of client needs. That's one of its beauties.

    Microsoft remains committed to S2D and its success. Microsoft Azure Stack is built on S2D so their commitment is pretty clear.

    So is ours!

    Proof-of-Concept (PoC) Lab
    S2D 4-Node for Hyper-Converged and SOFS Only
    Hyper-V 2-Node for Compute to S2D SOFS
    This is the newest edition to our S2D product PoC family:
    Kepler-47 S2D 2-Node Cluster

    The Kepler-47 picture is our first one. It's based on Dan Lovinger's concept we saw at Ignite Atlanta a few years ago. Components in this box were similar to Dan's setup.

    Our second generation Kepler-47 is on the way to being built now.
    Kepler-47 v2 PoC Ongoing Build & Testing

    This new generation will have an Intel Server Board DBS1200SPLR with an E3-1270v6, 64GB ECC, Intel JBOD HBA I/O Module, TPM v2, and Intel RMM. OS would be installed on a 32GB Transcend 2242 SATA SSD. Connectivity between the nodes will be Mellanox ConnectX-4 LX running at 10GbE with RDMA enabled.

    Storage in Kepler-47 v2 would be a combination of one Intel DC P4600 Series PCIe NVMe drive for cache, two Intel DC S4600 Series SATA SSDs for performance tier, and six HGST 6TB 7K6000 SAS or SATA HDDs for capacity. The PCIe NVMe drive will optional due it is cost.

    We already have one or two client/customer destinations for this small cluster setup.


    Storage Spaces Direct (S2D) rocks!

    We've invested _a lot_ of time and money in our Proof-of-Concepts (PoCs). We've done so because we believe the platform is the future for both on-premises and data centre based workloads.

    Thanks for reading! :)

    Philip Elder
    Microsoft High Availability MVP
    MPECS Inc.
    Co-Author: SBS 2008 Blueprint Book
    Our Web Site
    Our Cloud Service

    Monday, 18 December 2017

    Cluster: Troubleshooting an Issue Using Failover Cluster Manager Cluster Events

    When we run into issues the first thing we can do is poll the nodes via the Cluster Events log in Failover Cluster Manager (FCM).

    1. Open Failover Cluster Manager
    2. Click on Cluster Events in the left hand column
    3. Click on Query
      • image
    4. Make sure the nodes are ticked in the Nodes: section
    5. In the Event Logs section:
      • Microsoft-Windows-Cluster*
      • Microsoft-Windows-FailoverClustering*
      • Microsoft-Windows-Hyper-V*
      • Microsoft-Windows-Network*
      • Microsoft-Windows-SMB*
      • Microsoft-Windows-Storage*
      • Microsoft-Windows-TCPIP*
      • Leave all defaults checked
      • OPTION: Hardware Events
    6. Critical, Error, Warning
    7. Events On
      • From: Events On: 2017-12-17 @ 0800
      • To: Events On: 2017-12-18 @ 2000
    8. Click OK
    9. Click Save Query As...
    10. Save it
      • Copy the resultant .XML file for use on other clusters
      • Edit the node value section to change the node designations or add more
    11. Click on Save Events As... in FCM to save the current list of events for further digging

    Use the Open Query option to get to the query .XML and tweak the dates for the current date and time, add specific Event IDs that we are looking for, and then click OK.

    We have FCM and Hyper-V RSAT installed on our cluster's physical DC by default.

    Philip Elder
    Microsoft High Availability MVP
    MPECS Inc.
    Co-Author: SBS 2008 Blueprint Book
    Our Web Site
    Our Cloud Service

    Saturday, 9 December 2017

    PowerShell TotD: Hyper-V Live Move a specific VHDX file

    There are times when we need to move one of two VHDX files associated with a VM.

    The following is the PowerShell to do so:

    Poll Hyper-V Host/Node for VM HDD Paths

    get-vm "*" | Select *path,@{N="HDD";E={$_.Harddrives.path}} | FL

    Move a Select VHDX

    Move-VMStorage -VMName VMName -VHDs @(@{"SourceFilePath" = "X:\Hyper-V\Virtual Hard Disks\VM-LALoB_D0-75GB.VHDX"; "DestinationFilePath" = "Y:\Hyper-V\Virtual Hard Disks\VM-LALoB_D0-75GB.VHDX"})

    Move-VMStorage Docs

    The Move-VMStorage Docs site. This site has the full syntax for the PowerShell command.


    While the above process can be initiated in the GUI, PowerShell allows us to initiate a set of moves for multiple VMs. This saves on time bigtime versus mouse.

    By the way, TotD means: Tip of the Day.

    Thanks for reading! :)

    Philip Elder
    Microsoft High Availability MVP
    MPECS Inc.
    Co-Author: SBS 2008 Blueprint Book
    Our Web Site
    Our Cloud Service

    Thursday, 9 November 2017

    Intel Server System R2224WFTZS Integration & Server Building Thoughts

    We have a brand new Intel Server System R2224WFTZS that is the foundation for a mid to high performance virtualization platform.


    Intel Server System R2224WFTZS 2U

    Below it sits one of our older lab Intel Server System SR2625URLX 2U. Note the difference in the drive caddy.

    That change is welcome as the caddy no longer requires a screwdriver to set the drive in place:


    Intel 2.5" Tooless Drive Caddy

    What that means is the time required to get 24 drives installed in the caddies went from half an hour or more to five or ten minutes. That, in our opinion, is a great leap ahead!

    The processors for this setup are Intel Xeon Gold 6134s with 8 cores running at 3.2GHz with a peak of 3.7GHz. We chose the Gold 6134 as a starting place as most of the other CPUs have more than eight cores thus pushing up the cost of licensing Microsoft Windows Server Standard or Datacenter.


    Intel Xeon Gold 6134, Socket, Heatsink, and Canadian Loonie $1 Coin

    The new processors are huge!

    The scale difference between the E3-1200 series, E5-2600 series is orders of magnitude larger. The jump in size reminds me of the Pentium Pro's girth next to the lesser desktop/server processors of the day.


    Intel Xeon Processor E3-1270 sits on the Intel Xeon Gold 6134

    The server is nearly complete.


    Intel Server System R2224WFTZS Build Complete

    Bill of Materials

    In this setup the server's Bill of Materials (BoM) is as follows:

    • (2) Intel Xeon Gold 6134
    • 384GB via 12x 32GB Crucial DDR4 LRDIMM
    • Intel Integrated RAID Module RMSP3CD080F with 7 Series Flash Cache Backup
    • Intel 12Gbps RAID Expander Module RES3TV360
    • (2) 150GB Intel DC S3520 M.2 SSDs for OS
    • (5) 1.9TB Intel DC S4600 SATA SSDs for high IOPS tier
    • (19) 1.8TB Seagate 10K SAS for low to mid IOPS tier
    • Second Power Supply, TPM v2, and RMM4 Module

    It's important to note that when setting up a RAID controller instead of a Host Bus Adapter (HBA) that does JBOD only we require the flash cache backup module. In this particular unit one needs to order the mounting bracket: AWTAUXBBUBKT

    I'm not sure why we missed that, but we've updated our build guides to reflect the need for it going forward.

    One other point of order is the rear 2.5" hot swap drive bay kit (A2UREARHSDK2) does not come installed from the factory in the R2224WFTZS as it did in the R2224WTTYS. I'm still not sold on M.2 for the host operating system as they are not hot swap capable. That means, if one dies we have to down a node in order to change it. With the rear hot swap bay we can do just that, swap out the 2.5" SATA SSD that's being used for the host OS.

    For the second set of two 10GbE ports we used an Intel X540-T2 PCIe add-in card as the I/O modules are not in the distribution channel as of this writing.

    NOTE: One requires a T30 hex screwdriver for the heatsinks! After installing the processor please make sure to start all four nuts prior to tightening. As a suggestion, from there snug each one up gradually starting with the two middle nuts then the outer nuts similar to the process for installing a head on an engine block. This process provides an even amount of pressure from the middle of the heatsink outwards.

    Firmware Notes

    Finally, make sure to update the firmware on all components before installing an operating system. There are some key fixes in the motherboard firmware updates as of this writing (BIOS 00.01.0009 ReadMe). Please make sure to read through to verify any caveats associated with the update process or the updates themselves.

    Next up on our build process will be to update all firmware in the system, install the host operating system and drivers, and finally run a burn-in process. From there, we'll run some tests to get a feel for the IOPS and throughput we can expect from the two RAID arrays.

    Why Build Servers?

    That's got to be the burning question on some minds. Why?

    The long and the short of it is because we've been doing so for so many years it's a hard habit to kick. ;)

    Actually, the reality is much more mundane. We continue to be actively involved in building out our own server solutions for a number of reasons:

    • We can fine tune our solutions to specific customer needs
      • Need more IOPS we can do that
      • Need more throughput we can do that
      • Need a blend of the two as is the case here, then we can do that too.
    • Direct contact with firmware issues, interoperability, and stability
      • Making the various firmware bits play nice together can be a challenge
    • Driver issues, interoperability, and stability
      • Drivers can be quite finicky about what's in the box with them
    • Hardware interoperability
      • Our parts bin is chalk full of parts that refused to work with one another
      • On the other hand our solution sets are known good configurations
    • Cost
      • Our server systems are a fraction of the cost of Tier 1
    • Overall system configuration
      • As Designed Stability out of the box
    • He said She said
      • Since we test our systems extensively prior to deploying we know them well
      • Software Vendors that point the finger have no leg to stand on as we have plenty of charts and graphs
      • Performance issues are easier to pinpoint in software vendor's products
      • We remove the guesswork around an already configured Tier 1 box

    Business Case

    The business case is fairly simple: There are _a lot_ of folks out there that do not want to cloud their business. We help customers with a highly available solution set and our business cloud to give them all of the cloud goodness but keep their data on-premises.

    We also help I.T. Professional Shops who may not have the skill-set on board that have customers with a need for High Availability and a cloud like experience but want the solution deployed on-premises.

    For those customers that do want to cloud their business we have a solution set for the Small to Medium I.T. Shops that want to provide multi-tenant solutions in their own data centres. We provide the solution and backend support at a very reasonable cost while they spend their time selling their cloud.

    All in all, we've found ourselves a number of different great little niches for our highly available solutions (clusters) over the last few years.

    Thanks for reading! :)

    Philip Elder
    Microsoft High Availability MVP
    MPECS Inc.
    Co-Author: SBS 2008 Blueprint Book
    Our Web Site
    Our Cloud Service
    Twitter: @MPECSInc

    Friday, 3 November 2017

    A Little Plug for Mellanox and RoCE RDMA

    RoCE (RDMA over Converged Ethernet) via Mellanox NICs and switches is our primary fabric choice for Storage Spaces Direct (S2D) and Scale-Out File Server (SOFS) to Hyper-V compute cluster fabric.

    With the Mellanox MSX1012X 10GbE switch we can deploy a pair of them along with a pair of ConnectX-4 Lx dual port NICs per node for about the same cost as a pair of NETGEAR XS716T 10GbE switches and a pair of Intel X540/X550-T2 10GbE RJ45 based NICs per node.

    We have a great business relationship with Mellanox. They are great folks to work with and their product support is second to none.

    I was honoured to be asked to use a portion of my presentation for MVPDays to create the following video that is resident on Mellanox's YouTube channel.

    Hopefully the video comes out okay as embedding it was a bit of a chore.

    Thanks for reading and have a great weekend!

    Philip Elder
    Microsoft High Availability MVP
    MPECS Inc.
    Co-Author: SBS 2008 Blueprint Book
    Our Cloud Service
    Twitter: @MPECSInc

    Wednesday, 1 November 2017

    Error Fix: Event 7034 Service Control Manager - Server, BITS, Task Scheduler, Windows Management Instrumentation, Shell Hardware Detection Crashes

    This has just recently started to pop up on networks we manage.

    All of the following are Event ID 7034 Service Control Manager service terminated messages:

    • The Windows Update service terminated unexpectedly. It has done this 3 time(s).
    • The Windows Management Instrumentation service terminated unexpectedly. It has done this 3 time(s).
    • The Shell Hardware Detection service terminated unexpectedly. It has done this 3 time(s).
    • The Remote Desktop Configuration service terminated unexpectedly. It has done this 3 time(s).
    • The Task Scheduler service terminated unexpectedly. It has done this 3 time(s).
    • The User Profile Service service terminated unexpectedly. It has done this 3 time(s).
    • The Server service terminated unexpectedly. It has done this 3 time(s).
    • The IP Helper service terminated unexpectedly. It has done this 2 time(s).
    • The Device Setup Manager service terminated unexpectedly. It has done this 3 time(s).
    • The Certificate Propagation service terminated unexpectedly. It has done this 2 time(s).
    • The Background Intelligent Transfer Service service terminated unexpectedly. It has done this 3 time(s).
    • The System Event Notification Service service terminated unexpectedly. It has done this 2 time(s).

    It turns out that all of the above are tied into SVCHost.exe and guess what:

    Log Name: Application
    Source: Application Error
    Date: 10/23/2017 5:09:57 PM
    Event ID: 1000
    Task Category: (100)
    Level: Error
    Keywords: Classic
    Faulting application name: svchost.exe_DsmSvc, version: 6.3.9600.16384, time stamp: 0x5215dfe3
    Faulting module name: DeviceDriverRetrievalClient.dll, version: 6.3.9600.16384, time stamp: 0x5215ece7
    Exception code: 0xc0000005
    Fault offset: 0x00000000000044d2
    Faulting process id: 0x138
    Faulting application start time: 0x01d34c5c3f589fe7
    Faulting application path: C:\Windows\system32\svchost.exe
    Faulting module path: C:\Windows\System32\DeviceDriverRetrievalClient.dll

    A contractor of ours that we deployed a greenfield AD and cluster for was the one who figured it out. WSUS and the Group Policy settings were deployed this last weekend with everything in our Cloud Stack running smoothly until then.

    The weird thing is, we have had these settings in place for years now without any issues.

    The following are the settings changed at both sites:

    System/Device Installation
    Specify search order for device driver source locations: Not Configured
    2014-02-11: Enabled by Philip Elder.
    2017-11-01: Not Configured by Philip Elder.
    Specify the search server for device driver updates: Not Configured
    2014-02-11: Enabled by Philip Elder.
    2017-11-01: Not Configured by Philip Elder.

    System/Driver Installation
    Turn off Windows Update device driver search prompt: Not Configured
    2017-10-28: Disabled by Philip Elder.
    2017-11-1: Returned to Not Configured by Philip Elder

    System/Internet Communication Management/Internet Communication settings
    Turn off Windows Update device driver searching: Not Configured
    2014-02-11: Disabled by Philip Elder.
    2017-11-01: Not Configured by Philip Elder.

    It is important to note that when working with Group Policy settings a comment should be made in each setting if at all possible. Then, when it comes to troubleshooting an errant behaviour that turns out to be Group Policy related we are better able to figure out where the setting is and when it was set. In some cases, a short description of the "Why" the setting was made helps.

    Philip Elder
    Microsoft High Availability MVP
    MPECS Inc.
    Co-Author: SBS 2008 Blueprint Book
    Our Cloud Service
    Twitter: @MPECSInc

    Tuesday, 31 October 2017

    Xeon Scalable Processor Motherboard CPU-Soft Lockup Fix

    The new Intel Purley based Intel Server Boards S2600WF, S2600BP, and S2600ST Product Family use a new BMC (Baseboard Management Controller) video subsystem.

    As a result, some operating systems, mostly *NIX based, will choke on install as they may not have the driver built-in.

    Intel Technical Advisory: Intel® Server Board S2600WF, S2600BP and S2600ST Product Family fail to initialize the operating system video driver for the ASPEED* Base Management Controller (BMC).

    That document point's to ASPEED's site for downloading an up to date driver that fixes the problem.

    Root Cause
    Full root cause of this issue has been determined. Intel has confirmed that the failure has no bearing on system performance, it only impacts local video graphics. In detail, when the operating system loads, the OS-embedded ASPEED* video driver is not able to access a portion of the BMC memory space, therefore the process stalls.

    On Windows Server based configurations we need to update the driver once the OS is installed. The default VGA driver that comes built-in to the OS works just fine.

    Philip Elder
    Microsoft High Availability MVP
    MPECS Inc.
    Co-Author: SBS 2008 Blueprint Book
    Our Cloud Service
    Twitter: @MPECSInc

    Thursday, 26 October 2017

    Fujitsu ScanSnap N1800: E-mail Button Greyed Out Fix

    We have moved a ScanSnap N1800 onto a new greenfield setup in a side-by-side migration we've been running.

    In this case, the Exchange server is on-premises with the appropriage Anonymous MFP Relay setup configured.

    Searching about turned up what turned out to be a simple fix though not one we would prefer: Enable a mailbox in Exchange for the scanner's account.

    Once we did that the e-mail button did indeed appear and work with subsequent scan and send tests being successful.

    Note that the account being used has a rediculously long password that never changes and is restricted on the domain. So, the attack surface is relatively small.

    Philip Elder
    Microsoft High Availability MVP
    MPECS Inc.
    Co-Author: SBS 2008 Blueprint Book
    Our Cloud Service
    Twitter: @MPECSInc

    Tuesday, 24 October 2017

    Some thoughts on Windows Server 1709 and Where to Find It

    We were looking for the new Software Assurance benefit based download of Windows Server 1709 in Microsoft's Volume Licensing Service Center.

    It took a bit to realize that the download was not tied into "Windows Server 2016" and it's available downloads.

    The following two items show up in search for "Windows Server":



    When we click through and try to download either one they both point to the same download:image

    Keep in mind that 1709 is a Server Core only option and receives updates every six months. Plus, the service life of each release is 18 months.

    That means that adopting the Semi-Annual Channel (SAC) release of Windows Server would require a significant investment in both testing prior to deployment and in deploying the OS on a regular basis.

    Keep in mind that Software Assurance is required for access to SAC.

    Is it of value? For those businesses that are looking to adopt newer/better features via quicker cadence then yes, there is value in it.

    For those that are looking for long-term stability in their deployments then the Long Term Service Branch/Channel (LTSB/C) is the way to go.

    For us, we are in a "Wait and see" mode as our focus is currently Storage Spaces and Hyper-V along with Storage Spaces Direct clusters.

    As far as SAC being a Server Core option only we don't have a problem with that now do we? ;)

    Realistically though, there may be a lot of really neat features and abilities that may only appear in the SAC branch of Windows Server as we go along. That has yet to be seen, but given Microsoft's push to add value to Software Assurance over the last number of years one can comfortably wager that there will be extra value in that branch of the OS.

    Philip Elder
    Microsoft High Availability MVP
    MPECS Inc.
    Co-Author: SBS 2008 Blueprint Book
    Our Cloud Service
    Twitter: @MPECSInc

    Tuesday, 17 October 2017

    Microsoft Groove to Spotify Move

    When Microsoft announced the discontinuation of the Groove Music Pass there was a lot of dissappointment around here.

    Groove offered the ability to check out all sorts of music over the years without having to fork out a buck or two on a song that ended up not being listened to.

    With the ability to download music to four different devices for offline listening it was a really good value for the money.

    So, off we go into the migration from Groove to Spotify.

    First off, this was one of the smoothest transitions ever experienced. Everything moved over without a hitch. It took a bit but nothing was lost in the process!

    Score one for Microsoft and Spotify!

    Score two for Spotify: The $15/Month Premium Family Plan for up to five folks under one roof was the clincher.

    We were looking to obtain two more Groove Music Pass accounts. One for my wife and the other for our daughter. That would have been expensive!

    A couple of settings in the Spotify app to take note of after upgrading to Premium:

    • Settings - Music Quality: Enable High quality streaming (Premium only)
    • Settings - Social: Disable Automatically make new playlists public
    • Settings - Social: Enable Private session

    The last two are personal preference but as a rule we will make sure our kid's apps are set up this way.

    Philip Elder
    Microsoft High Availability MVP
    MPECS Inc.
    Co-Author: SBS 2008 Blueprint Book
    Our Cloud Service
    Twitter: @MPECSInc

    Sunday, 17 September 2017

    Fix: Unable to Message Skype Contacts

    This was a weird one. Skype on my Windows Phone stated "Messaging Unavailable" for my Dad.

    The Skype apps would not present any form of his contact.

    While logged in to the OneDrive site I saw a Chat bubble. I clicked on it and typed in his name.

    Low and behold his contact came up BLOCKED in red.


    I didn't do that and nowhere else in the Skype ecosystem on any of my devices did that status come up.

    So, I unblocked him but still no joy.

    I had him log on to the OneDrive site, click the chat bubble beside the bell, and search for me and sure enough, I came up BLOCKED.

    After his unblocking me our mutual contacts lit up.

    Yo Skype, what the chicken?!?

    Philip Elder
    Microsoft High Availability MVP
    MPECS Inc.
    Co-Author: SBS 2008 Blueprint Book
    Our Cloud Service
    Twitter: @MPECSInc

    Saturday, 16 September 2017

    Storage Live Migration: Where's the Move Status?

    We're in the process of re-working a client's cluster setup (Dell MD3220 DAS + (2) Dell R520 Hyper-V Nodes) by adding drives and a new LUN.

    Note that until the new RAID array on the MD3220 has finished initializing the drives will remain Read-Only.

    A Storage Live Migration (SLM) can be initiated from within Failover Cluster Manager, Hyper-V Manager, or via PowerShell (Move-VMStorage on TechNet).

    To see how things are moving along we can check in Hyper-V Manager:

    The VHDX in question is over 500GB in size and it's taking a while to move!

    Note that the VM remains online all the while and also once the SLM completes.

    Once we've completed our storage re-organization we have some new workloads to configure.

    Philip Elder
    Microsoft High Availability MVP
    MPECS Inc.
    Co-Author: SBS 2008 Blueprint Book
    Our Cloud Service
    Twitter: @MPECSInc

    Wednesday, 6 September 2017

    Client E-mail Warning for the Current Malware Campaigns

    This went out this morning. The first place in any "security strategy" should be to train the human.


    I hope you had a great summer!

    With anti-SPAM services getting better and better the malicious folks out there are getting a lot more subtle in their efforts plus we’re seeing an uptick of baddies in the Inbox.

    Things to note in the message below:

    1. The FROM domain does not match the domain in the link
    2. After hovering the mouse over the Here link the URL listed contains a bunch of gibberish
    3. Watch for language, spelling, and grammar errors as there tends to be a lot of them
    4. Is the Subject and/or Sender legit? Call them first!
    5. Do NOT open any Word documents and especially do NOT click Enable Macros if prompted!
    6. Be cautious with any PDF attachments. If in doubt call the sender or forward to here with a question.


    NOTE: We are seeing _a lot_ of compromised e-mail addresses and mailboxes as a result of users opening something or clicking on something they should not have.

    One attack vector is via a Macro enabled Word document harvests both E-mail and Addresses to send out _replies_ to a legitimate e-mail thread/conversation. If the Word document gets clicked on and a prompt comes for enabling Macros the Word document is BAD. CLOSE Word and SHIFT+DELETE the e-mail!

    If in doubt, don’t open or click on it! Do _not_ hesitate to call or forward the questionable content!

    Thank you and have a wonderful day! :)

    Philip Elder
    Microsoft High Availability MVP
    MPECS Inc.
    Co-Author: SBS 2008 Blueprint Book
    Our Cloud Service
    Twitter: @MPECSInc

    Monday, 4 September 2017

    Enable 2FA (Two Factor Authentication) Everywhere It's Available!

    Yes, it's a bit of an extra inconvenience.

    But, that inconvenience may save the account and any data associated with it from being hijacked!

    As an example, after logging into my Microsoft ID and heading into the Security section I can check and see if there is anything out of the ordinary.


    And, low and behold what do I find? That I've attempted to log on from some interesting places!

    image2FA is enabled on this Microsoft ID and all others. Amazon, Blogger, Microsoft,and any other that offer 2FA has it enabled.

    There's absolutely no way in this day and age that it should not be used.

    Thanks for reading. :)

    Philip Elder
    Microsoft High Availability MVP
    MPECS Inc.
    Co-Author: SBS 2008 Blueprint Book
    Our Cloud Service