Monday 13 May 2019

New Blog Post Azure Web App: Display Server Requests Metrics on Dashboard

New Blog Post: Azure Web App: Display Server Requests Metrics on Dashboard

We have a new blog site: MPECS Inc. Blog
.  
Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service
Twitter: @MPECSInc

New Blog Post: Azure Web Apps: Subdomain DNS A Record Error


New blog post: Azure Web Apps: Subdomain DNS A Record Error

We have a new blog site: http://blog.mpecsinc.com!


Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service
Twitter: @MPECSInc

Saturday 11 May 2019

New Blog Location: http://blog.mpecsinc.com

Good day all!

After close to ten twelve years it's time to start anew.

Blogger made some changes that no longer allow image posting via Open Live Writer. Using the built-in Blogger editor is not a very happy place to be as it's limited and requires an internet connection.

So, we've set up a new location: http://blog.mpecsinc.com (SSL coming soon!).

The new blog feed is here: http://blog.mpecsinc.com/feed/

We will be posting all new content over at the new blog. And, as time permits copying the most popular posts over to keep things in one place as much as possible.

Thank you all for reading and for your support over these last twelve years! :)
 
Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
Our Cloud Service
Twitter: @MPECSInc
New Blog Location: https://blog.mpecsinc.com

Thursday 2 May 2019

SharePoint Online: Setting up a WebDav File/Windows Explorer Favourite and/or Mapped Drive

We’re working on setting up a project collaboration site in Office 365 SharePoint Online.

One of the simplest things to do to streamline document access is to set up a Favourite/Quick Access link in user’s File/Windows Explorer.

To do so:

  1. Open the SharePoint Site in Internet Explorer
  2. In Internet Options –> Trusted Sites set the slider to Low
  3. Add the site to the Trusted Sites list and Apply & OK
  4. Click on the Documents link
  5. Click the Return to classic SharePoint link bottom left of the browser
  6. Click the LIBRARY tab
  7. Click the Open with Explorer button under Connect & Export
    1. A credentials prompt may happen here
  8. Drag the folder in the Address Bar to Favourites/Quick Access
  9. Right click on the new shortcut and click on Properties
  10. On the General tab:
    1. Click and hold at the left of the Location: UNC path
    2. Drag the mouse to the right to highlight the entire path and release the mouse button
    3. Right click on the highlight and Copy
    4. UNC path will look like: \\YOURDOMAINURI.sharepoint.com@SSL\DavWWWRoot\

That UNC path can be used to map a drive via Group Policy Preferences so that all users will have access via File/Windows Explorer.

Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
www.s2d.rocks !
Our Web Site
Our Cloud Service

Tuesday 9 April 2019

IMPORTANT: CRITICAL Firmware Update for Intel SSD DC S4510 and S4610

The Intel SSD DC S4510 and S4610 series SATA SSDs have a critical flaw where after 1,700 hours of idle time, meaning on but not working, they brick.

The following is from the Release Notes for the update:

Intel® Solid State Drive DC S4510 and S4610 Series Revision History

History

Date

Firmware

March 2019

XC311102(MR1)

XCV10110(MR1)

The following changes are included in this firmware update:

Resolved issue related to intermittent drive drop during initial boot.

Resolved 1.92TBand 3.84TB SKUs may become unresponsive at 1700hrs. of cumulative Idle Power On Hours.

Intel direct download can be found here: Intel® SSD Data Center Tool (Intel® SSD DCT)

For OEM provided applications check with the vendor's support site to find out if there is an update available.

This is a _critical data loss scenario_ and should be dealt with ASAP!

Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
www.s2d.rocks !
Our Web Site
Our Cloud Service

Tuesday 29 January 2019

iPhone - Add a Virtual Home Button

While helping out with a new iPhone XR the following was seen:

Tap on it and:

Figuring out just what that was turned out to be a bit of a challenge since it was put there by an Apple employee when the phone was purchased and set up.

Well, after asking around and getting an answer from the ever knowing Merv Porter (he has way better search skills than me) it turns out it is called the Virtual Home Button and is in the AssistiveTouch menu.

To turn it on is quite simple:
  1. Settings
  2. General
  3. Accessibility
  4. AssistiveTouch
Turn that on. The menu above can be customized using Customize Top Level Menu… and the button itself can have several different touch abilities.

The button can be moved and moves about the screen depending on what's being done or read.
It grows opaque when not in use and disappears when trying to get a screenshot of it. The above were taken by my own iPhone's camera.

Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
www.s2d.rocks !
Our Web Site
Our Cloud Service

Tuesday 15 January 2019

Custom Intel X299 Workstation: Intel VROC RAID 1 NVMe WinSat Disk Score

We just finished a custom build for a client of ours in the US.

The machine is extremely fast but quiet.

image

After kicking the tires a bit with Windows 10 Pro 64-bit and some software installs post burn-in we get the following performance out of the Intel NVMe RAID 1 pair:

C:\Temp>winsat disk
Windows System Assessment Tool
> Running: Feature Enumeration ''
> Run Time 00:00:00.00
> Running: Storage Assessment '-ran -read -n 0'
> Run Time 00:00:00.77
> Running: Storage Assessment '-seq -read -n 0'
> Run Time 00:00:02.38
> Running: Storage Assessment '-seq -write -drive C:'
> Run Time 00:00:01.64
> Running: Storage Assessment '-flush -drive C: -seq'
> Run Time 00:00:00.45
> Running: Storage Assessment '-flush -drive C: -ran'
> Run Time 00:00:00.38
> Dshow Video Encode Time                      0.00000 s
> Dshow Video Decode Time                      0.00000 s
> Media Foundation Decode Time                 0.00000 s
> Disk  Random 16.0 Read                       1020.73 MB/s          8.8
> Disk  Sequential 64.0 Read                   3203.52 MB/s          9.3
> Disk  Sequential 64.0 Write                  1456.24 MB/s          8.8
> Average Read Time with Sequential Writes     0.090 ms          8.8
> Latency: 95th Percentile                     0.146 ms          8.9
> Latency: Maximum                             0.316 ms          8.9
> Average Read Time with Random Writes         0.058 ms          8.9
> Total Run Time 00:00:05.91

The machine is destined for a surveying company that's getting into high end image and video work with drones.

All in all, we are very happy with the build and we're sure they will be too!

Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
www.s2d.rocks !
Our Web Site
Our Cloud Service

Friday 11 January 2019

Some Thoughts on the S2D Cache and the Upcoming Intel Optane DC Persistent Memory

Intel has a very thorough article that explains what happens when the workload data volume on a Storage Spaces Direct (S2D) Hyper-Converged Infrastructure (HCI) cluster starts to "spill over" to the capacity drives in a NVMe/SSD Cache with HDD Capacity for storage.

Essentially, any workload data that needs to be shuffled over to the hard disk layer will suffer a performance hit and suffer it big time.

In a setup where we would have either NVMe PCIe Add-in Cards (AiCs) or U.2 2.5" drives for cache and SATA SSDs for capacity the performance hit would not be as drastic but it would still be felt depending on workload IOPS demands.

So, what do we do to make sure we don't shortchange ourselves on the cache?

We baseline our intended workloads using Performance Monitor (PerfMon).

Here is a previous post that has an outline of what we do along with links to quite a few other posts we've done on the topic: Hyper-V Virtualization 101: Hardware and Performance

We always try to have the right amount of cache in place for the workloads of today but also with the workloads of tomorrow across the solution's lifetime.

S2D Cache Tip

TIP: When looking to set up a S2D cluster we suggest running with a higher count smaller volume cache drive set versus just two larger capacity drives.

Why?

For one, we get a lot more bandwidth/performance out of three or four cache devices versus two.

Secondly, in a 24 drive 2U chassis if we start off with four cache devices and lose one we still maintain a decent ratio of cache to capacity (1:6 with four versus 1:8 with three).

Here are some starting points based on a 2U S2D node setup we would look at putting into production.

  • Example 1 - NVMe Cache and HDD Capacity
    • 4x 400GB NVMe PCIe AiC
    • 12x xTB HDD (some 2U platforms can do 16 3.5" drives)
  • Example 2 - SATA SSD Cache and Capacity
    • 4x 960GB Read/Write Endurance SATA SSD (Intel SSD D3-4610 as of this writing)
    • 20x 960GB Light Endurance SATA SSD (Intel SSD D3-4510 as of this writing)
  • Example 3 - Intel Optane AiC Cache and SATA SSD Capacity
    • 4x 375GB Intel Optane P4800X AiC
    • 24x 960GB Light Endurance SATA SSD (Intel SSD D3-4510 as of this writing)

One thing to keep in mind when it comes to a 2U server with 12 front facing 3.5" drives along with four or more internally mounted 3.5" drives is their heat and available PCIe slots. Plus, the additional drives could also place a constraint on the processors that are able to be installed also due to thermal restrictions.

Intel Optane DC Persistent Memory

We are gearing up for a lab refresh when Intel releases the "R" code Intel Server Systems R2xxxWF series platforms hopefully sometime this year.

That's the platform Microsoft set an IOPS record with set up with S2D and Intel Optane DC persistent memory:

We have yet to see to see any type of compatibility matrix as far as the how/what/where Optane DC can be set up but one should be happening soon!

It should be noted that they will probably be frightfully expensive with the value seen in online transaction setups where every microsecond counts.

TIP: Excellent NVMe PCIe AiC for lab setups that are Power Loss Protected: Intel SSD 750 Series

image

Intel SSD 750 Series Power Loss Protection: YES

These SSDs can be found on most auction sites with some being new and most being used. Always ask for an Intel SSD Toolbox snip of the drive's wear indicators to make sure there is enough life left in the unit for the thrashing it would get in a S2D lab! :D

Acronym Refresher

Yeah, gotta love 'em! Being dyslexic has its challenges with them too. ;)

  • IOPS: Inputs Outputs per Second
  • AiC: Add-in Card
  • PCIe: Peripheral Component Interconnect Express
  • NVMe: Non-Volatile Memory Express
  • SSD: Solid-State Drive
  • HDD: Hard Disk Drive
  • SATA: Serial ATA
  • Intel DC: Data Centre (US: Center)

Thanks for reading!

Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
www.s2d.rocks !
Our Web Site
Our Cloud Service

Thursday 10 January 2019

Server Storage: Never Use Solid-State Drives without Power Loss Protection (PLP)

Here's an article from a little while back with a very good explanation of why one should not use consumer grade SSDs anywhere near a server:

While the article points specifically to Storage Spaces Direct (S2D) it is also applicable to any server setup.

The impetus behind this post is pretty straight forward via a forum we participate in:

  • IT Tech: I had a power loss on my S2D cluster and now one of my virtual disks is offline
  • IT Tech: That CSV hosted my lab VMs
  • Helper 1: Okay, run the following recovery steps that help ReFS get things back together
  • Us: What is the storage setup in the cluster nodes?
  • IT Tech: A mix of NVMe, SSD, and HDD
  • Us: Any consumer grade storage?
  • IT Tech: Yeah, the SSDs where the offline Cluster Storage Volume (CSV) is
  • Us: Mentions above article
  • IT Tech: That's not my problem
  • Helper 1: What were the results of the above?
  • IT Tech: It did not work :(
  • IT Tech: It's ReFS's fault! It's not ready for production!

The reality of the situation was that there was live data sitting in the volatile cache DRAM on those consumer grade SSDs that got lost when the power went out. :(

We're sure that most of us know what happens when even one bit gets flipped. Error Correction on memory is mandatory for servers for this very reason.

To lose an entire cache worth across multiple drives is pretty much certain death for whatever sat on top of them.

Time to break-out the backups and restore.

And, replace those consumer grade SSDs with Enterprise Class SSDs that have PLP!

Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
www.s2d.rocks !
Our Web Site
Our Cloud Service

Monday 7 January 2019

Security: Direct Internet Connections for KVM over IP devices such as iLO Advanced, iDRAC Enterprise, Intel RMM, and others = BAD

While discussing firmware and updating firmware this situation, experienced a number of years ago vicariously, came to light:

ISP: Excuse me sir, but we have a huge volume of SPAM coming out of [WAN RMM IP]
Admin: Huh?
ISP: We are seeing huge volumes of SMTP traffic outbound from [WAN RMM IP]
Admin: Oh?
* Checks non-existent documentation
Admin: Um, is that IP assigned to us?
ISP: Yes sir, along with [WAN SSL IP for internal services]
Admin: Hmmm …
[PAUSE]
Admin: Oh, wait, I think I know … [unplugs iLO/iDRAC/RMM from switch connected to ISP modem]
Admin: Is it better now?
ISP: Oh yeah, what did you do?
Admin: Oh, I fixed it.

One should never plug a RMM/iLO/iDRAC type device directly into the Internet right?

We probably blogged about this in the past, but it definitely bears repeating as we still encounter situations where the devices are plugged directly in to the Internet!

Happy New Year everyone! :)

Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
www.s2d.rocks !
Our Web Site
Our Cloud Service

Saturday 22 December 2018

VS Code for PowerShell: IntelliSense Code Highlighting Not Working?

This is what VS Code looks like on _all_ machines used in the shop and elsewhere but one:

image

Note the great colour coding going on. That's IntelliSense working its magic to render the code in colour!

The errant machine looks like this:

image

Note the distinct lack of colour coding going on. :(

We removed and re-installed VS Code, the Extensions, and anything else we could with no change.

After an ask on a PowerShell list a suggestion was made to check the VS Code theme.

Sure enough, on the problematic one:

image

VS CODE: Dark (Visual Studio) Colour Theme

While on a functioning setup:

image

VS CODE: Dark+ (default dark)

For whatever reason, the latter colour theme seems to break things.

Now that we have that figured out, we can move on to coding! :D

Merry Christmas and a Happy New Year's to everyone!

Hat Tip: Shawn Melton MVP

Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
www.s2d.rocks !
Our Web Site
Our Cloud Service

Wednesday 12 December 2018

Intel Technology Provider for 2019

We just received word of our renewal for the Intel Technology Provider program:

image

We've been system builders since the company began in 2003 with my building systems for more than a decade before that!

One of the comments that gets made on a somewhat frequent basis is something along the lines of being a "Dinosaur". ;)

Or, this question gets asked quite a lot, "Why?"

There are many reasons for the "Why". Some that come off the top are:

  • We design solutions that meet very specific performance needs such as 150K IOPS, 500K IOPS, 1M IOPS and more
  • Our solutions get tested and thrashed before they ever get sold
    • We have a parts bin with at least five figures worth of broken vendor's promises
  • We have a solid understanding of component and firmware interactions
  • Our systems come with guaranteed longevity and performance
    • How many folks can say that when "building" a solution in a Vendor's "Solution Tool"?
  • We avoid the finger pointing that can happen when things don't live up to muster

The following is one of our lab builds. A two node Storage Spaces Direct (S2D) cluster utilizing 24 Intel SSD DC-4600 or D3-4610 SATA series SSDs flat meaning no cache layer. The upper graphs are built in Grafana while the bottom left is Performance Monitor watching the RoCE (RDMA over Converged Ethernet via Mellanox) and the bottom right is the VMFleet WatchCluster PowerShell.

image

We just augmented the two node setup with 48 more Intel SSD D3-4610 SATA SSDs for the other two nodes and are waiting on a set of Intel SSD 750 series NVMe PCIe AiCs (Add-in-Card) to bring our 750 count up to 3 per node for NVMe cache.

Why the Intel SSD 750 Series? They have Power Loss Protection built-in. Storage Spaces Direct will not allow any cache devices hold any data in the storage's local cache if it is volatile. What becomes readily discoverable is that writing straight through to NAND is a very _slow_ process relative to having that cache power protected!

We're looking to hit 1M IOPS flat SSD and well over that when the NVMe cache setup gets introduced. There's a possibility that we'll be seeing some Intel Optane P4800X PCIe AiCs in the somewhat near future as well. We're geared-up for a 2M+ run there. :D

Here's another test series we were running to saturate the node's CPUs and storage to see what kind of numbers we would get at the guest level:

image

Again, the graphs in the above shot are Grafana based.

The snip below is our little two node S2D cluster (E3-1270v6, 64GB ECC, Mellanox 10GbE RoCE, 2x Intel DC-4600 SATA SSD Cache, 6x 6TB HGST SATA) pushing 250K IOPS:

image

We're quite proud of our various accomplishments over the years with our high availability solutions running across North America and elsewhere in the world.

We've not once had a callback asking us to go and pick-up our gear and refund the payment because it did not meet the needs of the customer as promised.

Contrary to the "All in the Cloud" crowd there is indeed a niche for those of us that provide highly available solution sets to on-premises clients. Those solutions allow them to have the uptime they need without the extra costs of running all-in the cloud or hybrid with peak resources in the cloud. Plus, they know where their data is.

Thanks for reading!

Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
www.commodityclusters.com
Our Web Site
Our Cloud Service

Tuesday 11 December 2018

OS Guide: Slipstream Updates Using DISM and OSCDImg *Updated

We have found out that we need to have the May Servicing Stack Update (SSU) KB4132216 _and_ the latest SSU which is currently KB4465659 in the Updates_WinServ folder we drop the Cumulative Update into for the Windows Server 2016 slipstream run.

image

Note that the current version of the script points to Server 2019. Please use that as a base to tweak and create a set of folders for Windows Server 2016 and Windows 10 updates.

Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
www.s2d.rocks !
Our Web Site
Our Cloud Service

Thursday 6 December 2018

Error Fix: Trust Relationship is Broken

Here's a quick post on fixing a broken trust situation when the local administrator username and password is a known commodity.

On Windows 7:

  1. Windows Explorer
  2. Right click My Computer/This PC --> Properties
  3. Change settings for Computer Name
  4. Change button
  5. Domain: Setting Now: DOMAIN.LOCAL
    1. Change to DOMAIN (delete .Local)
    2. Credential
  6. Reboot

That process will fix things for Windows 7 unless PowerShell is up to date then for all others including it:

  1. Log on with local admin user
  2. Reset-ComputerMachinePassword -Credential DOMAIN\DomainAdmin
  3. Log off
  4. Log on with domain user account

That's it.

If you know any other methods, especially for situations where the local admin username and password is an unknown or all local admin accounts are disabled feel free to comment or ping!

Thanks for reading. :)

Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
www.commodityclusters.com
Our Web Site
Our Cloud Service

Monday 3 December 2018

A word of caution: Verify! Everything!

We get to work with a whole host of clients and their industries but as a contractor we also get to work with a wide variety of IT Pros and IT Companies.

Many times we get involved in a situation that's an outright pickle.

Something has gone sideways and the caller is looking for some guidance, some direction, and a bit of handholding because they are at a loss.

Vendor Blame Game

Some of those times the caller is in the middle of a tongue wagging session between a set of vendors blaming the other for the pickle.

We were in that situation back in the day when a Symantec BackupExec (BUE) solution failed to restore. The client site had two HP DAT tape libraries with BUE firing "All Good" reports.

We found out those reports were bad when the client's main file server went blotto.

We were in between the storage vendor and their products, Symantec, and HP. It was not a pretty scene at all. In the end, it was determined _by us_ that BUE was the source of the problem because it did not do any kind of verify on the backups being written to tape despite the setting being there to do so.

We were fortunate that we had multiple redundant systems in place and managed to get most of the data back except one partner's weeks' worth of work. We had to build out a new domain though.

So, why the blog post?

Because, it's still happening today.

Verify, Verify, and Verify Again

We _highly suggest_ verifying that all backup products and backup services are doing what they say they are doing.

If the service provider charges for a test failover then do it anyway. Charge the fee back to the client because once the process has run successful or not things are in a better place either way.

Never, _ever_ walk into a disaster recovery situation without ever having tested the products that are supposed to save the client's business. Period.

Yeah, there are times where something may happen before that planned failover. That's not what we're talking about here.

What we are after is testing to make sure that the vendor's claims are indeed true and that the solution set is indeed working as planned.

The last place we need to find out that our client's backups are _not_ working is when their servers, virtual machines, cloud services vendors have gone blotto.

Out of the Cloud Backup

We always make sure we have a way to back up any cloud vendor's services to our client's site. It just makes sense.

Our trust is a very fickle thing.

When it comes to our client's data we don't give our full trust to any vendor or solution set.

We _always_ test the backup and recovery processes so that we're not blindsided by things not going as planned or any "hidden fees" for accessing the client's data in a disaster recovery situation.

Philip Elder
Microsoft High Availability MVP
MPECS Inc.
Co-Author: SBS 2008 Blueprint Book
www.s2d.rocks !
Our Web Site
Our Cloud Service