MPECS Inc. Blog: 2 Node SR1695GPRX + VTrak E610sD Hyper-V Cluster Is Live

Tuesday, 14 June 2011

2 Node SR1695GPRX + VTrak E610sD Hyper-V Cluster Is Live

We are still ironing out a few things as far as how to set up the MPIO settings, but for the most part we are quite happy with the end results of our configuration testing.

A Hardware Problem

It seems that one of the Adaptec cards has taken exception to the current setup. We have not concluded as to whether the card has failed or needs a firmware reset as of yet. We need to determine _which_ card is causing the following on the VTrak:

47243 New Events

Last Event: 47272, Port 2 Ctrl 1, Info, Jun 14, 2011 08:44:20 – Host interface has logged out.

The corresponding error for a log in would follow the above. Note that none of the other three connections are throwing this error. As a result we are pretty sure it is one of the Adaptec cards since we connected like with like on the controllers.

Adaptec SAS:
- Server 1: Port x on Controller 1
- Server 2: Port x on Controller 1
Intel 1064e SAS:
- Server 1: Port x on Controller 2
- Server 2: Port x on Controller 2

We connected the cables in this manner just for this reason. We will be running through the VTrak’s Web GUI as well as the console session connected via Serial cable to see if we can figure out which port is which on Controller 1.

We have a pair of LSI 3442E-R SAS controllers on the way. We will replace both of the Adaptec cards with the LSi cards, flatten everything, and stand up a cluster for the third time.

Live Migration Success

This is a happy sign:

We initiated a Live Migration of the Windows Server 2008 R2 Standard Remote Desktop Services VM from Node-99 to Node-90.

The Live Migration ran successfully as shown above. The RDS VM was Live Migrated to Node-90. We left things alone for a while then Live Migrated the VM back to Node-99 without issue.

Cluster Events

There are a few Errors and one Critical error in the logs. All of them are from a problem with bringing additional LUNs online, formatting them, adding them as available storage in FCM, and then making them Clustered Shared Volumes.

The problems stemmed from a deadlock condition in the Cluster Resource Control Manager which we believe came about as a result of the problem path via the failed/flaky Adaptec card.

Conclusion

We are now confident enough to start proposing this Highly Available Failover Cluster to our clients.

Cost wise this HA Hyper-V cluster will be a very economical insurance policy against any single point of failure.

Philip Elder
MPECS Inc.
Microsoft Small Business Specialists
Co-Author: SBS 2008 Blueprint Book

*Our original iMac was stolen (previous blog post). We now have a new MacBook Pro courtesy of Vlad Mazek, owner of OWN.

Windows Live Writer