We lose it almost continually throughout the day.
This particular box is our first Intel SR1630HGP Server System with an Intel Xeon Processor 3460 and 16GB of RAM.
This is what the %windir%\MiniDump directory looks like right now:
Our handy Crash Analyzer Wizard that comes with our Software Assurance and MDOP benefits tell us:
Click on that Details button and we see:
CLOCK_WATCHDOG_TIMEOUT (101)
An expected clock interrupt was not received on a secondary processor in an MP system within the allocated interval. This indicates that the specified processor is hung and not processing interrupts.
Arguments:
Arg1: 0000000000000019, Clock interrupt time out interval in nominal clock ticks.
Arg2: 0000000000000000, 0.
Arg3: fffff88001e5d180, The PRCB address of the hung processor.
Arg4: 0000000000000002, 0.
Now, it’s not like we have not seen this one before:
- Hyper-V On Nehalem CPUs Error – CLOCK_WATCHDOG_TIMEOUT Critical Hotfix
- Hyper-V CLOCK_WATCHDOG_TIMEOUT Error Within 24 Hours On Fresh H-V 2K8 R2
- This particular server hit by the freezes was actually the one from the above posts. It has been sitting off on a shelf as we have not had the need for it up until this current project we are working on.
We did run the hotfix on the server a while back, but apparently that version was not good enough since we brought all off the server’s server board firmware and the RAID controller’s firmware up to date. We did this just before bringing it back online.
The above screenshot of the MiniDump directory does not do the situation justice as the server has frozen a lot more times than that . . . leading us to look at other options for the setup we need.
But then, apparently the KB975530 hotfix has received a newer version from what we found when searching to find any further info on the problem.
Here is a screenshot of the hotfix directory with our original update and the one that we just received:
There was not a lot of time between v3 and v4 with the time of this post being late March.
It now remains to be seen if this fourth iteration of the patch will actually straighten out the timing problem happening with the Nehalem CPUs.
The server:
- Intel Server System SR1630HGP 1U
- Intel Xeon Processor X3460
- Intel RAID Controller RS2BL080
- Battery Backup Unit
- 16GB Kingston ECC DDR3 RAM
- 450GB 15K Seagate SAS drives in RAID 5
- DVDROM optical drive
***
This post has been sitting open for the last couple of hours while busy with other things. After consistently freezing today, the server has been up and running with the two server VMs and five desktop VMs without a hiccup.
Hopefully that will be the situation from now on!
Philip Elder
MPECS Inc.
Microsoft Small Business Specialists
Co-Author: SBS 2008 Blueprint Book
*Our original iMac was stolen (previous blog post). We now have a new MacBook Pro courtesy of Vlad Mazek, owner of OWN.
No comments:
Post a Comment