Thursday 5 March 2009

Intel SRCSASRB Firmware 1.20.72-0572 = Fatal Firmware Error (Correction Post)

This is a repost to correct my mistaken identification for the RAID controller’s model number! We had two identical boxes on the bench while this testing was going on. One had the SRCSASBB8i (Server Core box) and the other had the SRCSASRB (SBS 2008).

My apologies for making that mistake!

The discovery of the mistake was made when we opened up the box to switch out the RAID controller with another SRCSASBB8i and noticing that I had made the mistake. I am correcting that mistake with the Intel folks too as we have called into them to escalate the problem beyond the first tier.

We changed out the card with a known good card, updated that RAID controller’s firmware to 1.20.72-0572 and let it go over night. Well, this morning we came in and one of the hot swap caddy’s LEDs was solid and the system was locked up.

The only other step for us now is to swap out the hard disks with new ones and see if the problem reappears.

The odd thing is that we do have this RAID controller with this firmware installed in a number of other server configurations and we have not been experiencing any lockups there. The problem may be particular to this one setup.

*Repost Starts Here*

We have just started to implement the SRCSASBB8i SRCSASRB RAID controller with the new 1.20.72-0562 firmware in it. The system this particular one is in is our bench SBS 2008 box.

We stress test any new RAID controller firmware for a couple of weeks prior to implementing the update in any client environment where it is applicable.

This is what we have been seeing with the newest firmware update on the SRCSASBB8i SRCSASRB:

090302SRCSASBB8iError_thumb4

Intel SRCSASRB

[Fatal, 3] … Fatal firmware error: Line 156 in ../../raid/1078in.c

The server will be completely locked up with either one or both of the RAID 1 array drive lights being solid green.

The 320GB Seagate drives have been tested and are in the clear.

The Intel SRCSASRB and SRCSASBB8i share the same engine, though the boards are completely different as the BB8i has a PCI-E 8x where the RB has a PCI-E 4x interface.

In our testing of what seems to be the same firmware on the SRCSASRB SRCSASBB8i, we have not seen an error like this at all.

We will back the SRCSASBB8i SRCSASRB off to the previous firmware revision 1.12.172_0470 and run things up again to see if we run into the same problem. If we do, then we may be dealing with a problematic RAID controller and not bad firmware.

UPDATE 2009-03-05: Please note that the RAID controller we are dealing with is not the SRCSASBB8i, it is the SRCSASRB. My bad for that. We had two identical boxes on the bench, one with each model of RAID controller and I got them mixed up!

My apologies for the mistaken identification.

Philip Elder
MPECS Inc.
Microsoft Small Business Specialists

*All Mac on SBS posts will not be written on a Mac until we replace our now missing iMac!

Windows Live Writer

2 comments:

Manuel W. said...

Maybe you've already covered this, but why do you choose to hand-build servers? It seems like every month there are several posts about firmware problems and major low-level driver & BIOS issues. I hope that doesn't come across as glib or judgmental, I am genuinely interested in hearing your approach.

Philip Elder Cluster MVP said...

Manuel,

It makes business sense for us to have our servers built here.

For one, the RMA process with some hardware vendors is less than stellar with a lot of hoops to jump through.

With Intel, we get immediate technical support and 24 hour turn around on parts that need to be RMAd. No hassles, no muss, no fuss.

Another side to it is the ability to get to know how the hardware operates and interacts within the box itself, but also with the OS riding on top of it. It helps us to know when we are dealing with a hardware issue or a software issue.

It also iliminates the possibility of being caught in the middle of a who is responsibile fight between the hardware vendor and the software vendor for a problem. BTDT, and thanks but no thanks.

Those are a few off the top.

Thanks for the comment.

Philip