Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

TylerL

macrumors regular
Original poster
Jan 2, 2002
207
291
Drives me nuts that there's no cohesive list of Shutdown Causes...

I have an Xserve3,1 that is occasionally (anywhere from once a day to once a week) shutting down entirely.
No kernel panic...simply turning off. I can still power it back on via LOM, for whatever that's worth. Server Monitor shows all thermals and fan activity as healthy.
Console lists "Previous Shutdown Cause: -105".
I've also encountered this behavior during testing with other operating systems as well (Linux, 10.6, 10.11).
...and AXD (Apple Xserve Diagnostics) says all components pass with flying colors...

I have seen no similar reports in web searches, and I'm hoping someone here can shed some light on what -105 points to...

Thanks!
 
Last edited:

TylerL

macrumors regular
Original poster
Jan 2, 2002
207
291
Update: I finally got it to fail while I was nearby, and saw the "CPUB OVERTEMP" LED lit before I manually restarted the server.
I did a burn-in test by pegging all CPU cores at 100% and watched the temperature sensors all day. All readings were within proper limits. Curiouser and curiouser...
 

DeltaMac

macrumors G5
Jul 30, 2003
13,757
4,583
Delaware
A few (mostly) random thoughts. Maybe one will be your answer.
If it always is that CPUB OVERTEMP LED, you should check that the CPU heatsink is installed properly. Likely that a 6 or 7 year-old system will work better if the heat sink thermal compound is refreshed.
You could also then try swapping the CPU A and CPU B positions.

You can find out how to do all that in the xServe technician guide (Repair manual)
If you don't already have that manual, you can search the net for that with the filename xserve_early2009.pdf
That manual showed up in the first search for me.
Very helpful if you want to try reworking the CPU heat sinks.
Be cautious about the tiny 2-pin connectors for the CPU temp sensors. Make sure those temp sensor connectors are snug, and not damaged in any way.
Maybe, physically check that all fans in the fan array are running.

None of that really explains why there is no issue when you try the CPU stress test, however.
I can't find any reference anywhere to shutdown cause -105, but I will throw out to you - intermittent component in the power supply.
Does your Xterm have dual power supplies - or a single power supply?
 

TylerL

macrumors regular
Original poster
Jan 2, 2002
207
291
Thanks for your reply.
It's always CPUB OVERTEMP, and I'm planning on swapping the two processors placement today.
The funny thing is...the CPUs are never overly hot! Even immediately after a shutdown event, I can comfortably touch the copper bottom of the heatsink of CPUB. iStat Menus states CPUB never gets above 55°C.

I have two power supplies in this Xserve and have tried using each individual one turing testing with no change. Though I hadn't actually removed either component entirely.

I've also noticed that I can get the error to appear at-will by initiating a restart. Rather than restart as soon as the OS completes its housekeeping, the Xserve turns off with the OVERTEMP LED. I've reset the SMC and NVRAM for good measure. Very interesting problem...
 

DeltaMac

macrumors G5
Jul 30, 2003
13,757
4,583
Delaware
The reason that I asked about single or dual power supplies - If you have only one power supply, that might be contributing to (or even causing) the shutdowns. And, that's the reason that you would have dual supplies in the first place
I would also swap the power supply positions.

If the heat sink is not installed properly, then it's possible that you would find the heat sink is still relatively cool, despite an over temp if the heat sink, or more correctly the thermal compound, is not doing its job allowing the heat to be conducted away from the heat source. Keep in mind that the heat sink itself is NOT the source of the heat. Its purpose is to draw the heat away from the sensitive components. If it does not do that, then overheating is the result, despite what you might feel from the heat sink surface. (The thermal compound loses efficiency with age, for example)
Or, the temp sensor is not reporting the temps correctly.
Swapping the CPU positions MIGHT tell the tale, if you still get CPUB overtemps and CPUA does not report overtemp.
Of course, you would re-do the thermal compound when you swap the CPU positions, so the issue MAY be gone as a result.
Let us know how it goes!
 

TylerL

macrumors regular
Original poster
Jan 2, 2002
207
291
Before tearing it all apart, I tried removing CPUB first, and couldn't get the server to fail through repeated restarts.

I was surprised that the heatsinks and thermal sensors and logic board were designed in a way that you can't simply swap the two CPUs, heatsinks intact.
Got halfway through detaching the CPUs from the heatsinks when I realized my thermal paste tube had dried up :) Oh well. It'll be an exercise for later in the week…
[doublepost=1458573672][/doublepost]If it does come to this... if you've dealt with CPU replacements in the past, is it possible to use a regular "lidded" CPU instead? Seems it only adds a fraction of a millimeter to the CPU height, and they're much more available on eBay than Xserve/MacPro-specific delidded CPUs.
 

DeltaMac

macrumors G5
Jul 30, 2003
13,757
4,583
Delaware
I have never tried that. I have seldom worked on Xserve at all, I have replaced a couple of CPUs, but with Apple replacements, while they were still in warranty.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.