Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

nonlen

macrumors member
Original poster
Mar 29, 2014
49
1
Hi, I have a problem with my upgraded Mac Pro 4.1/5.1., 2 x 2.66 ghz cpu, 48 gb of ram, system drive Samsung m2 512 gb, 5 other ssd/hd's, Radeon rx 580, Mac Os Mojave.

So the problem is...when I do a "cold" boot, Mac boots as usual but after 5 to 10 mins it freezes and after a while it shuts down and starts again. Sometimes I have to force it to shut down by pressing the power button and start again. This all happens only once, max 2 two times in a row and after that I can use it for hours or even days without any issues. It seems to be taking a longer time to crash if I just boot the machine and do nothing serious with it and vice versa if I open a Logic project or video, it freezes/crashes earlier.

If I shut it down after a longer usage and immediately boot again, there is no issues. I have done a clean Mojave install a couple of times but it didn't make any difference.



Any ideas what is causing this issue? Thank you for your help.



Attached panic report
 

Attachments

  • Panic report 300423.pdf
    42.5 KB · Views: 127
The first thing you need to check when you have a com.apple.driver.AppleIntelCPUPowerManagement NMI kernel panic is the CPU tray Northbridge heatsink. Always start your diagnose there, then the RTC battery voltage and any oxidation with the DIMMs.

The most probable cause is the Northbridge push-pins that failed, even more so because it's an early-2009 Mac Pro.
 
  • Like
Reactions: bmoraski
Thank you all. I forgot to say that this problem started more than 3 years ago after I didn't even open the Mac for 6-7 months. So after that I have used the machine only occasionally but the problem remains the same.
 
I just removed the Samsung m2 system drive and booted from clean Mojave ssd. Same crash after a few minutes. I attached a few reports, don't if that helps.
 

Attachments

  • iStat report.pdf
    30.3 KB · Views: 114
  • Image.png
    Image.png
    125.9 KB · Views: 69
  • iStat menus crash.pdf
    149.7 KB · Views: 85
I just removed the Samsung m2 system drive and booted from clean Mojave ssd. Same crash after a few minutes. I attached a few reports, don't if that helps.

Did you inspected the CPU tray Northbridge heatsink push-pins? If you continue the CPU tray overheating, you will damage it permanently and will gonna need a replacement one.
 
Did you inspected the CPU tray Northbridge heatsink push-pins? If you continue the CPU tray overheating, you will damage it permanently and will gonna need a replacement one.
Well, I'm afraid I have neither the skill nor the tools to do it. I have to find someone to fix that. I am just wondering...if it's a heating problem, why it crashes after "cold start up" and it's working perfectly after that for hours....it has never crashed when it is in use unless some audio plugin causes it.
 
Well, I'm afraid I have neither the skill nor the tools to do it. I have to find someone to fix that. I am just wondering...if it's a heating problem, why it crashes after "cold start up" and it's working perfectly after that for hours....it has never crashed when it is in use unless some audio plugin causes it.

Northbridge can go over 90ºC in minutes when you run something that tax CPU or RAM, if you are not with load the air from the CPU heatsink can probably keep the nortbridge temperature low enough that it keeps "working".

Anyway, your Mac Pro definitively have a hardware problem, NMI are non-maskable interrupts and are hardware related crashes. Could be the Northbridge heatsink, could be the PSU that is failing, could be oxidation.

The CPU tray should be the first thing to inspect, since it's a mere visual inspection that anyone can do, while for the PSU you need specialised equipment and expertise to diagnose - unless you replace it with a known working one.
 
Thank you Tsialex. I really appreciate your kindness. I have to find someone to fix my Mac. Thanks again.
 
iStat temperatures small Logic project running, Northbrigde tdiode ????
 

Attachments

  • Näyttökuva 2023-5-5 kello 19.12.41.png
    Näyttökuva 2023-5-5 kello 19.12.41.png
    583.6 KB · Views: 92
iStat temperatures small Logic project running, Northbrigde tdiode ????

Shutdown immediately, you are burning the Northbridge and damaging permanently this CPU tray. The expected maximum normal T-diode temperature is 75ºC, anything over 105ºC damages it.
 
This is the reference thread about repairing the Northbridge heatsink:


In my opinion, this is no doubt the best solution:

 
  • Like
Reactions: nonlen
Shutdown immediately, you are burning the Northbridge and damaging permanently this CPU tray. The expected maximum normal T-diode temperature is 75ºC, anything over 105ºC damages it.
Before possible repair, I'd like to know that is it possible or even likely that heatsink is broken after such temperatures ( in my case 128 ˚C if the sensor is correct) ? If it is broken, is it possible (or does it make any sense considering the cost) to replace it.
 
Before possible repair, I'd like to know that is it possible or even likely that heatsink is broken after such temperatures ( in my case 128 ˚C if the sensor is correct) ? If it is broken, is it possible (or does it make any sense considering the cost) to replace it.
The heatsink itself should be fine. However, the rivet is broken. Therefore, there is no good contact between the heatsink and the NB chip. Which makes the NB diode temperature is high, and the heatink temperature is low (because the heat cannot transfer from the chip to the heatsink).

If the above is correct, then you only need to replace the rivets, not the heatsink. The cost is very minimal.

There are many ways to keep the heatsink into position. Even a very low cost zip tie can do the job (of course, this is not the recommended way to fix this issue, but it can work). Buy a proper revet / screw replacement only cost few dollars.

The heatsink is a piece of metal designed to transfer heat. 128°C is nothing. Besides, the heatsink itself never really reach that temperature (otherwise, it is taking heat, and it is working). The problem now is that the heatsink cannot touch the chip. From your screen capture, the heatsink temperature is just 51°C. There is no way to get any thermal damage at that temperature.

On the other hand, 128°C definitely exceed the limit of the NB chip. Even though the chip has internal protection to throttle itself, but that can only provide little help if the heatsink is completely detached. This is why Tsialex ask you to shutdown the cMP ASAP. In fact, most like that 128 is just the software display limit, the actual temperature most likely is higher than that.
 
  • Like
Reactions: tsialex
Before possible repair, I'd like to know that is it possible or even likely that heatsink is broken after such temperatures ( in my case 128 ˚C if the sensor is correct) ?

Heatsink is probably fine, broken are the push-pins that fix it to the CPU tray PCB.

If it is broken, is it possible (or does it make any sense considering the cost) to replace it.

Since the CPU tray Northbridge cooked itself to such a high temperature, it's possible that the chipset die distorted itself and when this unfortunately happens, even with new push-pins correctly installed to the heatsink assembly, the CPU tray will continue to overheat. When this happens, you need to replace the CPU tray since it's not economically viable to repair it.

First step is to repair the heatsink assembly, apply new thermal paste and finally test the CPU tray. If the T-Diode temperature drops to around 72 to 75ºC and no crashes, your CPU tray was probably saved.
 
Thank you tsialex. It seems to be hard to find those push-pins. At least here in Europe. Do you know any place where to find correct pins?
 
Thank you tsialex. It seems to be hard to find those push-pins. At least here in Europe. Do you know any place where to find correct pins?

Last year a guy on eBay-FR was selling a repair kit, maybe someone from Europe can help you more. The reference thread have posts from people from Europe that had to source the replacement pins.

Go to a good hardware store and try to find the screw/nylon nut, it's the best and cheapest solution.
 
Thanks again. Does these do the job? (See attached picture)
 

Attachments

  • IMG_0028.png
    IMG_0028.png
    1.6 MB · Views: 73
Thanks again. Does these do the job? (See attached picture)

Works, but iFixit suggestion is not the best solution, requires that you drill the base of the CPU tray. I've linked the best one from the start:

 
Ok, thanks again. Just to be sure...so this is all I need...

M3-.5mm x 14mm stainless steel button-head screw and an M3-.5mm stainless steel nylon-insert locking nut, re-using the original spring.
 
Ok, thanks again. Just to be sure...so this is all I need...

M3-.5mm x 14mm stainless steel button-head screw and an M3-.5mm stainless steel nylon-insert locking nut, re-using the original spring.

Yes. Did you found the original springs? Check if the springs are not missing.
 
I haven't checked that yet but.... what makes you think that springs are not there. Anyway, it will take 1-2 weeks until I have some time for this. I will report back - whatever the result is. Thanks again. 👍
 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.