Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Status
The first post of this thread is a WikiPost and can be edited by anyone with the appropiate permissions. Your edits will be public.

startergo

macrumors 603
Sep 20, 2018
5,020
2,282
I have
Just to you know how easy 970 Pro overheats, I just moved my 970 Pro, my main boot drive, to a non heatsink adapter to do some tests and 5 minutes after the boot, DriveDX warned me that the sensor 2 of my 970 Pro, was at 74ºC.

Just tested my 970 Pro running Mojave. First I did a CPU+GPU stress test for 15 minutes no change in temperature at all (42C). Now I am doing Blackmagic stress test on the drive for 25 minutes. The temperature rose from 42 to 57C on Lycom DT120.
 

tsialex

Contributor
Jun 13, 2016
13,454
13,601
I have


Just tested my 970 Pro running Mojave. First I did a CPU+GPU stress test for 15 minutes no change in temperature at all (42C). Now I am doing Blackmagic stress test on the drive for 25 minutes. The temperature rose from 42 to 57C on Lycom DT120.
While is was testing sensor 1 was 51ºC, sensor 2 was 74ºC.
 

startergo

macrumors 603
Sep 20, 2018
5,020
2,282

Attachments

  • Screen Shot 2019-02-10 at 6.20.46 PM.png
    Screen Shot 2019-02-10 at 6.20.46 PM.png
    168.1 KB · Views: 211

Reindeer_Games

macrumors 6502
Nov 29, 2018
286
228
Pueblo, CO
Your geographic area of location, temps of the room the machine sits in, and how aggressive you have you're fans set all play into these temps-and then finally how hard you run them which is where the heatsinks come into play. Depending upon whether its summer or winter-and if you personally can control the room temps (like a home machine) cool, but if you can't (like an office or it's summer) heatsinks can save you unnecessary headaches. The PX1 is just one of many-I've had good results with a cheap heatsink and rubber-band when necessary.

Additionally-your fans typically aren't spooling up when you use Recovery-where heatsinks really come into play.
 
  • Like
Reactions: dabotsonline

startergo

macrumors 603
Sep 20, 2018
5,020
2,282
Idle temperatures:
I guess my health is also 30%;)
 

Attachments

  • Screen Shot 2019-02-10 at 6.51.14 PM.png
    Screen Shot 2019-02-10 at 6.51.14 PM.png
    140.8 KB · Views: 206

tsialex

Contributor
Jun 13, 2016
13,454
13,601
Does that mean that you are 30% dead or 30% alive? ;)

What is the lifetime write value?
Both screens. When my 970 Pro overheated, DriveDX bounced on the dock warning about it until I shut My Mac down.

DriveDX - 970Pro.b.png
DriveDX - 970Pro.a.png

[doublepost=1549843214][/doublepost]One thing I noticed, sensor2 don't store the peak of the temperature, 74º is not there, but median over time since power up/lifetime or something like it.
 

startergo

macrumors 603
Sep 20, 2018
5,020
2,282
I wonder what that % on the health screen means? I bet you on a new 970 pro they won't be at 100% either. What is the warranty on these is it 3 or 5 years?
 

tsialex

Contributor
Jun 13, 2016
13,454
13,601
I wonder what that % on the health screen means? I bet you on a new 970 pro they won't be at 100% either. What is the warranty on these is it 3 or 5 years?

970 Pro has 5 year warranty. My problem is how fast it went to 74ºC, I wasn't benchmarking or anything, this was just with spotlight indexing. I'm not going to use a M.2 blade without heatsink anymore.

I installed 4 SM951-AHCI 256GB, all NOS, last Friday, temperature sensor starts at 100% and as it heats, goes down. See below:

DriveDX - SM951-AHCI 256GB.a.png
DriveDX - SM951-AHCI 256GB.c.png
 

startergo

macrumors 603
Sep 20, 2018
5,020
2,282
970 Pro has 5 year warranty.

I installed 4 SM951-AHCI 256GB, all NOS, last Friday, temperature sensor starts at 100% and as it heats, goes down. See below:

View attachment 821134 View attachment 821135



My problem is how fast it went to 74ºC, I wasn't benchmarking or anything, this was just with spotlight indexing.
I wonder if Samsung magician will show the same info... I mean the same temperature sensor degrade? If it degrades at the same pace it will die in less than 2 months.
 
  • Like
Reactions: dabotsonline

tsialex

Contributor
Jun 13, 2016
13,454
13,601
I wonder if Samsung magician will show the same info... I mean the same temperature sensor degrade? If it degrades at the same pace it will die in less than 2 months.
I don't have Windows at this moment, so I can't check that.

Just one thing, it's summer here, now it's at 35ºC. I think that was around 37 or 39ºC the day that the 970 Pro got to 74ºC.
 
  • Like
Reactions: dabotsonline

startergo

macrumors 603
Sep 20, 2018
5,020
2,282
"Status vs Current Health Rating
Both Status and Health Rating are used to display the state of health indicator, but there is one key difference in their behavior in DriveDx.

Health Rating always reflects current state of health indicator (SMART attribute), but Status could also include forecasting to the future.

Some events could (with very high probability) indicate impending problems with drive, but it is impossible to convert them into the current “health rating”. One of the typical examples are “bad sectors” related health indicators (SMART attributes). For example, according to the statistics – after bad sectors first reallocation, drives are over 14 times more likely to fail within 60 days than drives without bad sector reallocation counts. In such cases DriveDx will set status as “FAILNG”, but (at the same time) current health rating could be even 100%."

So it looks like the status indicator is more like a prediction to a failure
[doublepost=1549844744][/doublepost]
I don't have Windows at this moment, so I can't check that.

Just one thing, it's summer here, now it's at 35ºC. I think that was around 37 or 39ºC the day that the 970 Pro got to 74ºC.
After you told me where to look I saw my second sensor was also at 74C after 30 min Blackmagic test. I may check with a temperature gun.
 
  • Like
Reactions: dabotsonline

tsialex

Contributor
Jun 13, 2016
13,454
13,601
"Status vs Current Health Rating
Both Status and Health Rating are used to display the state of health indicator, but there is one key difference in their behavior in DriveDx.

Health Rating always reflects current state of health indicator (SMART attribute), but Status could also include forecasting to the future.

Some events could (with very high probability) indicate impending problems with drive, but it is impossible to convert them into the current “health rating”. One of the typical examples are “bad sectors” related health indicators (SMART attributes). For example, according to the statistics – after bad sectors first reallocation, drives are over 14 times more likely to fail within 60 days than drives without bad sector reallocation counts. In such cases DriveDx will set status as “FAILNG”, but (at the same time) current health rating could be even 100%."

So it looks like the status indicator is more like a prediction to a failure
[doublepost=1549844744][/doublepost]
After you told me where to look I saw my second sensor was also at 74C after 30 min Blackmagic test. I may check with a temperature gun.
Heat gun don't work well with non reflective surfaces, the best way to the this is with a thermopar and a logger. I have both, but can't do that right now.
 
  • Like
Reactions: dabotsonline

flowrider

macrumors 604
Nov 23, 2012
7,321
3,003
Hmm, never noticed the 2nd temp sensor. Only my 970 Pros have 2 sensors. The SM951 and my 840 SSDs all have only 1 sensor.

All my SSDs have cooling. My 840s sit in heatsinks. My SM951 is on a PX1 and my 970 Pros are on an I/O Crest.

Lou
 

startergo

macrumors 603
Sep 20, 2018
5,020
2,282
Heat gun don't work well with non reflective surfaces, the best way to the this is with a thermopar and a logger. I have both, but can't do that right now.
upload_2019-2-10_17-44-2.png

upload_2019-2-10_17-45-18.png

Here is the Samsung Magician SMART information for both 970 pro I have. The temps for 2 sensors look pretty close in contrast with the DriveDX
 

AidenShaw

macrumors P6
Feb 8, 2003
18,667
4,677
The Peninsula
Be
View attachment 821150
View attachment 821151
Here is the Samsung Magician SMART information for both 970 pro I have. The temps for 2 sensors look pretty close in contrast with the DriveDX
I see about the same on my ThinkPad.

t480s.jpg

Need to figure out the scaling and units, though. 311° seems a bit high for both Fahrenheit and Celsius. It could be Kelvin (311°K == 38°C). (Laptop is basically idle - just a couple of SSH sessions to Linux systems, and my home office is 19°.)

Anyway, it seems like assigning a "health" figure to temperatures isn't well defined. If the current temperature is 70% of the range between min and max temperatures, do you get a 30%? For non-spinners, being close to the max temperature means that the drive may throttle to avoid exceeding the max - but longevity shouldn't be affected. The drive will throttle to protect itself.

On the other hand, if your drive has a "lifetime write guarantee" of 500TiB, and if your current writes total 250TiB - then a health score of 50% on "lifetime writes" is pretty obvious.
 
Last edited:
  • Like
Reactions: dabotsonline

startergo

macrumors 603
Sep 20, 2018
5,020
2,282
I see about the same on my ThinkPad.


Need to figure out the scaling and units, though. 311° seems a bit high for both Fahrenheit and Celsius. It could be Kelvin (311°K == 38°C). (Laptop is basically idle - just a couple of SSH sessions to Linux systems, and my home office is 19°.)

Anyway, it seems like assigning a "health" figure to temperatures isn't well defined. If the current temperature is 70% of the range between min and max temperatures, do you get a 30%?

On the other hand, if your drive has a "lifetime write guarantee" of 500TiB, and if your current writes total 250TiB - then a health score of 50% on "lifetime writes" is pretty obvious.

Actually, pretty funny 134 in HEX is 34.85C and 139 HEX is 39.85C
 

bsbeamer

macrumors 601
Sep 19, 2012
4,313
2,713
I've seen the random restart without warning related to a few things in the past:
- Bad stick of RAM
- Failing SSD or HDD
- Failing GPU
- Failing PSU

Install DriveDx and check your drive health. Not just for boot drive. It's usually an easy way to check for potential drive issues.

Check your installed RAM, check for lights, and check for system reporting.

Failing GPU (or a GPU issue) can be harder to diagnose. Sometimes it's specific actions that trigger. Usually only see this in multiple GPU machines that are either underpowered, or when running specific tasks for CUDA or 3D rendering.

Failing PSU can be harder to diagnose or trigger. Would suggest you use a UPS with your machine.
 

BelgianBoy

macrumors regular
Jun 19, 2018
112
15
Belgium
I've seen the random restart without warning related to a few things in the past:
- Bad stick of RAM
- Failing SSD or HDD
- Failing GPU
- Failing PSU

Install DriveDx and check your drive health. Not just for boot drive. It's usually an easy way to check for potential drive issues.

Check your installed RAM, check for lights, and check for system reporting.

Failing GPU (or a GPU issue) can be harder to diagnose. Sometimes it's specific actions that trigger. Usually only see this in multiple GPU machines that are either underpowered, or when running specific tasks for CUDA or 3D rendering.

Failing PSU can be harder to diagnose or trigger. Would suggest you use a UPS with your machine.

I have a drve that is "failing": too many reallocated bad sectors. Other problems too, temp, startups etc.
Funny I haven't had problems before I installed my new PCI-nvmeSSD drive.
 

bsbeamer

macrumors 601
Sep 19, 2012
4,313
2,713
Is the failing drive your boot drive or another?
Is there a heatsink on your NVMe drive? Thermal pads? Properly installed?

Recommend using Carbon Copy Cloner and clone EVERYTHING to an external drive ASAP.

May just be coincidence timing with your issues. Using NVMe with 6 other SATA SSDs in MP5,1 and none report issues with DriveDx.
 

BelgianBoy

macrumors regular
Jun 19, 2018
112
15
Belgium
Bootdrive NVMe is with heavy heatsink (kryoM.2 evo PCIe 3.0 x4 Adapter für M.2 NGFF PCIe SSD, M-Key mit Passivkühler) 35-38° celsius. Failing drive is the original MP 5.1 drive (old enough!) Already ordered replacement.
 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.