Why Apple don't use NVME m2 slot for mac mini/mac studio

MacCheetah3 · Sep 16, 2023

Kr0n05K!ngR said:
Are their any journal articles or warranty / technical docs from apple which show this?

From Apple, unfortunately, I’ve only seen anecdotal. In general:

Samsung 850 Pro SSD Reaches End of Life With 9100 TB Written

No, that isn't a major typo on this article's headline. According to print magazine c't, who conducted a test bench consisting of two pieces each of OCZ's TR150, Crucial's BX 200, Samsung's 750 Evo, Samsung's 850 Pro, SanDisk's Extreme Pro and SanDisk's Ultra II, the last SSD to actually give...

www.techpowerup.com

The SSD Endurance Experiment: They're all dead

techreport.com

The results seem to be based on luck (manufacturing defects a.k.a. “silicon lottery”) and storage use (i.e., not just how much data but in what size chunks are written).

What is the Enterprise SSD Lifetime?

The erasable lifespan of the NAND cell is getting shorter and shorter, which has led to many discussions about SSD longevity.

www.diskmfr.com

Kr0n05K!ngR said:
  I defiantly agree that there are a tone of advantages to the mac SOC, the problem is the use case i have (ML, Deep learning, Ai), which are guaranteed to produce allot of writes on the SSD, so a more robust solution is needed, as i wish to extend the life of my little silver wonder (mac mini m2 pro) as much as possible.

I’m not familiar with that category of software. Is there a way to specify the “scratch disk” (i.e., cache file location) like in video and image editing apps?

Kr0n05K!ngR · Sep 16, 2023

Queen6 said:
TBH this was my first thought as those I know in a similar position utilise Enterprise Cloud solutions or in the event of a large organisation their own specific Enterprise level HW and again generally in the Cloud.

Q-6

Hiya, thanks for the reply.

Not an enterprise, just a simple person doing ML and programming AI project, but this is kind of argument is why companies are utilising AI more and more, leaving normal developers and FOSS developers in the dust with little recourse but to come up with their own solutions, which is why running the mac on an external SSD (which has been done many times before on older macs (2018 down), and should be possible with the M2 pro mini.

Regardless, consumer SSDs of 1tb down at less then £50, and i have accounted for this. With £50 being less then 79 - 229, and still allows me to achieve a some what low cost solution.

The point is to have a solution that can cater to normal people who may wish to explore AI related programming, which unfortunately requires allot of writes.

Ai is here, and for people to be competitive, they need to learn it, and how to utilise it.

Kr0n05K!ngR · Sep 16, 2023

Hi Leman, thanks for the reply.

leman said:
So, in other words, you do not have an empirical estimate and your statements about the storage endurance of Apple SSDs is mere speculation.

Based on known and real world variables, published documents, peer reviewed journal articles (actual scientific research), yes.

Personally I would not want to gamble with my technology and do not have the time to stress test an m2 mac mini pro, happy for any one else to try, but i am unsure if apple would be happy to refund you after they did it.

leman said:
„Accidental damage„ is precisely defined by the insurance terms as physical damage due to handling or external event. Internal component failures are not accidental damage. At most they could argue that a failed SSD is wear and tear. But that will be hard given they never advertise any limits.

This is again a slippery slope, why dump the whole responsibility on to apple, if anything my use case is special and i would not want to force a price hike for apple care +. I prefer having a round about solution apple supports, such as having kernel access when configured (startup utility > reduced protection for known developers to have kernel access > and then gatekeeper doing its job).

This unfortunately does not happen when i run macOS from an external NVME drive, and as a result libraries and other FOSS that i use fail, as they cannot get kernel access regardless of disabling SIP, and Gatekeeper on an external volume. I am stuck running app store apps only, and does not suit my use case.

leman said:
Wait a moment, 2,500TB? As in, two thousand five hundred TBs? Assuming 24/7 operation that's almost 1GB/s continuous writes to storage! That is an absolutely insane figure that will require specialised enterprise-level equipment. Discussing these kind of needs in the context of a personal computer storage is like comparing logistic capabilities of a family car to that of an industrial port.

Sad reality of working with ML and AI, i am not an enterprise, but this is why big business will leave 3rd party developers and FOSS developers in the dust if they do not evolve their work flows. I have just started this journey to try, and i recommend it to all whom are willing.

IMO more people need to be learning AI, its not only a boom industry, but with out these skills, some jobs may disappear or be taken over by people who can use it.   

So, with that said, for me, and this is only my opinion which i do not know, if this means anything, but the devices need to cater for AI development, and considering that apple have a rural engine in the SoC, i would think that the ability to upgrade the SSD would make sense.

Kr0n05K!ngR · Sep 16, 2023

Hi MacCheetah3,

Thanks for the reply.

MacCheetah3 said:
I’m not familiar with that category of software. Is there a way to specify the “scratch disk” (i.e., cache file location) like in video and image editing apps?

In terms of AI / ML, from my understanding, you could call the process of writing and reading from the drive at a continues rate a form of cache, but its actually streaming data on and off at high speeds.

Basically in ML you are limited by the bottle neck of the slowest device / area.

So, this could be bus speeds, interface speeds, speed of the storage media, processor (often GPU), power, Model in creation or in use, etc. (please bare with me, i am still learning).

Now the speed that AI can get data off of the drive and processes it is important, as it should be as fast as the GPU / model you are using. As the AI can then draw conclusions based on the data it is being fed, and call these conclusions form a long term data base.

What you need to remember about the SSD in this scenario, nothing is stored on the SSD for long. We are not talking about on single file, this is a short term data base, and how fast this data is streamed off of the drive, hence the large amount of data writes i am mentioning, seeming excessive, but in AI this is bare minimal to be functional.

Ideally you need the ai to scale with the work load required, which is why a kubernetes cluster will be needed in future, as then i can add and remove hardware resources (elasticity) as needed.

I just have to get this project done first in the short term, and a replaceable SSD, at a reasonable price (their are already existing solutions, and I am not a company with unlimited disposable income, so their has to be a realistic pricing for the general consumer) is the only solution.

startergo · Sep 16, 2023

Kr0n05K!ngR said:
This unfortunately does not happen when i run macOS from an external NVME drive

You can try Innie You may need to disable SIP but if it works for you, you can always notarize the text.

Kr0n05K!ngR · Sep 16, 2023

startergo said:
You can try Innie You may need to disable SIP but if it works for you, you can always notarize the text.

Hi Startergo,

Thanks for the response!

Ill take a look fully soon, thank you for the recommendation.

Looking at it in passing, it is definitely a testing point to see if kernel access is achievable on the external SSD if the Mac thinks its an internal one.

I think for now, I will let the apple engineers have a go before i potentially do anything that could invalidate the apple warranty.

I have a case open, and me posting on this forum was to see if others have had this experience. Specifically kernel access being refused when booting off of an external SSD, regardless of disabling gatekeeper and sip.

Queen6 · Sep 16, 2023

Kr0n05K!ngR said:
Hiya, thanks for the reply.

Not an enterprise, just a simple person doing ML and programming AI project, but this is kind of argument is why companies are utilising AI more and more, leaving normal developers and FOSS developers in the dust with little recourse but to come up with their own solutions, which is why running the mac on an external SSD (which has been done many times before on older macs (2018 down), and should be possible with the M2 pro mini.

Regardless, consumer SSDs of 1tb down at less then £50, and i have accounted for this. With £50 being less then 79 - 229, and still allows me to achieve a some what low cost solution.

The point is to have a solution that can cater to normal people who may wish to explore AI related programming, which unfortunately requires allot of writes.

Ai is here, and for people to be competitive, they need to learn it, and how to utilise it.

As said go with external TB4/USB4 SSD's for the data. Surely the SW is not so inflexible that it expects the app & data to be on the same drive. Also need to remember the Mac Mini is an entry level system and not designed for such massive writes to the SSD.

In general I believe Apple's SSD's are of good quality, however Apple definitely abuses it's position on upgrades which should be reigned in. Bottom line for you is, your in a niche use case. If you need to complete the project you will likely have to make compromises such as external SSD's or hope the Mini holds up long enough.

The larger more expensive Mac's offer far greater RAM and potentially a RAM drive would alleviate the wear on the SSD, equally if working on a budget need to make do as is...

Q-6

Queen6 · Sep 16, 2023

Kr0n05K!ngR said:
Personally I would not want to gamble with my technology and do not have the time to stress test an m2 mac mini pro, happy for any one else to try, but i am unsure if apple would be happy to refund you after they did it.

Well that's generally how it works professionally. You purchase the right system for the job and just hammer it. If you move onto a more demanding project you replace it. No sense to overspend as technology marches on, if all works out the systems pay for themselves rapidly.

As long as your Mac is under warrantee Apple will cover any non user (liquid ingress, obvious physical abuse etc.) repair related to HW failures up to and including replacing the computer. I wouldn't overly worry about it. I always pushed my systems hard and they have to be notebooks which adds more complexity and they survived the abuse...

I always travel with two systems, likely a Mac & PC dont worry about failure. Should one go down on the job, I'd just replace it once home.

Q-6

Kr0n05K!ngR · Sep 16, 2023

Thank you for your reply Queen6.

Queen6 said:
Surely the SW is not so inflexible that it expects the app & data to be on the same drive.

Unfortunately this seems to be the case, you can think of it as a running environment, where the AI will be "thinking" / processing the data.
You could push some of the work to an external drive for sure, but the reality is that the drive the AI is running on will get a huge number of writes.

I guess you can look at it the same way you would look at other software like video editing, you perform some writes when the software is ran, or when the software is doing work like video encoding.

The encoding happens on the main system, and then the result is outputted in a location of your choosing.

Queen6 said:
In general I believe Apple's SSD's are of good quality, however Apple definitely abuses it's position on upgrades which should be reigned in.

Indeed, my experiance of apple products, have been of very high quality and standard, its one of the reasons i love them so.

Queen6 said:
Bottom line for you is, your in a niche use case. If you need to complete the project you will likely have to make compromises such as external SSD's or hope the Mini holds up long enough.

For the consumer market, indeed niche, in terms of market as a whole, its booming!

If cost effective solutions are not given to consumers, then i fear general FOSS and software development may be stifled.

Heck Github uses Microsoft machine learning, to train their AI (open ai as they own majority share), so to program with out programmers. This is what we will be competing with, and if we do not make our own solutions, then the open market of ideas will fail, and roll on the same behaviours some free to play offer.

Which they may be forced to ramp up considering Unity recent change of ToS, requiring massive passive compensation from developers if the user base gets over a set number…

Ref: https://www.axios.com/2023/09/13/unity-runtime-fee-policy-marc-whitten

And this is only part of the craziness that will continue until people develop alternatives.   All I am saying is that apple could really make a name for itself by providing consumer grade solutions for AI development problems, that are cost effective and affordable.

How much would an additional NVME slot cost? Could they shift more units if people found out they cater to AI development, and FOSS with competitive pricing and some flexibility?

I see this as a massive opportunity!

Queen6 said:
The larger more expensive Mac's offer far greater RAM and potentially a RAM drive would alleviate the wear on the SSD, equally if working on a budget need to make do as is...

Indeed, i agree here, hence my aim to get kernel access on a external SSD. Which apple has stated they allow, just my system or the M2 in general may be having problems doing so.

The investigation continued!

I will be off for the rest of the day now, as i have to provide apple with some additional reports to try and resolve this.   

Again, i am only posting here to see if others who may have the M2 pro mini or the like have experienced issues accessing kernel when booting off of an external NVME.

Thanks again all for your input!

leman · Sep 16, 2023

Kr0n05K!ngR said:
Based on known and real world variables, published documents, peer reviewed journal articles (actual scientific research), yes.

Given that there are no published documents or scientific research regarding the endurance of Apple SSDs, what you say essentially boils down to "uneducated guess".

Kr0n05K!ngR said:
Sad reality of working with ML and AI, i am not an enterprise, but this is why big business will leave 3rd party developers and FOSS developers in the dust if they do not evolve their work flows. I have just started this journey to try, and i recommend it to all whom are willing.

IMO more people need to be learning AI, its not only a boom industry, but with out these skills, some jobs may disappear or be taken over by people who can use it.   

So, with that said, for me, and this is only my opinion which i do not know, if this means anything, but the devices need to cater for AI development, and considering that apple have a rural engine in the SoC, i would think that the ability to upgrade the SSD would make sense.

Be it as it may, continuously writing 1GB/s in no way constitutes normal use of consumer equipment. This kind of requirements will burn out pretty much any SSDs in a couple of days. I have no idea why you need to write so much data or how you can do this kind of work on a Mac Mini, but a replaceable SSD will be as useful here as a pair of steel cap working boots when running over a minefield. As already mentioned, this requires special — and usually very very expensive — equipment. Your use case does not translate to even most extreme professional demands (in domain of personal workstations at least).

Chuckeee · Sep 16, 2023

The other thing to consider is how NAND memory fails. NAND includes internal self monitoring and control of access to the actual registers and banks. This is consistently in use as part of the wear leveling process. As a side result bad/fail sites are avoided. While there are other failure mechanisms, this is the most common. Up to a point this remains unnoticeable since any loss just reduces the memory available during wear leveling. Eventually this will begin to erode available addressable memory.

The result NAND SSD typically “fail” in a different fashion than spinning platter HDD. No speed reduction, no clicking, they don’t run hotter. When NAND SSD “fail” their reported capacity slowly decreases. This can make determining how much loss constitutes a failure a matter of interpretation. 0.1%? 1%?, 10%? Etc.

Having excessive capacity can help prolong SSD life. Writes to the same memory location, at a physical level, is always being changed internally at the internal chip level. So even if your memory access command is to the same location, eventually that physical that location “migrates”.

A side effect is that securely erasing NAND memory (so it’s permanently non-recoverable) is very very difficult. Multiple commanded writes is not a sufficient approach. Some sort of physical destruction is usually needed. Sticking the chips in a microwave oven works.

Kr0n05K!ngR · Sep 16, 2023

leman said:
Given that there are no published documents or scientific research regarding the endurance of Apple SSDs, what you say essentially boils down to "uneducated guess".

You confuse me no end, define uneducated guess? This research is quantitive and empirical, testing common SSD across most manufacturing.

Do you think a "brand" makes the difference when it comes to actual scientific research / testing?
Or should it be what they put in to the product? 
What makes apple internal SSDs different besides the SSD controller being part of the SoC?
Are the memory bank chips specific to apple?
Do they fabricate them in some special way?
How does this make them 100% resistant to SSD wear?

In the consumer eyes, if the company is loyal and trustworthy (which apple tends to be), then they can go by this and provide loyalty. But in reality, apple is a business and will do what is best for business. The question really is what is best for business? And that is why i continue to believe that a company investing in consumer level AI research and not just simply offer a high end enterprise solution, would increase their sales, and change the market fully. Brining on a consumer level AI boom, and not just a industrial one.

leman said:
Be it as it may, continuously writing 1GB/s in no way constitutes normal use of consumer equipment.

Normal is relativistic, and dependent on project that a person is doing at that time, so i think when you say normal, you mean normal for you?

leman said:
I have no idea why you need to write so much data or how you can do this kind of work on a Mac Mini, but a replaceable SSD will be as useful here as a pair of steel cap working boots when running over a minefield. As already mentioned, this requires special — and usually very very expensive — equipment.

I have said many times why i am using this, and why replaceable ssds are needed. Consider this. £40 a time for a 1tb NVME ssd Samsung pro 970 vs 1.3k replacing a mac mini m2 pro.... hmmm

Regarding the steel toe cap boots, i fear you have no army experience and perhaps not been in cadets? as well, army boots and steel toe cap boots are essentially the same thing, so i do not understand your remark regarding " as useful here as a pair of steel cap working boots when running over a minefield?

Essentially they would be very useful indeed. So... my recommendation is useful?

Please see link: https://fromyourtactics.com/tactical-boots-vs-work-boots-are-tactical-boots-good-work-boots/

I think i could easily have a beer with you and we can debate the nite away!

leman said:
As already mentioned, this requires special — and usually very very expensive — equipment. Your use case does not translate to even most extreme professional demands (in domain of personal workstations at least).

It is only expensive right now from apple, other suppliers offer alternatives which can easily be set up, but i want it to work with apple, as i love apple.

And besides, engineers have already told me that kernel mode access should be available when booting from a thunderbolt4 / usb4 ssd if i have allowed this, i am still waiting for them to come back to me, and i hope its a fixable fix with a firmware update, as this would solve my problem.

leman said:
Your use case does not translate to even most extreme professional demands (in domain of personal workstations at least).

Those demands are evolving in real time, and i am simply trying to meet them.

jdb8167 · Sep 16, 2023

Chuckeee said:
A side effect is that securely erasing NAND memory (so it’s permanently non-recoverable) is very very difficult. Multiple commanded writes is not a sufficient approach. Some sort of physical destruction is usually needed.

With Apple silicon without the Secure Enclave, the data on the NAND SSD is indistinguishable from noise. No need for physical destruction.

Chuckeee · Sep 16, 2023

jdb8167 said:
With Apple silicon without the Secure Enclave, the data on the NAND SSD is indistinguishable from noise

The effectiveness of encryption, especially with the evolving status of quantum computing, is an evolving target. Highly dependent on levels paranoia and timelines. Government users can be very paranoid.

mr_roboto · Sep 17, 2023

vigilant said:
Apple has been using a very similar way of booting since Intel. I believe it goes back to T1 or T2 Macs? Theres a fascinating video on how the boot process works from DEFCON I believe in 2018 or 2019.

I’d encourage you to give it a look.

It's a little different from that.

Apple SoCs boot via a process I'll call "iBoot" (the actual name of one of the pieces of software involved). This was first created for iOS devices, and has been extended in major ways to support Apple Silicon Macs. iOS devices didn't have to support multiple installed operating systems side by side, each with its own security policy, booting from an external drive, and some other things. Macs need these features, so iBoot had to get a lot more complex to support them.

Intel CPUs boot using UEFI firmware. UEFI stands for Unified External Firmware Interface; it was created by Intel in the early 2000s. Apple adopted UEFI when they switched Macs from PowerPC CPUs to Intel. (As an aside, back then they did put out requests for feedback on what firmware to use with Intel CPUs, and one of the options was to continue using Open Firmware, the same firmware standard Apple used on PowerMacs.)

T1 and T2 Macs are odd ducks. They have both Intel CPUs and Apple SoCs. However, they aren't exceptions to the above rules: the Apple SoC boots using iBoot and the Intel CPU boots using UEFI.

I've watched the talk you mention, it's good! The summary of it is that in T2 Macs, Apple wanted to make UEFI considerably more secure than was possible within the letter of the UEFI spec. There were several boot security features they wanted for Macs which were easy in iBoot but impossible in UEFI. These were all things having to do with the very early boot environment as UEFI just begins to execute.

So, they turned the Intel UEFI boot process into something managed by the T2. While the T2 starts up, the x86 CPU is held in reset. Once the T2 is running, it does setup work on behalf of the x86 CPU and checks the cryptographic signature of the x86 UEFI image. Once the T2 is satisfied that everything's ready to start UEFI safely, it injects the UEFI image into memory visible to the x86 and lets the x86 exit reset so it can begin booting.

mr_roboto · Sep 17, 2023

Kr0n05K!ngR said:
And besides, engineers have already told me that kernel mode access should be available when booting from a thunderbolt4 / usb4 ssd if i have allowed this, i am still waiting for them to come back to me, and i hope its a fixable fix with a firmware update, as this would solve my problem.

What do you mean by "kernel mode access", and why would you need it for this application? All you need is storage. Machine learning should be a regular userspace process.

Also, what would be different about booting from an external? You gave a fairly weird answer before that's hard to make any sense out of.

Also, who cares where you boot from, why can't you just tell your software to use whatever storage you want it to use? It's almost inconceivable that it's so badly written that it forces all writes to go only to the boot drive.

A lot of the things you've been saying in this thread lead me to believe that you're in way over your head, have no idea what you're doing, and have been freaking out because you googled things, misunderstood them, and leapt to the wrong conclusions.

Also I would echo @leman in saying that if you're training ML models on such a tremendously large data set that you need 2500 TB/month worth of writes, why on earth are you doing that kind of work on a Mac Mini? How did you ever plan to do all this on its internal drive? The normal machine to "acquire" for this kind of work is (if you're only doing it once or twice) not to buy, but rent a big AWS instance for a month or two. One stuffed with a bunch of GPUs to accelerate model training. (This is one of the many reasons why I suspect you and/or your boss have no idea what you're doing.)

genexx · Sep 17, 2023

There are some possibilitys out there if you are talented enough to solder chips or just go and use an Acasis TBU405 with an WD SN770 or SN850 or X.

For my MBA M2 2022 it was a no Brainer to buy 1TB Internal Storage as i had to add the Costs of a TB Hub and the Acasis which would have led me to less Portability and a higher Price.

With a MacMini the Story is different and with the Studio i would guess some could come up and do the MacMini Soldering ( Video with DosDude1 ) on the NVME without Controller Card as well. This could be a invention of some Add On Solution.

This could also be a Repair Solution for the Internal Storage.

leman · Sep 17, 2023

Kr0n05K!ngR said:
You confuse me no end, define uneducated guess? This research is quantitive and empirical, testing common SSD across most manufacturing.

You are making guesses about endurance of an SSD without having tested the said SSD or looking at the empirical data pertaining to the said SSD. That's your own admission. This makes what you say about endurance of Apple SSDs an uneducated guess.

Kr0n05K!ngR said:
Normal is relativistic, and dependent on project that a person is doing at that time, so i think when you say normal, you mean normal for you?

"Normal" as in what these devices are designed for and what people in general use them for. You say you are writing 2500TB per month. Enterprise SSDs have endurance around 10000TBW (some more, some less), which means you will burn one out in under half a year.

I have never heard of any workloads requiring these kind of writes, this is well above any usual datacenter/server/workstation use. Th only think that comes to mind are folks who work on large particle accelerators and have to write a lot of sensor data quickly, but they don't do it 24/7.

Kr0n05K!ngR said:
I have said many times why i am using this, and why replaceable ssds are needed. Consider this. £40 a time for a 1tb NVME ssd Samsung pro 970 vs 1.3k replacing a mac mini m2 pro.... hmmm

A Samsung Pro 970 has endurance of up to 1200 TWB, it's entirely unusable for your needs as it will fail within one or two weeks of you hammering it with your insane amount of writes (probably earlier because the controller likely won't be able to keep up with your demands).

Kr0n05K!ngR said:
It is only expensive right now from apple, other suppliers offer alternatives which can easily be set up, but i want it to work with apple, as i love apple.

For 1GB/s continuous writes? Sorry, but you appear to be very confused. Consumer SSDs support the speeds you need but they don't have the endurance. Enterprise SSDs that have the speed and endurance to run at these conditions for at least a least couple of months are expensive and bulky — and they use either PCIe or SAS. There is nothing "affordable" about that and it wound't fit in a Mac Mini. But you can buy a Mac Pro, you should be able to afford it if you can keep buying a new SSD every week.

And by the way, most current enterprise SSDs are rated for 1-3 DWPD for 3-5 years. But you will completely overwrite a large enterprise drive in just a few hours. No equipment is designed fro this kind of abuse.

Kr0n05K!ngR said:
Those demands are evolving in real time, and i am simply trying to meet them.

As I said before, your demands strike me as ludicrously high. I don't know what it is exactly you are trying to do, and I don't understand your explanations. Your arguments would be easier to stomach if you didn't claim that a fairly mundane workload (ML training/inference) on an underpowered computer (M2 Pro Mac Mini) required storage infrastructure needs in excess of a large datacenter.

leman · Sep 17, 2023

mr_roboto said:
A lot of the things you've been saying in this thread lead me to believe that you're in way over your head, have no idea what you're doing, and have been freaking out because you googled things, misunderstood them, and leapt to the wrong conclusions.

Also I would echo @leman in saying that if you're training ML models on such a tremendously large data set that you need 2500 TB/month worth of writes, why on earth are you doing that kind of work on a Mac Mini? How did you ever plan to do all this on its internal drive? The normal machine to "acquire" for this kind of work is (if you're only doing it once or twice) not to buy, but rent a big AWS instance for a month or two. One stuffed with a bunch of GPUs to accelerate model training. (This is one of the many reasons why I suspect you and/or your boss have no idea what you're doing.)

Amen to that. My impression is that they are either logging every single training step to the disk (which serves no purpose, also, would M2 Pro even support training at these kind of speeds?) or doing some other kind of nonsense. Or maybe it's actually 2.5TB/month and not 2500TB/month — and then a regular Mac SSD should last for over 15 years.

At any rate, I agree with you that these posts strike me as confused and incoherent. Especially the claims that this is typical use for machine learning. I'm not a machine learning expert, but I've been working in empirical research as data analysis specialist for two decades and I've never seen anything like that.

genexx · Sep 17, 2023

leman said:
Amen to that. My impression is that they are either logging every single training step to the disk (which serves no purpose, also, would M2 Pro even support training at these kind of speeds?) or doing some other kind of nonsense. Or maybe it's actually 2.5TB/month and not 2500TB/month — and then a regular Mac SSD should last for over 15 years.

At any rate, I agree with you that these posts strike me as confused and incoherent. Especially the claims that this is typical use for machine learning. I'm not a machine learning expert, but I've been working in empirical research as data analysis specialist for two decades and I've never seen anything like that.

Amen too !

I had worked for a Company that made Service for Bet Shops and the Display Update of the Sports Data was writing TB´s on the poor SSD´s what made them fail.

Solution i had come up to just use a Ram Disk ( and optimize the Code ). The failure and Service Rate was dropping dramatically after that.

On the Other Hand, i would ( if the written Data is really that high ) investigate my programming.

Sometimes it helps to use your own Brain.

JouniS · Sep 17, 2023

leman said:
"Normal" as in what these devices are designed for and what people in general use them for. You say you are writing 2500TB per month. Enterprise SSDs have endurance around 10000TBW (some more, some less), which means you will burn one out in under half a year.

I have never heard of any workloads requiring these kind of writes, this is well above any usual datacenter/server/workstation use. Th only think that comes to mind are folks who work on large particle accelerators and have to write a lot of sensor data quickly, but they don't do it 24/7.

Workloads like that are very common in bulk data processing. When you have more data than memory, the natural approach is designing algorithms in terms of sequential scans, sorting, and joins. And that involves constant writing and reading of temporary files. Early Unix and VAX/VMS systems did that a lot by necessity. Database engines use similar approaches. MapReduce, which used to be popular back in the day, was essentially a rebranding of the same idea.

Many people were skeptical of SSDs 15-20 years ago, because it seemed impossible to extend their lifetime beyond a few weeks, or at most a few months, in I/O heavy workloads. But as memory/storage capacity grew faster than most workloads, issues like that became less relevant. If SSD lifetime is a few weeks at 100% load but the average load is only 0.5%, the SSD can be expected to last a decade.

leman · Sep 17, 2023

JouniS said:
Workloads like that are very common in bulk data processing. When you have more data than memory, the natural approach is designing algorithms in terms of sequential scans, sorting, and joins. And that involves constant writing and reading of temporary files. Early Unix and VAX/VMS systems did that a lot by necessity. Database engines use similar approaches. MapReduce, which used to be popular back in the day, was essentially a rebranding of the same idea.

Absolutely, but in bulk processing you are usually dealign with existing (large) data, which makes it a read-heavy and not write-heavy operation. And each processing step reduces the size of the data — for many use cases the aggregated result can even stay in memory during the entire scan (when I do this kind of processing I like to periodically write out intermediate results in case something goes wrong so that I can restart from the middle, but these are small data packages).

The use case described here is data being constantly generated at an incredibly high rate. 1GB/s — that's a TB every 15 minutes! And it sounds like the poster is continuously overwriting this data as they don't mention petabyte-class storage systems (in fact, they seem to believe that their work can be done on a Samsung 970 Pro, which is doubly confusing to me as that SSD tops at 1TB and will run out of endurance in mere weeks, if not earlier due to the controller being overwhelmed). What kind of use case is that? When is that data being consumed? If they can generate the data with that speed I assume they must also have the capability to process it at the same speed. Why even write it out then? None of this makes any sense to me.

JouniS · Sep 17, 2023

leman said:
Absolutely, but in bulk processing you are usually dealign with existing (large) data, which makes it a read-heavy and not write-heavy operation. And each processing step reduces the size of the data — for many use cases the aggregated result can even stay in memory during the entire scan (when I do this kind of processing I like to periodically write out intermediate results in case something goes wrong so that I can restart from the middle, but these are small data packages).

Another common pattern is starting from a manageable amount of data, generating massive amounts of intermediate data, and collapsing it to a manageable amount of final data.

Large-scale graph operations are one example. Once upon a time, I had graphs with billions of nodes, where each node was labeled by a single character. I wanted to search the graphs to find paths with a given label. An optimized representation of the graph was ~10 GB and the index was a bit larger, but building the index required a lot of work and I/O. The basic idea was generating all paths of length 2^k (e.g. 128 or 256), or shorter prefixes of them if the label of the prefix already defines the initial node uniquely.

The algorithm was something like this: You have a list of paths of length (up to) n, and you want to double the length to (up to) 2n. You create two copies of the list, sort one copy by the initial node and another by the (one-past) final node. Then you scan over the lists, join paths when the nodes match, and write a list of new paths. Then you sort the new paths by their labels, scan the list, and mark paths with labels that define the initial node uniquely. If you don't have enough memory, pretty much every step involves duplicating the data and writing it to disk.

The year was something like 2016, so we could assume at least 256 GB RAM, which was enough for doing the joins in memory for any connected component of the graph (but not for the full graph). On the other hand, we were still using HDDs. In the optimized version I ended up writing, a typical run took 20-30 hours. We had two sets of path lists on disk (~500 GB each), and the average write (read) rate was 70 (80) MB/s. Both rates should exceed 1 GB/s if you run the low-memory version of the algorithm on a computer with a fast SSD but less RAM.

dmccloud · Sep 19, 2023

Kr0n05K!ngR said:
HI NT1440, Thanks for the response.

Please can you let me know how this problem is hypothetical?

  All SSDs have a finite amount of life, which is stated by manufacturers to be TBW or Total Bytes Written capacity within warranty documentation. Research continues in industry to extend SSD life as much as physics will allow. So i am confused how this is hypothetical?

Perhaps your set up does not produce that many writes?

Please let me know how you maximise the life of your Mac SSD, is there something you can recommend i try?

Some ref for you regarding TBW.

Understanding SSD endurance: drive writes per day (DWPD), terabytes written (TBW), and the minimum recommended for Storage Spaces Direct | Microsoft Community Hub

Measuring wear is one thing, but how can we predict the longevity of an SSD?

techcommunity.microsoft.com

An Introduction to TBW - Terabytes Written - Embedded Computing Design

Terabytes Written, or TBW, is the total amount of terabytes data that a Solid-State Drive (SSD) can write in its lifetime.

embeddedcomputing.com

The issue is hypothetical because the computer itself is more likely to die before modern SSDs. TBW is so high with modern SSDs that you're unlikely to ever hit that limit, even if doing a lot of heavy 4K video editing. For example, the Samsung 980 Pro series of SSDs have TBWs of 600x the listed capacity (e.g., 250GB - 150TBW, 500GB - 300TBW, 1TB - 600 TBW, 2TB - 1200 TBW), meaning you could fully rewrite the entire drive 600 times. Crucial's 2TB MX500 SSD has a TBW of 700TB, which is good for 350 complete rewrites of the drive. Enterprise SSDs, such as those used in datacenters and services such as AWS, Apple's own iCloud datacenters, etc. have even higher TBW ratings, and MTBF ratings in the millions of hours. The Microsoft article you cited is from 2019, which is an eternity in the tech sector.

The attached screenshot is from Crucial's specsheet for the MX500 series. Note that the MTTF is equivalent to 750,000 days of continuous usage.

On a side note, there is no way an individual is writing 2500TB a month to any SSD.

NT1440 · Sep 19, 2023

dmccloud said:
The issue is hypothetical because the computer itself is more likely to die before modern SSDs. TBW is so high with modern SSDs that you're unlikely to ever hit that limit, even if doing a lot of heavy 4K video editing. For example, the Samsung 980 Pro series of SSDs have TBWs of 600x the listed capacity (e.g., 250GB - 150TBW, 500GB - 300TBW, 1TB - 600 TBW, 2TB - 1200 TBW), meaning you could fully rewrite the entire drive 600 times. Crucial's 2TB MX500 SSD has a TBW of 700TB, which is good for 350 complete rewrites of the drive. Enterprise SSDs, such as those used in datacenters and services such as AWS, Apple's own iCloud datacenters, etc. have even higher TBW ratings, and MTBF ratings in the millions of hours. The Microsoft article you cited is from 2019, which is an eternity in the tech sector.

The attached screenshot is from Crucial's specsheet for the MX500 series. Note that the MTTF is equivalent to 750,000 days of continuous usage.

View attachment 2269373

On a side note, there is no way an individual is writing 2500TB a month to any SSD.

Thank you, you put it way better than I could have.

Why Apple don't use NVME m2 slot for mac mini/mac studio

macrumors 68030

macrumors member

macrumors member

macrumors member

macrumors 603

macrumors member

macrumors G4

macrumors G4

macrumors member

macrumors Core

macrumors 68040

macrumors member

macrumors 601

macrumors 68040

macrumors 6502a

macrumors 6502a

macrumors 6502

macrumors Core

macrumors Core

macrumors 6502

macrumors 6502a

macrumors Core

macrumors 6502a

macrumors 68040

macrumors P6

Our Staff