Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
They also have other useful features like surge-protected USB ports, per-port MMUs, memory controller QoS, and M5 GPU just doubled the integer multiplication throughout - but you won’t find any of these things on the marketing sheet. Not a strong argument IMO.
Perhaps Apple has implemented full end-to-end ECC RAM without telling anybody. That would not be an argument against it. On the contrary. It would show that Apple also believes it's a good idea to have.
 
When they rolled out the new Mac Pro using apple silicon I was shocked that they used the same enclosure. On one level I understand the logic in using an existing enclosure it really highlighted the deficiencies of this $6,000 - $10,000+ computer. With Apple Silicon and most everything already soldered onto the logic board the size, cooling, and expansion bays were useless in the mac pro.

I love the design of the case, and I was looking for a knock off to build my own pc for years, so its not like I have anything against the design - its fantastic.
In 5-10 years you can get them dirt cheap from OWC.
 
  • Like
Reactions: maflynn
Just a few comments on this:

- What makes you think that Apple wants or needs to train models on their own hardware (inference is a different matter)?
- M3 Ultra is slow for ML because it still lacks ML acceleration. M5 Ultra might be less slow.
- The latest macOS beta introduces Infiniband support, which is the protocol Nvidia uses to build large AI clusters. This would allow you to connect multiple Studios together and use them as a distributed ML accelerator at a lower price than an equivalent hypothetical Mac Pro. And it is very possible that this is what Apple will use internally to link multiple Max or Ultra class chips into coherent compute clusters.
Your posts make me smarter, Leman--they're objective, full of expertise and insight, and clear about where expertise ends. The opposite of most MR posts which make me dumber.
 
The comparisons you mentioned is totally biased.
ChatGPT is trained using nVidia chips. They didn’t need to design their own chip, and neither does Apple.

Whether Apple chooses to or not is their own decision, its build versus buy, but it is not a necessity for them to make their own server chip.
 
  • Haha
Reactions: 1d1otic
ChatGPT is trained using nVidia chips. They didn’t need to design their own chip, and neither does Apple.

Whether Apple chooses to or not is their own decision, its build versus buy, but it is not a necessity for them to make their own server chip.
We are talking about Apple, not OpenAI. Apple is using their own closed-ecosystem with their own hardware and software. That's a huge difference. Yeah, they could use other services but eventually, they would develop and use their own. Besides, tell that to Apple Intelligence.
 
We are talking about Apple, not OpenAI. Apple is using their own closed-ecosystem with their own hardware and software. That's a huge difference. Yeah, they could use other services but eventually, they would develop and use their own. Besides, tell that to Apple Intelligence.
Tesla makes their own batteries but they do not make their own tires.

No company NEEDS to develop their own chip in order to train their own AI. I think Apple is so far behind right now that it makes little sense to spend even more time on a chip program, when there are already so many options. Apple has a long history of successfully porting code from one processor to another, so if they train their engine on nVidia or whatever, they can always make a chip later and port to it. Or maybe they’ll never make their own AI chip, beyond putting AI features into their mainstream chips.

Maybe they will, maybe they won’t.
 
  • Haha
Reactions: 1d1otic
Tesla makes their own batteries but they do not make their own tires.

No company NEEDS to develop their own chip in order to train their own AI. I think Apple is so far behind right now that it makes little sense to spend even more time on a chip program, when there are already so many options. Apple has a long history of successfully porting code from one processor to another, so if they train their engine on nVidia or whatever, they can always make a chip later and port to it. Or maybe they’ll never make their own AI chip, beyond putting AI features into their mainstream chips.

Maybe they will, maybe they won’t.
Then why would Apple make their OWN chips to replace Intel, AMD, and Nvidia? Your logic already failed and not considering the fact that Apple made their own ecosystem a while ago. Ironically, they dont even use Nvidia GPU for AI and had to use Google TPU.

Most importantly, you forgot the fact that AI is dominated by Nvidia's ecosystem unless companies make their own workflow and AI chips which doesn't work that way. Eventually, Apple NEED their own AI chip for their own because others already doing that due to their own AI workflows and models.

And guess what? Google, Microsoft, Amazon, Meta already made their own chips for AI. This only proves why Apple need their own NPU for their own AI.
 
  • Haha
Reactions: maflynn
Then why would Apple make their OWN chips to replace Intel, AMD, and Nvidia? Your logic already failed and not considering the fact that Apple made their own ecosystem a while ago. Ironically, they dont even use Nvidia GPU for AI and had to use Google TPU.

Most importantly, you forgot the fact that AI is dominated by Nvidia's ecosystem unless companies make their own workflow and AI chips which doesn't work that way. Eventually, Apple NEED their own AI chip for their own because others already doing that due to their own AI workflows and models.

And guess what? Google, Microsoft, Amazon, Meta already made their own chips for AI. This only proves why Apple need their own NPU for their own AI.
Why doesn’t Apple make their own DRAM and flash memory chips? They resell tens of millions of those chips.

To be honest, I do not think you understand the basics of semiconductors, which is that you need to sell very high volume in order to cover the expense. Apple sells tens of millions of chips in their phones and computers but they don’t need millions of chips to train an AI model. Once trained, they can run it on their own silicon or someone else’s, if they think their data center demands would justify building their own chips for this purpose.
 
  • Haha
Reactions: 1d1otic
Why doesn’t Apple make their own DRAM and flash memory chips? They resell tens of millions of those chips.

To be honest, I do not think you understand the basics of semiconductors, which is that you need to sell very high volume in order to cover the expense. Apple sells tens of millions of chips in their phones and computers but they don’t need millions of chips to train an AI model. Once trained, they can run it on their own silicon or someone else’s, if they think their data center demands would justify building their own chips for this purpose.
Again, your logic failed and biased. I told you, tell that to those companies making their own chips for AI despite Nvidia GPUs are available. Also, Apple already have their own AI servers with M2 Ultra which is far from being good enough for AI. Clearly, Apple disagrees you after all.
 
  • Haha
Reactions: Non-Euclidean
Again, your logic failed and biased. I told you, tell that to those companies making their own chips for AI despite Nvidia GPUs are available. Also, Apple already have their own AI servers with M2 Ultra which is far from being good enough for AI. Clearly, Apple disagrees you after all.
What Apple chooses to do and what Apple must do are different things. They can do it one way or they can do it another.
 
  • Haha
Reactions: 1d1otic
What Apple chooses to do and what Apple must do are different things. They can do it one way or they can do it another.
And yet, Apple already have their own AI severs with slow chips. The fact that you can not deny.
 
We are talking about Apple, not OpenAI. Apple is using their own closed-ecosystem with their own hardware and software. That's a huge difference. Yeah, they could use other services but eventually, they would develop and use their own. Besides, tell that to Apple Intelligence.

They do training in third-party hardware, inference on their own. I don’t see what’s so difficult about this concept. Do you know how ML models are trained and what they actually look like?
 
The kind of memory errors that ECC RAM is designed to correct (single bit-flips caused by cosmic radiation or electromagnetic interference) are relatively rare occurences in general, and errors affecting working RAM are even more rare. On a typical home computer or laptop, a bit-flip might cause some effect in software less than a dozen times a year if that machine were left on all day every day, and in most cases, that effect would be unnoticeable. It might manifest itself as an incorrect colour in one or a small group of pixels on a photo or a frame of video during playback, might change an 8 to a 9 on a spreadsheet. On a very, very rare occasion, it might affect running code and cause a program to crash.
I don't know if your estimate is correct, but "less than a dozen times a year" issues are common, not rare. Consumer devices are usually on 24/7, as people prefer putting the devices to sleep over full shutdown. (With mobile devices, you don't really have a choice.) In a household with multiple devices, "less than a dozen times a year" can easily become "at least once a week". If 5% of the issues are severe enough to annoy the user, that frequency is high enough to affect the perceived quality of the device.

Many of the error frequency estimates you can find online are probably little more than naive extrapolation. For example, Google reported 25k to 70k errors / billion hours / Mbit in the late 2000s. That would mean hourly errors in most devices, if distributed evenly. But they also reported that only between 1/8 and 1/2 of their servers (depending on server type) and between 3% and 21% of the memory modules saw any errors during the year.
 
  • Like
Reactions: Basic75
Do you dislike journaling file systems? And before you say that filesystem corruption is more frequent than bit errors in RAM
Y'know, it's worth at least reading the first paragraph of Wikipedia on a subject before basing your whataboutism on it:
A journaling file system is a file system that keeps track of changes not yet committed to the file system's main part by recording the goal of such changes in a data structure known as a "journal", which is usually a circular log. In the event of a system crash or power failure, such file systems can be brought back online more quickly with a lower likelihood of becoming corrupted.[1][2] https://en.wikipedia.org/wiki/Journaling_file_system
Nothing to do with "bit errors" (Mass storage devices/filesystems/drive controllers have been using checksums/parity etc. since long before journaling filesystems) and everything to do with mass storage being non-volatile and - unlike RAM - expected to keep its data structures intact in the face of software crashes, disconnections, power-failures etc. that might happen in the middle of a complex update.

If there's a "RAM equivalent" of Journaling it's somewhere in the OS's memory management code, nothing to do with ECC.

"Slow things down" is not a good counterargument to reliability.
Performance vs. reliability vs. cost is always a trade-off. For most personal computers, the trade-off is not worth it.
Nobody here is saying that data centres with dozens of machines don't benefit from ECC. Apple haven't made data centre hardware since the XServe, and I'm not sure that ticked all the "high availability" boxes.
 
  • Like
Reactions: Alameda
Take a step back. Journaling incurs a slight performance overhead for increased reliability. Just like ECC. My point is that if you value your data even slightly you should want both. Ideally with a filesystem that checksums data, but Apple doesn't care about our data as long as we don't...
For most personal computers, the trade-off is not worth it.
I don't agree. The performance impact is negligible. It's an existing proven technology that improves reliability and serves as early-warning sytem for some kinds of hardware failure (and software attacks). If it were universally deployed the cost would come down.

As long as everybody believes Intel's marketing that it's just for servers we will all suffer from random crashes and data corruptions and never know whether RAM is to blame. Which it sometimes is, we know that. We just don't know when.
 
But they also reported that only between 1/8 and 1/2 of their servers (depending on server type) and between 3% and 21% of the memory modules saw any errors during the year.
Which is the point - the typical server sees < 1 memory error per year, which is likely the true rate for 'cosmic ray'-type events, If you're running a data centre with hundreds of servers and downtime costs $$$/hour that can still add up to something significant. If you're running a personal computer or three, you're unlikely to be affected.

Average rates are next to useless for this sort of thing, and are pushed up by a few chips/systems that experience catastrophic failures, so it's a mystery why reports like the ones you keep using them. Median rates would be better - graphs better still: these things often follow a "bathtub curve" with most failures happening either due to manufacturing faults that show up shortly after installation or wear-and-tear that shows up in "later life" with a long, near-fault-free period in between where the rate is far below the average.

In a household with multiple devices, "less than a dozen times a year" can easily become "at least once a week". If 5% of the issues are severe enough to annoy the user, that frequency is high enough to affect the perceived quality of the device.
Except household devices (which rarely use ECC RAM) simply don't fail due to memory errors anything close to "a dozen times a year" - if they did, you'd notice.
 
There are billions of personal computers and smartphones, somebody is being affected.
You really don't understand the concepts of probability and risk, do you?

Somebody is being struck by lightning. The odds are about one in a million per year, so with 8 billion people it must be a daily occurrence (on average, of course). For the average person on the street, that's simply not worth worrying about.
Now, if I owned a chain of golf courses in areas especially prone to thunderstorms with a million people passing through every year, it's quite probably that one of my customers is going to get zapped at some stage, so I'd want to check my insurance & staff training, put up safety warnings etc. Nobody here is saying that large data-centres with so many servers, with so much RAM, that memory errors are an issue don't benefit from ECC, just that it's irrelevant to most personal computer applications.
 
You really don't understand the concepts of probability and risk, do you?
And you are ignoring that the risk doesn't only increase with numbers but also with age. Without ECC how do we even know exactly how great the risk is for consumer devices? They don't run in controlled environments like servers. We have a readily available solution, let's use it.
 
As long as everybody believes Intel's marketing that it's just for servers we will all suffer from random crashes and data corruptions and never know whether RAM is to blame.
I think you're the one actually falling for "Intel's marketing". Intel don't want you to think "ECC is only for servers". Intel want you to think that "ECC is essential for any professional work so only Xeon will do". The issue with Intel is that they only support ECC on their Xeon processors (which include both the server-class Xeons and personal workstation chips like the Xeon-W). They're using that to lock people who actually need (or think they need) ECC into their premium-priced Xeon CPUs.

Now, Linus Torvalds may well find ECC useful for testing and debugging the Linux kernel - but I'm gonna go out on a limb here and suggest that Linus Torvalds isn't really a typical personal computer user...

And, yeah, maybe more support for ECC would bring prices down a bit - but at the end of the day an ECC DDR5 needs a couple more physical RAM chips on each DIMM to store the check bits, and doing full-blown in-band ECC on LPDDR sacrifices a similar percentage of your total RAM & memory bandwidth (although there is apparently some sort of internal ECC built in to LPDDR). 16GB of ECC RAM is always likely to be more expensive than 16GB of non-ECC RAM (although whether Apple could just eat that as part of their ridiculous RAM upgrade prices is an interesting question).

What I don't see from anybody arguing for ECC-for-all is any actual data on error rates - or evidence of crashes/corruption caused by memory errors on the type of on-package LPDDR5x used in Apple Silicon - which is significantly different from regular DDR5 DIMMS. Just unsupported claims that memory errors must be happening and not being recognised...
 
And you are ignoring that the risk doesn't only increase with numbers but also with age.
Yes, it's called "wearing out"... but solid state components like DRAM have long working lives & will probably outlive moving components, PSU capacitors or even NAND Flash (which inevitably dies after a finite number of writes).

Without ECC how do we even know exactly how great the risk is for consumer devices?
...because nobody is actually coming up with any solid evidence to show that this is a problem that exists. If you want to claim that all consumer devices need ECC to solve this "risk" then it's your burden of proof to show that the risk exists.

Consumer devices get killed by dust, overheating, power spikes, failed firmware updates, solder whiskers etc. and - even if they die from RAM failure - there's no guarantee that ECC will be of any help unless it's conveniently preceded by an increasing rate of errors.

Know what mission-critical data centres who actually need ECC also have? Someone on-call to come and swap out components at short notice & elaborate redundancy plans so that they can do this without interrupting service. Not much use knowing your RAM is about to die otherwise.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.