M4+ Chip Generation - Speculation Megathread [MERGED]

Boil · Mar 19, 2025

crazy dave said:
...release a not-really-consumer product with it into the wild a la the DGX station above as a Mac Pro Cube for $40,000 base (I'm making the number up).

Fixed that for you...! ;^p

tenthousandthings · Mar 20, 2025

crazy dave said:
Aye, it's been awhile since HBM was used in a consumer product - what 7 years and counting? I mean Nvidia is out to release the DGX station with HBM but I really doubt that one could call that consumer with a straight face:

NVIDIA DGX Station

The Ultimate Desktop AI Supercomputer Powered by NVIDIA Grace Blackwell.

www.nvidia.com

I mean it is technically a "desktop workstation" but I suspect you're looking at several tens of thousands of dollars and then one gets to count the PCIe versions of Hopper/Blackwell as "consumer".

HBM3e and yup it's still very expensive. HBM4 is coming up and the new even higher throughput interconnect is said to be even more expensive - like a lot more.

[…] But who knows right? Maybe Apple will not only use HBM in their data center but release a not-really-consumer product with it into the wild a la the DGX station above as a Mac Pro for $40,000 base (I'm making the number up).

If it contains GB300 (see photo at link: two Grace, four Blackwell Ultra, and more), then it will be closer to $140,000… So a $40,000 Mac Pro with comparable (while not quite on the same level) performance might seem like a real bargain!

MRMSFC · Mar 20, 2025

Pressure said:
VEGA10 (Radeon RX VEGA 56 / 64) and VEGA20 (Radeon VII) also uses HBM.

Oh yeah I knew about those, but iirc the HBM didn’t help much with graphics performance (though it did make them compute monsters, memories from the crypto craze)

crazy dave said:
Aye, it's been awhile since HBM was used in a consumer product - what 7 years and counting? I mean Nvidia is out to release the DGX station with HBM but I really doubt that one could call that consumer with a straight face:

NVIDIA DGX Station

The Ultimate Desktop AI Supercomputer Powered by NVIDIA Grace Blackwell.

www.nvidia.com

I mean it is technically a "desktop workstation" but I suspect you're looking at several tens of thousands of dollars and then one gets to count the PCIe versions of Hopper/Blackwell as "consumer".

HBM3e and yup it's still very expensive. HBM4 is coming up and the new even higher throughput interconnect is said to be even more expensive - like a lot more.

As for HBM versus LPDDR, there was a long discussion with lots of hard numbers in the link @tenthousandthings gave in the previous posts. Short version: it depends on the application and how much RAM you need. Basically for the same amount of RAM and a much smaller physical package (because it's stacked), you can have much higher bandwidth with HBM and on a per bandwidth basis be just as energy efficient. That's how Nvidia can get multi-TB/s bandwidth on like 144-288GB of HBM for professional Hopper and Blackwell GPUs. However, the cost is quite high and for many applications overkill. High bandwidth LPDDR makes the most sense for all Apple's (consumer-facing) products. If Apple were to design a data center specific die with HBM and wanted to dual use it to help defray development costs, they'd probably have to at least redesign the memory controller for the consumer product. I don't know how easy that is or how much Apple would have to change.

But who knows right? Maybe Apple will not only use HBM in their data center but release a not-really-consumer product with it into the wild a la the DGX station above as a Mac Pro for $40,000 base (I'm making the number up).

Thanks for the explanation, I figured there was a data center use case for HBM since I was aware of NVidia’s hardware. But I figured that the conversation revolved around consumer products.

And that makes sense, if Apple’s serious about using their own hardware for data center stuff then HBM would be useful there.

Still with that explanation I have doubts about any of their consumer products using HBM. Given that Apple is very much “prosumer” oriented.

leman · Mar 21, 2025

I would like to point out that HBM or not HBM is a red herring. What's important is having a memory solution that can support your task. It pretty much boils down to two things (simplified): performance and cost.

To recap: HBM is essentially "parallel DRAM" — you stack multiple RAM chips and give them a very wide interface. Typically we see 1024 data signaling lanes per module. While the signal rate is comparable to other modern DDR solutions, the sheer volume of signaling lanes means very high bandwidth. From the theoretical performance, this is the same as using 16 traditional 64-bit DDR modules, just that the data bus needs to be parallel. Same reasoning applies to LPDDR5.

And this is where it gets interesting. Implementors avoid super wide RAM in practice because having that many signaling lanes on the board is tricky. First, they occupy space, and there are only that many signal traces you can put on the mainboard. These traces also need to be fairly long, which brings problems with power consumption and signal quality. Overall, this tends to be a logistical nightmare. There are server mainboards offering up to 12-channel DDR5 (768-bit interface), and they are very large, power-hungry, and do not support the fastest RAM standards. From what I remember, latest EPYC boards top up at 500-600GB/s — hardly impressive.

HBM avoids these problems by using an interposer to connect all the data and signaling links. One can use advanced packaging methods like through-silicon vias here, which have considerably lower footprint and result in lower power consumption. Also, RAM and the processing unit are closer together. This ends up being considerably more efficient if you want to maximize your bandwidth without blowing power and space budget. At the same time, the drawback is the price — this kind of complex packaging is expensive and HBM modules are not as readily available. So there is a good reason why HBM is limited to high-end data center applications in practice.

There is yet another practical solution for high-bandwidth application - GDDR. Here one focuses on transmitting as much data per pin as possible by pushing the infrastructure to its limits and using advanced signal encoding. The result is fast, but also very power-hungry, which again makes it less suitable for scalable, reliable solutions.

So it means that HBM is the way to go, right? Well, not quite. Recall that the core problem is logistics, signaling logistics to be precise. Traditional RAM fails here because traditional signal traces do not scale. HBM wins because it uses advanced packaging for wiring. Which is precisely how Apple has tackled the issue. They use „traditional“ (albeit heavily customized) RAM modules and a HBM-like wiring solution to connect them to the SoC. This is obviously more expensive that the usual way, but much cheaper than actual HBM, and can be used on a larger scale because you are not limited by supply-constrained HBM. Power consumption is another big advantage - it's fairly unique to use a 512-bit RAM interface as your main memory while still delivering 10+ hours of battery.

The point I am trying to make here is that HBM and Apple's RAM packaging leverage the same solution to the fundamental problem. Hence "poor man's HBM". This does not mean that Apple does not need HBM or won't be using HBM on some of their products. It is obvious that they need more RAM bandwidth to support larger models, so if they want to do that they will need to scale. Can their custom solution support it? Absolutely. Will it be economically feasible? Who knows. There are different things to explore here. I remember seeing patent that mentioned placing RAM modules on both sides of the package, which would double bandwidth and capacity for example.

OptimusGrime · Mar 21, 2025

leman said:
I would like to point out that HBM or not HBM is a red herring. What's important is having a memory solution that can support your task. It pretty much boils down to two things (simplified): performance and cost.

To recap: HBM is essentially "parallel DRAM" — you stack multiple RAM chips and give them a very wide interface. Typically we see 1024 data signaling lanes per module. While the signal rate is comparable to other modern DDR solutions, the sheer volume of signaling lanes means very high bandwidth. From the theoretical performance, this is the same as using 16 traditional 64-bit DDR modules, just that the data bus needs to be parallel. Same reasoning applies to LPDDR5.

And this is where it gets interesting. Implementors avoid super wide RAM in practice because having that many signaling lanes on the board is tricky. First, they occupy space, and there are only that many signal traces you can put on the mainboard. These traces also need to be fairly long, which brings problems with power consumption and signal quality. Overall, this tends to be a logistical nightmare. There are server mainboards offering up to 12-channel DDR5 (768-bit interface), and they are very large, power-hungry, and do not support the fastest RAM standards. From what I remember, latest EPYC boards top up at 500-600GB/s — hardly impressive.

HBM avoids these problems by using an interposer to connect all the data and signaling links. One can use advanced packaging methods like through-silicon vias here, which have considerably lower footprint and result in lower power consumption. Also, RAM and the processing unit are closer together. This ends up being considerably more efficient if you want to maximize your bandwidth without blowing power and space budget. At the same time, the drawback is the price — this kind of complex packaging is expensive and HBM modules are not as readily available. So there is a good reason why HBM is limited to high-end data center applications in practice.

There is yet another practical solution for high-bandwidth application - GDDR. Here one focuses on transmitting as much data per pin as possible by pushing the infrastructure to its limits and using advanced signal encoding. The result is fast, but also very power-hungry, which again makes it less suitable for scalable, reliable solutions.

So it means that HBM is the way to go, right? Well, not quite. Recall that the core problem is logistics, signaling logistics to be precise. Traditional RAM fails here because traditional signal traces do not scale. HBM wins because it uses advanced packaging for wiring. Which is precisely how Apple has tackled the issue. They use „traditional“ (albeit heavily customized) RAM modules and a HBM-like wiring solution to connect them to the SoC. This is obviously more expensive that the usual way, but much cheaper than actual HBM, and can be used on a larger scale because you are not limited by supply-constrained HBM. Power consumption is another big advantage - it's fairly unique to use a 512-bit RAM interface as your main memory while still delivering 10+ hours of battery.

The point I am trying to make here is that HBM and Apple's RAM packaging leverage the same solution to the fundamental problem. Hence "poor man's HBM". This does not mean that Apple does not need HBM or won't be using HBM on some of their products. It is obvious that they need more RAM bandwidth to support larger models, so if they want to do that they will need to scale. Can their custom solution support it? Absolutely. Will it be economically feasible? Who knows. There are different things to explore here. I remember seeing patent that mentioned placing RAM modules on both sides of the package, which would double bandwidth and capacity for example.

I wonder if any lessons can be applied from the R1 memory? It’s another kind of of “near memory” like HBM or GDDR. Very low latency and utilizes many more pins than dram but perhaps too low capacity to replace it?

thenewperson · Mar 21, 2025

OptimusGrime said:
I wonder if any lessons can be applied from the R1 memory? It’s another kind of of “near memory” like HBM or GDDR. Very low latency and utilizes many more pins than dram but perhaps too low capacity to replace it?

IIRC there was a rumour that they’d be starting a transition to LLW RAM like on the R1 for their other devices in 2026/2027 (can’t remember the exact year).

leman · Mar 21, 2025

OptimusGrime said:
I wonder if any lessons can be applied from the R1 memory? It’s another kind of of “near memory” like HBM or GDDR. Very low latency and utilizes many more pins than dram but perhaps too low capacity to replace it?

There are not many details about R1, however looking at available information it seems to follow the same idea as HBM or Apple's on-package LPDDR. The main difference is likely even tighter integration for increased performance and power consumption.

OptimusGrime · Mar 21, 2025

leman said:
There are not many details about R1, however looking at available information it seems to follow the same idea as HBM or Apple's on-package LPDDR. The main difference is likely even tighter integration for increased performance and power consumption.

So would you say it would be a good fit for Apple’s ambitions?

leman · Mar 22, 2025

OptimusGrime said:
So would you say it would be a good fit for Apple’s ambitions?

I don’t have a judgement on this. Not much is known about this tech, its costs, or limitation. What I would mention is that chiplet-style RAM is hardly an economical solution for main memory. It could work as large cache though.

OptimusGrime · Mar 22, 2025

leman said:
I don’t have a judgement on this. Not much is known about this tech, its costs, or limitation. What I would mention is that chiplet-style RAM is hardly an economical solution for main memory. It could work as large cache though.

That makes sense. I’ve read that it is used as pseudo-sram on the Vision Pro.

Xiao_Xi · Mar 25, 2025

According to CTEE, AMD, first, and Apple, later, will launch products using TSMC's SoIC this year.

According to the supply chain, AMD is the first company to introduce SoIC, and Apple will follow suit in the second half of this year by introducing SoIC advanced packaging for M5 chips.

https://www.ctee.com.tw/news/20250325700068-430501 [Traditional Chinese]

Antony Newman · Mar 25, 2025

I wonder if TSMC helped information about the 2nm HVM get out to counter negative messaging about onshore (Taiwanese) yield and dampen misinformation about expected costs when the the fabs make their way to the US (which should only add 10% to each wafer).

Yesterdays news didn't have the AMD scoop - but did hint at OpenAI's interest in TSMC 2nm (...wondering if Softbank are now involved).

台積2奈米準備就緒 H2量產傳蘋果率先採用

台積電先進製程「根留台灣」決心不變，高雄廠3月31日舉辦2奈米擴產典禮，新竹寶山4月下旬將迎來首批2奈米晶圓共乘，下半年排程也將於4月1日開放接受預訂。法人預估，台積電2奈米進程順利，今年底月產能上看5萬片，潛在客戶蘋果、AMD、英特爾、博通、AWS等備戰。法人表示，台積電2奈米準備就緒，破除蘋果...

www-ctee-com-tw.translate.goog

splifingate · Mar 29, 2025

tenthousandthings said:
The best description of the priorities of the Apple Silicon team is found in an interview of Anand Shimpi in February 2023. It is well worth reading that transcript if you are truly interested in understanding what Apple is doing.

"But really the thing that we see, that the iPhone and the iPad have enjoyed over the years, is this idea that every generation gets the latest of our IPs, the latest CPU IP, the latest GPU, media engine, neural engine, and so on and so forth, and so now the Mac gets to be on that cadence too."

Yet, segmentation seems to be less of a perceptual reality, and more of a conscious strategy for AAPL these days....

thenewperson · Mar 29, 2025

splifingate said:
Yet, segmentation seems to be less of a perceptual reality, and more of a conscious strategy for AAPL these days....

Just these days?

tenthousandthings · May 13, 2025

Speaking of codenames, am I right in thinking we still haven’t seen “Hidra” — ? — that rumor was it was for Mac Pro. I believe it was generally thought to be M4 Ultra, but that seems unlikely now, so what is it? M5 Pro/Max/Ultra?

I guess it makes sense that SoIC development could have a different timeline, with work on the chiplets beginning earlier, fundamentally changing how the A-series and the M-series relate to one another.

Populus · May 13, 2025

tenthousandthings said:
Speaking of codenames, am I right in thinking we still haven’t seen “Hidra” — ? — that rumor was it was for Mac Pro. I believe it was generally thought to be M4 Ultra, but that seems unlikely now, so what is it? M5 Pro/Max/Ultra?

I guess it makes sense that SoIC development could have a different timeline, with work on the chiplets beginning earlier, fundamentally changing how the A-series and the M-series relate to one another.

Yeah, my bet is that it’s the M5 Ultra.

Boil · May 13, 2025

tenthousandthings said:
Speaking of codenames, am I right in thinking we still haven’t seen “Hidra” — ? — that rumor was it was for Mac Pro. I believe it was generally thought to be M4 Ultra, but that seems unlikely now, so what is it? M5 Pro/Max/Ultra?

I guess it makes sense that SoIC development could have a different timeline, with work on the chiplets beginning earlier, fundamentally changing how the A-series and the M-series relate to one another

Populus said:
Yeah, my bet is that it’s the M5 Ultra.

Nah, it's the M5 Extreme...! ;^p

DrWojtek · May 13, 2025

Maybe it’s the chiplet rumor overall. One chiplet, one head. It might refer to the whole M5 series.

tenthousandthings · May 13, 2025

Huh! On the recent codenames leak, which includes mysterious future silicon codenamed “Sotra,” here’s something:

Hidra is a Norwegian island. So is Sotra…

Hidra = M5 Ultra
Sotra = M7 Ultra

Komodo and Borneo are also islands. Also Baltra. So are Donan and Brava.

I know it doesn’t mean anything, that’s the whole point of a codename, but I wasn’t aware they were all islands…

Populus · May 13, 2025

tenthousandthings said:
Huh! On the recent codenames leak, which includes mysterious future silicon codenamed “Sotra,” here’s something that seems unlikely to be a coincidence:

Hidra is a Norwegian island. So is Sotra…

Hidra = M5 Ultra
Sotra = M7 Ultra

Why the M7 Ultra and not the M6 Ultra?

DrWojtek · May 13, 2025

tenthousandthings said:
Huh! On the recent codenames leak, which includes mysterious future silicon codenamed “Sotra,” here’s something:

Hidra is a Norwegian island. So is Sotra…

Hidra = M5 Ultra
Sotra = M7 Ultra

Komodo and Borneo are also islands. So are Donan and Brava.

I know it doesn’t mean anything, that’s the whole point of a codename, but I wasn’t aware they were all islands…

Nice find. Can’t believe I/many of us went all in on the Hydra thing. We’ve played too much Heroes of Might and Magic in our days.

That pretty much throws an ’Extreme’ option out the window. It’s just a new island themed codename branding. Possibly referenced because of upcoming chiplets. Brava and Donan were neither of them really islands.

Edit: I see Donan and Brava were islands, too. In that case I don’t think it means anything at all.

tenthousandthings · May 13, 2025

Populus said:
Why the M7 Ultra and not the M6 Ultra?

Just a guess, no reason, really. The M3 Ultra thing makes me think it’s possible the Ultra will (going forward) always skip a generation, so M3-M5-M7, but I’ve no basis for that, it’s just a hunch.

tenthousandthings · May 13, 2025

DrWojtek said:
Edit: I see Donan and Brava were islands, too. In that case I don’t think it means anything at all.

Yes, agree, just a fun fact. I deleted a clause (preserved in the @Populus reply above) after “something” as soon as I realized they are all island names.

Populus · May 13, 2025

tenthousandthings said:
Yes, agree, just a fun fact. I deleted a clause (preserved in the @Populus reply above) after “something” as soon as I realized they are all island names.

Yeah, even Hidra is an island. Maybe Hydra? Or that’s just in the MCU (Marvel).

I’m afraid we’re all eager to know more about the upcoming M5 family of SoCs, because of how close it is, few months away of being revealed. Hopefully more details will leak soon…

DrWojtek · May 13, 2025

But since we had mac identifiers 17,1 and 17,2 leak, at least we know Hidra is M5. And it would probably not be a MBP which comes in 3+ identifiers. And it won’t be a Max Studio or MBA, which just got upgraded.

That leaves two configs of Mac Pro, or Mac Mini, or a regular iMac and iMac Pro is back?

M4+ Chip Generation - Speculation Megathread [MERGED]

macrumors 68040

Contributor

macrumors 6502

macrumors Core

macrumors 6502

macrumors 65816

macrumors Core

macrumors 6502

macrumors Core

macrumors 6502

macrumors 68000

macrumors member

Contributor

macrumors 65816

Contributor

macrumors 604

macrumors 68040

macrumors 6502

Contributor

macrumors 604

macrumors 6502

Contributor

Contributor

macrumors 604

macrumors 6502

Our Staff