I don't think that is going to be enough bandwidth. For that kind info chip, you really want 200+GB/s... so something "HBM-like". Frankly, I am starting thinking that Apple will have a DYI-HBM with stacked LPDDR chips and very wide memory bus, with 8 memory controllers or more. They already stack RAM on top of the iPhone chip, so I don't see why it wouldn't be possible.
It would definitely be bandwidth constrained, but I do think 4x LPDDR5-6400 modules could reach 200+GB/s (barely). I added 50% to the
speed of the two LPDDR4X modules in the Mac Mini to account for the higher speed rating of LPDDR5 and then doubled it to account for the extra channels. Pretty lazy math and I could be wrong.
I don't think it's unprecedented. There are laptops shipping with large and hot GPUs. A mobile RTX 2080 is 545mm2 with a TDP of 80W in the Max-Q configuration. An Apple SoC with a 12+4 CPU and a 32-core GPU will probably have a combined TDP of around 60-70watts. Shouldn't be that much of a challenge to cool in the current 16" chassis.
This is a really useful comparison. I see where your head is at, but I think I can show you what I mean by unprecedented. Let's say for argument the M99 has a 65W TDP.
1. 65W over 3 square centimeters is actually much more heat dense than 80W over 5.5 square centimeters:
21.7 W/cm vs 14.5 W/cm.
2.
Most of these machines using the RTX 2080 are four to five centimeters thick. That provides for a
lot more cooling than the roughly three centimeter thick MBP16.
3. GPUs have a built-in advantage: because they work in parallel, they tend to distribute heat evenly.
4. Lastly, that 80W TDP number for the 2080 Max Q includes VRAM. We're already looking at a more heat-dense part with less space for cooling, and we haven't even added high bandwidth memory yet.
I am asking around about this btw, so I'll let you know if someone more knowledgeable than me weighs in. All I can really say right now is that I don't think anyone has put a part this heat-dense in an enclosure this small.
For high-end configs (like the Mac Pro), I don't really see them doing monolithic chips — yields will probably be abysmal. But a multi-chip package, with CPU+GPU dies connected to a shared I/O+cache die (possibly stacked with RAM) — that should be doable. AMD does it with Zen3 and it seems to work just fine.
But than again, it's Apple we are talking about. I can totally see them using a 1000mm2 monolithic die that costs $1000 to make just to prove a point. Still cheaper than the Xeons and the GPUs they have to buy from a third party
I guess I can see it. Some kind of weird memory and I/O die designed to communicate with a separate GPU and CPU that have their own memory controllers. I don't know enough about memory to really put it in my mind's eye, though. Wouldn't separate GPU and CPU controllers be competing for access to memory channels?
No, that’s 12 High Performance cores and 4 High Efficiency cores.
But let us be honest with ourselves.
I think the 16” MacBook Pro will get at maximum 8 High Performance cores, 4 High Efficiency cores and up to 16-core GPU (12- to 16-cores depending on configuration).
That's certainly where my head was at before the Bloomberg article. But I don't mind being more optimistic if that's what the evidence suggests. The 8+16 design idea was always rooted in the original Bloomberg APU leak - it just sort of takes that, assumes the die will be 150mm2 or less, and assumes Apple really likes multiples of four. Reasonable assumptions. But it also had the major failing that it would fall just short of the Radeon Pro 5600M.
The Bloomberg article presents a different, but fully coherent strategy: design a big APU, let the yield fall where it may, and bin it aggressively to cover the MBP14, MBP16, and the iMacs.