Apple Mac Pro, how will Apple manage that huge package without using chiplets?

Ulfric · Dec 9, 2020

Anyone who haven't read the recent rumor from Bloomberg, Apple is preparing new chips for pro lineups.

Later in 2021, for higher-end desktop

32+? core CPU
64, 128 core GPU, ‘several times faster than the current graphics modules Apple uses from Nvidia and AMD in its Intel-powered hardware

Now there are several questions that need to be answered.

Will it be an SoC? If yes then it will be huge & its thermal envelope will be huge as well. Will it be feasible? We don't know the yields of TSMC's 5nm. Now Apple may bear the cost but eventually they would pass the cost onto customers. So, in the end it will just increase the price of future Mac Pro. Monolithic die will just increase the cost.

Also, another thing that needs to be addressed is that whether the RAM & GPU will be included in the package or not. Many of the ASIC's functionality in M1 SoC can be transferred to GPU (things like exporting, encoding, decoding etc.), but we don't know if Apple would do that or not.

ArPe · Dec 9, 2020

The logic board of a Mac Pro is big enough for a large SoC but they will also likely have a proprietary PCIE slot that interfaces directly with UMA for a larger GPU.

leman · Dec 9, 2020

Nvidia ships chips as large as 628mm2 with TDP close to 300 watts. Intel Xeon dies are up to 694mm2 large with TDPs of 200 watts — and it run at over 300 wats, sustained, in the Mac Pro.

Using a multi-chip package (chiplet route) is probably more economic, but I don't think it's out of the question that Apple will choose a monolithic die root. Looking at the M1 die shot, they probably should be able to fit a 32-core CPU and a 128-core GPU into a single 700-800mm2 die, and while these are likely to be crazy expensive due to yield issues, its not like Apple has to sell them. Even if they end up with a costs of $2000 per chip, it is still going to be significantly cheaper for them than buying a state of the art Xeon or Threadripper and a high-end GPU.

Besides, a large chip can still be binned. Bloomberg reports that future Apple configurations will include a 64-core and 128-core GPUs. While we should take these reports with a large grain of salt, I can imagine that the practical yields can be improved dramatically if one is prepared to fuse off up to 64 defective GPU cores on the die...

Ulfric said:
Also, another thing that needs to be addressed is that whether the RAM & GPU will be included in the package or not. Many of the ASIC's functionality in M1 SoC can be transferred to GPU (things like exporting, encoding, decoding etc.), but we don't know if Apple would do that or not.

RAM and GPU are most certainly going to be included on the package. The big question is whether the GPU is going to be included on-chip

leman · Dec 9, 2020

ArPe said:
The logic board of a Mac Pro is big enough for a large SoC but they will also likely have a proprietary PCIE slot that interfaces directly with UMA for a larger GPU.

My bet goes for potentially multiple CPU+GPU+RAM boards with some sort of fast cross-board interconnect for a NUMA-like setup. There is preliminary support for these configurations in Metal.

Ulfric · Dec 9, 2020

leman said:
RAM and GPU are most certainly going to be included on the package. The big question is whether the GPU is going to be included on-chip

What I meant by package is SiP (system in Package). If RAM will be on same die & non upgradable then it makes the entire pro moniker useless.

Frantisekj · Dec 9, 2020

leman said:
Nvidia ships chips as large as 628mm2 with TDP close to 300 watts. Intel Xeon dies are up to 694mm2 large with TDPs of 200 watts — and it run at over 300 wats, sustained, in the Mac Pro.

Using a multi-chip package (chiplet route) is probably more economic, but I don't think it's out of the question that Apple will choose a monolithic die root. Looking at the M1 die shot, they probably should be able to fit a 32-core CPU and a 128-core GPU into a single 700-800mm2 die, and while these are likely to be crazy expensive due to yield issues, its not like Apple has to sell them. Even if they end up with a costs of $2000 per chip, it is still going to be significantly cheaper for them than buying a state of the art Xeon or Threadripper and a high-end GPU.

Besides, a large chip can still be binned. Bloomberg reports that future Apple configurations will include a 64-core and 128-core GPUs. While we should take these reports with a large grain of salt, I can imagine that the practical yields can be improved dramatically if one is prepared to fuse off up to 64 defective GPU cores on the die...

Ulfric said:

Also, another thing that needs to be addressed is that whether the RAM & GPU will be included in the package or not. Many of the ASIC's functionality in M1 SoC can be transferred to GPU (things like exporting, encoding, decoding etc.), but we don't know if Apple would do that or not.

Click to expand...

RAM and GPU are most certainly going to be included on the package. The big question is whether the GPU is going to be included on-chip

Is it possible that Apple mix chip with onboard memory plus offer optional RAM ports to ramp up amount ti smiliral amout Mac Pro can?

leman · Dec 9, 2020

Ulfric said:
What I meant by package is SiP (system in Package). If RAM will be on same die & non upgradable then it makes the entire pro moniker useless.

Not necessarily. If the non-upgradeable RAM is several times faster than any slotted memory would be, what would you choose?

Frantisekj said:
Is it possible that Apple mix chip with onboard memory plus offer optional RAM ports to ramp up amount ti smiliral amout Mac Pro can?

That could be a compromise between performance, price and upgradeability I suppose.

Ulfric · Dec 9, 2020

leman said:
Not necessarily. If the non-upgradeable RAM is several times faster than any slotted memory would be, what would you choose?

In some of the workloads capacity does matter.

leman · Dec 9, 2020

Ulfric said:
In some of the workloads capacity does matter.

On-package memory does not necessarily mean sacrificing capacity, only upgradeability. Apple could be using something like stacked DDR5 with a lot of memory channels (12 or even 16). That would give you bandwidth of up to 800GB/s and capacities of multiple TBs.

filterdecay · Dec 9, 2020

Ulfric said:
What I meant by package is SiP (system in Package). If RAM will be on same die & non upgradable then it makes the entire pro moniker useless.

pro just means getting work done. If the new macpro has better efficiency with ram then I could get by with maybe only 32gb? Who knows. I'm hoping a return to the $2500 base spec mac pro.

guzhogi · Dec 9, 2020

Ulfric said:
‘several times faster than the current graphics modules Apple uses from Nvidia and AMD in its Intel-powered hardware

Considering how Apple used low- to middle-end graphics, that's not saying much. I'd like to see how these supposed GPUs compare to top of the line 3rd party GPUs like Geforce 3090 (or whatever is current at the time Apple ships this)?

ArPe said:
The logic board of a Mac Pro is big enough for a large SoC but they will also likely have a proprietary PCIE slot that interfaces directly with UMA for a larger GPU.

I also wonder if Apple will make some sort of proprietary connection like Nvidia's NVLink? I'd still like to see some kind of expandability/upgradeability, though. However, Apple seems to like to do proprietary things.

JohnnyGo · Dec 9, 2020

My guess is that are three routes Apple could choose:
1) SOC with CPU+GPU and RAM on package (like M1) plus external slots for upgrades of both RAM and GPU
2) SOC with CPU and RAM on package PLUS A SEPARATE GPU chip plus external slots for upgrades of both RAM and GPU
3) SOC with CPU only plus external slots for base+ upgrades of both RAM and GPU

IMHO Apple has to have some form of interconnect/slots for future user upgrades for both RAM and GPU if we’re talking Mac Pro.

Ethosik · Dec 9, 2020

filterdecay said:
pro just means getting work done. If the new macpro has better efficiency with ram then I could get by with maybe only 32gb? Who knows. I'm hoping a return to the $2500 base spec mac pro.

Yes...people confuse Pro with Enthusiast on this site all the time and have for years and years. As a "pro" I would like to not have to deal with opening the computer case. As an "Enthusiast" I do like opening my computer case and upgrading components (my gaming PC for example). I have two computers for two separate needs. When I want to get work done, I do not want to deal with opening my case up. I get a new machine when my needs change and I make up the price of the machine when I finish a few jobs for people.

JeepGuy · Dec 9, 2020

filterdecay said:
pro just means getting work done. If the new macpro has better efficiency with ram then I could get by with maybe only 32gb? Who knows. I'm hoping a return to the $2500 base spec mac pro.

I think those days are gone, our only hope is a mini pro.

deconstruct60 · Dec 9, 2020

Ulfric said:
Anyone who haven't read the recent rumor from Bloomberg, Apple is preparing new chips for pro lineups.

Later in 2021, for higher-end desktop

32+? core CPU

64, 128 core GPU, ‘several times faster than the current graphics modules Apple uses from Nvidia and AMD in its Intel-powered hardware

Now there are several questions that need to be answered.

Will it be an SoC?

Probably. However, I suspect that will will not see the combination of "max" big CPU cores paired up with "max" GPU core count at all.

128 GPU cores more so a MBP 16" or iMac 27" thing. Unless Apple has chased away all of the 3rd party GPU options, then a count that large won't come integrated to the Mac Pro.

The current M1 is about 120mm^2. if about 45% of that is 4 big , System Level Cache (SLC) , 8 GPU and RAM controllers then that subset is about 54mm^2.

5 * 54mm^2 is 270mm^2. 270 + 120 = 390mm^2. That would be tractable in 12 months or so at 5nm.

6 * 54mm^2 is 324mm^2 324 + 120 = 444mm^2 I suspect that is probably more in the zone where Apple would 'quit' with a monolithic die.

If they start to punt on more GPU cores once get to the 32 GPUs core mark then can double the number of big CPU cores in that 54mm^2 block. For example.

chunk1 : 4 big , 8 GPU , more SLC , another set of RAM controllers.
chunk2 : 4 big , 8 GPU , more SLC , another set of RAM controllers.
chunk3 : 4 big , 8 GPU , more SLC , another set of RAM controllers.
chunk4 : 8 big , more SLC , another set of RAM controllers.
chunk5 : 8 big , more SLC , another set of RAM controllers

coupled to the baseline's 4 , 8 and the rest, that nets to 32 big , 32 GPU , 6 SLC blocks. 6 memory controllers.

For the > 32 GPU SoC I think Apple will make the opposite trade off. Put a cap on big CPU cores and soak that CPU core allocate up with doubling up on GPU core blocks. So if the big CPU core count caps at 12 then could get the following

chunk1 : 4 big , 8 GPU , more SLC , another set of RAM controllers
chunk2 : 4 big , 8 GPU , more SLC , another set of RAM controllers
chunk3 : 16 GPU , more SLC , another set of RAM controllers
chunk4 : 16 GPU , more SLC , another set of RAM controllers
chunk5 : 8 GPU

coupled to the baseline's 4, 8 and the rest , that nets to 12 big (+ 4 small) , 64 GPU , 5 memory controllers.

For the 128 GPU cores... I doubt that would come in a SoC package. At least not at 5nm.

32 CPU and 128 GPU cores into one large "mega" package would probably force them into chiplets. But also don't see the point of that unless totally wrapped up in Apple only GPUs that are only iGPUs dogma future. Apple's lack of SMT and extra super wide instruction dispatch is highly leveraged on bigger and deeper caches. If go to chiplets and one unified System Level cache that will introduce substantive latency. That in turn will put a drag on the performance that the more unified, monolithic implementations achieve. I suspect Apple is going to try to avoid that. Even with monolithic as the SLC cache gets much bigger keeping uniform, extra low latency access is going to be tricky. (more snooping, farther distances , more chatter/traffic. etc. )

Additionally, for 32-128 GPU cores at some point it probably not going to make sense to couple them to LPDDR4 ( or LPDDR5). LPDDR5 is better than previous generations of GDDR implementations but HBM and GDDR6X (or better). Not really. Unless keeping those GPUs cores at relatively low clocks the bandwidth contention is going to get quite high once put multiple concurrent workloads on different subsections.

Apple is out to buy nobody's iGPUs for the Macs with only those.
Apple is probably out to cut the number of dGPUs they buy way down ( minimally remove from MBP 16" and iMac 21-24" and possibility also from iMac 27" ). That actually gets them into the slippery slope zone with 3rd parties. The sales of iMac Pro and Mac Pro class systems are relatively small. ( probably in the sub 100K run rate per year zone). If they bring back eGPU and those sales is there something big enough to keep a 3rd party interested in doing driver and software support work or not? Weave in some BTO iMac 27" and then it is probably not an issue. ( not super happy but decent enough market for AMD to put the time in).

If Apple is killing off 3rd party GPU kernel access and 3rd party options they 64 and 128 may be a dGPU deviation off the more tightly coupled Unified Memory iGPU driver model. Once operating as a dGPU it doesn't have to be a "chiplet'. They Can dump all the SoC baseline logic/fixed function cores and CPU cores. That's is pretty good chance of both Apple making a grab but also having to backfill where vendors all just 'quit' also (Apple also has to fill the role because have painted themselves into another corner. ).

Ulfric said:
If yes then it will be huge & its thermal envelope will be huge as well. Will it be feasible?

If keeping the LPDDR4 (LPDDR5) memory then if 6 chunks were all 20W each then in the 120W range. That too is tractable. Core counts probably not as much of an issue as increasing the extremely high I/O bandwidth out ( shifting to PCI-e v4 or v5 ). There is some substantive things that Apple punted on to get to lower power ( which makes sense for an iPad Pro and not so much for a higher end desktop system. )

Ulfric said:
We don't know the yields of TSMC's 5nm. Now Apple may bear the cost but eventually they would pass the cost onto customers. So, in the end it will just increase the price of future Mac Pro. Monolithic die will just increase the cost.

There is an opportunity cost for Apple. Pragmatically there is an upper limit to how many 5nm wafers they can get. If they can get 4-5x as many iPhone/iPad SoCs out of a wafer and sell 4x as many devices then that might get the allocation.

Ulfric said:
Also, another thing that needs to be addressed is that whether the RAM & GPU will be included in the package or not. Many of the ASIC's functionality in M1 SoC can be transferred to GPU (things like exporting, encoding, decoding etc.), but we don't know if Apple would do that or not.

As the package gets bigger mounting the RAM gets more tractable. ( at least for GPU sizes. )

RAM on the package does become an issue if Apple is going to try to track the max RAM capacities of the higher end desktops. Getting to triple digit GB RAM on package is going to be problematical even for DDR5. Quad digit ( > 1TB) even more so. But Apple may just choose to backslide there. ( let some customers 'go' to cover the average user capacity at even higher margins. )

deconstruct60 · Dec 9, 2020

leman said:
On-package memory does not necessarily mean sacrificing capacity, only upgradeability. Apple could be using something like stacked DDR5 with a lot of memory channels (12 or even 16). That would give you bandwidth of up to 800GB/s and capacities of multiple TBs.

Multiple TBs???? Multiple GBs. Somewhat suspect Apple will even have memory controllers (and TLBs) that can even address a TB (let alone multiples. )

And the pin-out , trace fan out of 12-16 channels looks like what 2-D floorplan wise ? ( other than relatively quite large).

deconstruct60 · Dec 9, 2020

leman said:
Nvidia ships chips as large as 628mm2 with TDP close to 300 watts. Intel Xeon dies are up to 694mm2 large with TDPs of 200 watts — and it run at over 300 wats, sustained, in the Mac Pro.

Using a multi-chip package (chiplet route) is probably more economic, but I don't think it's out of the question that Apple will choose a monolithic die root. Looking at the M1 die shot, they probably should be able to fit a 32-core CPU and a 128-core GPU into a single 700-800mm2 die, and while these are likely to be crazy expensive due to yield issues, its not like Apple has to sell them. Even if they end up with a costs of $2000 per chip, it is still going to be significantly cheaper for them than buying a state of the art Xeon or Threadripper and a high-end GPU.

There is little good reason to go super duper package on the GPU. At some point mulitple GPUs are better than just one dogmatically coupled to the CPU cores. Too tight of coupling thermally was the corner they painted themselves into with the Mac Pro 2013. Really isn't a good reason to revisit that with a Mac Pro SoC later in time.

Apple needs just a "good enough" iGPU in the Mac Pro SoC. That way all Macs have an iGPU. If wants to chase the very top end of the dGPU performance then a decoupled GPU would be a much more saner way to go. A slightly different driver model with much bigger NUMA latencies between the shared Memory spaces and more targeted and localized caches for data that isn't actively shared as much.

Besides the memory workload patterns will diverge more as the GPU and CPU core counts also do major divergence. 20-30 of one and 64-100 of another isn't going to be uniform. The differences in patterns are going to become far more pronounced if both loaded up with high workloads. The same memory controller trying to do two different jobs likely will run into issues.

leman said:
Besides, a large chip can still be binned. Bloomberg reports that future Apple configurations will include a 64-core and 128-core GPUs. While we should take these reports with a large grain of salt, I can imagine that the practical yields can be improved dramatically if one is prepared to fuse off up to 64 defective GPU cores on the die...

Fusing off 64 cores isn't going to improve yields at all. Those 64 still sucked up space on the wafer. Fusing off 120-200mm^2 zones on a wafer are sunk costs; not "saving money".

leman · Dec 9, 2020

deconstruct60 said:
Multiple TBs???? Multiple GBs. Somewhat suspect Apple will even have memory controllers (and TLBs) that can even address a TB (let alone multiples. )

And the pin-out , trace fan out of 12-16 channels looks like what 2-D floorplan wise ? ( other than relatively quite large).

Well, the current Mac Pro supports over a TB of RAM, and I don’t see Apple downgrading here. TLB are probably not a problem with 16KB pages. As to the hardware implementation, I am totally clueless. But 16 64-bit channels is still only 1024 bits - and HBM2 is wider than that. They would obviously need some sort of interposer technology, but it’s not like this hasn’t been done before... furthermore, since DDR5 seems to be designed with stacking in mind, I think this could be an attractive option.

leman · Dec 9, 2020

deconstruct60 said:
Too tight of coupling thermally was the corner they painted themselves into with the Mac Pro 2013. Really isn't a good reason to revisit that with a Mac Pro SoC later in time.

I don't see the parallel here. Back then the entire platform was thermally constrained and they didn't anticipate the TDP of GPUs doubling over a couple of years. With their own silicon they know what they are delivering. If 300 watts is enough for them to deliver a state of the art CPU and GPU, then why not?

deconstruct60 said:
Apple needs just a "good enough" iGPU in the Mac Pro SoC. That way all Macs have an iGPU. If wants to chase the very top end of the dGPU performance then a decoupled GPU would be a much more saner way to go. A slightly different driver model with much bigger NUMA latencies between the shared Memory spaces and more targeted and localized caches for data that isn't actively shared as much.

Maybe you are right, but this breaks the programming model. I think one of the best parts of Apple Silicon is that it gives you certain guarantees that you can use in your code. Just like gaming consoles. If Mac Pro deviates from these guarantees, pro software developers will suddenly have a much harder job. As I've said before, I can see some sort of NUMA interface between various clusters, where each cluster has a CPU and GPU that share memory, but can also communicate with other CPU+GPU pairs. I think that putting a non-local GPU on a Mac Pro would go agains the philosophy of Apple Silicon. But who knows, you are making a good point as well.

deconstruct60 said:
Besides the memory workload patterns will diverge more as the GPU and CPU core counts also do major divergence. 20-30 of one and 64-100 of another isn't going to be uniform. The differences in patterns are going to become far more pronounced if both loaded up with high workloads. The same memory controller trying to do two different jobs likely will run into issues.

Why? The processors don't have their own memory controllers. They connect to memory via a large shared cache. The memory controller just takes care of all the outstanding cache synchronization or prefetch requests. Maybe I am missing something, but I just don't see a problem. Whether you are streaming large amount of data for the GPU, or accessing random patterns via a CPU-side pointer chase, a cache line fetch is a cache line fetch. Apple likely has some sort of hardware in place to pool these and make them more efficient.

deconstruct60 said:
Fusing off 64 cores isn't going to improve yields at all. Those 64 still sucked up space on the wafer. Fusing off 120-200mm^2 zones on a wafer are sunk costs; not "saving money".

"Saving money" in the sense of "we can still sell them in our $6000 model and save the good ones for the $10000 model". Could be decent business

Krevnik · Dec 9, 2020

leman said:
Maybe you are right, but this breaks the programming model. I think one of the best parts of Apple Silicon is that it gives you certain guarantees that you can use in your code. Just like gaming consoles. If Mac Pro deviates from these guarantees, pro software developers will suddenly have a much harder job. As I've said before, I can see some sort of NUMA interface between various clusters, where each cluster has a CPU and GPU that share memory, but can also communicate with other CPU+GPU pairs. I think that putting a non-local GPU on a Mac Pro would go agains the philosophy of Apple Silicon. But who knows, you are making a good point as well.

The penalties for moving stuff around make things trickier for thread management once you start looking at multiple chips (be it AMD's CCX or SMP). Something Apple could take on, but makes things more complicated in a different way.

But I will point out the moment you need multiple SoCs, chiplets start looking good as a way to consolidate common components that don't benefit from the duplication (like I/O & Secure Enclave). If those SoCs have different memory spaces, the cure is now worse than the disease.

leman said:
"Saving money" in the sense of "we can still sell them in our $6000 model and save the good ones for the $10000 model". Could be decent business

Using enormous dies where you have to fuse off some large percentage of them means you are wasting considerable die space, which has a cost. So you have to trade off the waste vs using an appropriately sized die with better yields. Or if you are Intel, you just charge an arm and a leg until someone comes along to tell you otherwise. This sort of management is Cook's bread and butter, so I'm curious how Apple tackles this, honestly.

It's partly why AMD's CCX dies are ingenious. Instead of having a big 16-core part that gets fused down to 6-8 cores, they have a smaller part with better yields, and is more flexible towards meeting demand in the market, depending on what people are actually buying.

I honestly wouldn't be too surprised if there's something chiplet like though.

Frantisekj · Dec 9, 2020

leman said:
Not necessarily. If the non-upgradeable RAM is several times faster than any slotted memory would be, what would you choose?

That could be a compromise between performance, price and upgradeability I suppose.

So on chip RAM would become level 4 or 5 cache 🙂

EntropyQ3 · Dec 11, 2020

Krevnik said:
The penalties for moving stuff around make things trickier for thread management once you start looking at multiple chips (be it AMD's CCX or SMP). Something Apple could take on, but makes things more complicated in a different way.

But I will point out the moment you need multiple SoCs, chiplets start looking good as a way to consolidate common components that don't benefit from the duplication (like I/O & Secure Enclave). If those SoCs have different memory spaces, the cure is now worse than the disease.

Using enormous dies where you have to fuse off some large percentage of them means you are wasting considerable die space, which has a cost. So you have to trade off the waste vs using an appropriately sized die with better yields. Or if you are Intel, you just charge an arm and a leg until someone comes along to tell you otherwise. This sort of management is Cook's bread and butter, so I'm curious how Apple tackles this, honestly.

It's partly why AMD's CCX dies are ingenious. Instead of having a big 16-core part that gets fused down to 6-8 cores, they have a smaller part with better yields, and is more flexible towards meeting demand in the market, depending on what people are actually buying.

I honestly wouldn't be too surprised if there's something chiplet like though.

The big advantage for AMD with chiplets were that they could adress desktop and server markets across the the full range with both minimal outlay and minimal design time. Efficient, but not necessarily optimal from a performance standpoint.Note that they haven’t taken this path with their GPUs, despite the die size (520mm2), nor has nVidia (628mm2).

Apples situation is not the same, nor (as far as we know) are they shooting for the server market. Honestly, if I were Apple, I would simply ditch the Mac Pro in it’s current concept, and serve distinct target markets (video?) with dedicated supplementary hardware rather than make a huge, very low volume chip with a ton of more general processing elements.

That said, I see no particular reason why Apple should have yield problems with chips in the 300mm2 range. AMD sells 520mm2 7nm dies along with 16 GB of GDDR6 + power and cooling at just over $500. Profitably. And that’s not even getting into the new game consoles that are architecturally similar to Apples new Macs with die sizes of 300-350mm2 at 7nm, 16 GB of GDDR6, 1TB of fast SSD and UHD Blu-Ray players. At $500. We really have no reason to exaggerate the difficulty of production or cost of the new Mac SoCs.

richinaus · Dec 11, 2020

xWhiplash said:
Yes...people confuse Pro with Enthusiast on this site all the time and have for years and years. As a "pro" I would like to not have to deal with opening the computer case. As an "Enthusiast" I do like opening my computer case and upgrading components (my gaming PC for example). I have two computers for two separate needs. When I want to get work done, I do not want to deal with opening my case up. I get a new machine when my needs change and I make up the price of the machine when I finish a few jobs for people.

You described me. I have no interest in tinkering with work computers, and they are there as tools to get a job done. I just get the right tools for the jobs.

if there is the equivalent of the old trash can size Mac pro that can sit on my desk and deliver great performance out the box, I will be super happy. And I can see no reason at all why this won’t happen based on the M1 so far.

Krevnik · Dec 11, 2020

EntropyQ3 said:
That said, I see no particular reason why Apple should have yield problems with chips in the 300mm2 range. AMD sells 520mm2 7nm dies along with 16 GB of GDDR6 + power and cooling at just over $500. Profitably. And that’s not even getting into the new game consoles that are architecturally similar to Apples new Macs with die sizes of 300-350mm2 at 7nm, 16 GB of GDDR6, 1TB of fast SSD and UHD Blu-Ray players. At $500. We really have no reason to exaggerate the difficulty of production or cost of the new Mac SoCs.

Note the sort of yield issues facing Nvidia and AMD on their new dies at these sizes though. And it's not so much about the architecture, it's the scale. Also note how much smaller those console dies are in order to keep costs where they want them.

These rumors are of designs that would rival designs that are already enormous in their own right, and are on two large dies as it is. I'm not saying it's impossible, but rather the costs aren't linear, and that there are reasons to not just expand the size of the die indefinitely (or just fuse insanely large parts to supply something that can be done on a much smaller die with better costs). If you think Apple can reach these levels and stay in the 300-350mm2 range on 5nm, then sure, yields will probably be fine. That said, I'm a little skeptical that these sort of rumored chips would be in that size range.

But keep in mind a chunk of what you quoted was me also arguing against multiple SoCs and the complexities those entail, not necessarily advocating for chiplets.

EntropyQ3 said:
The big advantage for AMD with chiplets were that they could adress desktop and server markets across the the full range with both minimal outlay and minimal design time. Efficient, but not necessarily optimal from a performance standpoint.

I'd argue that for Apple, the Mac Pro is a place where reducing their outlay and costs makes sense, due to the size of the market. That said, just making the customer eat those costs is another option, but I wonder if Apple sees an opportunity to make the Mac Pro a cheaper workstation again, or just doesn't care.

I have also pointed out the chipset performance issues in the post you quoted. Although it has apparently turned out to be a smaller hit in practice than originally thought by engineers. For the sort of workloads that something like the Mac Pro would be using high core counts for, it might not be a bad trade off.

EntropyQ3 said:
Apples situation is not the same, nor (as far as we know) are they shooting for the server market. Honestly, if I were Apple, I would simply ditch the Mac Pro in it’s current concept, and serve distinct target markets (video?) with dedicated supplementary hardware rather than make a huge, very low volume chip with a ton of more general processing elements.

For some of us, the supplementary hardware is more CPU cores, unfortunately.

I generally agree that going gonzo with the cores like you see in ThreadRipper or Epyc isn't what Apple has in mind. However, offloading everything to an ASIC isn't feasible either.

dmccloud · Dec 12, 2020

EntropyQ3 said:
That said, I see no particular reason why Apple should have yield problems with chips in the 300mm2 range. AMD sells 520mm2 7nm dies along with 16 GB of GDDR6 + power and cooling at just over $500. Profitably. And that’s not even getting into the new game consoles that are architecturally similar to Apples new Macs with die sizes of 300-350mm2 at 7nm, 16 GB of GDDR6, 1TB of fast SSD and UHD Blu-Ray players. At $500. We really have no reason to exaggerate the difficulty of production or cost of the new Mac SoCs.

While production of larger die sizes should not be an issue for TSMC, the console market is a bad analogy from a cost perspective because of the pricing structure used by console manufacturers. Both Microsoft and Sony sell their consoles at a bare minimum profit, if not an outright loss. For example, the PS5 costs around $450 to manufacture and the XBox Series X costs right around that $500 mark. These companies make the bulk of their revenues off the licensing fees paid by game and peripheral developers, as well as their services such as Game Pass Ultimate (XBox) and Playstation Network. Nintendo also engages in this practice. For retailers such as Best Buy, Target, Gamestop, Walmart, etc. there is no real profit made from the sale of physical media, which is a big reason why digital sales and subscriptions have become increasingly emphasized over the years. This latter reason is why Gamestop has emphasized their used software/hardware business so much over the last several years. Even the base M1 Air probably costs Apple between $700-$850 to build, mainly because of the cost of the screen.

Apple Mac Pro, how will Apple manage that huge package without using chiplets?

macrumors regular

Suspended

macrumors Core

macrumors Core

macrumors regular

macrumors 6502a

macrumors Core

macrumors regular

macrumors Core

macrumors regular

macrumors 68040

macrumors 6502a

Contributor

macrumors 6502

macrumors G5

macrumors G5

macrumors G5

macrumors Core

macrumors Core

macrumors 601

macrumors 6502a

macrumors 6502a

macrumors 68030

macrumors 601

macrumors 68040

Our Staff