Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

quarkysg

macrumors 65816
Oct 12, 2019
1,247
841
If they were planning ahead why didn't plan ahead to do a larger M1 die. Years ago also? If Apple chose to not allocation resources to do the work then it wouldn't be done. However, that non allocation would have been a planned choice. The issue if there was an early window for the M2 then there was an early window for M1 also. 00

The bigger issues that has problems is why M2-big would come before regular M2. If Apple needs a "bigger than A15" die to do 'pipe cleaning' of the new fab then the M2 is bigger. ( 120-130mm2 versus 80-90mm2 ). If they want can collect some M2's as -1 GPU and/or -1 P core to sell. And the M2 has a higher average selling price.

The other major problem with this "grand plan" is that the next generation A-series has to start high volume production every year in June so have them in relatively very large volume in late September. If the iPhone wasn't on a relatively rigid timeline then would be flex time to tweak the process with a pipecleaner. Other problem is that Apple needs hundreds if not thousands of A-series early for field validation testing around the world and validation testing. Cranking out numerous "risk production" A-series wafers anyway.

There is narrow window in the first couple of months of TSMC "at risk" production where there is probably a bigger impact of running through a 250-500mm2 dies though might help

To have had time to learn and adjust from the pipe cleaning exercise the bigger die production would have had to start back Jan-March time frame. There isn't much to validate that happened.


The A-series sells in at least one, if not two or more, orders of magnitude higher volume than the bigger M-series die will. The bigger the M-series die the quite likely the bigger of order of magnitude difference. The A-series product line isn't some poor, impoverished product line. For example, $50 A-series processor sells 100M units then a 15% cut of that is $750M . A $350 M-series processor sells 10M units , then a 15% cut of that is $525M. There is actually more money in the A-series pot than that particular M unit.

The order of magnitude gap in units sold is that would need a processor up in the $1,000 range to make a difference. The vast majority of the Mac line up doesn't support that kind of price just for the main SoC die.

for a company like AMD that doesn't have a "mega scale" volume product to lean on would shifting costs onto a relativey very low run , very high mark up product make sense. Sure. For Apple... not really. Apple has die volume that the AMD of 10 years ago had wet dreams about.
While I understand your cost analysis of the Mac vs iOS devices SoCs, I think Apple probably do not much care about the cost of each die. It would have been factored into the final sale price of their respective products anyway. Besides for higher end Macs that comes with more expensive Intel CPUs and AMD GPUs, Apple may still come out ahead in terms of BOM costs for the higher end Macs.

As to the planning of the SoC design vs product placements, only Apple really knows. We’re just having fun here speculating ?.

At the moment, we’re probably conditioned by past releases and revenue share and assume iPhones will have priority over Macs. Well, Apple is rather well known to not follow traditional playbook.

Anyway, kind of disappointed tho. that no Apple Silicon Macs were announced today. Oh well.
 

jdb8167

macrumors 601
Nov 17, 2008
4,859
4,599
Hopefully this wait means we will get some version of the M2 rather than a larger M1.
I was kind of disappointed that there was no Apple silicon announcements at WWDC but I agree that this makes it more likely that we will see a new microarchitecture and perhaps a node change for the higher end MacBook SoCs. I have to admit that the rumored timing of the new 5nm+ or 4nm node going into production last month made it unlikely that Apple could ship products before the end of July. A more than 30 day delay before shipping seemed too long.
 
  • Like
Reactions: Roode

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
While I understand your cost analysis of the Mac vs iOS devices SoCs, I think Apple probably do not much care about the cost of each die. It would have been factored into the final sale price of their respective products anyway. Besides for higher end Macs that comes with more expensive Intel CPUs and AMD GPUs, Apple may still come out ahead in terms of BOM costs for the higher end Macs.

Perhaps I wasn't clear enough in my write up. Those are not raw BOM costs for the SoCs. Those "charge back" priced modules for the products ( with Apple Tax already weaved in.). that analysis isn't about die costs to Apple at all. It was about prices customers were paying.

A
apple doesn't buy dies they buy wafers from TSMC

Die costs:


crank the defect density up to 0.3 on a 330mm2 , 5nm wafer cost at about $20,000/wafer

A14 approximately 9.4 x 9.4 mm ( ~ 88 mm2 ) fab yield 77% good dies ~627 ---> ~ $32/die (where the goods are paying for the bad ones too. )

M1 approximately 11 x 11 mm ( ~ 120 mm2) fab yield 68% , good dies ~411 ----> ~ $48/die

even if there was an even bigger die

M-way-bigger 12 x 28 mm ( ~ 330 mm2 ) Fab yield 40% good dies ~77 ----> ~$259/die

The M1 at $350 per package would have huge margin in it even after pay for more expensive packaging.
So if test and package doubled those costs $64 , $96 , $518 .... if charging back to host system $50-80 , $350 , and $1000 there is lots of margin ( never mind that the defects are already paid for. ) Yeah the total amount of margin in relative dollar figures on the $1000 option is "juicier" , but that really isn't about paying for the defects.

At defect density of 1.0 , the A14 is still ( ~358 good) around $56/die with all the defects paid for. (although effective yield now down in the 44% range.. Still higher than the "way bigger" though!!!)

If the M2 and the A15 ( at approximately same die sizes ) both shared the costs of the drive down to volume production that would lower the ramp costs for both. M2 could could get bigger share at the front if really wanted to.
The problem with using big dies are a "pipe cleaner" is that the yield rate for a constant defect density goes down pretty fast. In the 0.3 example above, the effective A14 yield is almost twice as good. Once in the the sub 50% range, there needs to be lots of money to cover that up. ( or spending relatively lots more money on defect harvesting. )

When the defect density is relatively quite high... primarily TSMC should be eating that cost ( they are primarily running wafers through to figure things out. There might be a few customers helping defer that, but it isn't the customers 'job' to pay TSMC to sort their mess. )
 
  • Like
Reactions: jdb8167

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
Hopefully this wait means we will get some version of the M2 rather than a larger M1.

Depends upon where the bottleneck is. Mini-LED screens could have cause a slide. Panel TCONs (display controllers) have had hiccups. etc. there is lots of automobile production shutdown on quirky stuff like back up camera and infotainment subsystems.

The closer it gets to overlapping with iPhone volume launch the less sense it makes. Production for the Fall iPhones is going to be a huge drain on amiable "bleeding edge " wafer starts for Apple.
 

bobcomer

macrumors 601
May 18, 2015
4,949
3,699
Do you work for Apple?
Wow, never been accused of that before. Most people think of me as an intel fanboy around here, but neither of these is true. :)

I'm not upgrading from my 2018 iPad Pro to an M1 iPad just for Siri, a largely useless so-called assistant to be 0.1s faster. Comment of the day :p
Me neither, though my iPad Pro is a 2019 if I remember correctly. The miniLED does tempt me, but 12.9 is too big. I actually use Siri some...
 

EntropyQ3

macrumors 6502a
Mar 20, 2009
718
824
Perhaps I wasn't clear enough in my write up. Those are not raw BOM costs for the SoCs. Those "charge back" priced modules for the products ( with Apple Tax already weaved in.). that analysis isn't about die costs to Apple at all. It was about prices customers were paying.

A
apple doesn't buy dies they buy wafers from TSMC

Die costs:


crank the defect density up to 0.3 on a 330mm2 , 5nm wafer cost at about $20,000/wafer

A14 approximately 9.4 x 9.4 mm ( ~ 88 mm2 ) fab yield 77% good dies ~627 ---> ~ $32/die (where the goods are paying for the bad ones too. )

M1 approximately 11 x 11 mm ( ~ 120 mm2) fab yield 68% , good dies ~411 ----> ~ $48/die

even if there was an even bigger die

M-way-bigger 12 x 28 mm ( ~ 330 mm2 ) Fab yield 40% good dies ~77 ----> ~$259/die

The M1 at $350 per package would have huge margin in it even after pay for more expensive packaging.
So if test and package doubled those costs $64 , $96 , $518 .... if charging back to host system $50-80 , $350 , and $1000 there is lots of margin ( never mind that the defects are already paid for. ) Yeah the total amount of margin in relative dollar figures on the $1000 option is "juicier" , but that really isn't about paying for the defects.

At defect density of 1.0 , the A14 is still ( ~358 good) around $56/die with all the defects paid for. (although effective yield now down in the 44% range.. Still higher than the "way bigger" though!!!)

If the M2 and the A15 ( at approximately same die sizes ) both shared the costs of the drive down to volume production that would lower the ramp costs for both. M2 could could get bigger share at the front if really wanted to.
The problem with using big dies are a "pipe cleaner" is that the yield rate for a constant defect density goes down pretty fast. In the 0.3 example above, the effective A14 yield is almost twice as good. Once in the the sub 50% range, there needs to be lots of money to cover that up. ( or spending relatively lots more money on defect harvesting. )

When the defect density is relatively quite high... primarily TSMC should be eating that cost ( they are primarily running wafers through to figure things out. There might be a few customers helping defer that, but it isn't the customers 'job' to pay TSMC to sort their mess. )
You don't have to speculate on (high) defect densities for 5nm. TSMC described them in their recent technical symposium. It's actually better than 7nm (!). Graphs are for instance in Anandtechs write-up. Which means that they are at roughly 0.05 defects per cm2 today.

Now, there are more parameters of course - you can design for yield by for instance being able to bypass a CPUcore/GPUcore/NPU or certain cache area if a defect shows up in any of these areas. On the other hand, you may have strict demands on frequency vs. power that forces you to discard some percentage (though honestly, I can't see why Apple would choose mess up their own yields on Mac chips on that account.)

Also, by industry insiders, 5nm wafer cost is a lot lower, although no-one can really figure out what Apple pays. Their relationship to TSMC and their process progression is too intimate.

Sum total: Your analysis is good napkin math, but seems to over estimate cost-per-die significantly. (Fixed costs are unfortunately harder to estimate than running material costs, which makes cost estimations for relatively low volume parts difficult.)
 
Last edited:

theorist9

macrumors 68040
May 28, 2015
3,880
3,060
The AS Mac Pro needs to be big enough to handle the thermals of a scaled AS chip, while remaining quiet.

While AS is more thermally efficient than Intel on the CPU side, is it any more thermally efficient than NVIDIA/AMD on the GPU side?

Assuming linear scaling from the current 8-core AS GPU, a 128-core AS GPU would have processing power approximately equivalent to the NVIDIA RTX A6000, which has a 300W TDP.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
While AS is more thermally efficient than Intel on the CPU side, is it any more thermally efficient than NVIDIA/AMD on the GPU side?

Thermally efficient at doing what. GPGPU tasks or putting graphics images up on the screen? For the latter Apple's GPU using a number of technicques to reduce that amount of computation and data transmission and/or copying that needs to be done. That won't necessarily transfer over to large scale bulk computations. If 40 billion numbers to compute making the smaller isn't going to help if have to maintain accuracy. ( also not going to compress well either ( unless throwing out zero's (e.g., sparse matrix) ) ).


Assuming linear scaling from the current 8-core AS GPU, a 128-core AS GPU would have processing power approximately equivalent to the NVIDIA RTX A6000, which has a 300W TDP.

Apple's GPU isn't going to scale near that A6000's bandwidth. So again depends upon what computations with what data. Something that fits in cache and the memory I/O throughput is doesn't matter as much.

The number of addres * GHz is a peak theoretical score. If there is no bandwidth to back that up then that will be a mostly theoretical score.

If Apple does the scaling so that each chiplet has GPU cores and the memory I/O is distributed over the chiplets also then this probably won't scale as linearly as most folks are projecting it too. Second, the A6000 largely doesn't have to compete with the CPU cores for bandwidth to from the VRAM.

Decent chance, Apple is going to increase GPU core count skewed toward workloads to drive 3-4 screens with different stuff on them , than to apply a higher core count to a much bigger compute to drive a single screen "faster". There is also an efficiency dimension on how well juggle multple, more weakly related tasks.

I doubt Apple is going to aim at the 3090/A600/6900 parts of the competitive line ups. iGPU is a dual edge sword. It drives higher energy efficiencies but also probably giving up on peak bandwidth ( as that tends to consume much more power. )



P.S. If buy A6000 to do Deep Learning workloads driven by CUDA optimized libraries you'll get a diferent energy efficiency versus a 3090. Similarly a A6000 versus 3090 driving some high frame rate screen is different also.

 
Last edited:

leman

macrumors Core
Original poster
Oct 14, 2008
19,521
19,678
Thermally efficient at doing what. GPGPU tasks or putting graphics images up on the screen? For the latter Apple's GPU using a number of technicques to reduce that amount of computation and data transmission and/or copying that needs to be done. That won't necessarily transfer over to large scale bulk computations. If 40 billion numbers to compute making the smaller isn't going to help if have to maintain accuracy. ( also not going to compress well either ( unless throwing out zero's (e.g., sparse matrix) ) ).

Thermally efficient at GPU compute. M1 already delivers 2.6Tflops at only 10W of power. This is a level of efficiency Nvidia or AMD can only dream about. And sure, Apple has a process advantage but that’s only a minor contributing factor.

Apple's GPU isn't going to scale near that A6000's bandwidth. So again depends upon what computations with what data. Something that fits in cache and the memory I/O throughput is doesn't matter as much.

It’s true that Apple GPUs probably won’t have that ~1TB/s pure memory bandwidth. But they also have huge caches compared to classic GPUs, and Navi2 demonstrates quite clearly what is possible. With 512MB on-chip cache Apple won’t need massive RAM bandwidth to challenge even the fastest workstation GPUs.

If Apple does the scaling so that each chiplet has GPU cores and the memory I/O is distributed over the chiplets also then this probably won't scale as linearly as most folks are projecting it too. Second, the A6000 largely doesn't have to compete with the CPU cores for bandwidth to from the VRAM.

In terms of bandwidth per GPU ALU, the difference between the A6000 and M1 is practically negligible. And M1 has larger cache to offset the rest.

I doubt Apple is going to aim at the 3090/A600/6900 parts of the competitive line ups. iGPU is a dual edge sword. It drives higher energy efficiencies but also probably giving up on peak bandwidth ( as that tends to consume much more power. )

Of course they will be aiming for those levels - and beyond. Do you really think they will want to release a Mac Pro slower than Intel machines? You are perfectly correct with regards to bandwidth, but you are neglecting the effect the cache has on the total result.

And by the way, Nvidia‘s next supercomputer platform utilizes an iGPU. They have publicly acknowledged that they have reached the wall with dGPU designs. Modern compute and ML workloads require a high level of horizontal interaction, so just throwing more and more high-latency RAM bandwidth is not working anymore.
 

AgentMcGeek

macrumors 6502
Jan 18, 2016
374
305
London, UK
The M1 offers with around 5W of GPU power what a GTX can achieve with more than a 100W. That gives you an idea of how efficient it is vs the competition.
 

pshufd

macrumors G4
Oct 24, 2013
10,149
14,574
New Hampshire
The M1 offers with around 5W of GPU power what a GTX can achieve with more than a 100W. That gives you an idea of how efficient it is vs the competition.

My GTX 1050 Ti TDP is 70 watts. It can also drive 3 4K displays and an additional QHD display. The 1050 is an old card from back around 2016. The newer cards on newer processes are likely far more efficient.
 

AgentMcGeek

macrumors 6502
Jan 18, 2016
374
305
London, UK
Display capabilities have little to do with GPU power, in our case. The M1 limitations on displays come from the chip design. I’m sure the M1X will be designed with multiple 4K displays support.
 
  • Like
Reactions: Tagbert

MrGunny94

macrumors 65816
Dec 3, 2016
1,148
675
Malaga, Spain
Honestly I'm very curious to see Apple's own interpretation of high end computing their laptop line up. I ditched my 16" for the M1 Air and I'm very very surprised by how much work I can get done with M1 and without a fan, like I don't know why I should come back to the Pro line up after all since the Air already gives a good amount of computing.

However I want a Mini-LED display and this Air was bought to be a temporary computer to sell my 16" which was on it's last legs due to that Intel chip and the hot VRAM from the 5500M.
 

pshufd

macrumors G4
Oct 24, 2013
10,149
14,574
New Hampshire
Display capabilities have little to do with GPU power, in our case. The M1 limitations on displays come from the chip design. I’m sure the M1X will be designed with multiple 4K displays support.

The M1 mini that I have has multiple 4K monitor support. Just not enough of them. Display capabilities are capabilities though. A comparison to a six year old video card, is unfair.
 

jeanlain

macrumors 68020
Mar 14, 2009
2,461
955
The M1 offers with around 5W of GPU power what a GTX can achieve with more than a 100W. That gives you an idea of how efficient it is vs the competition.
Apple themselves say that the M1 GPU is 3 times more power-efficient than the competition, not 20 times.
 

pshufd

macrumors G4
Oct 24, 2013
10,149
14,574
New Hampshire
Apple themselves say that the M1 GPU is 3 times more power-efficient than the competition, not 20 times.

It's a five-year-old card on 14 nm. AMD is on 7 nm process for their APUs and GPUs today. I think that nVidia is using Samsung and 8 nm process. Intel is using 10 nm and their 11th gen stuff looks competitive with M1. Apple is likely comparing with 10th gen Intel and AMD but both stepped up their games this year.


Screen Shot 2021-08-03 at 6.22.43 AM.png
 

jeanlain

macrumors 68020
Mar 14, 2009
2,461
955
It's a five-year-old card on 14 nm. AMD is on 7 nm process for their APUs and GPUs today. I think that nVidia is using Samsung and 8 nm process. Intel is using 10 nm and their 11th gen stuff looks competitive with M1. Apple is likely comparing with 10th gen Intel and AMD but both stepped up their games this year.
We don't know which GPU Apple compares the M1 to, but I don't think it's the 10th gen intel. The Iris Xe with 96 CUs consumes about 25W and is slower than the M1 at 10W.
AMD Vega mobile isn't better.
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,521
19,678
Apple themselves say that the M1 GPU is 3 times more power-efficient than the competition, not 20 times.

It depends on which product you are talking about/ Many gaming GPUs are low-binned part that run particularly hot. Best parts are usually reserved for laptops. The M1 is roughly equivalent to a GTX 1650 Max-Q, which is an underclocked binned TU117 chip running at 35 watts. The desktop version is a lower quality part that has some of it's shader units disabled but runs a much higher frequency with the TDP of ~70Watts — and ends up being significantly faster most of the time

It's a five-year-old card on 14 nm. AMD is on 7 nm process for their APUs and GPUs today. I think that nVidia is using Samsung and 8 nm process. Intel is using 10 nm and their 11th gen stuff looks competitive with M1. Apple is likely comparing with 10th gen Intel and AMD but both stepped up their games this year.

Transition to 7/8nm didn't give Nvidia that much, roughly 20-25% at the same power consumption. And we still haven't seen the performance of the 35W 3050 parts...

We don't know which GPU Apple compares the M1 to, but I don't think it's the 10th gen intel. The Iris Xe with 96 CUs consumes about 25W and is slower than the M1 at 10W.
AMD Vega mobile isn't better.

We have a fairly good idea, the GPUs most close to M1's performance are the older GeForce 1050/1050 Ti (desktop) or the newer mobile 1650 Max-Q. The brand new 3050 with 30-50 W TDP will be up to 50% faster of course, but the upcoming 16-core Apple GPUs with the TDP of 20-25W will blow it out of the water.
 

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
It depends on which product you are talking about/ Many gaming GPUs are low-binned part that run particularly hot. Best parts are usually reserved for laptops. The M1 is roughly equivalent to a GTX 1650 Max-Q, which is an underclocked binned TU117 chip running at 35 watts. The desktop version is a lower quality part that has some of it's shader units disabled but runs a much higher frequency with the TDP of ~70Watts — and ends up being significantly faster most of the time



Transition to 7/8nm didn't give Nvidia that much, roughly 20-25% at the same power consumption. And we still haven't seen the performance of the 35W 3050 parts...



We have a fairly good idea, the GPUs most close to M1's performance are the older GeForce 1050/1050 Ti (desktop) or the newer mobile 1650 Max-Q. The brand new 3050 with 30-50 W TDP will be up to 50% faster of course, but the upcoming 16-core Apple GPUs with the TDP of 20-25W will blow it out of the water.
First blush implies it isn't that great...
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.