Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

MRMSFC

macrumors 6502
Jul 6, 2023
371
381
It feels like there hasn’t been a performance jump, at least in the P-cores since the A14. What happened?
Off the top of my head it seems like the processor isn’t scaling linearly with memory bandwidth like before, where “easy” increases could just come by increasing the bandwidth to feed the cores.

EDIT:
Reading between the lines of the keynote, I would guess that the processor team was focused on improving the NPU and GPU.

Granted I’m not privy to how Apple’s SoC team is organized, but I suspect it’s tightly integrated and shifts priorities around a lot.
 
  • Like
Reactions: Adult80HD

leman

macrumors Core
Oct 14, 2008
19,521
19,675
The following chart, created by our own @Andropov, as well as data posted by @name99 earlier demonstrate that the performance of A-series shows a clear linear trend in time. We are talking about fairly constant 250-300 GB6 CPU points per generation, on average. Similar can be observed for MC scores (here there is more variation, most likely due to E-cores).

In fact, there appears to be a bigger jump in performance from A16 to A17 than from A13 to A14. It's just that linear improvements on lower scores yield higher percentages. And btw, the difference between A16 and early A17 benchmarks is comparable to the difference in SC performance between Intel Rocket Lake and Intel Alder Lake, which has been universally praised as the single largest generational improvement in x86 history of the last decade. So I feel like we have been applying opportunistic metrics to these processors.

Bottomline: the performance has steadily improved in linear fashion since at least A10 and shows no signs of slowing down. The only reason why we are having this conversation is because we thought that the improvements were multiplicative while in fact they are additive. Which frankly, makes much more sense if we look at how hardware advancements actually work in practice.

screenshot-2023-09-14-at-13-11-19-png.25904
 
Last edited:

sack_peak

Suspended
Sep 3, 2023
1,020
959
Intel desktop chips vs Apple iPhone chips

perf-trajectory.png



screenshot-2023-09-14-at-13-11-19-png.25904


I'd like to add some data points to this to make more sense in relation to what was the 1st iPhones to have them

Year of ReleaseiPhone chipDie shrinkiPhone model
2023A17 Pro3nmiPhone 15 Pro
2022A16 Bionic5nmiPhone 14 Pro
2021A15 Bionic5nmiPhone 13 Pro
2020A14 Bionic5nmiPhone 12 Pro
2019A13 Bionic7nmiPhone 11 Pro
2018A12 Bionic7nmiPhone Xs
2017A11 Bionic10nmiPhone 8
2016A10 Fusion16nmiPhone 7
2015A914nmiPhone 6s

Future Die shrinks that Apple always has priority for

- 2024: N2 2nm
- 2026: A14 1.4nm
- 2028: A10 1.0nm
- 2030: A7 0.7nm
- 2032: A5 0.5nm
- 2034: A3 0.3nm
- 2036: A2 0.2nm

Imagine if you only replace every 8 years as replacing the battery was that easy.

So 2015 14nm > 2023 3nm > 2031 0.7nm.
 
Last edited:

sack_peak

Suspended
Sep 3, 2023
1,020
959
There are those of us disappointed that the (unconfirmed) Geekbench SC score for the mobile SoC doesn't outperform the currently-shipping, best desktop x86 chips.
Can anyone confirm that the A17 Pro outperform the top-end 2019 Mac Pro Xeon?
 

Macintosh IIcx

macrumors 6502a
Jul 3, 2014
625
612
Denmark
The following chart, created by our own @Andropov, as well as data posted by @name99 earlier demonstrate that the performance of A-series shows a clear linear trend in time. We are talking about fairly constant 250-300 GB6 CPU points per generation, on average. Similar can be observed for MC scores (here there is more variation, most likely due to E-cores).

In fact, there appears to be a bigger jump in performance from A16 to A17 than from A13 to A14. It's just that linear improvements on lower scores yield higher percentages. And btw, the difference between A16 and early A17 benchmarks is comparable to the difference in SC performance between Intel Rocket Lake and Intel Alder Lake, which has been universally praised as the single largest generational improvement in x86 history of the last decade. So I feel like we have been applying opportunistic metrics to these processors.

Bottomline: the performance has steadily improved in linear fashion since at least A10 and shows no signs of slowing down. The only reason why we are having this conversation is because we thought that the improvements were multiplicative while in fact they are additive. Which frankly, makes much more sense if we look at how hardware advancements actually work in practice.

screenshot-2023-09-14-at-13-11-19-png.25904
Something else to keep in mind: It can be very challenging to both move to a new process node (3nm) and rewrite an architecture without running into nasty bugs (remember Intel’s tick-tock concept). So, since Apple was rewriting the architecture for the GPU, I’m not surprised they decided to go conservative with the CPU this time. Remember that this is a chip where failure is not an option for the company.
 
  • Like
Reactions: name99

PaulD-UK

macrumors 6502a
Oct 23, 2009
906
510
Quote: Future Die shrinks that Apple always has priority for

- 2024: N2 2nm
- 2026: A14 1.4nm
- 2028: A10 1.0nm
- 2030: A7 0.7nm
- 2032: A5 0.5nm
- 2034: A3 0.3nm
- 2036: A2 0.2nm

Quote: According to Apple, these (M3) transistors are so small that some of their elements are only 12 silicon atoms wide…

So by 2036 Apple will be at the Top Quark + Higgs Boson quantum level.
Haha, can’t wait :)
 

sack_peak

Suspended
Sep 3, 2023
1,020
959
Quote: Future Die shrinks that Apple always has priority for

- 2024: N2 2nm
- 2026: A14 1.4nm
- 2028: A10 1.0nm
- 2030: A7 0.7nm
- 2032: A5 0.5nm
- 2034: A3 0.3nm
- 2036: A2 0.2nm

Quote: According to Apple, these (M3) transistors are so small that some of their elements are only 12 silicon atoms wide…

So by 2036 Apple will be at the Top Quark + Higgs Boson quantum level.
Haha, can’t wait :)
No human problem cannot be licked with enough R&D money, self interest and smarts.

Just see COVID vaccines... years before the pandemic Bill Gates said in a TED Talk that it would take 3 years for any future pandemic vaccine tto get done.

It took the world about a year or 2. The incentive to vaccinated 8 billion souls is too tempting not to pursue.
 
Last edited:

kiranmk2

macrumors 68000
Oct 4, 2008
1,666
2,308
i posted in the news thread that comparing the A17 to the A16 doesn't really predict a lot about M3 improvements. There has been an assumption that the M2 was based on the A15 and the M3 would skip the "disappointing" A16 and, instead be based on the A17 given the release schedules. However, what if the A17 is actually just the A16 chip that was supposed to be launched last year, but was pulled late in development because of how power hungry the GPU/ray tracing was? This would explain why the A17 seems to simply scale with frequency (i.e. no really IPC improvements) and number of GPU cores. Now, whether they managed to get the GPU power demand in check or just spent the 3 nm process efficiency gains on this can be argued about, but I feel it's a better predictor of the performance improvement for M3 over M2 to look at the A17 improvement over A15. From this (after a very cursory glance at some Geekbench 6 scores) I would say the M3 single core score is going to be 20-25% faster than M2.
 

cbum

macrumors member
Jun 16, 2015
57
42
Baltimore
Quote: According to Apple, these (M3) transistors are so small that some of their elements are only 12 silicon atoms wide…

So by 2036 Apple will be at the Top Quark + Higgs Boson quantum level.
Personally, I'm looking forward to finally reaching negative widths, taking advantage of spooky interactions...
 

dmccloud

macrumors 68040
Sep 7, 2009
3,142
1,899
Anchorage, AK
i posted in the news thread that comparing the A17 to the A16 doesn't really predict a lot about M3 improvements. There has been an assumption that the M2 was based on the A15 and the M3 would skip the "disappointing" A16 and, instead be based on the A17 given the release schedules. However, what if the A17 is actually just the A16 chip that was supposed to be launched last year, but was pulled late in development because of how power hungry the GPU/ray tracing was? This would explain why the A17 seems to simply scale with frequency (i.e. no really IPC improvements) and number of GPU cores. Now, whether they managed to get the GPU power demand in check or just spent the 3 nm process efficiency gains on this can be argued about, but I feel it's a better predictor of the performance improvement for M3 over M2 to look at the A17 improvement over A15. From this (after a very cursory glance at some Geekbench 6 scores) I would say the M3 single core score is going to be 20-25% faster than M2.

The A16 was still a 5nm design, and A17 Pro is a 3nm design, so it's highly doubtful Apple designed it for 3nm. Regarding the concerns over IPC compared to A16, switching to a new process often means delaying internal improvements that might affect IPC or other performance aspects. This is where Intel's "tick tock" approach has come into play at various times over the years.
 

scottrichardson

macrumors 6502a
Original poster
Jul 10, 2007
716
293
Ulladulla, NSW Australia
Some key takeaways, the A17...

- has SUBSTANTIALLY MORE transistors at 19 billion. That's 19% more than A16.
- runs at slightly LOWER power of ~7.5 watts versus ~8.5 watts on A16 (to be confirmed)
- increases clock speeds on the P-cores by 10%. We don't know if the E-cores also got a boost.
- P-cores have wider decode & execution units, improved branch prediction.
- GPU cores have improved efficiency, mesh shaders, and dedicated hardware ray-tracing silicon.
- has an additional GPU core for a total of 6 cores vs 5 in the A16.
- doubles the performance of the neural engine to 35 TOPS
- adds AV1 decode support
- adds USBC controller

So, if one is to take an objective look at that:

✅ higher clock speed
✅ 19% more transistors
✅ same or lower power consumption
✅ extra GPU core
✅ dedicated ray-tracing hardware
✅ 2 x faster neural engine
✅ AV1 decoder
✅ USBC

This is a SUBSTANTIAL upgrade. It's easy to get lost on the ~10% speed bump, but there's a lot more going on here. That 10% is really only the P-core performance. However what we need to see are real-world results once you factor in workloads that utilise the neural engine, as well as instances where AV1 is being decoded. Tasks bound to the neural engine will be TWICE as fast. That's SIGNIFICANT.

I would say Apple has utilised ALL of the purported benefits of TSMCs 3nm process in a very balanced way.

✅ more efficient = SAME battery life with 19% more hardware, and 10% speed increase

It could have been balanced differently. If they kept the GPU at 5 cores, we'd have probably gained some battery life and consumed less power. Assuming that the base design of the GPU is largely the same, with the ADDITION of mesh shaders and ray-tracing, which only benefit performance when those features are engaged.

Or they could have pushed the clock slightly higher while dropping back to 5 GPU cores, which would have improved performance to say 12%, but we'd be missing a GPU core and battery life would remain the same.

It's all trade-offs, right?

I guess where most people are disappointed is that the apparent architectural changes to the P-cores have not resulted in substantial clock-for-clock / IPC improvements? Someone smarter than I will be able to explain that widening the decode/execution units affects IPC versus clock speed.

By the looks of things, the P-cores may be very similar to the A16 - which is not a bad thing. Moving to a new process node can introduce its own bugs and issues, so keeping the main logic of the CPU cores the same means less to have to fix. Instead, using the new node to add additional silicon to other areas while ramping clock speed is a safer bet. It's likely then, that the A18 may have some further architectural changes to the processor cores over A17.

How does this related to M3 generation?

So I guess the big question is: is this A17 the precursor to M3? In all likelihood, yes.

Given that we know the A16 was meant to ship with the ray-tracing GPU, AND considering the M1 was A14, the M2 was A15, we can ONLY ASSUME the M3 is going to be based on A16. HOWEVER, note that the A16 was actually largely meant to be the A17 with ray-tracing. So, one can conclude that the M3 will have feature parity with the A17.

The bigger question to ask is what DIFFERENCES will there be between the M3 Mac chips and the A17.

What differences were there between an M1 and an A14? What differences were there between an M2 and an A15?

Ignoring core-counts...

Clocks will be faster on the M3 over the A17, that's almost a certainty. We're looking like 3.9GHz - 4.0Ghz is a realistic estimate.

Will memory subsystem be different?

Lots to discuss!
 

Chuckeee

macrumors 68040
Aug 18, 2023
3,064
8,724
Southern California
- has SUBSTANTIALLY MORE transistors at 19 billion. That's 19% more than A16.
- runs at slightly LOWER power of ~7.5 watts versus ~8.5 watts on A16 (to be confirmed)
- increases clock speeds on the P-cores by 10%. We don't know if the E-cores also got a boost.
- P-cores have wider decode & execution units, improved branch prediction.
- GPU cores have improved efficiency, mesh shaders, and dedicated hardware ray-tracing silicon.
- has an additional GPU core for a total of 6 cores vs 5 in the A16.
- doubles the performance of the neural engine to 35 TOPS
- adds AV1 decode support
- adds USBC controller

So, if one is to take an objective look at that:

✅ higher clock speed
✅ 19% more transistors
✅ same or lower power consumption
✅ extra GPU core
✅ dedicated ray-tracing hardware
✅ 2 x faster neural engine
✅ AV1 decoder
✅ USBC

This is a SUBSTANTIAL upgrade. It's easy to get lost on the ~10% speed bump, but there's a lot more going on here. That 10% is really only the P-core performance. However what we need to see are real-world results once you factor in workloads that utilise the neural engine, as well as instances where AV1 is being decoded. Tasks bound to the neural engine will be TWICE as fast. That's SIGNIFICANT.

I would say Apple has utilised ALL of the purported benefits of TSMCs 3nm process in a very balanced way.

✅ more efficient = SAME battery life with 19% more hardware, and 10% speed increase

All that AND you can use it to make phone call too!!!
 

Confused-User

macrumors 6502a
Oct 14, 2014
852
987
The surprising numbers and other info we're seeing is leading me to wonder if all our expectations are wrong.

While these aren't exactly the same CPU cores as the A16, they aren't very different. Assume the A17 has a GB6 score of 2900, and clocks at 3778MHz, while the A16 is 2650 at 3460MHz. Then isoclock the performance boost is 2900/3778 (=.7676) / 2650/3460 (.7658), or... nothing. A meaningless fraction negative, even.

What's happening here? Even for bad old Intel this would be an embarrassment. At least, if performance was a major goal. But maybe it wasn't, at least for CPUs, this year.

And then there's the GPUs. No performance uplift at all- the extra 20% comes from the sixth "core". Of course they're rearchitected to support RT, which is a big deal, if less than critical for a phone. (This assumes the same clock. We don't know that yet for sure, and it'll be interesting to see.)

So maybe what we're seeing here is the original A16-N3 design, which they had to shelve when they realized that N3 wasn't going to be ready in time - built for N3 but with only a few other changes. Perhaps they decided that N3B is a dead-end process, and they'd rather spend their engineering resources on a process that has legs (N3E and followons). For the A17 CPUs, the clock gains from N3B and some tiny improvements in branch prediction and widening a few small bits here and there - rather than a significantly wider arch we figured on, along with - are good enough, since no processor existing or expected in the next year can likely match even the A16.

This would dovetail well with the disappointing and (to me at least) surprising rumor of no M3s shipping this year. Maybe they put all their CPU efforts into an N3E core, which will ship next year in the M3 as their first N3E chip. It would then follow later in the A18 - perhaps even with some modest adjustments.

Of course this would be a huge turnaround - we're used to the phone SoC setting the pace, and everything else using the building blocks from that chip. (This is what I meant about our expectations being wrong.) But that doesn't have to be the case, and in fact it tends to be a headache for marketing: "Why is my $15K Mac Pro (or $6K Mac Studio) slower than my $1K iphone??"... yes, single-core of course. And if future cores shipped first on Macs, the lower initial volume (compared to iPhones) would have other advantages as well.

It will be interesting to see the M3 when it finally ships. It's been an *intensely* frustrating year so far...
 
  • Like
Reactions: SousVide

name99

macrumors 68020
Jun 21, 2004
2,410
2,318
I would say Apple has utilised ALL of the purported benefits of TSMCs 3nm process in a very balanced way.
I guess where most people are disappointed is that the apparent architectural changes to the P-cores have not resulted in substantial clock-for-clock / IPC improvements? Someone smarter than I will be able to explain that widening the decode/execution units affects IPC versus clock speed.

By the looks of things, the P-cores may be very similar to the A16 - which is not a bad thing. Moving to a new process node can introduce its own bugs and issues, so keeping the main logic of the CPU cores the same means less to have to fix. Instead, using the new node to add additional silicon to other areas while ramping clock speed is a safer bet. It's likely then, that the A18 may have some further architectural changes to the processor cores over A17.
This is a nicely balanced article. In terms of CPU IPC and micro-architecture (my primary interest) I think one could make two related claims:

- designing CPUs gets harder as you get closer to the leading edge. How do you deal with this? One possibility is that (at least for anything risky) you duplicate the functionality, old-style and new-style, and hide the new style behind "chicken bits" (ie bits the OS can flip if we know that the new style is safe vs not being safe). It's possible that there's a lot of new material hidden behind chicken bits, and we don't see its value either because it doesn't work, or because the new-style implementation was sized to match old style performance so we could test it but no more than that. Depending on the pipeline, some of this may even be fixed/scaled up and working well on M3 or Pro/Max? Certainly it would fit Apple's general "never do two things when you can reuse something" philosophy to to use A chips *also* as a test vector for fancy functionality to be added to M chips.

(What sort of functionality do I mean? There's a bunch of stuff related to the high end, like virtualization, handling large pages, a cache protocol that scales to very large designs, etc. This stuff is HARD, and spinning multiple test chips is expensive. But if Apple can get basic testing done on A and M chips, where it doesn't matter if it fails [A] or has poor scaling [M] that's a real win.)

- We all knew that N3 was not going to offer much in terms of scaling, because while the transistors are smaller, the wiring is not shrinking as much. BSPD will offer a one-time help for that, with slow improvement after that as more functionality (clocks after power) moves to the backside. But wiring is going to be an on-going problem going forward, only getting worse as the transistors get smaller (forksheet, then CFET). It may be necessary to rethink a lot of elements of design given this fact.
One reason Apple did so well once they got started is that they based their designs aggressively on the fact that transistors and wiring were getting cheaper, frequency was not improving as much. They are surely more aware than I am about how this balance changes going forward, that transistors are contiuning to get cheaper but wires are not; and design should adapt accordingly. Perhaps A17 (maybe even starting with A16) are gradual redesigns to prepare for this future?

Intel and AMD haven't yet had to confront this; their transistor density is a third or so that of Apple's so their wiring is much less dense. Of course the price they pay for that is larger chips and much hotter... Are they adapting their designs appropriately? Who knows?
nVidia is the other interesting case, and more like Apple in terms of seeing the layout of the future and adapting as necessary...
 

scottrichardson

macrumors 6502a
Original poster
Jul 10, 2007
716
293
Ulladulla, NSW Australia
- designing CPUs gets harder as you get closer to the leading edge. How do you deal with this? One possibility is that (at least for anything risky) you duplicate the functionality, old-style and new-style, and hide the new style behind "chicken bits" (ie bits the OS can flip if we know that the new style is safe vs not being safe). It's possible that there's a lot of new material hidden behind chicken bits, and we don't see its value either because it doesn't work, or because the new-style implementation was sized to match old style performance so we could test it but no more than that. Depending on the pipeline, some of this may even be fixed/scaled up and working well on M3 or Pro/Max? Certainly it would fit Apple's general "never do two things when you can reuse something" philosophy to to use A chips *also* as a test vector for fancy functionality to be added to M chips.

(What sort of functionality do I mean? There's a bunch of stuff related to the high end, like virtualization, handling large pages, a cache protocol that scales to very large designs, etc. This stuff is HARD, and spinning multiple test chips is expensive. But if Apple can get basic testing done on A and M chips, where it doesn't matter if it fails [A] or has poor scaling [M] that's a real win.)

This is super-interesting. Is there any evidence of such activity on previous M1 and M2 or A14-16 silicon?
 

MayaUser

macrumors 68040
Nov 22, 2021
3,177
7,196
All that AND you can use it to make phone call too!!!
here, we care about the M3...while iphones are fast enough, for the Mac everything matters, too fast will never be fast enough...so A17 is a base of understanding for what we are looking for the rest of the M3 family
I think this is a great update to have all that extra without drawing more power
Even the ipad pro and Macbook Air with M3 will get over 15% boost in Cpu if the M3 will be based on the A17 and not A16
 
Last edited:

sack_peak

Suspended
Sep 3, 2023
1,020
959
Some key takeaways, the A17...

- has SUBSTANTIALLY MORE transistors at 19 billion. That's 19% more than A16.
- runs at slightly LOWER power of ~7.5 watts versus ~8.5 watts on A16 (to be confirmed)
- increases clock speeds on the P-cores by 10%. We don't know if the E-cores also got a boost.
- P-cores have wider decode & execution units, improved branch prediction.
- GPU cores have improved efficiency, mesh shaders, and dedicated hardware ray-tracing silicon.
- has an additional GPU core for a total of 6 cores vs 5 in the A16.
- doubles the performance of the neural engine to 35 TOPS
- adds AV1 decode support
- adds USBC controller

So, if one is to take an objective look at that:

✅ higher clock speed
✅ 19% more transistors
✅ same or lower power consumption
✅ extra GPU core
✅ dedicated ray-tracing hardware
✅ 2 x faster neural engine
✅ AV1 decoder
✅ USBC

This is a SUBSTANTIAL upgrade. It's easy to get lost on the ~10% speed bump, but there's a lot more going on here. That 10% is really only the P-core performance. However what we need to see are real-world results once you factor in workloads that utilise the neural engine, as well as instances where AV1 is being decoded. Tasks bound to the neural engine will be TWICE as fast. That's SIGNIFICANT.

I would say Apple has utilised ALL of the purported benefits of TSMCs 3nm process in a very balanced way.

✅ more efficient = SAME battery life with 19% more hardware, and 10% speed increase

It could have been balanced differently. If they kept the GPU at 5 cores, we'd have probably gained some battery life and consumed less power. Assuming that the base design of the GPU is largely the same, with the ADDITION of mesh shaders and ray-tracing, which only benefit performance when those features are engaged.

Or they could have pushed the clock slightly higher while dropping back to 5 GPU cores, which would have improved performance to say 12%, but we'd be missing a GPU core and battery life would remain the same.

It's all trade-offs, right?

I guess where most people are disappointed is that the apparent architectural changes to the P-cores have not resulted in substantial clock-for-clock / IPC improvements? Someone smarter than I will be able to explain that widening the decode/execution units affects IPC versus clock speed.

By the looks of things, the P-cores may be very similar to the A16 - which is not a bad thing. Moving to a new process node can introduce its own bugs and issues, so keeping the main logic of the CPU cores the same means less to have to fix. Instead, using the new node to add additional silicon to other areas while ramping clock speed is a safer bet. It's likely then, that the A18 may have some further architectural changes to the processor cores over A17.

How does this related to M3 generation?

So I guess the big question is: is this A17 the precursor to M3? In all likelihood, yes.

Given that we know the A16 was meant to ship with the ray-tracing GPU, AND considering the M1 was A14, the M2 was A15, we can ONLY ASSUME the M3 is going to be based on A16. HOWEVER, note that the A16 was actually largely meant to be the A17 with ray-tracing. So, one can conclude that the M3 will have feature parity with the A17.

The bigger question to ask is what DIFFERENCES will there be between the M3 Mac chips and the A17.

What differences were there between an M1 and an A14? What differences were there between an M2 and an A15?

Ignoring core-counts...

Clocks will be faster on the M3 over the A17, that's almost a certainty. We're looking like 3.9GHz - 4.0Ghz is a realistic estimate.

Will memory subsystem be different?

Lots to discuss!
You forgot to mention that the iPhone 15 lineup weighs significantly lighter than the iPhone 14 lineup while being slightly smaller in volume at the same battery life.

Apple could have maintained the previous weight and volume and provide for more battery life.

Personally I'd prefer more battery life approaching that of a 2000 Nokia 3310 at

- same weight & volume as iPhone 14 lineup
- A17 Pro's current improved performance per watt
- A17 Pro's current improved raw performance
- A17 Pro's current improved power consumption

A contributing factor to 2014 iPhone 6 & 6 Plus' Bendgate was the design focus on thinner and lighter devices.
 
Last edited:

jeanlain

macrumors 68020
Mar 14, 2009
2,459
953
The following chart, created by our own @Andropov, as well as data posted by @name99 earlier demonstrate that the performance of A-series shows a clear linear trend in time.
Which means that the performance ratio with the previous generation is decreasing.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.