Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Retskrad

macrumors regular
Apr 1, 2022
200
672
The current Apple division is riding on coattails, of the microarchitecture that the pre-M1 guys, who left Apple, built. Those guys literally changed computing. It has given Apple a 5 year lead over their competitors when it comes to efficiency, starting from 2020.
 

Boil

macrumors 68040
Oct 23, 2018
3,478
3,173
Stargate Command
  • M3 Extreme
  • 4.20GHz clock speeds
  • 64-core CPU (48P/16E)
  • 160-core GPU (w/hardware ray-tracing)
  • 64-core Neural Engine
  • 960GB ECC LPDDR5X RAM (sixteen 64GB chips)
  • 2TB/s UMA bandwidth
  • 4200 SC
  • 69000 MC
  • Nice
  • ;^p
 
Last edited:

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
If the A17 CPU and GPU cores consume more than CPU and GPU cores of previous generations, could the A17-based M3 MBA and iPad Pro have thermal issues in everyday tasks?
 

Chuckeee

macrumors 68040
Aug 18, 2023
3,065
8,730
Southern California
Video codec questions: The A17 includes a hardware video decoder. And chance the M3 machines could include a hardware AV1 encoder? Maybe even more than just a single channel?

BTW anyone have any additional details how the video hardware decoders/encoders are implemented? Are the integral to SoC? The only ones I worked with (many years ago) were custom ASIC, I would be very shocked in that is the current approach.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,678
BTW anyone have any additional details how the video hardware decoders/encoders are implemented? Are the integral to SoC? The only ones I worked with (many years ago) were custom ASIC, I would be very shocked in that is the current approach.

AFAIK, that’s exactly how it works, just that the module is integrated into the SoC itself. It’s an additional specialized coprocessor.
 

Moka Akashiya

macrumors member
Nov 19, 2020
85
219
If the A17 CPU and GPU cores consume more than CPU and GPU cores of previous generations, could the A17-based M3 MBA and iPad Pro have thermal issues in everyday tasks?
If geekerwan review is correct, and if tech process will not be updated for M3 chips, they can consume the same on same clock speed. This means that you can get hotter temps and less perf stability in hard usage tasks, like games, 3d rendering and code compilation, and little less battery life / little more battery degradation with this "overclocking" approach, if 3nm really does not provide efficiency boost. M2 MBA max cpu temp is already sad, i don't think pushing to 109°C before throttling is good for nearest internal components in the long term.
 

MayaUser

macrumors 68040
Nov 22, 2021
3,178
7,200
Result Screenshot 2023-09-24 at 10.54.51.png
 
  • Like
Reactions: scottrichardson

Icelus

macrumors 6502
Nov 3, 2018
422
578
The A17 E-core adds a third SIMD unit – 384-bit wide SIMD execution in an efficiency core! Looks like it handles most of the usual things, including FADD, but not multiplies (MUL/FMUL/PMUL/SQDMULH). I'm not sure about 3-input operations – they benchmark at 2-per-cycle, but more testing is required to determine if that's a bottleneck in a single unit, or elsewhere.

 
  • Like
Reactions: Xiao_Xi

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
A17 Pro microarchitecture according to Geekerwan:
1695543749871.png


A17 Pro microarchitecture according to Dougall:
Quick first impressions of the new P-core in the A17 Pro:
* 9 wide decode/frontend (up from 8), for 9-per-cycle MOVs (register and immediate) and NOPs.
* 8 integer units (up from 6), four of which can handle flag operations (up from 3).
* 6-per-cycle ADR (up from 4).
* Load/store/SIMD throughput seems unchanged.
* No sign of MTE or SME.
* 2-per-cycle FCMP throughput (up from 1)
* 3 (untaken) branches per cycle (up from 2)
* I also see a 0.25-cycle latency reduction on some floating-point-adds (including FMAs). More investigation needed – latency should be an integer – could be a testing error, or an existing optimisation revealed by the frontend changes. But this might imply a 1-in-4 chance of executing with 1c lower latency.
* ~320 coalesced-entry ROB (vs A15: ~293, A16: ~270)
* ~163 entry integer scheduler, with two 12-entry non-scheduling queues
* 60 entry load/store scheduler, with 10-entry non-scheduling queue
* 160 entry FP/SIMD scheduler (4x40), with 14-entry non-scheduling queue
The A17 E-core adds a third SIMD unit – 384-bit wide SIMD execution in an efficiency core! Looks like it handles most of the usual things, including FADD, but not multiplies (MUL/FMUL/PMUL/SQDMULH). I'm not sure about 3-input operations – they benchmark at 2-per-cycle, but more testing is required to determine if that's a bottleneck in a single unit, or elsewhere.

Some other changes (doubled SIMD LDP (S/D) throughput, doubled SIMD->GPR throughput), but that's the most obvious one.

IPC increase according to Geekerwan:
1695547312919.png


Dougall's comment on the IPC increase:
With the clock speed increase, the IPC gain looks bizarrely slim: 33% more integer units turns into ~3.5% more IPC.

I suspect there are two issues at play. The first is that increasing clock speed decreases IPC. Cache-misses take more cycles, and there's less per-cycle memory bandwidth. The other is that A14-A16 were already very wide, and a lot of code was already latency-bound, so the extra integer units sometimes just don't help.
 
Last edited:

PgR7

Cancelled
Sep 24, 2023
45
13
Isnt it a bit mean for Apple to increase the peak power consumption of the chip because they know that Geekbench is prepared to not throttle the chip, the test last a few seconds and then lets it cool a few seconds before the next test.

You can get extra performance for free doing that, the chip peaks at 14W but the thermal dissipation of the chassis is 4.2w for sustained loads, the A16 peaks at 9W and the 14 PM dissipates 4.4W. Thats where lots of the performance gains in this generation are. Thats why in some test that last more the differences are not big, like Antutu and 3DMark Wild Life Extreme Unlimited.

Its like having a car with an engine that have 100 horsepower and every 5 minutes you can press a button and get 10 seconds of 200 extra horsepower. And boast about the insane acceleration of your car online pressing that button, and the resto of the time you have a 100 hp car.

PD: In Max Tech comparison video 14 PM vs 15 PM, in the 3DMark test, in the graph after about the 50 seconds mark the performance is the same
 
Last edited:

scottrichardson

macrumors 6502a
Original poster
Jul 10, 2007
716
293
Ulladulla, NSW Australia
Passmark CPU benchmark ranks. I think it is phenomenal that the A17 "mobile phone" chip is sitting in 5th place in the entire world for single-threaded performance. With a boost to clock speeds likely to occur for the M3 over the A17, we could see the M3 right up there as the fastest single-threaded chip, no?

Regardless, Apple is RIGHT there guys with the fastest chips on the planet, including the 14900KF.

Screenshot 2023-09-24 at 6.17.50 pm.png
 

Andropov

macrumors 6502a
May 3, 2012
746
990
Spain
Isnt it a bit mean for Apple to increase the peak power consumption of the chip because they know that Geekbench is prepared to not throttle the chip, the test last a few seconds and then lets it cool a few seconds before the next test.

You can get extra performance for free doing that, the chip peaks at 14W but the thermal dissipation of the chassis is 4.2w for sustained loads, the A16 peaks at 9W and the 14 PM dissipates 4.4W. Thats where lots of the performance gains in this generation are. Thats why in some test that last more the differences are not big, like Antutu and 3DMark Wild Life Extreme Unlimited.
The thing is, most tasks you do on a phone (other than gaming) are not sustained. If you're editing photos on a phone, for example, it's more likely that you're applying filters to a single image rather than doing a batch image processing job for thousands of them. People usually don't use phones that way.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,678
Isnt it a bit mean for Apple to increase the peak power consumption of the chip because they know that Geekbench is prepared to not throttle the chip, the test last a few seconds and then lets it cool a few seconds before the next test.

You can get extra performance for free doing that, the chip peaks at 14W but the thermal dissipation of the chassis is 4.2w for sustained loads, the A16 peaks at 9W and the 14 PM dissipates 4.4W. Thats where lots of the performance gains in this generation are. Thats why in some test that last more the differences are not big, like Antutu and 3DMark Wild Life Extreme Unlimited.

Its like having a car with an engine that have 100 horsepower and every 5 minutes you can press a button and get 10 seconds of 200 extra horsepower. And boast about the insane acceleration of your car online pressing that button, and the resto of the time you have a 100 hp car.

PD: In Max Tech comparison video 14 PM vs 15 PM, in the 3DMark test, in the graph after about the 50 seconds mark the performance is the same

That’s pretty much why I think that this core was designed for desktop use.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,678
Passmark CPU benchmark ranks. I think it is phenomenal that the A17 "mobile phone" chip is sitting in 5th place in the entire world for single-threaded performance. With a boost to clock speeds likely to occur for the M3 over the A17, we could see the M3 right up there as the fastest single-threaded chip, no?

Passmark is a famously bad microbenchmark that doesn’t predict real-world performance very well. A17 Pro probably is particularly favored thanks its wider int backend. Doesn’t change the fact that A17 is very impressive of course.
 

quarkysg

macrumors 65816
Oct 12, 2019
1,247
841
The thing is, most tasks you do on a phone (other than gaming) are not sustained. If you're editing photos on a phone, for example, it's more likely that you're applying filters to a single image rather than doing a batch image processing job for thousands of them. People usually don't use phones that way.
So in other words, Apple’s engineers knows what they are doing?

Maybe folks should cool down and wait for real work usage feedbacks?

Maybe it is as @leman posited, most likely the A17 Pro is laying the foundation for the M3, geared towards notebook and desktop performance.
 

PgR7

Cancelled
Sep 24, 2023
45
13
Passmark CPU benchmark ranks. I think it is phenomenal that the A17 "mobile phone" chip is sitting in 5th place in the entire world for single-threaded performance. With a boost to clock speeds likely to occur for the M3 over the A17, we could see the M3 right up there as the fastest single-threaded chip, no?

Regardless, Apple is RIGHT there guys with the fastest chips on the planet, including the 14900KF.

View attachment 2279145
It is for the first 10 seconds 😂
 

PgR7

Cancelled
Sep 24, 2023
45
13
The thing is, most tasks you do on a phone (other than gaming) are not sustained. If you're editing photos on a phone, for example, it's more likely that you're applying filters to a single image rather than doing a batch image processing job for thousands of them. People usually don't use phones that way.
Well gaming is a sustained workload, its an use case where you want the performance, and I would say people charging the phone and using it at the same time can get thermal throtling and battery will stop charging. I would say in summer recording videos it will happen too. A guy that is constantly loading tweets, instagram posts and Tik toks, just by constantly hitting more than 4.2W the thermal throttle will end happening
 

PgR7

Cancelled
Sep 24, 2023
45
13
It becomes a bit less funny when you put it n a laptop. Than suddenly Apple has a passively cooled laptop that can go toe to toe with fastest desktop CPUs on the planet.

“Lost all their engineers”, they say.
A passively cooled M2 laptop only ties the desktop CPU in a single core usage, in multi core It smokes It, and on sustained loads even more
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.