Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

senttoschool

macrumors 68030
Original poster
Nov 2, 2017
2,571
5,325
  1. M3 NPU is only 18 TOPS vs 35 TOPS on A17 Pro
  2. M3 has "Dynamic Cache" in hardware but A17 Pro does not
  3. No mentions of wider CPU cores like A17 Pro.
  4. M3's P core is "only 30% faster" than M1. M1 Max GB6 = 2375 * 1.3 = 3087.5. A17 Pro is ~2950. M3's P core is much slower than expected if it's based on A17 Pro.

1698720762680.png
 
Last edited:

senttoschool

macrumors 68030
Original poster
Nov 2, 2017
2,571
5,325
It actually said wider cores:
View attachment 2304722

That vague graph is not helpful because we don't know what the metric is. In GB6, 15% faster than M2 is more than 30% faster than M1.
I did not see this. That makes it slightly more likely that it's somehow related to A17 Pro. But yes, I also noticed that 15% faster than M2 is slightly more than 30% faster than M1. Still less than expected.

I expected 26% faster than M2. A15 --> A17 Pro = 1.26x faster ST GB6. Therefore, M3 should be roughly 26% faster ST over M2 in ST GB6. It should be around 3500 score. Instead, M3 is closer to 3100. A significant difference in expectation.
 
Last edited:

Gnattu

macrumors 65816
Sep 18, 2020
1,027
1,401
I did not see this. That makes it slightly more likely that it's somehow related to A17 Pro. But yes, I also noticed that 15% faster than M2 is slightly more than 30% faster than M1. Still less than expected.

I expected 26% faster than M2. A15 --> A17 Pro = 1.26x faster ST GB6. Therefore, M3 should be roughly 26% faster ST over M2 in ST GB6. It should be around 3500 score. Instead, M3 is closer to 3100. A significant difference in expectation.
A15 to A17 Pro has a lot of frequency bump, but obviously M2 series already comes with a very high clock and it is hard to push further.
 

deconstruct60

macrumors G5
Mar 10, 2009
12,311
3,902
  1. M3 NPU is only 18 TOPS vs 35 TOPS on A17 Pro


There are lots of vendors that play "fast and loose" on TOPS figures. If compare generation 'n' FP16 TOPs to generation's 'n+1" FP8 TOPs then mainly playing Apples-to-Oranges.

Qualcomm's super-duper numbers are mostly based on limbo-ing down to INT4.

The relevaant issues is trillion operations of WHAT? FP16 BP16 , INT8 , INT4 are all four substantially different things. Apple doesn't say on the write ups I've seen. NPU 60% faster gen-over-gen on FP16better than hand waving by dropping down to INT8 or INT4 and getting a 2x. Well duh you dumped gobs of resolution so yeah it gets easier to count the longer vector as more ops.





 

leman

macrumors Core
Oct 14, 2008
19,302
19,285
It’s the same core. I know this is a rumors forums, but jets not go crazy.

  1. M3 NPU is only 18 TOPS vs 35 TOPS on A17 Pro

Depends on how one counts. I’m sure it’s the same NPU, maybe marketing department did an uupsie.

  1. M3 has "Dynamic Cache" in hardware but A17 Pro does

Yes it does. They mentioned better support for complex shaders on A17. It’s the same thing.

  1. No mentions of wider CPU cores like A17 Pro.

Yes they did.

  1. M3's P core is "only 30% faster" than M1. M1 Max GB6 = 2375 * 1.3 = 3087.5. A17 Pro is ~2950. M3's P core is much slower than expected if it's based on A17 Pro.

Yes, M3 is likely clocked at 3.9ghz. That lower then expected, sure, and refutes my hypothesis of A17 cores having better frequency scaling. It’s still the same core though.
 

senttoschool

macrumors 68030
Original poster
Nov 2, 2017
2,571
5,325
Depends on how one counts. I’m sure it’s the same NPU, maybe marketing department did an uupsie.
You really think so? What evidence do you have?

There was no mention of the NPU in the video as far as I remember. I'm not going to believe that Apple erroneously forgot to mention the NPU in the video and put the wrong TOPS figure on its marketing website. The 18 TOPS figure is still live on the website.

Yes they did.
No need to quote that and mention it again. @Gnattu already mentioned it and I crossed it out long ago.

What's going on @leman? Working OT on M3 speculation lately? I notice that you're much more defensive lately.
 

leman

macrumors Core
Oct 14, 2008
19,302
19,285
You really think so? What evidence do you have?

Well, look, it’s like this. Either Apple has backported the M2 ANE to 3nm and slightly overclocked it, or they use the already existing new 3nm ANE found in the A17. Which do you think is more likely?

Besides, the quoted throughput for the M3 ANE is almost exactly half of the A17 ANE. This makes me think that they are using different metrics (it would be an odd thing to do but nothing that didn’t happen before). Maybe one is INT8 and another FP16. Or FP16 and FP32.

The conclusive evidence will be provided by comparing the die shots. I’m not trained in this so I can’t do this. Hopefully more knowledgeable people will chime in.

I notice that you're much more defensive lately.

I love speculating like any other person here (and you know it), but lately there has been too much arbitrary conjecture for my taste. I think speculation should be grounded in reality to some degree. Like, why would one expect 30% perf improvement at iso power if TSMC was very clear that it will be 15% at most.

That said, I am a bit disappointed with the the M3 announcement. I was hoping for higher clocks and more substantial redesign (finally SVE support for example). My theory with better scaling of the 3N cores at higher frequencies is also pretty much out of the window. And I’m not happy at all about the bean-counter min-maxing approach they took with the M3 family, I think they should have taken that $20-30 hit per system and delivered a more consistent lineup (and I say it as Apple shareholder). In short, I think there are plenty of things to criticize with this announcement (as well as plenty of things to be exited about). We don’t have to make stuff up.
 

senttoschool

macrumors 68030
Original poster
Nov 2, 2017
2,571
5,325
Well, look, it’s like this. Either Apple has backported the M2 ANE to 3nm and slightly overclocked it, or they use the already existing new 3nm ANE found in the A17. Which do you think is more likely?

Besides, the quoted throughput for the M3 ANE is almost exactly half of the A17 ANE. This makes me think that they are using different metrics (it would be an odd thing to do but nothing that didn’t happen before). Maybe one is INT8 and another FP16. Or FP16 and FP32.
I suppose they could have deemed the NPU as not important this generation and in order to increase yields, decided to chop off half of it.
 

senttoschool

macrumors 68030
Original poster
Nov 2, 2017
2,571
5,325
It's a free world.
At least one very qualified person is saying it's a different GPU arch.

So now we have 3 pieces of evidence:

1. Unexpectedly much slower CPU ST if it's the same arch as A17 Pro
2. Different GPU features
3. Half the NPU of A17 Pro

So you still think it's "nonsense" for at least bringing this topic up?

Occam's razor suggests that Apple's M series has deviated from its A series. Maybe not all of it. But certainly for a large chunk of it it seems.
 
  • Like
Reactions: ric22

MayaUser

macrumors 68030
Nov 22, 2021
2,869
6,163
At least one very qualified person is saying it's a different GPU arch.

So now we have 3 pieces of evidence:

1. Unexpectedly much slower CPU ST if it's the same arch as A17 Pro
2. Different GPU features
3. Half the NPU of A17 Pro

So you still think it's "nonsense" for at least bringing this topic up?

Occam's razor suggests that Apple's M series has deviated from its A series. Maybe not all of it. But certainly for a large chunk of it it seems.
So if are not based on A17, on what then?
 

leman

macrumors Core
Oct 14, 2008
19,302
19,285
At least one very qualified person is saying it's a different GPU arch.

I don’t know who this person is. I looked through their twitter history and they are talking about industry trends and market prognosis. It doesn’t seem to me like they are a GPU engineer or programmer.

Don’t get me wrong, maybe they know something I don’t. But I’d rather hear it from the horses mouth. When introducing A17 Apple said they have redesigned the GPU to have better efficiency with complex workloads. This is exactly what Dynamic Cache is about. So following Occam’s Razor it makes sense for me to assume that they are the same thing.

As to the rest.

1. Unexpectedly much slower CPU ST if it's the same arch as A17 Pro

I am not sure how you came to this conclusion? We don’t know the clocks nor the benchmark scores for these cores. Physical layout looks identical to die shots of A17. I’d at least wait for some GB results to make these claims.

2. Different GPU features

See above.
3. Half the NPU of A17 Pro

That’s the funny one. We had people identify the ANE in the M3 series. It looks bigger than the A17 component and has a larger block of cache attached to it. It also looks very different from the ANE in M2. I am therefore inclined to assume that Apple was reporting different numbers (e.g. INT8 vs FP16) and that the M3 ANE even has larger buffers to support larger models.
 

altaic

macrumors 6502a
Jan 26, 2004
650
432
I don’t know who this person is. I looked through their twitter history and they are talking about industry trends and market prognosis. It doesn’t seem to me like they are a GPU engineer or programmer.

Don’t get me wrong, maybe they know something I don’t. But I’d rather hear it from the horses mouth. When introducing A17 Apple said they have redesigned the GPU to have better efficiency with complex workloads. This is exactly what Dynamic Cache is about. So following Occam’s Razor it makes sense for me to assume that they are the same thing.
The individual GPU cores look identical to me, however the Dynamic Cache functionality could reside as part of the other GPU support logic or perhaps the SLC. So, it's possible it's only part of the M3 family.

I think this is the sort of thing people will have to confirm in software, or alternatively keep an eye out for an update of the Metal feature set tables.
 
Last edited:

Macintosh IIcx

macrumors 6502a
Jul 3, 2014
609
595
Denmark
The individual GPU cores look identical to me, however the Dynamic Cache functionality could reside as part of the other GPU support logic or perhaps the SLC. So, it's possible it's only part of the M3 family.

I think this is the sort of thing people will have to confirm in software, or alternatively keep an eye out for an update of the Metal feature set tables.
I don’t know, I think it is just a marketing thing that they reserved that for the M3 presentation as it probably is more useful for pro apps than gaming on an iPhone. We have to remember that distinction between general public iPhone users and pro app users on the Mac – it requires different marketing even if the arch is the same.
 
  • Like
Reactions: GMShadow

leman

macrumors Core
Oct 14, 2008
19,302
19,285
The individual GPU cores look identical to me, however the Dynamic Cache functionality could reside as part of the other GPU support logic or perhaps the SLC. So, it's possible it's only part of the M3 family.

If this Dynamic Caching does what I think it does it will be part of the GPU core/shader scheduler functionality, as it manages resource allocation inside the GPU cores, presumably even as the shader is executed.

I think this is the sort of thing people will have to confirm in software, or alternatively keep an eye out for an update of the Metal feature set tables.

I think testing for this in software might be tricky, as we don't know what other changes have occurred (like bigger register files). With Dynamic Caching one should see better shader occupancy on complex workloads, this should be visible in the Metal profiler.
 
  • Like
Reactions: altaic

leman

macrumors Core
Oct 14, 2008
19,302
19,285
I did some tests and can confirm that A17 Pro has Dynamic Caching.

I wrote a shader that uses a lot of local variables on a dynamic conditional path that is never taken. On my M1 Max this has reduced the maximal number of threads per GPU core from 1024 to 448, indicating that the system has allocated the register space for the worst case (and as a consequence, fewer threads can fit into the GPU core memory simultaneously). I also observed a 20% reduction in performance vs. an equivalent shader without the expensive conditional path.

On the A17 Pro, there is no difference, it's always 1024 threads and the performance stays the same whether the expensive path is disabled in code or not. So the register memory is allocated lazily, on demand. This is a very impressive work from Apple.

Hope this can put the speculation to rest. A17 and M3 use the same GPU cores.
 

leman

macrumors Core
Oct 14, 2008
19,302
19,285
Hybrid? Its own design? Forked design? Even if it's based on the A17 Pro, it certainly looks scaling is not the same as M1 and M2 given the ST & NPU speeds.

The only question mark is the NPU, but on the die shots it looks bigger than the one in either M2, A16, or A17, so it's likely more capable.

GPU is the same and has the same features (see my post above). CPU shows the same scaling, looks the same on the die shot, and improvements mentioned by Apple are consistent with improvements for A17. I don't think there is much room to speculate that these cores are different.
 

komuh

macrumors member
May 13, 2023
39
10
Besides, the quoted throughput for the M3 ANE is almost exactly half of the A17 ANE. This makes me think that they are using different metrics (it would be an odd thing to do but nothing that didn’t happen before). Maybe one is INT8 and another FP16. Or FP16 and FP32.
ANE dose not support FP32 so you are probably right it would be INT8 vs FP16/BF16 results or just they make a small changes from A-series and move to more desktop oriented model with smaller NPU and die space is used for extra GPU performance? (who knows)

Or they trade teoretical FLOPs for extra cache/memory? as making use of even half of the NPU power is hard right now.
 

prime17569

macrumors regular
May 26, 2021
192
490
I said this in another thread, but I'll repeat it here.

Continuing the trend that started with the first Mac Studio, all M3 Macs have a 15 in their model identifier (e.g. Mac15,x), which matches the iPhone15,x notation used for the iPhone 14 Pro and regular iPhone 15 models that have the A16.

This, and the fact that the M3 series chips didn't get the updated 35 TOPS neural engine from A17 Pro, suggests that the M3 series chips are based on the core designs that were originally intended to go into the A16 before TSMC's N3B node was delayed and A16 had to be backported to 4nm.

The M3's chip ID also supports this. Generally, for a given A chip, the corresponding base M chip has a chip ID number that is increased by two. For example:
A14: T8101 -> M1: T8103
A15: T8110 -> M2: T8112
A16: T8120 -> M3: T8122

A17 is T8130, so a base M-series chip based on it would be expected to be T8132. However, M3 is T8122, suggesting that it is based (at least in part) on A16, which is T8120.
 

senttoschool

macrumors 68030
Original poster
Nov 2, 2017
2,571
5,325
I said this in another thread, but I'll repeat it here.



The M3's chip ID also supports this. Generally, for a given A chip, the corresponding base M chip has a chip ID number that is increased by two. For example:
A14: T8101 -> M1: T8103
A15: T8110 -> M2: T8112
A16: T8120 -> M3: T8122

A17 is T8130, so a base M-series chip based on it would be expected to be T8132. However, M3 is T8122, suggesting that it is based (at least in part) on A16, which is T8120.
That would explain the less than stellar ST and NPU.
 

leman

macrumors Core
Oct 14, 2008
19,302
19,285
The M3's chip ID also supports this. Generally, for a given A chip, the corresponding base M chip has a chip ID number that is increased by two. For example:
A14: T8101 -> M1: T8103
A15: T8110 -> M2: T8112
A16: T8120 -> M3: T8122

A17 is T8130, so a base M-series chip based on it would be expected to be T8132. However, M3 is T8122, suggesting that it is based (at least in part) on A16, which is T8120.

Or maybe they are just incrementing ID in order...

At any rate, I hope that people like Dougall Johnson will get their hands on a M3 machine soon enough to clear this up.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.