Are we sure M3 is based on A17 Pro?

senttoschool · Oct 30, 2023

M3 NPU is only 18 TOPS vs 35 TOPS on A17 Pro
M3 has "Dynamic Cache" in hardware but A17 Pro does not
~~No mentions of wider CPU cores like A17 Pro.~~
M3's P core is "only 30% faster" than M1. M1 Max GB6 = 2375 * 1.3 = 3087.5. A17 Pro is ~2950. M3's P core is much slower than expected if it's based on A17 Pro.

Gnattu · Oct 30, 2023

It actually said wider cores:

That vague graph is not helpful because we don't know what the metric is. In GB6, 15% faster than M2 is more than 30% faster than M1.

senttoschool · Oct 30, 2023

Gnattu said:
It actually said wider cores:
View attachment 2304722

That vague graph is not helpful because we don't know what the metric is. In GB6, 15% faster than M2 is more than 30% faster than M1.

I did not see this. That makes it slightly more likely that it's somehow related to A17 Pro. But yes, I also noticed that 15% faster than M2 is slightly more than 30% faster than M1. Still less than expected.

I expected 26% faster than M2. A15 --> A17 Pro = 1.26x faster ST GB6. Therefore, M3 should be roughly 26% faster ST over M2 in ST GB6. It should be around 3500 score. Instead, M3 is closer to 3100. A significant difference in expectation.

Gnattu · Oct 30, 2023

senttoschool said:
I did not see this. That makes it slightly more likely that it's somehow related to A17 Pro. But yes, I also noticed that 15% faster than M2 is slightly more than 30% faster than M1. Still less than expected.

I expected 26% faster than M2. A15 --> A17 Pro = 1.26x faster ST GB6. Therefore, M3 should be roughly 26% faster ST over M2 in ST GB6. It should be around 3500 score. Instead, M3 is closer to 3100. A significant difference in expectation.

A15 to A17 Pro has a lot of frequency bump, but obviously M2 series already comes with a very high clock and it is hard to push further.

deconstruct60 · Oct 30, 2023

senttoschool said:
M3 NPU is only 18 TOPS vs 35 TOPS on A17 Pro

There are lots of vendors that play "fast and loose" on TOPS figures. If compare generation 'n' FP16 TOPs to generation's 'n+1" FP8 TOPs then mainly playing Apples-to-Oranges.

Qualcomm's super-duper numbers are mostly based on limbo-ing down to INT4.

The relevaant issues is trillion operations of WHAT? FP16 BP16 , INT8 , INT4 are all four substantially different things. Apple doesn't say on the write ups I've seen. NPU 60% faster gen-over-gen on FP16better than hand waving by dropping down to INT8 or INT4 and getting a 2x. Well duh you dumped gobs of resolution so yeah it gets easier to count the longer vector as more ops.

leman · Oct 30, 2023

It’s the same core. I know this is a rumors forums, but jets not go crazy.

senttoschool said:
M3 NPU is only 18 TOPS vs 35 TOPS on A17 Pro

Depends on how one counts. I’m sure it’s the same NPU, maybe marketing department did an uupsie.

senttoschool said:
M3 has "Dynamic Cache" in hardware but A17 Pro does

Yes it does. They mentioned better support for complex shaders on A17. It’s the same thing.

senttoschool said:
~~No mentions of wider CPU cores like A17 Pro.~~

Yes they did.

senttoschool said:
M3's P core is "only 30% faster" than M1. M1 Max GB6 = 2375 * 1.3 = 3087.5. A17 Pro is ~2950. M3's P core is much slower than expected if it's based on A17 Pro.

Yes, M3 is likely clocked at 3.9ghz. That lower then expected, sure, and refutes my hypothesis of A17 cores having better frequency scaling. It’s still the same core though.

senttoschool · Oct 30, 2023

leman said:
Depends on how one counts. I’m sure it’s the same NPU, maybe marketing department did an uupsie.

You really think so? What evidence do you have?

There was no mention of the NPU in the video as far as I remember. I'm not going to believe that Apple erroneously forgot to mention the NPU in the video and put the wrong TOPS figure on its marketing website. The 18 TOPS figure is still live on the website.

leman said:
Yes they did.

No need to quote that and mention it again. @Gnattu already mentioned it and I crossed it out long ago.

What's going on @leman? Working OT on M3 speculation lately? I notice that you're much more defensive lately.

leman · Oct 31, 2023

senttoschool said:
You really think so? What evidence do you have?

Well, look, it’s like this. Either Apple has backported the M2 ANE to 3nm and slightly overclocked it, or they use the already existing new 3nm ANE found in the A17. Which do you think is more likely?

Besides, the quoted throughput for the M3 ANE is almost exactly half of the A17 ANE. This makes me think that they are using different metrics (it would be an odd thing to do but nothing that didn’t happen before). Maybe one is INT8 and another FP16. Or FP16 and FP32.

The conclusive evidence will be provided by comparing the die shots. I’m not trained in this so I can’t do this. Hopefully more knowledgeable people will chime in.

senttoschool said:
I notice that you're much more defensive lately.

I love speculating like any other person here (and you know it), but lately there has been too much arbitrary conjecture for my taste. I think speculation should be grounded in reality to some degree. Like, why would one expect 30% perf improvement at iso power if TSMC was very clear that it will be 15% at most.

That said, I am a bit disappointed with the the M3 announcement. I was hoping for higher clocks and more substantial redesign (finally SVE support for example). My theory with better scaling of the 3N cores at higher frequencies is also pretty much out of the window. And I’m not happy at all about the bean-counter min-maxing approach they took with the M3 family, I think they should have taken that $20-30 hit per system and delivered a more consistent lineup (and I say it as Apple shareholder). In short, I think there are plenty of things to criticize with this announcement (as well as plenty of things to be exited about). We don’t have to make stuff up.

senttoschool · Oct 31, 2023

leman said:
Well, look, it’s like this. Either Apple has backported the M2 ANE to 3nm and slightly overclocked it, or they use the already existing new 3nm ANE found in the A17. Which do you think is more likely?

Besides, the quoted throughput for the M3 ANE is almost exactly half of the A17 ANE. This makes me think that they are using different metrics (it would be an odd thing to do but nothing that didn’t happen before). Maybe one is INT8 and another FP16. Or FP16 and FP32.

I suppose they could have deemed the NPU as not important this generation and in order to increase yields, decided to chop off half of it.

senttoschool · Oct 31, 2023

leman said:
Yes it does. They mentioned better support for complex shaders on A17. It’s the same thing.

https://twitter.com/x/status/1719148004333265032

This guy disagrees.

leman · Oct 31, 2023

senttoschool said:
https://twitter.com/x/status/1719148004333265032

This guy disagrees.

It's a free world.

senttoschool · Oct 31, 2023

leman said:
It's a free world.

At least one very qualified person is saying it's a different GPU arch.

So now we have 3 pieces of evidence:

1. Unexpectedly much slower CPU ST if it's the same arch as A17 Pro
2. Different GPU features
3. Half the NPU of A17 Pro

So you still think it's "nonsense" for at least bringing this topic up?

Occam's razor suggests that Apple's M series has deviated from its A series. Maybe not all of it. But certainly for a large chunk of it it seems.

MayaUser · Oct 31, 2023

senttoschool said:
At least one very qualified person is saying it's a different GPU arch.

So now we have 3 pieces of evidence:

1. Unexpectedly much slower CPU ST if it's the same arch as A17 Pro
2. Different GPU features
3. Half the NPU of A17 Pro

So you still think it's "nonsense" for at least bringing this topic up?

Occam's razor suggests that Apple's M series has deviated from its A series. Maybe not all of it. But certainly for a large chunk of it it seems.

So if are not based on A17, on what then?

leman · Oct 31, 2023

senttoschool said:
At least one very qualified person is saying it's a different GPU arch.

I don’t know who this person is. I looked through their twitter history and they are talking about industry trends and market prognosis. It doesn’t seem to me like they are a GPU engineer or programmer.

Don’t get me wrong, maybe they know something I don’t. But I’d rather hear it from the horses mouth. When introducing A17 Apple said they have redesigned the GPU to have better efficiency with complex workloads. This is exactly what Dynamic Cache is about. So following Occam’s Razor it makes sense for me to assume that they are the same thing.

As to the rest.

senttoschool said:
1. Unexpectedly much slower CPU ST if it's the same arch as A17 Pro

I am not sure how you came to this conclusion? We don’t know the clocks nor the benchmark scores for these cores. Physical layout looks identical to die shots of A17. I’d at least wait for some GB results to make these claims.

senttoschool said:
2. Different GPU features

See above.

senttoschool said:
3. Half the NPU of A17 Pro

That’s the funny one. We had people identify the ANE in the M3 series. It looks bigger than the A17 component and has a larger block of cache attached to it. It also looks very different from the ANE in M2. I am therefore inclined to assume that Apple was reporting different numbers (e.g. INT8 vs FP16) and that the M3 ANE even has larger buffers to support larger models.

altaic · Oct 31, 2023

leman said:
I don’t know who this person is. I looked through their twitter history and they are talking about industry trends and market prognosis. It doesn’t seem to me like they are a GPU engineer or programmer.

Don’t get me wrong, maybe they know something I don’t. But I’d rather hear it from the horses mouth. When introducing A17 Apple said they have redesigned the GPU to have better efficiency with complex workloads. This is exactly what Dynamic Cache is about. So following Occam’s Razor it makes sense for me to assume that they are the same thing.

The individual GPU cores look identical to me, however the Dynamic Cache functionality could reside as part of the other GPU support logic or perhaps the SLC. So, it's possible it's only part of the M3 family.

I think this is the sort of thing people will have to confirm in software, or alternatively keep an eye out for an update of the Metal feature set tables.

Macintosh IIcx · Oct 31, 2023

altaic said:
The individual GPU cores look identical to me, however the Dynamic Cache functionality could reside as part of the other GPU support logic or perhaps the SLC. So, it's possible it's only part of the M3 family.

I think this is the sort of thing people will have to confirm in software, or alternatively keep an eye out for an update of the Metal feature set tables.

I don’t know, I think it is just a marketing thing that they reserved that for the M3 presentation as it probably is more useful for pro apps than gaming on an iPhone. We have to remember that distinction between general public iPhone users and pro app users on the Mac – it requires different marketing even if the arch is the same.

leman · Oct 31, 2023

altaic said:
The individual GPU cores look identical to me, however the Dynamic Cache functionality could reside as part of the other GPU support logic or perhaps the SLC. So, it's possible it's only part of the M3 family.

If this Dynamic Caching does what I think it does it will be part of the GPU core/shader scheduler functionality, as it manages resource allocation inside the GPU cores, presumably even as the shader is executed.

altaic said:
I think this is the sort of thing people will have to confirm in software, or alternatively keep an eye out for an update of the Metal feature set tables.

I think testing for this in software might be tricky, as we don't know what other changes have occurred (like bigger register files). With Dynamic Caching one should see better shader occupancy on complex workloads, this should be visible in the Metal profiler.

sunny5 · Nov 1, 2023

Apple M3 chips appears to be a hybrid of the A17 Pro, A16 Bionic designs

The new M3 chips are here and there’s a lot to digest. As Apple doesn’t make a point of revealing the design origins of its chips, it can be an interesting exercise to put the pieces of the puzzle together.

www.notebookcheck.net

CPU already seems fishy which might be A16 based.

senttoschool · Nov 2, 2023

MayaUser said:
So if are not based on A17, on what then?

Hybrid? Its own design? Forked design? Even if it's based on the A17 Pro, it certainly looks scaling is not the same as M1 and M2 given the ST & NPU speeds.

leman · Nov 2, 2023

I did some tests and can confirm that A17 Pro has Dynamic Caching.

I wrote a shader that uses a lot of local variables on a dynamic conditional path that is never taken. On my M1 Max this has reduced the maximal number of threads per GPU core from 1024 to 448, indicating that the system has allocated the register space for the worst case (and as a consequence, fewer threads can fit into the GPU core memory simultaneously). I also observed a 20% reduction in performance vs. an equivalent shader without the expensive conditional path.

On the A17 Pro, there is no difference, it's always 1024 threads and the performance stays the same whether the expensive path is disabled in code or not. So the register memory is allocated lazily, on demand. This is a very impressive work from Apple.

Hope this can put the speculation to rest. A17 and M3 use the same GPU cores.

leman · Nov 2, 2023

senttoschool said:
Hybrid? Its own design? Forked design? Even if it's based on the A17 Pro, it certainly looks scaling is not the same as M1 and M2 given the ST & NPU speeds.

The only question mark is the NPU, but on the die shots it looks bigger than the one in either M2, A16, or A17, so it's likely more capable.

GPU is the same and has the same features (see my post above). CPU shows the same scaling, looks the same on the die shot, and improvements mentioned by Apple are consistent with improvements for A17. I don't think there is much room to speculate that these cores are different.

komuh · Nov 2, 2023

leman said:
Besides, the quoted throughput for the M3 ANE is almost exactly half of the A17 ANE. This makes me think that they are using different metrics (it would be an odd thing to do but nothing that didn’t happen before). Maybe one is INT8 and another FP16. Or FP16 and FP32.

ANE dose not support FP32 so you are probably right it would be INT8 vs FP16/BF16 results or just they make a small changes from A-series and move to more desktop oriented model with smaller NPU and die space is used for extra GPU performance? (who knows)

Or they trade teoretical FLOPs for extra cache/memory? as making use of even half of the NPU power is hard right now.

prime17569 · Nov 3, 2023

I said this in another thread, but I'll repeat it here.

Continuing the trend that started with the first Mac Studio, all M3 Macs have a 15 in their model identifier (e.g. Mac15,x), which matches the iPhone15,x notation used for the iPhone 14 Pro and regular iPhone 15 models that have the A16.

This, and the fact that the M3 series chips didn't get the updated 35 TOPS neural engine from A17 Pro, suggests that the M3 series chips are based on the core designs that were originally intended to go into the A16 before TSMC's N3B node was delayed and A16 had to be backported to 4nm.

The M3's chip ID also supports this. Generally, for a given A chip, the corresponding base M chip has a chip ID number that is increased by two. For example:
A14: T8101 -> M1: T8103
A15: T8110 -> M2: T8112
A16: T8120 -> M3: T8122

A17 is T8130, so a base M-series chip based on it would be expected to be T8132. However, M3 is T8122, suggesting that it is based (at least in part) on A16, which is T8120.

senttoschool · Nov 3, 2023

prime17569 said:
I said this in another thread, but I'll repeat it here.

The M3's chip ID also supports this. Generally, for a given A chip, the corresponding base M chip has a chip ID number that is increased by two. For example:
A14: T8101 -> M1: T8103
A15: T8110 -> M2: T8112
A16: T8120 -> M3: T8122

A17 is T8130, so a base M-series chip based on it would be expected to be T8132. However, M3 is T8122, suggesting that it is based (at least in part) on A16, which is T8120.

That would explain the less than stellar ST and NPU.

leman · Nov 3, 2023

prime17569 said:
The M3's chip ID also supports this. Generally, for a given A chip, the corresponding base M chip has a chip ID number that is increased by two. For example:
A14: T8101 -> M1: T8103
A15: T8110 -> M2: T8112
A16: T8120 -> M3: T8122

A17 is T8130, so a base M-series chip based on it would be expected to be T8132. However, M3 is T8122, suggesting that it is based (at least in part) on A16, which is T8120.

Or maybe they are just incrementing ID in order...

At any rate, I hope that people like Dougall Johnson will get their hands on a M3 machine soon enough to clear this up.

Are we sure M3 is based on A17 Pro?

macrumors 68030

macrumors 65816

macrumors 68030

macrumors 65816

macrumors G5

macrumors Core

macrumors 68030

macrumors Core

macrumors 68030

macrumors 68030

macrumors Core

macrumors 68030

macrumors 68040

macrumors Core

macrumors 6502a

macrumors 6502a

macrumors Core

Suspended

macrumors 68030

macrumors Core

macrumors Core

Suspended

macrumors regular

macrumors 68030

macrumors Core

Our Staff