Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

mr_roboto

macrumors 6502a
Sep 30, 2020
856
1,866
M4 is not a full fledged new generation product. While Apple is free to use whatever naming scheme they want, the reality is that M4 does not use a new architecture. It uses a slightly modified (to boost yield and reduce cost) tech process. All semi companies but Apple simply refused to use N3B because it was a dud. Apple was in a pickle because they had to stick with the annual iPhone release schedule. But they knew the process was bad, so, while still working on the N3B based chips they probably started re-spinning the layout for N3E. The small time gap between M3 and M4 is not an indication of some sort of acceleration of the development/release cycle.
Clock speed depends on many factors, not just architecture. Apple explains what's new in M4 here and new architecture is not listed. Unless Apple is so modest as to not mention the new CPU architecture (while boasting about the new display engine), it's not a new architecture.
The sentence you highlighted in your link ("M4 builds on the GPU architecture of M3...") concerns the GPU, not the CPU. Just above it is a sentence where they promise a big CPU performance boost.

Also, Apple marketing is not exactly where I go for reliable technical information about Apple's CPU microarchitecture. It's not exactly the kind of thing they put a lot of effort into. We'll know a lot more once Apple updates their CPU Optimization Guide - a developer oriented doc - with all the details on M4.


I think it's funny that you claim M4 isn't a full new generation product, yet at the same time you acknowledge clock speed depends on many factors. One of those factors is... microarchitecture! Clock speed isn't exclusively a product of process node.

Also funny is that we do have reasonably good evidence that CPU microarchitecture did change. If M4's CPU was just M3 ported to N3E plus the end result of AMX morphing into SME, you would expect all non-ML single core test results to land at about 4.40 GHz / 4.05 GHz = 1.086 = 108.6% in this comparison:


What we actually see: some are clustered around 108%, others very much aren't, and some of the ones which aren't are definitely not tests you'd expect SME to be used in. (For example, Navigation and HTML5 Browser.) There's also some which fail to reach the 108% bar. This is exactly the kind of spread you expect to see with a CPU uarch update - some things fared better than others.

In general when I see your posts I know exactly what to expect: all Apple's grapes are sour, nothing they do is any good. You've got one note to play, and you play it a lot!
 

falainber

macrumors 68040
Mar 16, 2016
3,539
4,136
Wild West
The sentence you highlighted in your link ("M4 builds on the GPU architecture of M3...") concerns the GPU, not the CPU. Just above it is a sentence where they promise a big CPU performance boost.

Also, Apple marketing is not exactly where I go for reliable technical information about Apple's CPU microarchitecture. It's not exactly the kind of thing they put a lot of effort into. We'll know a lot more once Apple updates their CPU Optimization Guide - a developer oriented doc - with all the details on M4.


I think it's funny that you claim M4 isn't a full new generation product, yet at the same time you acknowledge clock speed depends on many factors. One of those factors is... microarchitecture! Clock speed isn't exclusively a product of process node.

Also funny is that we do have reasonably good evidence that CPU microarchitecture did change. If M4's CPU was just M3 ported to N3E plus the end result of AMX morphing into SME, you would expect all non-ML single core test results to land at about 4.40 GHz / 4.05 GHz = 1.086 = 108.6% in this comparison:


What we actually see: some are clustered around 108%, others very much aren't, and some of the ones which aren't are definitely not tests you'd expect SME to be used in. (For example, Navigation and HTML5 Browser.) There's also some which fail to reach the 108% bar. This is exactly the kind of spread you expect to see with a CPU uarch update - some things fared better than others.

In general when I see your posts I know exactly what to expect: all Apple's grapes are sour, nothing they do is any good. You've got one note to play, and you play it a lot!
I did not highlight this sentence, Google search did. I just copied the link. But I was talking about the entire article and, in it, Apple did not claim new CPU architecture. It's funny, that Apple did not claim it but Apple fans do. CPU architectures are not developed in 6 months. uarch update may improve the performance but it does not constitute new architecture.

Edit: as far as the clocks are concerned, just check, say, Intel CPU lineup. They have dozens of processor models with different clocks and the same architecture. Original claim that slight clock increase indicates new architecture is technically naive.
 
Last edited:

senttoschool

macrumors 68030
Nov 2, 2017
2,626
5,482
I think it's funny that you claim M4 isn't a full new generation product, yet at the same time you acknowledge clock speed depends on many factors. One of those factors is... microarchitecture! Clock speed isn't exclusively a product of process node.
I think what the internet is missing is that, ultimately, perf/ghz does not matter. It's always perf/watt that is important and Apple just increased perf/watt in a massive way (assuming no drastic increase in power).

I assume that Apple did tweak the design to increase clock speeds without using more power. Otherwise, N3E might be a godsend node and we should all be buying TSMC stock.
 

name99

macrumors 68020
Jun 21, 2004
2,410
2,317
No. Does anyone?
I don't know how to respond to this. This indicates a level of cluelessness that's truly scary.
Do you have any idea what SME is? Do you know what it entails in terms of changing the CPU?
Would you also not consider the addition of, say, AVX to be a change in architecture?
Let's put it differently, what WOULD satisfy you as being "a change in architecture"?
 
  • Like
Reactions: jdb8167

falainber

macrumors 68040
Mar 16, 2016
3,539
4,136
Wild West
I don't know how to respond to this. This indicates a level of cluelessness that's truly scary.
Do you have any idea what SME is? Do you know what it entails in terms of changing the CPU?
Would you also not consider the addition of, say, AVX to be a change in architecture?
Let's put it differently, what WOULD satisfy you as being "a change in architecture"?
SME is an extension to architecture. It is fairly common for CPUs from the same architecture generation to support or omit some extensions. For example, Intel Xeons will have AVX-512 and their desktop siblings won't, but they will still share the same core architecture generation.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,674
SME is an extension to architecture. It is fairly common for CPUs from the same architecture generation to support or omit some extensions. For example, Intel Xeons will have AVX-512 and their desktop siblings won't, but they will still share the same core architecture generation.

So you are suggesting that M3 CPU supported all these things, but for some reason Apple just decided to pretend like they don’t exist, in addition to artificially reducing IPC? That’s a rather cumbersome hypothesis.


At any rate, I think you might be mixing up the terms “architecture” as in platform (e.g. x86, ARM) and architecture as in CPU design (also called microarchitecture). M4 supports new CPU instructions and has a different performance behavior across the board from M3, which already qualifies as new architecture. If you don’t consider it a a new architecture, then you also should be prepared to argue that Zen4 or Alder Lake are not new architectures.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
If you don’t consider it a a new architecture, then you also should be prepared to argue that Zen4 or Alder Lake are not new architectures.
Many consider a new microarchitecture to be a redesign, not an upgrade. Thus, many consider Zen 3 an upgrade of Zen 4, but Zen 5 a new microarchitecture.

Zen5_678x452.jpg


Do you know what it entails in terms of changing the CPU?
Has Apple changed the decoder to adopt SME?

Apple-M4-chip-new-CPU-240507_big.jpg.large.jpg
 

smalm

macrumors newbie
I see a similar concept. They have building blocks of different parts of a SoC. In several states of functionality. When they have to freeze the design, it could be a block more advanced would be available just a little time latter. Bad luck. But you have to freeze once. You can’t forever.
And then comes the time where you have to shoot the engineer to get the project out of the door...
 
Last edited:

quarkysg

macrumors 65816
Oct 12, 2019
1,247
841
Many consider a new microarchitecture to be a redesign, not an upgrade. Thus, many consider Zen 3 an upgrade of Zen 4, but Zen 5 a new microarchitecture.

Zen5_678x452.jpg



Has Apple changed the decoder to adopt SME?

Apple-M4-chip-new-CPU-240507_big.jpg.large.jpg
May I know how you came to the conclusion that the AMD slide shows a new design, while the Apple slide does not?

Both says about the same thing to me ... like wider decode?

So AMD gets a pass when they say so, but Apple doesn't?
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
May I know how you came to the conclusion that the AMD slide shows a new design, while the Apple slide does not?
Honestly, I don't know if M4 can be considered a new microarchitecture. What I have tried to convey, perhaps poorly, is that AMD considers Zen 5 a new microarchitecture, while Zen 4, does not. I am under the impression that Zen 1, Zen 5 and maybe Zen 3 can be considered new microarchitectures, while Zen 2, Zen 3+, Zen 4 are upgrades of the previous version.
 
  • Like
Reactions: quarkysg

quarkysg

macrumors 65816
Oct 12, 2019
1,247
841
Honestly, I don't know if M4 can be considered a new microarchitecture. What I have tried to convey, perhaps poorly, is that AMD considers Zen 5 a new microarchitecture, while Zen 4, does not. I am under the impression that Zen 1, Zen 5 and maybe Zen 3 can be considered new microarchitectures, while Zen 2, Zen 3, Zen 3+ are upgrades of the previous version.
Well, if you ask me, it doesn't really matter if it is a new architecture or not. At the end of the day, it is how much performance the CPU architect and designer can squeeze out of their designs that counts.
 

altaic

macrumors 6502a
Jan 26, 2004
711
484
Honestly, I don't know if M4 can be considered a new microarchitecture. What I have tried to convey, perhaps poorly, is that AMD considers Zen 5 a new microarchitecture, while Zen 4, does not. I am under the impression that Zen 1, Zen 5 and maybe Zen 3 can be considered new microarchitectures, while Zen 2, Zen 3+, Zen 4 are upgrades of the previous version.
Well, if you ask me, it doesn't really matter if it is a new architecture or not. At the end of the day, it is how much performance the CPU architect and designer can squeeze out of their designs that counts.
So it’s agreed that no one knows what “new” means, and no more bike shedding 👏

Anyway, back to the topic, if Apple’s claims about the M4 operating at 1/2 the power of M3 are true, and if the geekbench scores are legit (where the M4 is much better than the M3 Pro), that’s ****ing amazing. That means that the fabled “double Ultra” is feasible in the Studio. Plus, with LPDDR5X, they could go up to 10700 at similar power, which would be a marked increase in memory throughput. Better start bitching about how you’re being obsoleted before it’s cool!
 
Last edited:

Dulcimer

macrumors 6502a
Nov 20, 2012
967
1,148
So it’s agreed that no one knows what “new” means, and no more bike shedding 👏

Anyway, back to the topic, if Apple’s claims about the M4 operating at 1/2 the power of M3 are true, and if the geekbench scores are legit (where the M4 is much better than the M3 Pro), that’s ****ing amazing. That means that the fabled “double Ultra” is feasible in the Studio. Plus, with LPDDR5X, they could go up to 10700 at similar power, which would be a marked increase in memory throughput. Better start bitching about how you’re being obsoleted before it’s cool!
The half-power comparison was to M2, not M3. And let’s be real, that claim is likely specific to certain workloads taking advantage of new arch features.
 

altaic

macrumors 6502a
Jan 26, 2004
711
484
The half-power comparison was to M2, not M3. And let’s be real, that claim is likely specific to certain workloads taking advantage of new arch features.
Half power compared to M2 is more impressive. Not sure what you’re getting at 🙃
 

MrGunny94

macrumors 65816
Dec 3, 2016
1,148
675
Malaga, Spain
Honestly, I think we need to wait for this to reach the Mac to be sure when it comes to Battery Life. I do think the chip is amazing and more capable in the M4 Iteration.

I really loved the fact that they decided to push forward with more E Cores and keep improving them, that's where I wanted them to take the base and Pro chips.
 

thenewperson

macrumors 6502a
Mar 27, 2011
992
912
Honestly, I think we need to wait for this to reach the Mac to be sure when it comes to Battery Life. I do think the chip is amazing and more capable in the M4 Iteration.

I really loved the fact that they decided to push forward with more E Cores and keep improving them, that's where I wanted them to take the base and Pro chips.

They did move the Pro to 6E already so it’s good to see the base chip get this too. What I’m curious about is if the A18 gets this upgrade as well.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,674
Main QUESTION is whether NEON is now SVE, and if so whether it's 128b SVE or 256b SVE.

Since GB6 supports SVE, I think we would have noticed if SIMD has been extended to 256-bit. As you say, the big question is whether M4 supports the regular (non-streaming-mode) SVE at all. Streaming mode solves the HPC problem in a way that AVX512 failed to solve, but Apple might choose not to implement non-steraming SVE at all, deeming Neon to be enough as the base low-latency SIMD ISA. Would be great to get masks though.

I'm not so sure.

If you look at SME (especially the latest SME2.1 stuff, eg
https://reviews.llvm.org/D137571 )
so much of it, in hindsight, seems motivated by AMX functionality. For example LUTI2 and LUTI4 seem to match AMX lookup table stuff [since AMX1, apparently to support quantized weights] along with the strided 2 and 4 vector loads that were added to M3 AMX.

That is true, and LUT instructions was what I had in mind when I wrote my post. Do you know if these new instructions cover all the functionality that were described as part of AMX? Also if my memory serves me right, there were some specialized addressing modes in AMX that I don't remember seeing in SVE? Then again, I have a lot of difficulty navigating ARM's documentation, so I probably missed a lot of things.

P.S. I just had another look and I don't see an equivalent for the generating genlut instruction (https://github.com/corsix/amx/blob/main/genlut.md)
 

MRMSFC

macrumors 6502
Jul 6, 2023
371
381
I don't know how to respond to this. This indicates a level of cluelessness that's truly scary.
Do you have any idea what SME is? Do you know what it entails in terms of changing the CPU?
Would you also not consider the addition of, say, AVX to be a change in architecture?
Let's put it differently, what WOULD satisfy you as being "a change in architecture"?
Is this going to be a Ship of Theseus argument?

I don’t know the criteria for a “new” architecture.
 
  • Like
Reactions: altaic

leman

macrumors Core
Oct 14, 2008
19,521
19,674
Is this going to be a Ship of Theseus argument?

I don’t know the criteria for a “new” architecture.

I think this is precisely the point. Characterizing something like a "new architecture" is entirely subjective. Frankly, my feeling is that some say M4 is not a new architecture just because it has been released so soon after M3. Which is hardly a good argument. In terms of features and performance, the M3->M4 is roughly comparable to Rocket Lake->Alder lake, so there's that.

Personally, I do not believe asking whether the architecture is "new" is a productive line of inquiry, because it leads pretty much nowhere. It's much more interesting to ask "what is new" and "what has changed".
 

mr_roboto

macrumors 6502a
Sep 30, 2020
856
1,866
Half power compared to M2 is more impressive. Not sure what you’re getting at 🙃
I believe Apple's claim was half power at the same performance as M2. This is likely a comparison between a M2 core running at its highest performance state, and a M4 core running at a reduced clock speed chosen to approximate the performance of the full-speed M2.

I expect M4 P cores in their highest performance state to still use about the same amount of power as Apple's P cores always do - somewhere around 5 to 6 watts. It seems to be the target they aim for.
 

altaic

macrumors 6502a
Jan 26, 2004
711
484
I believe Apple's claim was half power at the same performance as M2. This is likely a comparison between a M2 core running at its highest performance state, and a M4 core running at a reduced clock speed chosen to approximate the performance of the full-speed M2.

I expect M4 P cores in their highest performance state to still use about the same amount of power as Apple's P cores always do - somewhere around 5 to 6 watts. It seems to be the target they aim for.
I think you’re right about that. Still an impressive improvement, though.
 

MrGunny94

macrumors 65816
Dec 3, 2016
1,148
675
Malaga, Spain
They did move the Pro to 6E already so it’s good to see the base chip get this too. What I’m curious about is if the A18 gets this upgrade as well.
Most likely everything M Pro below will get it because it's portable and on the go devices.

I'm curious the approach they will take with the M4 Pro but I do hope they go this route, but based on this chip alone I think it's safe to say (same goes for M3 Pro)
 

name99

macrumors 68020
Jun 21, 2004
2,410
2,317
Most likely everything M Pro below will get it because it's portable and on the go devices.

I'm curious the approach they will take with the M4 Pro but I do hope they go this route, but based on this chip alone I think it's safe to say (same goes for M3 Pro)
Apple are engaged in on-going work (that moves a little more each year) to split the OS up into more and more pieces that can run independently on separate cores. Obviously this is a goal that every OS vendor strives for in the age of multi-core; Apple's nothing special in this respect, just the techniques they will use will be optimal for the structure of Darwin.
There have been academic OSs in the past (like Barrelfish, from MS) that have pushed this idea, but moving a large commercial OS in this direction is obviously harder!


I've mentioned before that part of how Apple run faster is to run experiments in parallel. IMHO the M3 6E cluster was such an experiment – put it in a chip where it can't cause any harm, and see just how well it can get used (both by the OS and by lightweight threads in apps). Presumably the experiment was a big success, enough so that we see it as the new norm (and perhaps also justifying moving to 6E cores for M4 Max?)


Open questions then include
- does 6E make sense for an iPhone? I guess we'll see soon! Maybe it does?

- does going up to 8 E-cores now make sense? (There are two issues here. The presence of 8 E cores, is there enough work for them? And whether it's still feasible to have them all sharing a single set of L2 capacities like the L2 itself, the L2 TLB and page walkers, and AMX/SME. If those resources start to be overloaded, maybe better to dial it back to 4E+4E for the M4 Pro and slowly over the new few years work our way back to 6E+6E in four years or so?)

- does a dedicated OS-only E cluster make sense? The idea here is that we devote an E cluster (maybe only two E-cores, maybe no AMX/SME needed, and small L2) to running the most security critical elements of the OS and NOTHING ELSE. The idea is that if we have these cores isolated to this extent malicious apps won't be able to [or at least will have to work even harder to find some scheme] either modify the OS or eavesdrop on what it's doing. This will also allow us to make the other cores more aggressive in terms of things like variable timing and speculation without having to worry about this endless stream of micro-architectural security issues (Spectre, GoFetch and the rest of them). If you want to do crypto or anything involving passwords, call into the OS which will shunt the work to a security core, and given this fact, who CARES that an app can, with immense effort, sometimes read a few bytes from the memory range of some other app?
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.