Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Analog Kid

macrumors G3
Mar 4, 2003
9,360
12,603
If all Macs have similar base performance for a given Mx family, IMHO, it makes it easier to tune macOS to work better with the SoC. Having YoY improvement is already enough of a variable to content with when it comes to software development. Having more variables across a single SoC family will make it worst, so I would think making it simpler for the software (both internal and external) developer is a core strategy for Apple as well.

Easier to tune how?
 

quarkysg

macrumors 65816
Oct 12, 2019
1,247
841
Easier to tune how?
I suppose you have not taken a look at OS codes? Maybe spent some time browsing through say a Linux source code and drivers, from their memory management to their video driver codes. They have lots of branches to support different CPUs / GPUs. These codes are notoriously difficult to debug due to the nature of OS codes, namely due to them being interrupt driven.

If an engineer still need to optimise the speed of their code because M1 is slower compared to M1 Pro for example, it would introduce bugs into the code tree. This is just a simplistic example.

Writing software is hard. Writing good software is even harder.
 
Last edited:

leman

macrumors Core
Oct 14, 2008
19,521
19,677
generally true, M2 is modest update to M1. We need to wait for armv9 architecture change that is already in progress. But that requires some time. M2 is buying that time.

What does armv9 have to do with any of this?
 
  • Like
Reactions: altaic

ikir

macrumors 68020
Sep 26, 2007
2,176
2,366
These aren't constant. They are up to. In short, in certain mathematical operations they can reach up to 18-20% and up to 35% GPU performance per clock cycle. Across the board you're more likely to say, 9% IPC improvement over M1.
All real life scenarios and benchmark I've seen M2 is much more faster, in real life gaming and Crossover API translation for example M2 is 2x faster than M1.
 

UBS28

macrumors 68030
Oct 2, 2012
2,893
2,340
Of course anytime someone makes an ill advised snarky complaint about price they always seem to neglect that comparable windows machines seem to cost more, why is that? Maybe SSD, CPU, display quality, build quality don’t matter? I mean this is way more money than a cheap Chromebook or a cheap low power windows craptop

What? Windows machines are way cheaper.

I can buy a 12th gen Intel 14-core CPU with a RTX 3060 for the price of a 14” Base M1 Pro.

And this PC laptop is faster than a 16” M1 Max MacBook Pro for a fraction of the price.

And with Bitcoins crashing, the RTX 3080 should become alot cheaper soon so you will be spec out PC laptops for cheap with this GPU too.
 

Juraj22

macrumors regular
Jun 29, 2020
179
208
What does armv9 have to do with any of this?
Apple uses ARM ISA to build its CPUs. ARMv9 is next iteration that brings vector instructions (and more). But those vector instructions will be the driving force to increase performance of creative apps significantly. Arm had NEON for long time, but it is not even close to AVX512 Intel has. So this will close the gap.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,677
Apple uses ARM ISA to build its CPUs. ARMv9 is next iteration that brings vector instructions (and more). But those vector instructions will be the driving force to increase performance of creative apps significantly. Arm had NEON for long time, but it is not even close to AVX512 Intel has. So this will close the gap.

"Vector instructions" (by which I assume you mean SVE/SVE2) are optional extensions to ARMv8, so Apple doesn't need to implement ARMv9 to get them. I also don't share your optimism about increasing the performance of creative apps. First, it all depends on the implementation. SVE does not mandate a minimal SIMD width, so having SVE does not automatically mean higher performance. Besides, Apple already has a wide SIMD execution backend capable of executing up to 512-bit worth of SIMD operations per cycle. So they are already on par with mainstream implementation software AVX512/AVX2 — Intel is still faster overall because it runs higher clocks. Sure, with SVE Apple could increase the width of the SIND units to 256bits or higher, but that comes at the expense of flexibility and power consumption, so it's far from clear that this is the optimal way for them.

And finally, I doubt that creative software relies that much on CPU vector processing. Apple currently offers three vector processing backends: the CPU (via NEON), the AMX coprocessor (vie an undocumented extension accessible from Apple Accelerate library) and the GPU. Most applications that need high vector processing throughput will use Accelerate or the GPU. Neon is good for applications that need to do small to modest amounts of data-parallel processing with low latency.

Don't get me wrong, as someone who has been programming using low-level SIMD for two decades I am pumped for SVE/SVE2 and I can't wait to get my hands on it. But it's not going to be a magical performance panacea and ARMv9 won't do anything for everyday general-purpose performance.
 

altaic

macrumors 6502a
Jan 26, 2004
713
484
Apple uses ARM ISA to build its CPUs. ARMv9 is next iteration that brings vector instructions (and more). But those vector instructions will be the driving force to increase performance of creative apps significantly. Arm had NEON for long time, but it is not even close to AVX512 Intel has. So this will close the gap.
Apple Silicon has had such instructions for a long time. It’s called AMX: https://gist.github.com/dougallj/7a75a3be1ec69ca550e7c36dc75e0d6f
 

Juraj22

macrumors regular
Jun 29, 2020
179
208
Apple Silicon has had such instructions for a long time. It’s called AMX: https://gist.github.com/dougallj/7a75a3be1ec69ca550e7c36dc75e0d6f
yeah, only issue is that it is only available via Accelerate.framework. So it is not something compiler would use, or that you can use some simd extension..nope. For us only NEON is available. (or Accelerate.framework)

ARMv9 will bring this from private space to public space. Many apps will see huge boost in performance.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,677
What? Windows machines are way cheaper.

I can buy a 12th gen Intel 14-core CPU with a RTX 3060 for the price of a 14” Base M1 Pro.

Not with the same buid quality or display or battery life or connectivity. If you only care about performance, sure, you can buy an equivalent Windows PC for less.

If you care for an overall package, Apple Silicon offers excellent value proposition, especially on the low end. An M1/M2 machine will outperform a much more expensive Windows premium laptop.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,677
ARMv9 will bring this from private space to public space.

No it won't. SVE gives you access to CPU-internal SIMD units, not the AMX coprocessor. Again, you are assuming that SVE will automatically come with a vastly larger SIMD backend, but that is not at all given.
 

altaic

macrumors 6502a
Jan 26, 2004
713
484
yeah, only issue is that it is only available via Accelerate.framework. So it is not something compiler would use, or that you can use some simd extension..nope. For us only NEON is available. (or Accelerate.framework)
It’s available in the hardware. ARMv9 as you’re referring to it is a hardware spec, which you’re conflating with software. Libraries, compilers, and whatever you mean by “extension” are part of the compilation tool chain. Maybe buy the dougallj guy a coffee and it’ll turn up in LLVM sooner. Or wait for Apple to do it.
 

DMG35

Contributor
May 27, 2021
2,526
8,164
Next year when then M3 is a 20% increase over the M2 is that going to be the M2.1?

Apple is going to be able to make these gigantic leaps every year that some of you think they are going to make. You aren't meant to upgrade your computer every year and thinking there are going to be these crazy advancements in their ships every single year isn't reality.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,677
It’s available in the hardware. ARMv9 as you’re referring to it is a hardware spec, which you’re conflating with software. Libraries, compilers, and whatever you mean by “extension” are part of the compilation tool chain. Maybe buy the dougallj guy a coffee and it’ll turn up in LLVM sooner. Or wait for Apple to do it.

It's a private implementation detail. Apple hides it for a reason, the hardware is constantly evolving and there is no guarantee that the AMX ISA is compatible between different Apple Silicon products. Sure, you can reverse-engineer the opcodes and roll your own little library that uses it directly, but it might (and will) break when the next model is released.

Maybe in the future there won't be any AMX and Apple will go full on SVE and SME. Who knows.
 

mr_roboto

macrumors 6502a
Sep 30, 2020
856
1,866
No it won't. SVE gives you access to CPU-internal SIMD units, not the AMX coprocessor. Again, you are assuming that SVE will automatically come with a vastly larger SIMD backend, but that is not at all given.
And to re-emphasize something you already said, Apple could have shipped SVE in M1 if they wanted to. It's an optional feature in Arm v8, and continues to be an optional feature in Arm v9.

Apple has chosen not to implement it so far, which suggests they may never do so. NEON is a reasonably good SIMD ISA, and as you say, Apple already provides lots of NEON compute throughput.

Arm had NEON for long time, but it is not even close to AVX512 Intel has. So this will close the gap.
Thanks to Intel's long history of blunders deploying AVX512, it's a litle hard to count it as an already realized advantage for Intel. The vast majority of applications don't bother because it's not available in most of the processors those apps are going to run on.

The latest debacle is that even though Alder Lake's P cores support AVX512, Intel had to disable it because Alder Lake's E cores don't support AVX512, and as Intel found out, having too much a difference in ISA feature support between cores in the same chip really is not good. Whoops, mass consumer adoption delayed for yet another year (at least).
 

leman

macrumors Core
Oct 14, 2008
19,521
19,677
And to re-emphasize something you already said, Apple could have shipped SVE in M1 if they wanted to. It's an optional feature in Arm v8, and continues to be an optional feature in Arm v9.

Oh, it's optional? I thought it was part of the core, but I looked it up and you are right! That makes ARMv9 even less interesting.
 
  • Like
Reactions: Unregistered 4U

Unregistered 4U

macrumors G4
Jul 22, 2002
10,610
8,628
Why do you keep repeating this silly straw man? Have I said anything that suggests I’m favor of this idea?
I’m not saying you’re in favor of the idea, just that there are a LOT of people that expect from Apple the same thing they’ve expected from AMD/Intel for years. If Apple HAD introduced a low end system with a single threaded performance that’s, say, half of the M1 Ultra, no one would have been alarmed by that, that’s how computers have operated for quite some time now. It’s this lack of a clearly less performant (single threaded wise) option that makes folks point at Apple and think “something’s fishy”.

Perhaps if Apple were in a position where they were providing a range of chip solutions for external vendors, there would be a focus on releasing products that require ever more sophisticated power and cooling requirements… and there would be a market of support vendors (with their tubing and thermal paste and special alloys) to go along with it. Unfortunately, more expensive power and cooling requirements usually yield a more expensive (to produce) product in the end. And, since the company using the chips is the same one that’s making them, they will likely always err on the side of the product being cheaper/easier to produce.
 

Analog Kid

macrumors G3
Mar 4, 2003
9,360
12,603
I suppose you have not taken a look at OS codes? Maybe spent some time browsing through say a Linux source code and drivers, from their memory management to their video driver codes. They have lots of branches to support different CPUs / GPUs. These codes are notoriously difficult to debug due to the nature of OS codes, namely due to them being interrupt driven.

If an engineer still need to optimise the speed of their code because M1 is slower compared to M1 Pro for example, it would introduce bugs into the code tree. This is just a simplistic example.
You describe it as though anything is stable. Core speed changes not only need to be handled from generation to generation, but from moment to moment as things like thermal throttling, power management, system loading, interrupt priorities and locks and a variety of other factors affect the core performance. There are already massive difference in memory bandwidth between the variants. I don’t think this is an excuse to purposely limit system performance.

Writing software is hard. Writing good software is even harder.
If it wasn’t hard, anyone could do it.
 

Analog Kid

macrumors G3
Mar 4, 2003
9,360
12,603
It’s this lack of a clearly less performant (single threaded wise) option that makes folks point at Apple and think “something’s fishy”.

No, it’s the lack of a more performant option that makes folks think ”something’s fishy”. Nothing I’m arguing about has anything to do with Intel, AMD, marketing or public perceptions— just physics.
 

ArkSingularity

macrumors 6502a
Mar 5, 2022
928
1,130
Apple already does have "less performant" chips in the market. They just use older chips rather than making new ones that are purposefully slower (lower-tier iPads do this frequently, as does the 2020 M1 Air).
 
  • Like
Reactions: Unregistered 4U

Unregistered 4U

macrumors G4
Jul 22, 2002
10,610
8,628
No, it’s the lack of a more performant option that makes folks think ”something’s fishy”. Nothing I’m arguing about has anything to do with Intel, AMD, marketing or public perceptions— just physics.
Well, the argument itself is, essentially “AMD’s and Intel’s work like this, why doesn’t Apple’s?”

And the answer to that, we know, is that they restrict the performance of their lower end parts.
Maybe Apple’s doesn’t work that way because ALL of the M-series chips fit within the top 25 on this chart? :)

I’ve never looked at this chart before now, but the LOWEST end M1 is only 618 points away from the highest single thread performance currently on the chart. That looks even more like Apple targeted a ‘pinnacle’ level single threaded performance and simply didn’t do what AMD/Intel did with the lower end. Could an Ultra with more juice perform better? I’d tend to think so, but, again, I’m also trying to avoid the “think like AMD/Intel” mindset, and I know that thinking is from what I’ve seen in the past with Intel processors (including a demo where they’ve SERIOUSLY overclocked a chip and are keeping it cool with liquid nitrogen being dosed manually).

I’m sure some enterprising person with a YouTube channel and a need for more subscribers will grab an Ultra and find a way to test this theory. Then, post a video, complete with a thumbnail using multicolor Impact font and a “surprised look” portrait to let us know what they find. :D
 

Analog Kid

macrumors G3
Mar 4, 2003
9,360
12,603
Well, the argument itself is, essentially “AMD’s and Intel’s work like this, why doesn’t Apple’s?”
No it’s not. The argument is essentially, “physics works like this, why doesn’t Apple?”. Less constraints means more possibilities. Desktops are at a different point on the power performance curve. The Studio thermal management makes that intent clear, I just haven’t seen any indication the current implementation delivers on that intent.

I’m also trying to avoid the “think like AMD/Intel” mindset, and I know that thinking is from what I’ve seen in the past with Intel processors (including a demo where they’ve SERIOUSLY overclocked a chip and are keeping it cool with liquid nitrogen being dosed manually).
But you’re still using AMD/Intel as a guidepost— they have nothing to do with the topic we’re discussing, but keep coming up.

As I said earlier, I’m not sure the AS cores can simply be arbitrarily upclocked and cooled. It may require a different core design. Clock rate is also only one path to increased performance.
 

Unregistered 4U

macrumors G4
Jul 22, 2002
10,610
8,628
No it’s not. The argument is essentially, “physics works like this, why doesn’t Apple?”. Less constraints means more possibilities. Desktops are at a different point on the power performance curve. The Studio thermal management makes that intent clear, I just haven’t seen any indication the current implementation delivers on that intent.
Going by the physics, I, personally, have never overclocked an M1 system successfully. I’ve also never overclocked a M1 system and seen it fail (and it may not be feasible in either way). Understanding that, I realize that all I know about “more power = better performance” is based on what I’ve read related to AMD and Intel. Which is why when I ask SHOULD putting more energy in get better performance out, I recognize that the only reason why I’d say “yes” is based on information I have that may not physically apply to a structure like Apple Silicon. It could be that you’re right and that Apple Silicon processors are optimized for the clock/voltage they run at and anything higher or lower would just yield worse performance. Considering that the lowest performing Apple Silicon is within 600 points or so of the current single threaded champion, that wouldn’t be an insane assumption. They designed for an impressive power/performance apex and, with the M2, are designing for one a little bit higher.

But you’re still using AMD/Intel as a guidepost— they have nothing to do with the topic we’re discussing, but keep coming up.
Yes, that’s what I said I’m doing. “Pour more power into chip means better performance” is, I know, Intel thinking. It’s AMD thinking and I recognize that. And, I know I’m not the only one thinking that “pour more power into the chip means better performance”. :)
 

ArkSingularity

macrumors 6502a
Mar 5, 2022
928
1,130
No it’s not. The argument is essentially, “physics works like this, why doesn’t Apple?”. Less constraints means more possibilities. Desktops are at a different point on the power performance curve. The Studio thermal management makes that intent clear, I just haven’t seen any indication the current implementation delivers on that intent.
I imagine the chip design itself probably limits how much Apple can scale their frequencies up. CPUs get around non-zero transition times for transistor state changes by grouping CPU logic into pipeline stages and using synchros to keep track of the chip's current electrical state, so there is really only so much you can raise a given design's clock speed before you need to go back and redesign the chip. It's hard to say what these general limits would be on Apple's chips, but they've always generally prioritized IPC over clock speed, so we will see.

Intel did (rather infamously) learn this lesson the hard way on the Pentium 4 when they tried to race to 3ghz+ on comparatively ancient fabs. They achieved it (and it was great for marketing, I guess), but the chip was an absolute frying pan with a horrible design and abysmal IPC. They had to design the chip with a whopping 31 pipeline stages to do it, and branch misprediction had such a hefty penalty that it made the IPC worse than the Pentium 3's that they replaced. Even speculative execution was botched, where Intel's strategy for handling "not ready yet" instructions was quite literally to simply execute them anyway in a loop until they were ready, occupying processor resources and wasting power.

Intel has since figured out how to get much higher clock speeds on shorter pipelines, but we've got the benefit of better fabs nowadays. We will have to see how far Apple takes their chips, but I imagine it will be a while before they break past 4ghz or so. We might see them release downclocked versions in some devices, but so far, Apple has mainly just used older versions of their CPUs instead. It waits to be seen what they do in the next few years I suppose.
 
Last edited:

Analog Kid

macrumors G3
Mar 4, 2003
9,360
12,603
I imagine the chip design itself probably limits how much Apple can scale their frequencies up. CPUs get around non-zero transition times for transistor state changes by grouping CPU logic into pipeline stages and using synchros to keep track of the chip's current electrical state, so there is really only so much you can raise a given design's clock speed before you need to go back and redesign the chip. It's hard to say what these general limits would be on Apple's chips, but they've always generally prioritized IPC over clock speed, so we will see.


As I said earlier, I’m not sure the AS cores can simply be arbitrarily upclocked and cooled. It may require a different core design. Clock rate is also only one path to increased performance.
The entry point to this discussion though wasn’t whether Apple can increase the single thread performance, but whether they should.
 
  • Like
Reactions: ArkSingularity
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.