Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

cmaier

Suspended
Jul 25, 2007
25,405
33,474
California
Hi Cmaier!

I found the issue - I typed decoder but meant 8 wide decode block - Apple Silicon cores appear to have both giant caches and be 8 wide. Anandtech has written about it as did a developer. Seeing as Anandtech is the brainchild of Anand Lal Shimpi of the Apple Silicon team I think they have a good knowledge source.


Ok, this is what it says:

Other contemporary designs such as AMD’s Zen(1 through 3) and Intel’s µarch’s, x86 CPUs today still only feature a 4-wide decoder designs (Intel is 1+4) that is seemingly limited from going wider at this point in time due to the ISA’s inherent variable instruction length nature, making designing decoders that are able to deal with aspect of the architecture more difficult compared to the ARM ISA’s fixed-length instructions.

I don’t disagree with that, in principle - it says “seemingly” and “at this point in time.” So it’s not claiming there is some inherent unsolvable problem. And someone earlier claimed that AMD said it couldn’t go wider, and I don’t think AMD ever said that.

I think it all comes back to my point - going wider would have diminishing returns, because the added pipelines wouldn’t be filled sufficiently often to make it worth the added hardware worth it.
 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
I think it all comes back to my point - going wider would have diminishing returns, because the added pipelines wouldn’t be filled sufficiently often to make it worth the added hardware worth it.

This makes perfect sense. And this line of reasoning transcends ISA - it applies equally to x86, ARM and anything else. There is a reason that even high-performance ARM designs like X1 max out at 5 decoders (which is roughly equivalent to what’s found in x86 land).

But Apple has somehow managed to crack the problem. Maybe it’s their absolutely humongous reorder buffers (they can keep hundreds of loads and stores in flight), or maybe their branch predictors are that much better, but somehow they can have a 50% wider backend than anyone else and still keep it well utilized.
 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
Think of it this way - how many incoming ISA instructions do you think an x86 CPU even knows about at once? How many can it look at simultaneously to figure out if there’s an instruction in the queue that does not have any dependencies on prior instructions? Not a lot.

352 micro-ops on Tiger Lake, 256 on Zen 3, over 600 micro-ops on M1. So about 150-200 instructions for x86 designs... I wouldn't say that this window is exceedingly small. It's not like the compiler has an infinite optimization window either (function body limits etc.).

But I digress, you obviously have much more knowledge in this domain so I assume that you are right. I'm a software guy, not a hardware guy. From the algorithmic side of things, I have difficulty understanding why the more "tight" nature of x86 would put it at a disadvantage... but I suppose it's just me.

Where I do 100% follow you are things like rearranging branches, "heavy" dependencies (e.g. loads that are likely to miss the cache, instructions that are known to have long running time etc. ) at compile time to assist the OOE execution. I just think that overdoing this might end up hurting performance on a more sophisticated CPU that can potentially eliminate some loads and fuse operations.
 
  • Like
Reactions: BigMcGuire

leman

macrumors Core
Oct 14, 2008
19,522
19,679
The passage makes a lot of sense. Not sure if you ever looked into decoders of a fixed length instruction set architecture like ARM. The instruction are all 4 wide and 4byte aligned, making it possible to fetch a large memory block per cycle and feed them literally into parallel decoders. This is just not possible with x64, as you correctly pointed out, because you need to figure out where each instruction starts and ends - which is largely of sequential nature.
Not sure why you claim the decode has nothing to do with parallelism either - you should know better despite having mostly worked on the backend side.

VLA can be parallelized as well. You might want to look at SIMD algorithms for things like UTF-8 validation or JSON parsing. They exist and work very well. The basic principle is that you have routines that detect bit patterns (ones that encode how long a sequence is) — these routines work in parallel and their output can be combined in order to quickly and efficiently detect sequence boundaries. Fixed-function hardware of this kind can be made very fast too (think sorting networks). It's just that for something as messed up as x86 it takes a lot of transistor budget and presumably a lot of power. If you compare the binary encoding of x86 and ARM Aarch64, the former is made out of nightmares while the later is a beautiful, elegant butterfly. Still, x86 CPUs have largely fixed their issues by employing optimization techniques like uop caches etc.

Probably the real reason why x86 doesn't go wider is what @cmaier points out: it is very difficult to feed a wide execution backend and you quickly run into diminishing returns. As I wrote above, Apple is the only one who seems to have solved this problem in practical terms — there is no other company shipping a commercial out of-order CPU with such a wide design as far as I am aware.

P.S. Fun little detail: Apple GPUs use variable-length instruction encoding, with instructions ranging from two to 12 bytes... it seems to be easy to decode though...
 

jeremiah256

macrumors 65816
Aug 2, 2008
1,444
1,169
Southern California
Story today about Apple not making an iMessage client for android phones -- because it would hurt Apple. That's pretty close to taking a monopoly power too far.

What monopoly power? Including iMessage, I've got four messaging apps on my iPhone.

Facebook's WhatsApp and Facebook Messenger have more active users than iMessage. And WeChat is growing and may overtake iMessage soon. Exclusivity does not mean monopoly, especially when there are multiple, more popular, options.
 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
Story today about Apple not making an iMessage client for android phones -- because it would hurt Apple. That's pretty close to taking a monopoly power too far.

That is very strange logic. Message is not a monopoly, they are one service out of many. And of course a company uses their services in order to lock in/keep the clients. Complaining that Apple doesn't open Messages to other platforms is like complaining that KFC does not hand out their recipe to Burger King or that Coca-Cola does not allow PepsiCo to produce Coke.

Now, if Message was the only messaging service available on the iPhone, this argument would have much more merit. But as it is now its akin to arguing that a company is not allowed to offer exclusive services to lock in their brand benefits.

Edit: after thinking about it for couple of minutes, I do agree that the fact that Messages is the default app for SMS and also the default rich messaging service could be under circumstances interpreted as abuse of the company position. There were relevant rulings regarding default browsers... Frankly, I think that Apple should make SMS services available to the third party and allow the default messaging app to be changed by the user. That should remove any concerns over unfair practices.
 
Last edited:

jeremiah256

macrumors 65816
Aug 2, 2008
1,444
1,169
Southern California
That is very strange logic. Message is not a monopoly, they are one service out of many. And of course a company uses their services in order to lock in/keep the clients. Complaining that Apple doesn't open Messages to other platforms is like complaining that KFC does not hand out their recipe to Burger King or that Coca-Cola does not allow PepsiCo to produce Coke.

Now, if Message was the only messaging service available on the iPhone, this argument would have much more merit. But as it is now its akin to arguing that a company is not allowed to offer exclusive services to lock in their brand benefits.
The way people are acting whenever there's a new chicken sandwich, maybe KFC and Popeyes do need to open source those recipes.
 
  • Like
Reactions: BigMcGuire

leman

macrumors Core
Oct 14, 2008
19,522
19,679
My Ryzen Windows laptop is faster than MacBook Air M1. No joke
What's so "haha" I don't get it. I tested out from boot times to opening chrome and other apps. Windows is faster. I have a feeling that people here don't use Windows daily and are biased.

Thank you, that's great to know! I should probably move from my Mac to a Ryzen-based Windows machine, it sounds like it's perfect for my job of "professionally booting the computer up and opening Chrome repeatedly 1000 times per day". Why, it sounds like I could probably get around 30-50% more boot cycles into my workday, dramatically increasing my revenue! Good bye sad, miserable life in poverty, good morning caviar and champagne!
 

quarkysg

macrumors 65816
Oct 12, 2019
1,247
841
There were relevant rulings regarding default browsers... Frankly, I think that Apple should make SMS services available to the third party and allow the default messaging app to be changed by the user. That should remove any concerns over unfair practices.
If you are referring to the Microsoft case with Internet Explorer, I remember that the act of having Internet Explorer as a default install by itself does not constitute anti-trust behaviour. It was Microsoft's tactic of (threatening?) OEMs preventing pre-installing Netscape as an alternative browser, or else risk not getting Windows licenses (or something to that effect) that got them into trouble.

I suppose Apple could have APIs to allow sending of SMS, but that may open up DoS attack vectors for malwares.

I'm not sure if Android has such capabilities. I suspect not tho.
 

iHorseHead

macrumors 68000
Jan 1, 2021
1,594
2,003
Thank you, that's great to know! I should probably move from my Mac to a Ryzen-based Windows machine, it sounds like it's perfect for my job of "professionally booting the computer up and opening Chrome repeatedly 1000 times per day". Why, it sounds like I could probably get around 30-50% more boot cycles into my workday, dramatically increasing my revenue! Good bye sad, miserable life in poverty, good morning caviar and champagne!
You don't have to show me the attitude. Also Unity works better on my Windows laptop and has better export times. I didn't bring it up because Chrome is the only M1 app I use to keep things fair. Too early to say anything about Unity.
 

EntropyQ3

macrumors 6502a
Mar 20, 2009
718
824
cmaier, I really appreciate you sharing your from-the-trenches experience. Thanks.
I would like to ask you something that is really difficult to evaluate from my armchair perspective - I have often heard that the sheer volume of the x86 ISA, the "accumulated cruft", would make designing new x86 cores require more work/time/expense/debugging than designing, say a pure 64-bit ARM8 core.
It sounds plausible, but - by how much? Enough that it significantly affects decision to product cycle time, or can it be compensated by hiring more people? Does it have any specific consequences you’d like to mention, (apart from the more formal consequences of dependencies you’ve already discussed)?
 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
You don't have to show me the attitude. Also Unity works better on my Windows laptop and has better export times. I didn't bring it up because Chrome is the only M1 app I use to keep things fair. Too early to say anything about Unity.

Not surprised about Unity, it still lacks native tools from what I know. As to “attitude”, I reserve all right to poke fun at you if your definition of “performance” revolves around boot times. And I also totally believe that Mac apps launch marginally slower, the OS is also doing much more work here (sandboxing, cryptographic validation, certificate verification...).

On a more serious note, yes, mobile Ryzen will often be faster on short-running workloads that utilize multiple cores. It has twice as many cores as M1 and uses almost three times as much power during its PL2 window. Not to mention that performance-oriented Ryzen laptops set PL1 to 25 watts. M1 is pretty much capped at 15W CPU except some micro benchmarks that can push it a bit more.
 
  • Like
Reactions: BigMcGuire

iHorseHead

macrumors 68000
Jan 1, 2021
1,594
2,003
Not surprised about Unity, it still lacks native tools from what I know. As to “attitude”, I reserve all right to poke fun at you if your definition of “performance” revolves around boot times. And I also totally believe that Mac apps launch marginally slower, the OS is also doing much more work here (sandboxing, cryptographic validation, certificate verification...).

On a more serious note, yes, mobile Ryzen will often be faster on short-running workloads that utilize multiple cores. It has twice as many cores as M1 and uses almost three times as much power during its PL2 window. Not to mention that performance-oriented Ryzen laptops set PL1 to 25 watts. M1 is pretty much capped at 15W CPU except some micro benchmarks that can push it a bit more.
I just told how it is. My definition around performance doesn't resolve around boot times. Windows does verifications too and it has certificates. I just use Unity for my day to day work and on Windows it's faster. I'm still waiting for M1 version.
My point was Ryzen is pretty good and shouldn't be underestimated. A lot better than Intel PCs I really have to use at work.
 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
My point was Ryzen is pretty good and shouldn't be underestimated. A lot better than Intel PCs I really have to use at work.

Nobody is dismissing Ryzen. Zen3 has finally caught up with Intel and AMD’s execution so far is very promising. It’s just that it’s no match for Apple CPUs in terms of efficiency.
 

iHorseHead

macrumors 68000
Jan 1, 2021
1,594
2,003
Nobody is dismissing Ryzen. Zen3 has finally caught up with Intel and AMD’s execution so far is very promising. It’s just that it’s no match for Apple CPUs in terms of efficiency.
I highly doubt it.
 

Tenkaykev

macrumors 6502
Jun 29, 2020
386
431
You are comparing a quad-core CPU that tops out at 15 watts with a 8-core CPU that has long-term sustained limit at 35W and usually operates with a power level of 50-60 watts... so unless the Ryzen is beating the M1 by a factor of 3 across the board, I am not sure what your point is.
I thought the same, to me it came across as a ringing endorsement of the M1 MacBook?
 

Icelus

macrumors 6502
Nov 3, 2018
422
578
You are comparing a quad-core CPU that tops out at 15 watts with a 8-core CPU that has long-term sustained limit at 35W and usually operates with a power level of 50-60 watts... so unless the Ryzen is beating the M1 by a factor of 3 across the board, I am not sure what your point is.
And SMT2 so 16 threads. Windows has (had?) also a prefetcher to preload (part of) common used applications in order to reduce loading times.
 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
And SMT2 so 16 threads. Windows has (had?) also a prefetcher to preload (part of) common used applications in order to reduce loading times.

SMT2 as well as efficiency cores are implementation details so I don’t think they should be held against a CPU. It’s Apple choice for foregoing SMT in their designs, we shouldn’t penalize other CPUs because they implement it.
 

bobcomer

macrumors 601
May 18, 2015
4,949
3,699
Frankly, I think that Apple should make SMS services available to the third party and allow the default messaging app to be changed by the user. That should remove any concerns over unfair practices.
I agree, but it's not really something I care much about. As long as there is SMS, I'm good. That's basically how it works on Android..

I only posted that story because of what Apple said. SMS is SMS to me, no matter what app I'm using to send texts. I don't go in for the flashy iMessage capabilities.
 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
I thought it was odd that Apple said that....

You funny guy :) No but really, why would they willingly give away one of the best exclusive features of their platform to a competitor? It makes no business sense.
 

UBS28

macrumors 68030
Oct 2, 2012
2,893
2,340
dont get me wrong i love my m1 laptop its my most valued thing i own. but i love when we have these cpu makers fight ..it will only help us consumers get a better product

AMD is the one that is going to beat M1, because they can easily switch to 5nm too once TSMC has enough capacity to produce those chips. And they have much greater software compatiblity.

The fact that you could buy an AMD "gaming PC" that was faster than the most expensive 32-core Mac Pro (Intel), was quite clear that AMD was the to-go solution for a long time already.

I honestly would have wished that Apple would have gone to AMD, to ensure maximum software compatability (so there would also be no reason to kill 32-bit as AMD supports it). I'm confident that some of my pro equipment will never work again on a Mac, as the guys behind the product look to only support Windows now after Apple killed 32-bit.
 
  • Like
Reactions: bobcomer

BigMcGuire

Cancelled
Jan 10, 2012
9,832
14,032
I built an AMD screamer last year (3600x with 64GB of 3200 ram and 2 - 1TB NVME drives). That was the fastest I've ever experienced Windows. I put a massive cool master heatsink on the CPU and it was able to hit factory overclock non-stop for hours. Screaming fast system.

The M1 is impressive. I've had some time to use it and it is the fastest Mac experience I've ever had. I played a long Starcraft II game and noticed that the fans never turned on (or if they did, I didn't notice) - but the case got warm to the touch. It was able to run Starcraft II comfortably on medium on my 4k monitor. What I didn't like is that the battery temp got to almost 40C. But that's Mac - they prefer to cook things instead of having the sound of airflow.

If I were to build a PC, I'd go AMD no question - it has been that way for awhile.

As someone who uses Windows every day... I guess, according to some on this thread here, I have the authority to say the M1 is really fast and impressive. I can read 10+ safari tabs and sip 1.26-2.2watts. I can read for hours and consume only a few percentage points on the battery - all the while the laptop being ice cold metal on my lap. Absolutely love it. This has replaced my iPad for now.

Love this chat about CPUs. I'm very impressed with the M1. I don't understand the need for people to defend their OS as if it was religion. These are tools to get jobs done and/or personal preferences for relaxation/life?
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.