Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

leman

macrumors Core
Oct 14, 2008
19,521
19,678
Do we have a clear picture of how good utilisation of Firestorm cores tends to be with 1 thread and how much could potentially be left on the table for something like SMT? I don't imagine it would be much but do we have any concrete evidence for it?

Maybe @name99 has some data?

The article is right though. If synthetic benchmarks test single thread rather than single-core performance, it is not a fair comparison as the M1 doesn’t support HT. Basically 1 thread = 1 core on M1. While 1 core = 2 threads on Intel.

Single-core performance is single-treaded performance. Anything else and you are entering nonsense land. By that logic IMB makes the fastest CPU cores, but when you actually try running something on them they you'd get the performance of a wet noodle.

You want to know how fast a CPU can run stuff, not obfuscate test results by mixing it arbitrary hardware details. A single M1 core will be faster than a single Tiger Lake core no matter how many threads you run on it (can be one, two or one thousand).
 
Last edited:

Gnattu

macrumors 65816
Sep 18, 2020
1,107
1,671
The logical core stuff reminds me the time of A10 SoC, the first Apple SoC with "big-little" design. One efficiency core and one performance core are grouped together and exposed to OS as a single core. This is something like a "reverse SMT" to expose two CPU cores as one logical core. A big limitation of such design is that only one core type, either performance or efficiency, can be activated in the pair but not both, the benchmark MT score is therefore not faster than a dual-core chip. I cannot say those benchmarks are unfair because they cannot activate more cores using MT workloads because there is no way to do that.
 

MauiPa

macrumors 68040
Apr 18, 2018
3,438
5,084
I’ll say this in simple language that even you can understand (shutout to old 60s documentaries- yah they actually said that). The only reason for hyper-threading at all is to help fill wait states from inefficiencies in the x86 instruction set. Why hold the processor at idle when it is waiting for complex instructions to be finished, when you could have the processor run another thread to fill-in the waits? You could also overcome this waiting problem with more efficient instructions that wait less - the Arm or RISC approach. Who cares if it takes 1 instruction or 20 instructions to complete a task, you only care which approach finishes the task quicker AKA more efficiently.

Hyper-threading is a reasonable approach, but so is reducing the complexities of instruction sets. One requires multiple threads to work, the other is just inherently more efficient

Finally, let’s not forget that in single core Intel is using that look ahead scheme (which has proven to be a security vulnerability) to also increase efficiency. You can only get so much

It will be interesting to see if x86 chips hold on - or everyone migrates to a simpler - more efficient model. Lots of ARM development out there. Of course Qualcomm will probably require you to pay license fees on toasters if you want to use their socs, but I would expect their offerings to be substantial nonetheless
 

leman

macrumors Core
Oct 14, 2008
19,521
19,678
The logical core stuff reminds me the time of A10 SoC, the first Apple SoC with "big-little" design. One efficiency core and one performance core are grouped together and exposed to OS as a single core. This is something like a "reverse SMT" to expose two CPU cores as one logical core. A big limitation of such design is that only one core type, either performance or efficiency, can be activated in the pair but not both, the benchmark MT score is therefore not faster than a dual-core chip. I cannot say those benchmarks are unfair because they cannot activate more cores using MT workloads because there is no way to do that.

Exactly. You have a test workload, you run it, you measure the result — that's what you get. Reframing it in the context "but this CPU can theoretically run X threads with improved efficiency so running one thread is not representative" is at best opportunism and at worst blatant manipulation. Want to talk about performance running multiple threads? Run multiple threads and measure the results! Coming up with some sort of hypothetical "core" performance instead (whatever that might be) is not useful in the least.
 
  • Like
Reactions: throAU

leman

macrumors Core
Oct 14, 2008
19,521
19,678
I’ll say this in simple language that even you can understand (shutout to old 60s documentaries- yah they actually said that). The only reason for hyper-threading at all is to help fill wait states from inefficiencies in the x86 instruction set. Why hold the processor at idle when it is waiting for complex instructions to be finished, when you could have the processor run another thread to fill-in the waits? You could also overcome this waiting problem with more efficient instructions that wait less - the Arm or RISC approach. Who cares if it takes 1 instruction or 20 instructions to complete a task, you only care which approach finishes the task quicker AKA more efficiently.

It's hardly this simple. Power ISA is pretty much RISC — and yet Power10 has 8-way SMT! SMT is a design option, plain and simple.
 

crazy dave

macrumors 65816
Sep 9, 2010
1,453
1,229
Oh this article again … everyone else has already covered the salient technical points, but I’ll just add that the Anandtech writers Ian and Andre tried to correct this guy on Twitter and he just … didn’t get it.

Personally it was really eye opening how many “tech journalists” repeated it and gave it credence all based on not understanding simple terminology.
 
  • Like
Reactions: JMacHack

eicca

Suspended
Oct 23, 2014
1,773
3,604
I have concluded benchmarks don’t really mean much. My 2020 work MacBook Air has benchmarks nearly double my old 2011 MacBook Pro, but the Air is far and above the slowest computer I use (and it only has one third-party app on it, which is Firefox). No idea why. But it ain’t benchmarks.

Another example: my Mac Pro has an even lower single core benchmark than my 2011 MacBook Pro, but single thread tasks are still somehow worlds faster.

The only real way to judge a computer is actual usage cases.

EDIT: Failed to specify, my 2020 MBA is the I5 model.
 
Last edited:

leman

macrumors Core
Oct 14, 2008
19,521
19,678
I have concluded benchmarks don’t really mean much.

Of course benchmarks matter. But one needs to understand how to interpret them and whether they will apply to a specific use case. If you are a regular home/office user, the only benchmark that is relevant to you is how quickly the system responds to your action and that's not really measurable in the first place.
 

jonblatho

macrumors 68030
Jan 20, 2014
2,529
6,241
Oklahoma
I have concluded benchmarks don’t really mean much. My 2020 work MacBook Air has benchmarks nearly double my old 2011 MacBook Pro, but the Air is far and above the slowest computer I use (and it only has one third-party app on it, which is Firefox). No idea why. But it ain’t benchmarks.

Another example: my Mac Pro has an even lower single core benchmark than my 2011 MacBook Pro, but single thread tasks are still somehow worlds faster.

The only real way to judge a computer is actual usage cases.
Assuming that this is an M1 MacBook Air and you experience that slowness in Firefox…not that Firefox is known for stellar performance/efficiency, but are you sure you’re using the Apple silicon version? Browsers can tend to struggle in Rosetta 2 translation.
 

futbalguy

macrumors 6502
May 16, 2007
285
63
I have concluded benchmarks don’t really mean much. My 2020 work MacBook Air has benchmarks nearly double my old 2011 MacBook Pro, but the Air is far and above the slowest computer I use (and it only has one third-party app on it, which is Firefox). No idea why. But it ain’t benchmarks.

Another example: my Mac Pro has an even lower single core benchmark than my 2011 MacBook Pro, but single thread tasks are still somehow worlds faster.

The only real way to judge a computer is actual usage cases.
Your M1 MacBook Air should crush the 2011 MacBook Pro. Check that you are running an M1 native app. Another possibility is the MacBook Air may have less memory and could be using swap space on the ssd which is much slower. The only other thing I can think of is the GPU on the MacBook Pro is better than MacBook Air, but 2011 is so old I don’t think it should be the case.
 

dgdosen

macrumors 68030
Dec 13, 2003
2,817
1,463
Seattle
Your M1 MacBook Air should crush the 2011 MacBook Pro. Check that you are running an M1 native app. Another possibility is the MacBook Air may have less memory and could be using swap space on the ssd which is much slower. The only other thing I can think of is the GPU on the MacBook Pro is better than MacBook Air, but 2011 is so old I don’t think it should be the case.
Unless it's an early 2020 Intel version... I think those are particularly thermally constrained.
 

eicca

Suspended
Oct 23, 2014
1,773
3,604
Assuming that this is an M1 MacBook Air and you experience that slowness in Firefox…not that Firefox is known for stellar performance/efficiency, but are you sure you’re using the Apple silicon version? Browsers can tend to struggle in Rosetta 2 translation.

Your M1 MacBook Air should crush the 2011 MacBook Pro. Check that you are running an M1 native app. Another possibility is the MacBook Air may have less memory and could be using swap space on the ssd which is much slower. The only other thing I can think of is the GPU on the MacBook Pro is better than MacBook Air, but 2011 is so old I don’t think it should be the case.

I failed to specify my 2020 MBA is the I5 model. Which still benchmarks double my 2011, but man that Air is just molasses. I'm trying to talk our IT guy into upgrading me to an M1.
 

eicca

Suspended
Oct 23, 2014
1,773
3,604
Unless it's an early 2020 Intel version... I think those are particularly thermally constrained.
You know, I wonder if that's the thing. I almost never hear the fans on my MBA but it's catastrophically slow. Whereas my 2011 MBP spins up the fans pretty quick but still stays much faster.
 

tonyz123456

macrumors member
Apr 4, 2017
79
56
I randomly came across this thread while browsing but am curious - what mainstream apps or games today are still single threaded where this matters? I don't think I've had a single core CPU computer in a very very long-term - maybe 10+ years so isn't single threaded benchmarks pointless since most software worth paying for have supported multi-core for years?

Whether it's optimized for the M1 chip or not is a separate topic.
 

Spindel

macrumors 6502a
Oct 5, 2020
521
655
I’ll say this in simple language that even you can understand (shutout to old 60s documentaries- yah they actually said that). The only reason for hyper-threading at all is to help fill wait states from inefficiencies in the x86 instruction set. Why hold the processor at idle when it is waiting for complex instructions to be finished, when you could have the processor run another thread to fill-in the waits? You could also overcome this waiting problem with more efficient instructions that wait less - the Arm or RISC approach. Who cares if it takes 1 instruction or 20 instructions to complete a task, you only care which approach finishes the task quicker AKA more efficiently.
while I agree with you to a large degree it’s not only CISC cpus that have inefficiencies.

In example Power architecture has up to 8 threads per core because, even if the instruction set is reduced it has one hell of a lot for example multiplication units that can not be filled all the time. Thus it has HT/SMT.
 

jonblatho

macrumors 68030
Jan 20, 2014
2,529
6,241
Oklahoma
I failed to specify my 2020 MBA is the I5 model. Which still benchmarks double my 2011, but man that Air is just molasses. I'm trying to talk our IT guy into upgrading me to an M1.
Ah yes, I forgot about the early 2020 Intel refresh. Yeah, there are some pretty serious thermal constraints on that so it’s probably just the 2015–2020 pattern of Apple asking too much of the CPUs and corresponding cooling they put into their machines, especially notebooks.
 

name99

macrumors 68020
Jun 21, 2004
2,410
2,322
Maybe @name99 has some data?



Single-core performance is single-treaded performance. Anything else and you are entering nonsense land. By that logic IMB makes the fastest CPU cores, but when you actually try running something on them they you'd get the performance of a wet noodle.

You want to know how fast a CPU can run stuff, not obfuscate test results by mixing it arbitrary hardware details. A single M1 core will be faster than a single Tiger Lake core no matter how many threads you run on it (can be one, two or one thousand).

What's the question?

I'm not interested in tribal idiocy.
You want to compare the performance of a SINGLE-THREADED M1 against a SINGLE-THREADED x86, well, look at the GB5 or AnandTech SPEC numbers.
You want to compare the multi-threaded performance of a particular SoC (M1 Pro 6 core or whatever) against a particular x86 SoC (Tiger Lake i7-1185G7) or whatever, again GB5 and AnandTech SPEC numbers give the results (spoiler alert -- more cores gives more throughput! -- and cost more! -- and use more energy!)

But when you want to start playing games where you say "I will insist that my unit of computation is whatever makes my team look best" that's where I lose patience. Why is the appropriate unit of comparison the "x86 hyperthreaded core" and not, for example, "the M1 P-cluster"?
If you're going to play that sort of game, more sensible targets are:
- performance per dollar or
- performance per watt.

I write, and explain, for people who want to understand. Not for people who are ONLY interested in dick-measuring.
I'd urge you to do the same. It's vastly more interesting figuring out how an M1 L1D cache works compared to a recent Intel cache, than wasting time trying to explain things to people who have zero interest in understanding.
 

cmaier

Suspended
Jul 25, 2007
25,405
33,474
California
I randomly came across this thread while browsing but am curious - what mainstream apps or games today are still single threaded where this matters? I don't think I've had a single core CPU computer in a very very long-term - maybe 10+ years so isn't single threaded benchmarks pointless since most software worth paying for have supported multi-core for years?

Whether it's optimized for the M1 chip or not is a separate topic.
No. Many problems are simply not parallelizable. Even in multi-threaded apps, single thread performance matters.
 
  • Like
Reactions: ADGrant and throAU

casperes1996

macrumors 604
Jan 26, 2014
7,599
5,770
Horsens, Denmark
I randomly came across this thread while browsing but am curious - what mainstream apps or games today are still single threaded where this matters? I don't think I've had a single core CPU computer in a very very long-term - maybe 10+ years so isn't single threaded benchmarks pointless since most software worth paying for have supported multi-core for years?

Whether it's optimized for the M1 chip or not is a separate topic.

Yes and no. Let's take an extreme example to illustrate the point. Same principles apply in more realistic examples but we'll make it extreme to really illustrate it.

Let's say you want to play a YouTube video. Normally a lot of this will be offloaded to dedicated hardware and real world video codecs don't really parallelise exactly this way but let's say as an example that we can make 1 thread for every frame that needs to be decoded in the video. It's a short video so let's say there are 600. At 30FPS that's 20 seconds.
We have a pretty awesome beast of a multi-core machine with 2 hundred million cores 600 of which go to work on this task at the same time. Awesome, whole video should be decoded in no time. But actually, each core is super slow and takes about 12 minutes to decode one frame. Now because you had that many cores you can watch the whole video in 12 minutes; All frames will be ready. But it also takes 12 minutes to get just one frame ready. But you have super good multithreaded benchmarks with your 2 million cores!
The essence of the problem is that even if things are logically parallelised, it still matters how fast each individual task can be finished. Furthermore, some tasks are not possible to perform in parallel so the program may be multi-threaded where possible but it isn't possible everywhere. Some operations are logically dependant on the results of prior operations. For example, if you run a program that automates a task by checking for new emails and then grouping all your new emails in two piles depending on who they were from. Then the process of grouping emails logically depends on fetching new emails. So you have a sequence. We must firsts, as a single threaded task, fetch new emails. Once we have the emails however, we can check which group to throw them in in parallel inspecting each email as a unique task or whatever subdivision makes sense.

Did that clear it up?
 

leman

macrumors Core
Oct 14, 2008
19,521
19,678
What's the question?

The question was this:

Do we have a clear picture of how good utilisation of Firestorm cores tends to be with 1 thread and how much could potentially be left on the table for something like SMT? I don't imagine it would be much but do we have any concrete evidence for it?

As to the rest, I completely agree with you. The main reason why I even bother replying to this kind of nonsense is to try to stop a flow of misinformation. Even if it's a futile effort.

Anyway, I can't wait to get my 16" M1 and finally do some proper GPU programming :)
 

casperes1996

macrumors 604
Jan 26, 2014
7,599
5,770
Horsens, Denmark
As to the rest, I completely agree with you. The main reason why I even bother replying to this kind of nonsense is to try to stop a flow of misinformation. Even if it's a futile effort.
I assume this relates to the thread in general and not my comment? :p

Anyway, I can't wait to get my 16" M1 and finally do some proper GPU programming :)

What kind of GPU programming do you do? I know a little Discord community that mostly focuses around Metal, though it's predominantly graphics, not so much GPGPU, but I can recommend the 2etime Discord - It's mostly intended as a learning environment for newcomers to Metal but there are also more advanced users on there :); Including someone from Apple's Xcode GPU Debugger team
 

Analog Kid

macrumors G3
Mar 4, 2003
9,360
12,603
But for multi-core chips, per core performance may be viewed as a proxy for performance per [silicon] area metric.
Just spitballing here, but wouldn’t a better proxy be taking overall performance and dividing by silicon area?

In general, I’m a fan of measuring something rather than measuring not-that-thing and calling it the thing.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.