A simple case to demonstrate this is the case of one core processors.
O.k. fair enough.
Now just find my a single core dual thread processor.....
..... and tell me how it is relevant today.
A simple case to demonstrate this is the case of one core processors.
Do we have a clear picture of how good utilisation of Firestorm cores tends to be with 1 thread and how much could potentially be left on the table for something like SMT? I don't imagine it would be much but do we have any concrete evidence for it?
The article is right though. If synthetic benchmarks test single thread rather than single-core performance, it is not a fair comparison as the M1 doesn’t support HT. Basically 1 thread = 1 core on M1. While 1 core = 2 threads on Intel.
The logical core stuff reminds me the time of A10 SoC, the first Apple SoC with "big-little" design. One efficiency core and one performance core are grouped together and exposed to OS as a single core. This is something like a "reverse SMT" to expose two CPU cores as one logical core. A big limitation of such design is that only one core type, either performance or efficiency, can be activated in the pair but not both, the benchmark MT score is therefore not faster than a dual-core chip. I cannot say those benchmarks are unfair because they cannot activate more cores using MT workloads because there is no way to do that.
I’ll say this in simple language that even you can understand (shutout to old 60s documentaries- yah they actually said that). The only reason for hyper-threading at all is to help fill wait states from inefficiencies in the x86 instruction set. Why hold the processor at idle when it is waiting for complex instructions to be finished, when you could have the processor run another thread to fill-in the waits? You could also overcome this waiting problem with more efficient instructions that wait less - the Arm or RISC approach. Who cares if it takes 1 instruction or 20 instructions to complete a task, you only care which approach finishes the task quicker AKA more efficiently.
It's hardly this simple. Power ISA is pretty much RISC — and yet Power10 has 8-way SMT! SMT is a design option, plain and simple.
I have concluded benchmarks don’t really mean much.
Assuming that this is an M1 MacBook Air and you experience that slowness in Firefox…not that Firefox is known for stellar performance/efficiency, but are you sure you’re using the Apple silicon version? Browsers can tend to struggle in Rosetta 2 translation.I have concluded benchmarks don’t really mean much. My 2020 work MacBook Air has benchmarks nearly double my old 2011 MacBook Pro, but the Air is far and above the slowest computer I use (and it only has one third-party app on it, which is Firefox). No idea why. But it ain’t benchmarks.
Another example: my Mac Pro has an even lower single core benchmark than my 2011 MacBook Pro, but single thread tasks are still somehow worlds faster.
The only real way to judge a computer is actual usage cases.
Your M1 MacBook Air should crush the 2011 MacBook Pro. Check that you are running an M1 native app. Another possibility is the MacBook Air may have less memory and could be using swap space on the ssd which is much slower. The only other thing I can think of is the GPU on the MacBook Pro is better than MacBook Air, but 2011 is so old I don’t think it should be the case.I have concluded benchmarks don’t really mean much. My 2020 work MacBook Air has benchmarks nearly double my old 2011 MacBook Pro, but the Air is far and above the slowest computer I use (and it only has one third-party app on it, which is Firefox). No idea why. But it ain’t benchmarks.
Another example: my Mac Pro has an even lower single core benchmark than my 2011 MacBook Pro, but single thread tasks are still somehow worlds faster.
The only real way to judge a computer is actual usage cases.
Unless it's an early 2020 Intel version... I think those are particularly thermally constrained.Your M1 MacBook Air should crush the 2011 MacBook Pro. Check that you are running an M1 native app. Another possibility is the MacBook Air may have less memory and could be using swap space on the ssd which is much slower. The only other thing I can think of is the GPU on the MacBook Pro is better than MacBook Air, but 2011 is so old I don’t think it should be the case.
Assuming that this is an M1 MacBook Air and you experience that slowness in Firefox…not that Firefox is known for stellar performance/efficiency, but are you sure you’re using the Apple silicon version? Browsers can tend to struggle in Rosetta 2 translation.
Your M1 MacBook Air should crush the 2011 MacBook Pro. Check that you are running an M1 native app. Another possibility is the MacBook Air may have less memory and could be using swap space on the ssd which is much slower. The only other thing I can think of is the GPU on the MacBook Pro is better than MacBook Air, but 2011 is so old I don’t think it should be the case.
You know, I wonder if that's the thing. I almost never hear the fans on my MBA but it's catastrophically slow. Whereas my 2011 MBP spins up the fans pretty quick but still stays much faster.Unless it's an early 2020 Intel version... I think those are particularly thermally constrained.
while I agree with you to a large degree it’s not only CISC cpus that have inefficiencies.I’ll say this in simple language that even you can understand (shutout to old 60s documentaries- yah they actually said that). The only reason for hyper-threading at all is to help fill wait states from inefficiencies in the x86 instruction set. Why hold the processor at idle when it is waiting for complex instructions to be finished, when you could have the processor run another thread to fill-in the waits? You could also overcome this waiting problem with more efficient instructions that wait less - the Arm or RISC approach. Who cares if it takes 1 instruction or 20 instructions to complete a task, you only care which approach finishes the task quicker AKA more efficiently.
Ah yes, I forgot about the early 2020 Intel refresh. Yeah, there are some pretty serious thermal constraints on that so it’s probably just the 2015–2020 pattern of Apple asking too much of the CPUs and corresponding cooling they put into their machines, especially notebooks.I failed to specify my 2020 MBA is the I5 model. Which still benchmarks double my 2011, but man that Air is just molasses. I'm trying to talk our IT guy into upgrading me to an M1.
Maybe @name99 has some data?
Single-core performance is single-treaded performance. Anything else and you are entering nonsense land. By that logic IMB makes the fastest CPU cores, but when you actually try running something on them they you'd get the performance of a wet noodle.
You want to know how fast a CPU can run stuff, not obfuscate test results by mixing it arbitrary hardware details. A single M1 core will be faster than a single Tiger Lake core no matter how many threads you run on it (can be one, two or one thousand).
No. Many problems are simply not parallelizable. Even in multi-threaded apps, single thread performance matters.I randomly came across this thread while browsing but am curious - what mainstream apps or games today are still single threaded where this matters? I don't think I've had a single core CPU computer in a very very long-term - maybe 10+ years so isn't single threaded benchmarks pointless since most software worth paying for have supported multi-core for years?
Whether it's optimized for the M1 chip or not is a separate topic.
I randomly came across this thread while browsing but am curious - what mainstream apps or games today are still single threaded where this matters? I don't think I've had a single core CPU computer in a very very long-term - maybe 10+ years so isn't single threaded benchmarks pointless since most software worth paying for have supported multi-core for years?
Whether it's optimized for the M1 chip or not is a separate topic.
What's the question?
Do we have a clear picture of how good utilisation of Firestorm cores tends to be with 1 thread and how much could potentially be left on the table for something like SMT? I don't imagine it would be much but do we have any concrete evidence for it?
I assume this relates to the thread in general and not my comment?As to the rest, I completely agree with you. The main reason why I even bother replying to this kind of nonsense is to try to stop a flow of misinformation. Even if it's a futile effort.
Anyway, I can't wait to get my 16" M1 and finally do some proper GPU programming
Is single threaded code performance important or not?
Just spitballing here, but wouldn’t a better proxy be taking overall performance and dividing by silicon area?But for multi-core chips, per core performance may be viewed as a proxy for performance per [silicon] area metric.