Anybody notice a diff between hyperthreading and no hyperthreading on the M1 based macs?

setharsh · Nov 1, 2021

Wanted to see what other devs think about having less total threads available. Im coming from a strong windows and Azure background and was curious to see that Macs with M1 don't have hyperthreading. I know it's not the same as having more dedicated cores, but curious to see if Apple actually thinks its a good idea for M2 and beyond or not.

bill-p · Nov 1, 2021

Hyperthreading will help with heavy multithreaded workloads. But the problem is that... it will decrease efficiency with lighter workloads due to extra scheduling overhead.

I think Apple made a wise decision to not include multithreading with any of their processors thus far, and instead prioritized efficiency. My MacBook's battery life is very happy with this decision.

For the upcoming iMac and Mac Pro, maybe Apple will implement some form of hyperthreading since those machines won't have to worry about efficiency? We'll see.

leman · Nov 1, 2021

So far Apple has demonstrated - and very convincingly - that they don’t need hyper threading. They can easily outperform anything in their power bracket, abs even match much larger CPUs, hyper threading or not. Another thing is that increasing hardware threads has limited scalability for consumer hardware. Not all tasks can trivially implemented to take advantage of large amount of hardware threads.

setharsh · Nov 1, 2021

bill-p said:
Hyperthreading will help with heavy multithreaded workloads. But the problem is that... it will decrease efficiency with lighter workloads due to extra scheduling overhead.

I think Apple made a wise decision to not include multithreading with any of their processors thus far, and instead prioritized efficiency. My MacBook's battery life is very happy with this decision.

For the upcoming iMac and Mac Pro, maybe Apple will implement some form of hyperthreading since those machines won't have to worry about efficiency? We'll see.

interesting, I come from a microservices implementation background, and sometimes -- at-least in windows -- having more threads (read v-cores) and relegating the actual work to the processor (aka hyperthreading) actually helps me in my day to day job. More threads are more threads and having actual hw implementation for arbitration between # cores vs # threads is better than software arbitration, and I've seen real world benefits. Eg, concurrent GC, more max Http connections, more worker threads in my service, more background tasks.

I know I wont get to use a Mac for my work, but still want to play around the latest and greatest bin-A M1 Maxed macs

crazy dave · Nov 1, 2021

bill-p said:
Hyperthreading will help with heavy multithreaded workloads. But the problem is that... it will decrease efficiency with lighter workloads due to extra scheduling overhead.

I think Apple made a wise decision to not include multithreading with any of their processors thus far, and instead prioritized efficiency. My MacBook's battery life is very happy with this decision.

For the upcoming iMac and Mac Pro, maybe Apple will implement some form of hyperthreading since those machines won't have to worry about efficiency? We'll see.

They will not. The presence of hyper threading in and of itself doesn’t change single thread performance or efficiency. That overhead is minimal. What’s happening is that current x86 designs often have more core resources than they can actually use in a single thread. Thus you can overlay a second thread to get, in an ideal workload, a 20% extra speed boost than if you had a run the two threads sequentially - this for next to no extra energy output since with a single thread you already running the core at full. Note though this hyperthreading means that an individual thread will complete more slowly than if you hadn’t put a second thread on the core. That’s a negative. Overall, that hyperthreading is effective in x86 is a consequence of the core design.

Apple’s core design is fundamentally different and attempts to feed more of its extra wide, but slower core with work from a single thread. The consequence of this is more efficient and faster processing of those threads but adding hyperthreading would probably gain you little to nothing.

Hyperthreading also has other consequences good and bad. So far, Apple’s core design seems to negate its need.

altaic · Nov 1, 2021

crazy dave said:
They will not. The presence of hyper threading in and of itself doesn’t change single thread performance or efficiency. That overhead is minimal. What’s happening is that current x86 designs often have more core resources than they can actually use in a single thread. Thus you can overlay a second thread to get, in an ideal workload, a 20% extra speed boost than if you had a run the two threads sequentially - this for next to no extra energy output since with a single thread you already running the core at full. Note though this hyperthreading means that an individual thread will complete more slowly than if you hadn’t put a second thread on the core. That’s a negative. Overall, that hyperthreading is effective in x86 is a consequence of the core design.

Apple’s core design is fundamentally different and attempts to feed more of its extra wide, but slower core with work from a single thread. The consequence of this is more efficient and faster processing of those threads but adding hyperthreading would probably gain you little to nothing.

Hyperthreading also has other consequences good and bad. So far, Apple’s core design seems to negate its need.

To expand on this, the M1 has a much wider instruction decoder (8 wide vs 4 for x86) paired with huge re-order buffer (~630 instructions vs 352 for x86) to achieve very high instruction level parallelism. I think Apple didn't bother to invest in a hyper-threading sort of simultaneous multithreading because the parallelism was already so good. For more info, I recommend this AnandTech article (particularly page 2).

NB, that's just for the M1, and the x86 numbers are cherry-picked to the benefit of the x86. It's also possible that the decoder or re-order buffer have been improved even more in the M1 Pro/Max.

Edit: According to this comment, the x86 numbers may actually be a bit better: 5 wide decoder. I don't have time to verify, but either way, the M1 has great single core performance at lower clocks hence a lot of parallelism and less bottlenecking.

Krevnik · Nov 1, 2021

crazy dave said:
Note though this hyperthreading means that an individual thread will complete more slowly than if you hadn’t put a second thread on the core. That’s a negative.

And thread schedulers take this into account, prioritizing spreading the load across the physical cores before doubling up. So people are probably not using Hyperthreading as much as they think.

crazy dave · Nov 1, 2021

Krevnik said:
And thread schedulers take this into account, prioritizing spreading the load across the physical cores before doubling up. So people are probably not using Hyperthreading as much as they think.

True.

Edit: had another comment about the nuances of thread schedulers and user control of it but ehhh not really important or germane and its true that which as you and @leman pointed out hyperthreading is most relevant when the CPU is saturated with threads

Spindel · Nov 2, 2021

setharsh said:
interesting, I come from a microservices implementation background, and sometimes -- at-least in windows -- having more threads (read v-cores) and relegating the actual work to the processor (aka hyperthreading) actually helps me in my day to day job. More threads are more threads and having actual hw implementation for arbitration between # cores vs # threads is better than software arbitration, and I've seen real world benefits. Eg, concurrent GC, more max Http connections, more worker threads in my service, more background tasks.

In an previous employment I ran a simulation software (that was kind of crappily written from a performance standpoint) but with that software turning off HT decreased the simulation time with about 10 % both when running single threaded simulations and multi threaded simulations.

This might very well be because windows task scheduler is crap. But in any case anecdotally HT might be as much of a burden as a help (setting the perf/W discussion aside).

So with HT YMMV very much depending on what tasks you need your computer to do.

quarkysg · Nov 2, 2021

Spindel said:
In an previous employment I ran a simulation software (that was kind of crappily written from a performance standpoint) but with that software turning off HT decreased the simulation time with about 10 % both when running single threaded simulations and multi threaded simulations.

This might very well be because windows task scheduler is crap. But in any case anecdotally HT might be as much of a burden as a help (setting the perf/W discussion aside).

So with HT YMMV very much depending on what tasks you need your computer to do.

I would think that for 2 threads that does different computations (e.g. 1 doing integer while another doing FP computations) HT will come out ahead. If both threads uses the same ALUs for the same core, it'll be a disadvantage. It'll be extremely challenging for the OS scheduler to know in advance what the threads (it is scheduling) is doing. I guess that's why (from what I read) Windows schedules to the actual cores first before scheduling to the HT 'cores'.

Spindel · Nov 2, 2021

quarkysg said:
I would think that for 2 threads that does different computations (e.g. 1 doing integer while another doing FP computations) HT will come out ahead. If both threads uses the same ALUs for the same core, it'll be a disadvantage. It'll be extremely challenging for the OS scheduler to know in advance what the threads (it is scheduling) is doing. I guess that's why (from what I read) Windows schedules to the actual cores first before scheduling to the HT 'cores'.

Yeah I understand this, the thing is with some software you get a performance hit even if running it ST with HT on this was on a 4c/8t machine.

quarkysg · Nov 2, 2021

Spindel said:
Yeah I understand this, the thing is with some software you get a performance hit even if running it ST with HT on this was on a 4c/8t machine.

Well, the OS also have processes running, so I guess it invariably finds it's way to the same core that your code is running.

joema2 · Nov 2, 2021

setharsh said:
Wanted to see what other devs think about having less total threads available. Im coming from a strong windows and Azure background and was curious to see that Macs with M1 don't have hyperthreading...

If anyone on an x86 Mac wants to inspect how much hyperthreading helps their workload, they can run the 3rd-party utility CPUSetter. It allows disabling hypertheading. I've used it several times and it seems safe, but that's only my experience: https://www.whatroute.net/cpusetter.html#download

If that shows hyperthreading helps your specific workload (say) 17%, that's one data point. OTOH the real question is how Intel's Alder Lake CPUs will compare to M1 and subsequent Apple Silicon CPUs. Alder Lake has a wider front end and better IPC, yet is hyperthreaded (on performance cores only).

jdb8167 · Nov 2, 2021

joema2 said:
OTOH the real question is how Intel's Alder Lake CPUs will compare to M1 and subsequent Apple Silicon CPUs. Alder Lake has a wider front end and better IPC, yet is hyperthreaded (on performance cores only).

Alder Lake has wider front end and better IPC than Tiger Lake not compared to Apple’s M1. Alder Lake will try and compensate for the M1’s higher IPC with much higher clock speeds and hyperthreading.

crazy dave · Nov 2, 2021

Spindel said:
In an previous employment I ran a simulation software (that was kind of crappily written from a performance standpoint) but with that software turning off HT decreased the simulation time with about 10 % both when running single threaded simulations and multi threaded simulations.

This might very well be because windows task scheduler is crap. But in any case anecdotally HT might be as much of a burden as a help (setting the perf/W discussion aside).

So with HT YMMV very much depending on what tasks you need your computer to do.

Interesting, when was this? What generation of chips? Because the modern tests I’ve seen HT on or off made no difference to single threaded workloads. Multithreaded workloads I’ll grant you things could get a little wonky in some very specific scenarios (although again I believe the core was saturated - the Windows scheduler *should* not be making use of HT/SMT2 until it is). Eg:

Investigating Performance of Multi-Threading on Zen 3 and AMD Ryzen 5000

www.anandtech.com

Of course the above is a test of AMD Zen3, but I don’t think modern Intel is much different if I remember right.

m1maverick · Nov 2, 2021

Spindel said:
Yeah I understand this, the thing is with some software you get a performance hit even if running it ST with HT on this was on a 4c/8t machine.

I think was applicable to the first implementation of hyperthreading. I believe later implementations made this a non-issue (though I wouldn't be surprised to see an outlier or two)

m1maverick · Nov 2, 2021

Krevnik said:
And thread schedulers take this into account, prioritizing spreading the load across the physical cores before doubling up. So people are probably not using Hyperthreading as much as they think.

If they're using workloads which are heavily thread dependent they are. The high level concept of HT is that it allows a core to execute code when it would otherwise be stalled while waiting for the dependencies of the initial thread to be fulfilled (for example thread A makes a memory request and therefore must stall while that memory request is fulfilled, therefore thread B can be executed during that time period).

All else being equal I would rather have cores over HT enabled cores but that requires more resources.

setharsh · Nov 2, 2021

joema2 said:
If anyone on an x86 Mac wants to inspect how much hyperthreading helps their workload, they can run the 3rd-party utility CPUSetter. It allows disabling hypertheading. I've used it several times and it seems safe, but that's only my experience: https://www.whatroute.net/cpusetter.html#download

If that shows hyperthreading helps your specific workload (say) 17%, that's one data point. OTOH the real question is how Intel's Alder Lake CPUs will compare to M1 and subsequent Apple Silicon CPUs. Alder Lake has a wider front end and better IPC, yet is hyperthreaded (on performance cores only).

Thank you! I have access to an intel Mac and will def compare this when i get a newer M1 Pro based laptop. When writing microservices, i try to be agnostic of generational changes in Intel profs, till they become std or we have really specific scenarios. For me, sometimes, having extra threads (note that not all SW threads run at 100% all the time), really helps. So that’s why i look at v-cores,and that’s what every cloud provider sells as compute units essentially.

Background tasks in microservices, cron jobs, GC cleanup for .NET, async/await are something’s that come to mind for these “extra” threads.

Spindel · Nov 2, 2021

crazy dave said:
Interesting, when was this? What generation of chips? Because the modern tests I’ve seen HT on or off made no difference to single threaded workloads. Multithreaded workloads I’ll grant you things could get a little wonky in some very specific scenarios (although again I believe the core was saturated - the Windows scheduler *should* not be making use of HT/SMT2 until it is). Eg:

Investigating Performance of Multi-Threading on Zen 3 and AMD Ryzen 5000

www.anandtech.com

Of course the above is a test of AMD Zen3, but I don’t think modern Intel is much different if I remember right.

Well I switched jobs in mid 2019 so it was back then, I've had that work computer for about 1 year then. Don't remember exactly what CPU it was but was some intel i7.

EDIT:// Just for clarification, the software used is well known (for all users) that it, performance wise, is kind of crappy. The math behind it is solid (and verified/certified for my field in my country), it's just that the software is really bad from a performance perspective. I will not mention what software it is, so don't ask.

ultrakyo · Nov 3, 2021

in today’s world processors comes with at least 4 cores, the need for hyperthreading is really unecessary unless all the threads are long running and blocking execution.

Anybody notice a diff between hyperthreading and no hyperthreading on the M1 based macs?

macrumors newbie

macrumors 68030

macrumors Core

macrumors newbie

macrumors 68000

macrumors 6502a

macrumors 601

macrumors 68000

macrumors 6502a

macrumors 65816

macrumors 6502a

macrumors 65816

macrumors 68000

macrumors 601

macrumors 68000

macrumors 65816

macrumors 65816

macrumors newbie

macrumors 6502a

macrumors regular

Our Staff