Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

NT1440

macrumors Pentium
May 18, 2008
15,092
22,158
It does not really matter, if you interpret the formula I did post above correctly. Given 2 identical cores, one runs with higher frequency and voltage, it will never get the efficiency back by finishing faster. Because you only gain time back, which is linear to frequency, while the power shows a cubic increase. Pdyn = Cdyn * V^2 * f
With other words, you always loose efficiency if you increase voltage - independent how you calculate.

Another conclusion from this formula is, that power is always increasing faster than frequency (and performance does not increase faster than frequency) - hence the lower the voltage the higher the efficiency.
See I’m not a math guy, but assuming your formula is correct then you’ve answered my question (and then some). Thanks
 

Tagbert

macrumors 603
Jun 22, 2011
6,256
7,281
Seattle
Apple attempted to make it more performant especially on the iGPU side to catch up with the competition but lack of active cooling negated any benefit. If you're coming from M1 it's better to wait for 3nm M3 with better designed MBA.
Untrue.

At any load below the thermal limits, the M2 is significantly faster than the M1. Even at it’s thermal limits, the M2 is still faster than the M1.

throttling-chart-2.png
 
Last edited:

mi7chy

macrumors G4
Oct 24, 2014
10,622
11,294
EDIT: I found out from here https://www.notebookcheck.net/Our-Test-Criteria.15394.0.html They use a multimeter. So they are measuring at the wall instead of using powermetrics. That's disappointing.

That's outdated from 2016. Look at some of the more recent Macbook reviews where they specifically mention powermetrics.

https://www.notebookcheck.net/Apple...dia-Laptop-for-Content-Creators.579013.0.html

"The M1 Pro is also very efficient when we have a look at the internal consumption (via powermetrics) and there is no throttling or turbo peaks at the start of the benchmarks."
 

Xiao_Xi

macrumors 68000
Original poster
Oct 27, 2021
1,627
1,101
Not sure if I understand the question. M2 happens to run with higher frequency and voltage when not thermally restricted - hence it will be less efficient - but it has not much to do with architectural efficiency (remember the formula in my last post).

I thought Apple's ARM SoCs were more efficient than Intel/AMD x86 CPUs because ARM SoCs take about the same time as x86 CPUs to finish the same task at lower frequency. Mx can do that because Apple has chosen to use wider/shorter pipelines, faster memory access, better branch predictors...

I wonder where cmaier is when you need him most.
 
  • Like
Reactions: Argoduck

leman

macrumors Core
Oct 14, 2008
19,521
19,677
I thought Apple's ARM SoCs were more efficient than Intel/AMD x86 CPUs because ARM SoCs take about the same time as x86 CPUs to finish the same task at lower frequency. Mx can do that because Apple has chosen to use wider/shorter pipelines, faster memory access, better branch predictors...

Yes, and this is still true. M2 is still much more efficient than x86 CPUs for similar performance. .
 
  • Like
Reactions: Tagbert

Gerdi

macrumors 6502
Apr 25, 2020
449
301
I thought Apple's ARM SoCs were more efficient than Intel/AMD x86 CPUs because ARM SoCs take about the same time as x86 CPUs to finish the same task at lower frequency. Mx can do that because Apple has chosen to use wider/shorter pipelines, faster memory access, better branch predictors...

I wonder where cmaier is when you need him most.

That is still the case. However the way the youtubers measure efficiency does not tell you much about architectural efficiency. As I tried to explain, Cdyn (the so called switching capacitance) is the only architecture dependent parameter, which impacts efficiency , while V and f can be freely chosen from the frequency-voltage curve. In addition V is quadratic in the equation, so it has a big impact.
 
  • Like
Reactions: Argoduck

Sydde

macrumors 68030
Aug 17, 2009
2,563
7,061
IOKWARDI
I thought Apple's ARM SoCs were more efficient than Intel/AMD x86 CPUs because ARM SoCs take about the same time as x86 CPUs to finish the same task at lower frequency. Mx can do that because Apple has chosen to use wider/shorter pipelines, faster memory access, better branch predictors...

Mainly, x86 has to have an elaborate instruction stream parser that has to look at a bunch of bytes and figure out which ones go together to form each instruction; it then has to convert each instruction into its constituent μops to be fed into the pipe.

ARMv8+ does away with both of these issues: instructions are concise, so they very rarely require μops, since that is basically what they already are; and since all instructions are 32-bits long, the pipeline can just grab 16 bytes and already know it has 4 opodes there, just about ready to get stuffed right into the pipe as is.

In other words, x86 starts with a disadvantage.

I wonder where cmaier is when you need him most.

He got banned, AAUI, for, reasons.
 
  • Like
Reactions: Xiao_Xi

leman

macrumors Core
Oct 14, 2008
19,521
19,677
Mainly, x86 has to have an elaborate instruction stream parser that has to look at a bunch of bytes and figure out which ones go together to form each instruction; it then has to convert each instruction into its constituent μops to be fed into the pipe.

Offtopic, but made me wonder why Appel is using VLE for their custom GPU ISA. Of course, that's an in-order architecture, so probably decoding isn't a bottleneck in the first place and can be done efficiently, but code density matters a lot.
 
Last edited:

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,664
OBX
Offtopic, but made me wonder why Appel is using VLA for their custom GPU ISA. Of course, that's an in-order architecture, so probably decoding isn't a bottleneck in the first place and can be done efficiently, but code density matters a lot.
I tried searching, but my searchfu isn't as stong as I'd like, what does VLA stand for?
 
  • Like
Reactions: Xiao_Xi

Sydde

macrumors 68030
Aug 17, 2009
2,563
7,061
IOKWARDI
What did he do this time? 🤣

His side of the story is something like "I looked under the bridge and became angry that it had been allowed to reach such a state". If you were to ask, say, weaselbot, you would probably get a completely different narrative, about snowflakes and hurt fee-fees and how I should be reprimanded for even mentioning it.
 

tomO2013

macrumors member
Feb 11, 2020
67
102
Canada
That is still the case. However the way the youtubers measure efficiency does not tell you much about architectural efficiency. As I tried to explain, Cdyn (the so called switching capacitance) is the only architecture dependent parameter, which impacts efficiency , while V and f can be freely chosen from the frequency-voltage curve. In addition V is quadratic in the equation, so it has a big impact.
I love math :)

Would you mind quoting the source of the formula that you’re using, and perhaps explain how it is accounting for architectural differences (e.g. path layout as oppoled switching capacitance in isolation) to account for fundamental differences such as M1 has no hardware accelerated logic for proRes, proRes Raw, or even the fundamental differences that anandtech identified between the newer blizzard efficiency cores which complete work faster and use a lower amount of energy go do so relative to anandtech.

I apologize if I have misunderstood your point on the formula - really I’m just looking for clarity as, in isolation, I do find myself questioning (not necessarily the formula itself which looks to determine efficiency as a measure of die size and capacitance) but the actual (relevance in isolation??) or applicability when comparing two fundamentally different pieces of silicon :)

Surely we cannot say that M1 decodes prores raw more efficiently at the same power envelope to M2 - so we should avoid hyperboles in the description and state scenarios where a situation holds true (or not)

Please can you help me understand better :).

Many thanks,

Tom.
 

Xiao_Xi

macrumors 68000
Original poster
Oct 27, 2021
1,627
1,101
I thought AMD used VLIW, but that appears to be a GCN/CDNA thing. Not 100% sure how RDNA2 would be classified.
Wikipedia calls GCN "RISC SIMD".
TeraScale is a VLIW SIMD architecture, while Tesla is a RISC SIMD architecture, similar to TeraScale's successor Graphics Core Next.

More info about AMD's GPU:
 
  • Like
Reactions: diamond.g

Gerdi

macrumors 6502
Apr 25, 2020
449
301
I love math :)

Would you mind quoting the source of the formula that you’re using, and perhaps explain how it is accounting for architectural differences (e.g. path layout as oppoled switching capacitance in isolation) to account for fundamental differences such as M1 has no hardware accelerated logic for proRes, proRes Raw, or even the fundamental differences that anandtech identified between the newer blizzard efficiency cores which complete work faster and use a lower amount of energy go do so relative to anandtech.

I apologize if I have misunderstood your point on the formula - really I’m just looking for clarity as, in isolation, I do find myself questioning (not necessarily the formula itself which looks to determine efficiency as a measure of die size and capacitance) but the actual (relevance in isolation??) or applicability when comparing two fundamentally different pieces of silicon :)

Surely we cannot say that M1 decodes prores raw more efficiently at the same power envelope to M2 - so we should avoid hyperboles in the description and state scenarios where a situation holds true (or not)

Please can you help me understand better :).

Many thanks,

Tom.

The formula is just basic physics, you might find it in schoolbooks. You can also convince yourself that 1W = 1F * 1V^2 / s.

I am using it to demonstrate a common fallacy, which the OP fell for. Lets be more precise and define efficiency:

Efficiency = Performance/Power = (work/cycle)*f/Pdyn = (work/cycle*f)/(Cdyn*V^2*f) = (work/cycle)/Cdyn/V^2

As expected, since f is linear in power as well as in performance it cancels out. Still a higher voltage is required in order to reach higher frequencies.
When looking at the formula - only (work/cycle/Cdyn) is architecture dependent and constant per architecture. Otherwise efficiency is decreasing with the square of voltage for any architecture.

My point was that you cannot possibly conclude based on the presented data, which shows that the M2 is less efficient than the M1, that it has anything to do with architecture - my point was it has everything to do with voltage. And if you look at an iso-voltage scenario, you will see that the M2 is architectural more efficient than the M1 (e.g. work/cycle/Cdyn is higher for M2 compared to M1.)

Corollary, if you want to compare the efficiency of 2 different architectures - make sure you compare them at an iso-voltage point - and not at an iso-performance point (as many youtubers or people here in the forum would suggest).
 
Last edited:

Sydde

macrumors 68030
Aug 17, 2009
2,563
7,061
IOKWARDI
As expected, since f is linear in power as well as in performance it cancels out. Still a higher voltage is required in order to reach higher frequencies.

Say what? The trajectory of IC design has been the opposite of that. At higher frequencies, the cycle goes up to and across peak more smoothly when the difference between 0 and peak is less. '70s/'80s machines used +5 as peak, operating in the 001~300Mhz range; once we started getting close to the Ghz mark, voltages started to drop, in part because gating was being designed to respond properly at lower differentials. So, please stop blorting nonsense.
 

Xiao_Xi

macrumors 68000
Original poster
Oct 27, 2021
1,627
1,101
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.