Ok, here is a confusing graphic
that links to a rather badly-written page, which I believe is the origin of the claim that I was referencing. The number I heard seems to come from farther down the page, where they say that when the decoder kicks in (when the μop cache cannot be used), it adds something like 4% to the power draw, which does not line up with this graphic.
My interpretation of the graphic is:
- "uncore" refers to the power draw of the processor support logic (this seems perhaps a bit low)
- "cores" is the total draw of each of the cores (probably P-cores)
- "execution units" is part of core draw
- "instruction decoders" is likewise part of core draw
- caches (L1,L2,L3) are not part of either core or uncore draw
The graphic would suggest that the decoders add around 18% of the power draw (averaging the ~8% for FP with the ~20% for integer, with the consideration that integer will tend to get much more use than FP, most of the time).
Which, having looked at this mess, leaves me curious as to what the reality is. Clearly the small number I heard tossed about is not very accurate, and this source is a very long way from reliable. I apologize for repeating casual hearsay.
It does seem unlikely that an x86 core is actually using a sixth of its power to interpret code, but maybe it really is. Decoding ARM instructions cannot cost more than a fraction of 1% of the power a core uses, though the elaborate trip to one of the execution queues is kind of expensive.