Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

leman

macrumors Core
Oct 14, 2008
19,520
19,671
Geekbench is garbage though compared to real workloads.

Geekbench is a decent cross-platform estimator of amortized burst CPU performance across a series of different workloads. SPEC is the same but for sustained performance and a bit more series. Blender is a good estimator of production 3D renderer performance on x86 CPUs, and that’s about it.

But hey, we all know that there is only one true benchmark: Stockfisch!
 

mi7chy

macrumors G4
Oct 24, 2014
10,619
11,293
I'm very happy to be alive and seeing Intel being an underdog for the 3rd time in the past 20 years:

View attachment 2081932

Intel 13th gen is a laughing stock. Isn't it?

At what power consumption? If it's still 10nm (Intel 7) then it's worrying but if Intel outsource to TSMC then they could potentially take the crown from AMD.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
Geekbench is a decent cross-platform estimator of amortized burst CPU performance across a series of different workloads. SPEC is the same but for sustained performance and a bit more series.
The benchmark might be cross-platform, but Anandtech's methodoly might not be. Who benchmarks a CPU on WSL instead of Windows or Linux?
 

pshufd

macrumors G4
Oct 24, 2013
10,146
14,572
New Hampshire
At what power consumption? If it's still 10nm (Intel 7) then it's worrying but if Intel outsource to TSMC then they could potentially take the crown from AMD.

Intel was supposed to go on 3nm but delayed it and I think that Apple is going to take the capacity.

It was discussed here a while back.
 

leman

macrumors Core
Oct 14, 2008
19,520
19,671
The benchmark might be cross-platform, but Anandtech's methodoly might not be. Who benchmarks a CPU on WSL instead of Windows or Linux?

The tooling is probably developed for unix, so targeting Linux makes sense. Again, running CPU work should make no difference from running from Windows directly.
 

mr_roboto

macrumors 6502a
Sep 30, 2020
856
1,866
I think it needs to be said explicitly for all the CB23 pushers: Cinebench uses hand-optimized x86 SIMD code. Instead of actually rewriting that for Arm, Cinebench's Arm port relies on a library which autotranslates every x86 SIMD instruction to an equivalent sequence of NEON SIMD instructions.

This is a very quick and dirty way to stand up a port with okay performance. It is extremely far from being a true native port that is well optimized for Arm. If the situation were reversed, you'd be screaming to high heavens that x86 CPUs were being treated unfairly in the comparison - and you'd be right!

Stop using CB23 for crossplatform comparisons between x86 and Apple Silicon. It's simply pointless. Unless you like trolling, I guess.
 

ahurst

macrumors 6502
Oct 12, 2021
410
815
Geekbench is a decent cross-platform estimator of amortized burst CPU performance across a series of different workloads. SPEC is the same but for sustained performance and a bit more series. Blender is a good estimator of production 3D renderer performance on x86 CPUs, and that’s about it.

But hey, we all know that there is only one true benchmark: Stockfisch!
Wasn't there a Nuvia white paper or something showing that GeekBench 5 ST/MT scores correlated very strongly (> 90%) with the equivalent SPEC benchmarks on the same CPUs?

Not that SPEC is representative of everyone's workloads, but it's the closest thing there is to a true "industry standard" for cross-platform/architecture performance comparisons (and has been for over 2 decades).

EDIT: Found it, was even stronger than I remembered! An R-squared of > 0.99 is just plain nuts.
 

pshufd

macrumors G4
Oct 24, 2013
10,146
14,572
New Hampshire
Paul's Hardware has a video on Intel's launch but it's just prices and dates. No benchmarks. I assume that we'll get those in a few weeks.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
Wasn't there a Nuvia white paper or something showing that GeekBench 5 ST/MT scores correlated very strongly (> 90%) with the equivalent SPEC benchmarks on the same CPUs?

Not that SPEC is representative of everyone's workloads, but it's the closest thing there is to a true "industry standard" for cross-platform/architecture performance comparisons (and has been for over 2 decades).

EDIT: Found it, was even stronger than I remembered! An R-squared of > 0.99 is just plain nuts.
Does this mean that the best cross-platform benchmark is Geekbench 5 when thermal throttling is not an issue?

It is important to note that the observed correlation is not a fundamental property and can break under several scenarios.

One example is thermal effects. Geekbench typically runs quickly (in minutes) and especially so in our testing where the default workload gaps are removed, whereas SPEC CPU typically runs for hours. The net effect of this is that Geekbench 5 may achieve a higher average frequency because it is able to exploit the system’s thermal mass due to its short runtime. However SPEC CPU will be governed by the long term power dissipation capability of the system due to its long run-time.
 

senttoschool

macrumors 68030
Nov 2, 2017
2,626
5,482
Geekbench is garbage though compared to real workloads.

Fortunately, Phoronix already did the comparison so I don't have to.
Except that Geekbench is not garbage and is a far greater predictor of general CPU performance than something like Blender.
 
  • Like
Reactions: pshufd

kvic

macrumors 6502a
Sep 10, 2015
516
460
In case someone in the audience are interested in building a AM5/Zen4 machine...

I found that in the first wave of X670/E based boards, MSI boards have _the_ best PCIe lane layouts. You get three x16 mechanical slots all connected to the CPU. Bifurcation config possible: x16, x0, x4 or x8, x8, x4. The latter is plentiful for running two top-end GPUs with zero compromise in bus bandwidth. And yet you still get one PCIe 5.0 x4 slot directly hooked up to the CPU.

Also you can convert the PCIe 5.0 M.2 socket into the 4th slot (PCIe 5.0 x4), and one PCIe 4.0 M.2 socket into the 5th slot (PCIe 4.0 x4). What a good time to be alive for HEDTs previously cost you much more.
 

theorist9

macrumors 68040
May 28, 2015
3,880
3,059
Wasn't there a Nuvia white paper or something showing that GeekBench 5 ST/MT scores correlated very strongly (> 90%) with the equivalent SPEC benchmarks on the same CPUs?

Not that SPEC is representative of everyone's workloads, but it's the closest thing there is to a true "industry standard" for cross-platform/architecture performance comparisons (and has been for over 2 decades).

EDIT: Found it, was even stronger than I remembered! An R-squared of > 0.99 is just plain nuts.
I don't know the extent to which this applies to SPEC 2017, but with the implementation of AVX512 by AMD, GB may no longer be a good cross-platform comparator.

The problem is that GB gives significant weight to the crypto score, and that is significantly accelerated by AVX512. AMD fully implements this, while of course AS doesn't (and I don't think they have anything equivalent yet). With Intel it's more complicated. Someone please correct me if I'm wrong but, IIUC, Intel implemented this is Rocket Lake (gen 11), but then removed it in Alder Lake (perhaps so that those needing it—presumably typically commercial customers—would be forced to buy Xeons, akin to how NVIDIA disabled FP64 on their consumer GPU's).

Anyways, the point is that AVX-512 has little effect on performance for consumers, since most apps consumers use don't implement it. Thus a benchmark that heavily rewards AVX-512 capability (as it appears GB does) might not be reflective of typical consumer workloads.

If the above is correct then, for typical consumer workloads, GB currently overestimates the performance of AMD-7000 and Xeon vs. AS and Intel-Alder Lake.

For more details see:


And:


From Anandtech:
"For our 3DPM v2.1 testing, we added in the Intel Core i9-11900K (Rocket Lake) to show performance across AVX workloads. Although Intel officially fused off the AVX2/512 extensions on Alder Lake which did cause a little controversy and gave the impression that AVX-512 on consumer platforms was dead. AMD clearly believes the opposite, as it has implemented it so that AVX-512 runs two cycles over a 256-bit wide instruction. The performance of the Ryzen 9 7950X here is phenomenal, although the Core i9-11900K which did indeed feature AVX instruction sets in the silicon, is still better than the Ryzen 5 7600X with AVX workloads."
 
Last edited:

Gerdi

macrumors 6502
Apr 25, 2020
449
301
The problem is that GB gives significant weight to the crypto score, and that is significantly accelerated by AVX512. AMD fully implements this, while of course AS doesn't (and I don't think they have anything equivalent yet). With Intel it's more complicated. Someone please correct me if I'm wrong but, IIUC, Intel implemented this is Rocket Lake (gen 11), but then removed it in Alder Lake (perhaps so that those needing it—presumably typically commercial customers—would be forced to buy Xeons, akin to how NVIDIA disabled FP64 on their consumer GPU's).

I do not see a problem at all. Geekbench is very transparent about the results - in fact I am always looking at the integer and floating point scores individually and you are not forced to take the combined score as reference.
 

theorist9

macrumors 68040
May 28, 2015
3,880
3,059
I do not see a problem at all. Geekbench is very transparent about the results - in fact I am always looking at the integer and floating point scores individually and you are not forced to take the combined score as reference.
The problem is precisely with using the composite score, which is clearly what I was referring to. That's what nearly everyone uses, so it's for that the caveat is needed. Of course you could use the individual scores as a workaround for this and, indeed, I was thinking of mentioning that in my post, but I thought that was so obvious that it did not bear mentioning....

I.e., the fact that there's a workaround for the problem attendant with the composite score doesn't mean there's not a problem. Rather, the very fact that you need a workaround means there is a problem. Not sure why you're arguing against this.

[This of course assumes the findings I referenced in my earlier post are correct—as I already mentioned in that post.]
 
Last edited:

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
I don't know the extent to which this applies to SPEC 2017, but with the implementation of AVX512 by AMD, GB may no longer be a good cross-platform comparator.

The problem is that GB gives significant weight to the crypto score, and that is significantly accelerated by AVX512. AMD fully implements this, while of course AS doesn't (and I don't think they have anything equivalent yet). With Intel it's more complicated.
I thought the criteria for considering a benchmark as a good cross-platform benchmark was whether it was equally optimized for all platforms, not whether it reflects the software situation well. Cinebench is not a good cross-platform benchmark because it is better optimized for x86 than ARM, not because it doesn't reflect the software situation well.

Does GeekBench 5 use SIMD instructions similar to AVX512 on Apple hardware?
 

mi7chy

macrumors G4
Oct 24, 2014
10,619
11,293
While the 7950x is the king of productivity the 7700x is the king of gaming at $300 less ($399 vs $699). A fat cache 7700x3D and 7950x3D would be the GOAT.

1664430465937.png


 

EntropyQ3

macrumors 6502a
Mar 20, 2009
718
824
The problem is precisely with using the composite score, which is clearly what I was referring to. That's what nearly everyone uses, so it's for that the caveat is needed.
That is not a problem with GeekBench 5 though, it's a problem with people being lazy.
I honestly feel that the obsession with boiling everything down to a single figure of merit for public consumption is destructive to the human mind and peoples ability to think and analyze in a broader context. Media carries a lot of guilt.
 

theorist9

macrumors 68040
May 28, 2015
3,880
3,059
I thought the criteria for considering a benchmark as a good cross-platform benchmark was whether it was equally optimized for all platforms, not whether it reflects the software situation well. Cinebench is not a good cross-platform benchmark because it is better optimized for x86 than ARM, not because it doesn't reflect the software situation well.

Does GeekBench 5 use SIMD instructions similar to AVX512 on Apple hardware?
I don't know. But: As I understand it, NEON is the SIMD equivalent for ARMv8. I searched Primate Lab's GB Release Notes for "AVX", and found this:

1664499346243.png


Doing the same search for "NEON" I got no hits.

 

theorist9

macrumors 68040
May 28, 2015
3,880
3,059
That is not a problem with GeekBench 5 though, it's a problem with people being lazy.
I honestly feel that the obsession with boiling everything down to a single figure of merit for public consumption is destructive to the human mind and peoples ability to think and analyze in a broader context. Media carries a lot of guilt.
I disagree. It's nice to have a single coarse-grained value you can use to scan through numerous processors, just so long as you maintain an awareness it is coarse-granined. It's just lazy thinking, that doesn't maintain that awareness, that I think we want to avoid.

Besides, the only way to be serious about benchmarking is to analyze your own workflow, app-by-app, and develop your own benchmarks that correspond directly to your workflow, and who actually does that? I do*, but I know I'm in the minority, and I'm not going to designate those who don't as lazy.

*For instance, I spent a weekend developing my own Mathematica benchmark, corresponding to the types of calculations I do (see results below comparing my 2019 27" i9 iMac to a Mac Studio). It showed me there's generally not yet enough of a gain (10%–20% faster, except for image processing, where it is substantially slower) to justify the $5200 + $1600 = $6800 I'd need to spend to replace my iMac with an equivalently-equipped (128 GB RAM + 2 TB SSD) Studio plus Studio Display.

1664500329672.png

1664500306416.png
 
Last edited:
  • Like
Reactions: singhs.apps

Gerdi

macrumors 6502
Apr 25, 2020
449
301
I don't know. But: As I understand it, NEON is the SIMD equivalent for ARMv8. I searched Primate Lab's GB Release Notes for "AVX", and found this:

View attachment 2084079

Doing the same search for "NEON" I got no hits.


That is because NEON crypto support was already in at the release of Geekbench 5.1.0.

In any case, Spec CPU does allow auto-vectorization for SIMD in the rate benchmarks but no hand-optimized SIMD of course.
 
  • Like
Reactions: theorist9

theorist9

macrumors 68040
May 28, 2015
3,880
3,059
That is because NEON crypto support was already in at the release of Geekbench 5.1.0.
Sorry, not following--those release notes go back to GB 5.0 and don't mention NEON. And the release notes for 5.1.0 don't mention NEON:


In addition a Google search of the entire primatelabs.com website for NEON doesn't turn up anything (at least that I could see):

site:primatelabs.com neon

[Emoticon was not intentional!]

...other than this from 2013:
1664502027606.png


Could you please provide a link?

Also, how much accleration does NEON provide compared to AVX512?
 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.