[CPU only] Apple M1/2(Max/Ultra) (TSMC 5nm) vs AMD Zen 4 (TSMC 5nm) - Technical Analysis

leman · Sep 27, 2022

mi7chy said:
Geekbench is garbage though compared to real workloads.

Geekbench is a decent cross-platform estimator of amortized burst CPU performance across a series of different workloads. SPEC is the same but for sustained performance and a bit more series. Blender is a good estimator of production 3D renderer performance on x86 CPUs, and that’s about it.

But hey, we all know that there is only one true benchmark: Stockfisch!

mi7chy · Sep 27, 2022

kvic said:
I'm very happy to be alive and seeing Intel being an underdog for the 3rd time in the past 20 years:

View attachment 2081932

Intel 13th gen is a laughing stock. Isn't it?

At what power consumption? If it's still 10nm (Intel 7) then it's worrying but if Intel outsource to TSMC then they could potentially take the crown from AMD.

Xiao_Xi · Sep 27, 2022

leman said:
Geekbench is a decent cross-platform estimator of amortized burst CPU performance across a series of different workloads. SPEC is the same but for sustained performance and a bit more series.

The benchmark might be cross-platform, but Anandtech's methodoly might not be. Who benchmarks a CPU on WSL instead of Windows or Linux?

pshufd · Sep 27, 2022

kvic said:
I'm very happy to be alive and seeing Intel being an underdog for the 3rd time in the past 20 years:

View attachment 2081932

Intel 13th gen is a laughing stock. Isn't it?

Your chart shows 12th gen.

pshufd · Sep 27, 2022

mi7chy said:
At what power consumption? If it's still 10nm (Intel 7) then it's worrying but if Intel outsource to TSMC then they could potentially take the crown from AMD.

Intel was supposed to go on 3nm but delayed it and I think that Apple is going to take the capacity.

It was discussed here a while back.

leman · Sep 27, 2022

Xiao_Xi said:
The benchmark might be cross-platform, but Anandtech's methodoly might not be. Who benchmarks a CPU on WSL instead of Windows or Linux?

The tooling is probably developed for unix, so targeting Linux makes sense. Again, running CPU work should make no difference from running from Windows directly.

diamond.g · Sep 27, 2022

pshufd said:
Your chart shows 12th gen.

The dark blue line is 13th gen.

mr_roboto · Sep 27, 2022

I think it needs to be said explicitly for all the CB23 pushers: Cinebench uses hand-optimized x86 SIMD code. Instead of actually rewriting that for Arm, Cinebench's Arm port relies on a library which autotranslates every x86 SIMD instruction to an equivalent sequence of NEON SIMD instructions.

This is a very quick and dirty way to stand up a port with okay performance. It is extremely far from being a true native port that is well optimized for Arm. If the situation were reversed, you'd be screaming to high heavens that x86 CPUs were being treated unfairly in the comparison - and you'd be right!

Stop using CB23 for crossplatform comparisons between x86 and Apple Silicon. It's simply pointless. Unless you like trolling, I guess.

ahurst · Sep 27, 2022

leman said:
Geekbench is a decent cross-platform estimator of amortized burst CPU performance across a series of different workloads. SPEC is the same but for sustained performance and a bit more series. Blender is a good estimator of production 3D renderer performance on x86 CPUs, and that’s about it.

But hey, we all know that there is only one true benchmark: Stockfisch!

Wasn't there a Nuvia white paper or something showing that GeekBench 5 ST/MT scores correlated very strongly (> 90%) with the equivalent SPEC benchmarks on the same CPUs?

Not that SPEC is representative of everyone's workloads, but it's the closest thing there is to a true "industry standard" for cross-platform/architecture performance comparisons (and has been for over 2 decades).

EDIT: Found it, was even stronger than I remembered! An R-squared of > 0.99 is just plain nuts.

pshufd · Sep 27, 2022

Paul's Hardware has a video on Intel's launch but it's just prices and dates. No benchmarks. I assume that we'll get those in a few weeks.

Xiao_Xi · Sep 28, 2022

ahurst said:
Wasn't there a Nuvia white paper or something showing that GeekBench 5 ST/MT scores correlated very strongly (> 90%) with the equivalent SPEC benchmarks on the same CPUs?

Not that SPEC is representative of everyone's workloads, but it's the closest thing there is to a true "industry standard" for cross-platform/architecture performance comparisons (and has been for over 2 decades).

EDIT: Found it, was even stronger than I remembered! An R-squared of > 0.99 is just plain nuts.

Does this mean that the best cross-platform benchmark is Geekbench 5 when thermal throttling is not an issue?

It is important to note that the observed correlation is not a fundamental property and can break under several scenarios.

One example is thermal effects. Geekbench typically runs quickly (in minutes) and especially so in our testing where the default workload gaps are removed, whereas SPEC CPU typically runs for hours. The net effect of this is that Geekbench 5 may achieve a higher average frequency because it is able to exploit the system’s thermal mass due to its short runtime. However SPEC CPU will be governed by the long term power dissipation capability of the system due to its long run-time.

senttoschool · Sep 28, 2022

mi7chy said:
Geekbench is garbage though compared to real workloads.

Fortunately, Phoronix already did the comparison so I don't have to.

Except that Geekbench is not garbage and is a far greater predictor of general CPU performance than something like Blender.

exoticSpice · Sep 28, 2022

mi7chy said:
Geekbench is garbage though compared to real workloads.

Fortunately, Phoronix already did the comparison so I don't have to.

View attachment 2081909

View attachment 2081914

I don't trust that Linux propaganda website. AMD uses Geekbench for Single Core in their benchmarks and so does Intel. So I will also use Geekbench to compare CPU pref.

pshufd · Sep 28, 2022

senttoschool said:
Except that Geekbench is not garbage and is a far greater predictor of general CPU performance than something like Blender.

I have to agree with this. It's my goto when evaluating a machine and their large database of systems is really nice too.

kvic · Sep 28, 2022

In case someone in the audience are interested in building a AM5/Zen4 machine...

I found that in the first wave of X670/E based boards, MSI boards have _the_ best PCIe lane layouts. You get three x16 mechanical slots all connected to the CPU. Bifurcation config possible: x16, x0, x4 or x8, x8, x4. The latter is plentiful for running two top-end GPUs with zero compromise in bus bandwidth. And yet you still get one PCIe 5.0 x4 slot directly hooked up to the CPU.

Also you can convert the PCIe 5.0 M.2 socket into the 4th slot (PCIe 5.0 x4), and one PCIe 4.0 M.2 socket into the 5th slot (PCIe 4.0 x4). What a good time to be alive for HEDTs previously cost you much more.

theorist9 · Sep 28, 2022

ahurst said:
Wasn't there a Nuvia white paper or something showing that GeekBench 5 ST/MT scores correlated very strongly (> 90%) with the equivalent SPEC benchmarks on the same CPUs?

Not that SPEC is representative of everyone's workloads, but it's the closest thing there is to a true "industry standard" for cross-platform/architecture performance comparisons (and has been for over 2 decades).

EDIT: Found it, was even stronger than I remembered! An R-squared of > 0.99 is just plain nuts.

I don't know the extent to which this applies to SPEC 2017, but with the implementation of AVX512 by AMD, GB may no longer be a good cross-platform comparator.

The problem is that GB gives significant weight to the crypto score, and that is significantly accelerated by AVX512. AMD fully implements this, while of course AS doesn't (and I don't think they have anything equivalent yet). With Intel it's more complicated. Someone please correct me if I'm wrong but, IIUC, Intel implemented this is Rocket Lake (gen 11), but then removed it in Alder Lake (perhaps so that those needing it—presumably typically commercial customers—would be forced to buy Xeons, akin to how NVIDIA disabled FP64 on their consumer GPU's).

Anyways, the point is that AVX-512 has little effect on performance for consumers, since most apps consumers use don't implement it. Thus a benchmark that heavily rewards AVX-512 capability (as it appears GB does) might not be reflective of typical consumer workloads.

If the above is correct then, for typical consumer workloads, GB currently overestimates the performance of AMD-7000 and Xeon vs. AS and Intel-Alder Lake.

For more details see:

AVX-512 Makes Ryzen 9 7950X Geekbench 5 Results Look Good — Too Good

The scores are real, the real-world performance gains aren't

www.tomshardware.com

And:

AMD Zen 4 Ryzen 9 7950X and Ryzen 5 7600X Review: Retaking The High-End

www.anandtech.com

From Anandtech:
"For our 3DPM v2.1 testing, we added in the Intel Core i9-11900K (Rocket Lake) to show performance across AVX workloads. Although Intel officially fused off the AVX2/512 extensions on Alder Lake which did cause a little controversy and gave the impression that AVX-512 on consumer platforms was dead. AMD clearly believes the opposite, as it has implemented it so that AVX-512 runs two cycles over a 256-bit wide instruction. The performance of the Ryzen 9 7950X here is phenomenal, although the Core i9-11900K which did indeed feature AVX instruction sets in the silicon, is still better than the Ryzen 5 7600X with AVX workloads."

Gerdi · Sep 28, 2022

theorist9 said:
The problem is that GB gives significant weight to the crypto score, and that is significantly accelerated by AVX512. AMD fully implements this, while of course AS doesn't (and I don't think they have anything equivalent yet). With Intel it's more complicated. Someone please correct me if I'm wrong but, IIUC, Intel implemented this is Rocket Lake (gen 11), but then removed it in Alder Lake (perhaps so that those needing it—presumably typically commercial customers—would be forced to buy Xeons, akin to how NVIDIA disabled FP64 on their consumer GPU's).

I do not see a problem at all. Geekbench is very transparent about the results - in fact I am always looking at the integer and floating point scores individually and you are not forced to take the combined score as reference.

theorist9 · Sep 28, 2022

Gerdi said:
I do not see a problem at all. Geekbench is very transparent about the results - in fact I am always looking at the integer and floating point scores individually and you are not forced to take the combined score as reference.

The problem is precisely with using the composite score, which is clearly what I was referring to. That's what nearly everyone uses, so it's for that the caveat is needed. Of course you could use the individual scores as a workaround for this and, indeed, I was thinking of mentioning that in my post, but I thought that was so obvious that it did not bear mentioning....

I.e., the fact that there's a workaround for the problem attendant with the composite score doesn't mean there's not a problem. Rather, the very fact that you need a workaround means there is a problem. Not sure why you're arguing against this.

[This of course assumes the findings I referenced in my earlier post are correct—as I already mentioned in that post.]

Xiao_Xi · Sep 28, 2022

theorist9 said:
I don't know the extent to which this applies to SPEC 2017, but with the implementation of AVX512 by AMD, GB may no longer be a good cross-platform comparator.

The problem is that GB gives significant weight to the crypto score, and that is significantly accelerated by AVX512. AMD fully implements this, while of course AS doesn't (and I don't think they have anything equivalent yet). With Intel it's more complicated.

I thought the criteria for considering a benchmark as a good cross-platform benchmark was whether it was equally optimized for all platforms, not whether it reflects the software situation well. Cinebench is not a good cross-platform benchmark because it is better optimized for x86 than ARM, not because it doesn't reflect the software situation well.

Does GeekBench 5 use SIMD instructions similar to AVX512 on Apple hardware?

mi7chy · Sep 28, 2022

While the 7950x is the king of productivity the 7700x is the king of gaming at $300 less ($399 vs $699). A fat cache 7700x3D and 7950x3D would be the GOAT.

EntropyQ3 · Sep 29, 2022

theorist9 said:
The problem is precisely with using the composite score, which is clearly what I was referring to. That's what nearly everyone uses, so it's for that the caveat is needed.

That is not a problem with GeekBench 5 though, it's a problem with people being lazy.
I honestly feel that the obsession with boiling everything down to a single figure of merit for public consumption is destructive to the human mind and peoples ability to think and analyze in a broader context. Media carries a lot of guilt.

theorist9 · Sep 29, 2022

Xiao_Xi said:
I thought the criteria for considering a benchmark as a good cross-platform benchmark was whether it was equally optimized for all platforms, not whether it reflects the software situation well. Cinebench is not a good cross-platform benchmark because it is better optimized for x86 than ARM, not because it doesn't reflect the software situation well.

Does GeekBench 5 use SIMD instructions similar to AVX512 on Apple hardware?

I don't know. But: As I understand it, NEON is the SIMD equivalent for ARMv8. I searched Primate Lab's GB Release Notes for "AVX", and found this:

Doing the same search for "NEON" I got no hits.

Geekbench 5 Release Notes

www.primatelabs.com

theorist9 · Sep 29, 2022

EntropyQ3 said:
That is not a problem with GeekBench 5 though, it's a problem with people being lazy.
I honestly feel that the obsession with boiling everything down to a single figure of merit for public consumption is destructive to the human mind and peoples ability to think and analyze in a broader context. Media carries a lot of guilt.

I disagree. It's nice to have a single coarse-grained value you can use to scan through numerous processors, just so long as you maintain an awareness it is coarse-granined. It's just lazy thinking, that doesn't maintain that awareness, that I think we want to avoid.

Besides, the only way to be serious about benchmarking is to analyze your own workflow, app-by-app, and develop your own benchmarks that correspond directly to your workflow, and who actually does that? I do*, but I know I'm in the minority, and I'm not going to designate those who don't as lazy.

*For instance, I spent a weekend developing my own Mathematica benchmark, corresponding to the types of calculations I do (see results below comparing my 2019 27" i9 iMac to a Mac Studio). It showed me there's generally not yet enough of a gain (10%–20% faster, except for image processing, where it is substantially slower) to justify the $5200 + $1600 = $6800 I'd need to spend to replace my iMac with an equivalently-equipped (128 GB RAM + 2 TB SSD) Studio plus Studio Display.

Gerdi · Sep 29, 2022

theorist9 said:
I don't know. But: As I understand it, NEON is the SIMD equivalent for ARMv8. I searched Primate Lab's GB Release Notes for "AVX", and found this:

View attachment 2084079

Doing the same search for "NEON" I got no hits.

Geekbench 5 Release Notes

www.primatelabs.com

That is because NEON crypto support was already in at the release of Geekbench 5.1.0.

In any case, Spec CPU does allow auto-vectorization for SIMD in the rate benchmarks but no hand-optimized SIMD of course.

theorist9 · Sep 29, 2022

Gerdi said:
That is because NEON crypto support was already in at the release of Geekbench 5.1.0.

Sorry, not following--those release notes go back to GB 5.0 and don't mention NEON. And the release notes for 5.1.0 don't mention NEON:

Geekbench 5.1 - Geekbench Blog

www.geekbench.com

In addition a Google search of the entire primatelabs.com website for NEON doesn't turn up anything (at least that I could see):

site

rimatelabs.com neon

[Emoticon was not intentional!]

...other than this from 2013:

Could you please provide a link?

Also, how much accleration does NEON provide compared to AVX512?

[CPU only] Apple M1/2(Max/Ultra) (TSMC 5nm) vs AMD Zen 4 (TSMC 5nm) - Technical Analysis

macrumors Core

Suspended

macrumors 68000

macrumors G4

macrumors G4

macrumors Core

macrumors G5

macrumors 6502a

macrumors 6502

macrumors G4

macrumors 68000

macrumors 68030

Suspended

macrumors G4

macrumors 6502a

macrumors 601

macrumors 6502

macrumors 601

macrumors 68000

Suspended

macrumors 6502a

macrumors 601

macrumors 601

macrumors 6502

macrumors 601

Our Staff