tl;dr - They are not fair comparisons.
I'm not going to be very deep but just enough to make you guys understand things.
1. Cinebench R23
CR23's render engine uses Intel Embree which is Intel's library to accelerate Ray tracing compute using CPU. It supports various SIMD instruction sets for x86 architecture and among these are SSE or AVX2. AVX2 is Intel's latest SIMD instruction set which is superior to SSE. And, CR23 is AVX heavy, so you know where this is going. Now, ARM's SIMD instruction set is NEON. But Intel Embree obviously doesn't support NEON native implementation. So, for CR23 to even run on Apple silicon, Intel Embree needs to be rewritten for ARM64 which thanks to Syoyo Fujita, became possible. Now, SSE or AVX2 intrinsics need to be translated to NEON intrinsics for every application which is a huge pain in the ass. But there's a library, it's a header actually, available to do that but it's only SSE2NEON and not AVX2NEON. Going by the Github comments for Apple's pull request on Intel Embree, Apple is working on bringing AVX2NEON support for Apple silicon. Even after that, I'm not sure if CR23 will be a fair comparison. Intel might introduce a superior SIMD instruction set and then Apple again has to do a pull request on Intel Embree for NEON translation? Man, that's PAIN.
2. Geekbench GPU Compute
First of all, I've seen a few comments here that you can't compare Metal vs CUDA. Not true. Geekbench is a cross-platform benchmark and it's perfectly fine to compare Metal vs CUDA. What is not a fair comparison is OpenCL comparisons since it's deprecated in macOS. But, the real issue is, for some reason, the GPU compute benchmark doesn't ramp up GPU frequencies or even consume close to maximum power GPU would consume when it's on full load for Apple silicon. How would this be a fair comparison when GPU is not even utilized to its fullest in Apple silicon? This was first noted in M1 Max/M1 Pro review as a comment by Andrei Frumusanu who is ex Anandtech and currently works at Nuvia.
3. Question you might have
A. If Geekbench GPU compute doesn't work as expected for Apple silicon, how can we compare GPU performance against Nvidia or AMD?
I would highly recommend GFXBench 5.0 Aztec Ruins High 1440p Offscreen and 3DMark Wild Life Extreme Unlimited. They both are native to Apple silicon supporting Metal and more importantly, really stress the GPU and give you a clear picture of the performance since they are offscreen tests. But keep in mind, 3DMark is still an iOS app. Not sure if there would be any penalty 'cause of that vs native windows implementation. And, no, SPECviewperf v2.0 doesn't support Metal if you are wondering.
Below are the screencaps from Dave2D's and Arstechnica's Mac Studio review:
B. If Apple Silicon GPUs are so powerful then why Blender benchmarks are underwhelming compared to that of Nvidia?
Two Reasons:
-> Blender 3.1 is just the first stable release supporting Metal in cycles and even Blender themselves in a video going over all the updates said that more performance optimizations for Metal are yet to come. I would definitely expect Apple silicon GPU to match CUDA scores of the latest Nvidia GPUs in blender benchmarks in the future.
-> But that's only in CUDA. Nvidia would still smoke Apple Silicon in Optix 'cause Apple doesn't have anything close to Optix since there are no Ray Tracing cores in Apple GPUs for Metal to take advantage of. I'd love to see Apple package RT cores in their GPU designs and optimize Metal to take advantage of those cores or even write separate API for accelerated ray tracing like Optix.
C. How can we compare the CPU performance of Apple Silicon against an x86 chip if CR23 is not fair?
As a consumer, I really don't know. Maybe, Blender benchmarks using CPU? If you're a professional, you already know about industry-standard benchmarks like SPEC, SPECint, SPECfp, etc. But I don't think anyone except Anandtech uses these benchmarks and the real problem is these YouTubers, man. It's just painful to watch and even more painful to read the comments of the viewers who take these benchmarks results as if it's all that matters when buying a machine.
D. Is there any game(s) out there that would be a fair comparison to measure GPU performance?
World of Warcraft. It's one of the very few games that's native to Apple Silicon and also supports Metal.
4. Final Note
I have reached out to Verge(Becca, Monica, Nilay, and Chaim) and Arstechnica(Andrew Cunningham) to correct them on their recent Mac Studio video/article. I didn't get any reply. I even reached out to Linux and MKBHD guys(Andrew, Adam, and Vinh) for their upcoming reviews with these points. But again, no reply. I don't blame them though. Maybe they didn't see my messages yet. I reached out via Twitter DM after all. Hence I wrote this post to bring little awareness to people who might not know about these details. Finally, it is very important to understand that Apple doesn't sell you SoCs. They sell you computers so make a choice wisely w/o falling for these youtubers or tech publications like Verge who run these benchmarks w/o doing any research on the tools they use and the inaccurate information that might come off of these results.
Cheers!
I'm not going to be very deep but just enough to make you guys understand things.
1. Cinebench R23
CR23's render engine uses Intel Embree which is Intel's library to accelerate Ray tracing compute using CPU. It supports various SIMD instruction sets for x86 architecture and among these are SSE or AVX2. AVX2 is Intel's latest SIMD instruction set which is superior to SSE. And, CR23 is AVX heavy, so you know where this is going. Now, ARM's SIMD instruction set is NEON. But Intel Embree obviously doesn't support NEON native implementation. So, for CR23 to even run on Apple silicon, Intel Embree needs to be rewritten for ARM64 which thanks to Syoyo Fujita, became possible. Now, SSE or AVX2 intrinsics need to be translated to NEON intrinsics for every application which is a huge pain in the ass. But there's a library, it's a header actually, available to do that but it's only SSE2NEON and not AVX2NEON. Going by the Github comments for Apple's pull request on Intel Embree, Apple is working on bringing AVX2NEON support for Apple silicon. Even after that, I'm not sure if CR23 will be a fair comparison. Intel might introduce a superior SIMD instruction set and then Apple again has to do a pull request on Intel Embree for NEON translation? Man, that's PAIN.
2. Geekbench GPU Compute
First of all, I've seen a few comments here that you can't compare Metal vs CUDA. Not true. Geekbench is a cross-platform benchmark and it's perfectly fine to compare Metal vs CUDA. What is not a fair comparison is OpenCL comparisons since it's deprecated in macOS. But, the real issue is, for some reason, the GPU compute benchmark doesn't ramp up GPU frequencies or even consume close to maximum power GPU would consume when it's on full load for Apple silicon. How would this be a fair comparison when GPU is not even utilized to its fullest in Apple silicon? This was first noted in M1 Max/M1 Pro review as a comment by Andrei Frumusanu who is ex Anandtech and currently works at Nuvia.
3. Question you might have
A. If Geekbench GPU compute doesn't work as expected for Apple silicon, how can we compare GPU performance against Nvidia or AMD?
I would highly recommend GFXBench 5.0 Aztec Ruins High 1440p Offscreen and 3DMark Wild Life Extreme Unlimited. They both are native to Apple silicon supporting Metal and more importantly, really stress the GPU and give you a clear picture of the performance since they are offscreen tests. But keep in mind, 3DMark is still an iOS app. Not sure if there would be any penalty 'cause of that vs native windows implementation. And, no, SPECviewperf v2.0 doesn't support Metal if you are wondering.
Below are the screencaps from Dave2D's and Arstechnica's Mac Studio review:
B. If Apple Silicon GPUs are so powerful then why Blender benchmarks are underwhelming compared to that of Nvidia?
Two Reasons:
-> Blender 3.1 is just the first stable release supporting Metal in cycles and even Blender themselves in a video going over all the updates said that more performance optimizations for Metal are yet to come. I would definitely expect Apple silicon GPU to match CUDA scores of the latest Nvidia GPUs in blender benchmarks in the future.
-> But that's only in CUDA. Nvidia would still smoke Apple Silicon in Optix 'cause Apple doesn't have anything close to Optix since there are no Ray Tracing cores in Apple GPUs for Metal to take advantage of. I'd love to see Apple package RT cores in their GPU designs and optimize Metal to take advantage of those cores or even write separate API for accelerated ray tracing like Optix.
C. How can we compare the CPU performance of Apple Silicon against an x86 chip if CR23 is not fair?
As a consumer, I really don't know. Maybe, Blender benchmarks using CPU? If you're a professional, you already know about industry-standard benchmarks like SPEC, SPECint, SPECfp, etc. But I don't think anyone except Anandtech uses these benchmarks and the real problem is these YouTubers, man. It's just painful to watch and even more painful to read the comments of the viewers who take these benchmarks results as if it's all that matters when buying a machine.
D. Is there any game(s) out there that would be a fair comparison to measure GPU performance?
World of Warcraft. It's one of the very few games that's native to Apple Silicon and also supports Metal.
4. Final Note
I have reached out to Verge(Becca, Monica, Nilay, and Chaim) and Arstechnica(Andrew Cunningham) to correct them on their recent Mac Studio video/article. I didn't get any reply. I even reached out to Linux and MKBHD guys(Andrew, Adam, and Vinh) for their upcoming reviews with these points. But again, no reply. I don't blame them though. Maybe they didn't see my messages yet. I reached out via Twitter DM after all. Hence I wrote this post to bring little awareness to people who might not know about these details. Finally, it is very important to understand that Apple doesn't sell you SoCs. They sell you computers so make a choice wisely w/o falling for these youtubers or tech publications like Verge who run these benchmarks w/o doing any research on the tools they use and the inaccurate information that might come off of these results.
Cheers!
Last edited: