Geekbench METAL scores

Asgorath · Apr 5, 2017

Yahooligan said:
Gaming GPU vs workstation GPU, the gaming GPUs are great for gaming but don't do as well at computations. Workstation GPUs do better at computations but aren't as good for gaming.

My D700 and my wife's GTX 680 are both in the 3.1-3.5 tflop range, my D700 kills her GTX 680 in computational benchmarks like OpenCL, her GTX 680 wins when doing rendering benchmarks like Valley.

This is an incorrect assessment of the situation. TFLOPs is a measurement of raw computational power. If the RX 470 has more TFLOPs than the D700s, then it has more raw computational horsepower. However, this raw power is very rarely the bottleneck in benchmarks, for both OpenCL and OpenGL/Metal. Many OpenCL benchmarks have been written for or tuned for the AMD architecture, and thus run extremely inefficiently on the NVIDIA architecture (since they are fundamentally different). Most compute code written/tuned for NVIDIA uses CUDA, as it exposes more of the underlying architecture to the application. There are a few OpenCL examples like Oceanwave and a face recognition benchmark that run much faster on NVIDIA than AMD, but again, that's probably because they were written on NVIDIA and thus have an implicit bias for that architecture.

As always, it really just boils down to the applications you want to run. If you care about LuxMark, then buy an AMD card. If you care about DaVinci Resolve, then buy an NVIDIA card.

linuxcooldude · Apr 5, 2017

Asgorath said:
This is an incorrect assessment of the situation. TFLOPs is a measurement of raw computational power. If the RX 470 has more TFLOPs than the D700s, then it has more raw computational horsepower. However, this raw power is very rarely the bottleneck in benchmarks, for both OpenCL and OpenGL/Metal. Many OpenCL benchmarks have been written for or tuned for the AMD architecture, and thus run extremely inefficiently on the NVIDIA architecture (since they are fundamentally different). Most compute code written/tuned for NVIDIA uses CUDA, as it exposes more of the underlying architecture to the application. There are a few OpenCL examples like Oceanwave and a face recognition benchmark that run much faster on NVIDIA than AMD, but again, that's probably because they were written on NVIDIA and thus have an implicit bias for that architecture.

As always, it really just boils down to the applications you want to run. If you care about LuxMark, then buy an AMD card. If you care about DaVinci Resolve, then buy an NVIDIA card.

Both are AMD cards other than the noted differences in models. Geekbench supports Metal/OpenCL/CUDA. But looking up stats for Geekbench 4 a staffer does say it only uses one GPU at a time for the compute benchmark. Interesting....

So should my metal score actually be 107,852 for 2 cards? Haha..lol

H2SO4 · Apr 5, 2017

Yahooligan said:
Gaming GPU vs workstation GPU, the gaming GPUs are great for gaming but don't do as well at computations. Workstation GPUs do better at computations but aren't as good for gaming.

My D700 and my wife's GTX 680 are both in the 3.1-3.5 tflop range, my D700 kills her GTX 680 in computational benchmarks like OpenCL, her GTX 680 wins when doing rendering benchmarks like Valley.

Ok, cheers. Makes sense.

linuxcooldude · Apr 5, 2017

Synchro3 said:
Ok, new Metal Benchmark with GTX Titan X, this time on Sierra, and latest Nvidia web driver:
View attachment 694880

View attachment 694881

Wow, in CUDA it scores 237366?

Yahooligan · Apr 5, 2017

Asgorath said:
This is an incorrect assessment of the situation. TFLOPs is a measurement of raw computational power. If the RX 470 has more TFLOPs than the D700s, then it has more raw computational horsepower. However, this raw power is very rarely the bottleneck in benchmarks, for both OpenCL and OpenGL/Metal. Many OpenCL benchmarks have been written for or tuned for the AMD architecture, and thus run extremely inefficiently on the NVIDIA architecture (since they are fundamentally different). Most compute code written/tuned for NVIDIA uses CUDA, as it exposes more of the underlying architecture to the application. There are a few OpenCL examples like Oceanwave and a face recognition benchmark that run much faster on NVIDIA than AMD, but again, that's probably because they were written on NVIDIA and thus have an implicit bias for that architecture.

As always, it really just boils down to the applications you want to run. If you care about LuxMark, then buy an AMD card. If you care about DaVinci Resolve, then buy an NVIDIA card.

If you go back to what the original discussion was, the complaint was that the RX470 (AMD card) is slower at Metal than the D700 (Also AMD) even though it has more TFLOPS. So, while the AMD vs Nvidia comparison is valid and benchmarks run better on what they've been optimized for, that doesn't explain why a "faster" gaming GPU is slower running the Metal benchmark than a "slower" D700. I still stand by it being gaming vs workstation GPU and what each is optimized for. Workstation/Computational GPUs with more power simply don't do as well gaming and gaming GPUs don't do as well with computations regardless of their TFLOPS.

Just my $0.02...

Synchro3 · Apr 5, 2017

linuxcooldude said:
Wow, in CUDA it scores 237366?

? No, in CUDA it scores 139735: https://browser.geekbench.com/v4/compute/614412

Well, I could install my GTX 980 Ti as second GPU in the Mac Pro to achieve that score, but it is already in my Kaby Lake-PC.

Asgorath · Apr 5, 2017

Yahooligan said:
If you go back to what the original discussion was, the complaint was that the RX470 (AMD card) is slower at Metal than the D700 (Also AMD) even though it has more TFLOPS. So, while the AMD vs Nvidia comparison is valid and benchmarks run better on what they've been optimized for, that doesn't explain why a "faster" gaming GPU is slower running the Metal benchmark than a "slower" D700. I still stand by it being gaming vs workstation GPU and what each is optimized for. Workstation/Computational GPUs with more power simply don't do as well gaming and gaming GPUs don't do as well with computations regardless of their TFLOPS.

Just my $0.02...

I was specifically commenting on this:

"My D700 and my wife's GTX 680 are both in the 3.1-3.5 tflop range, my D700 kills her GTX 680 in computational benchmarks like OpenCL, her GTX 680 wins when doing rendering benchmarks like Valley."

which is an AMD vs NVIDIA comparison. Also, the OpenCL tests might make good use of 2 GPUs and thus the 2 D700s could beat a single RX 470 if the test isn't limited by raw TFLOPs.

Edit: My main point is that we see a lot of posts along the lines of "GPU X has more TFLOPs than GPU Y but GPU Y runs application Z faster, what's up?". The simple answer is that most applications are not limited by raw GPU TFLOPs and the limiting factor is something else. As a result, you should always take the raw TFLOPs numbers with a huge grain of salt.

Yahooligan · Apr 5, 2017

Asgorath said:
I was specifically commenting on this:

"My D700 and my wife's GTX 680 are both in the 3.1-3.5 tflop range, my D700 kills her GTX 680 in computational benchmarks like OpenCL, her GTX 680 wins when doing rendering benchmarks like Valley."

which is an AMD vs NVIDIA comparison. Also, the OpenCL tests might make good use of 2 GPUs and thus the 2 D700s could beat a single RX 470 if the test isn't limited by raw TFLOPs.

Edit: My main point is that we see a lot of posts along the lines of "GPU X has more TFLOPs than GPU Y but GPU Y runs application Z faster, what's up?". The simple answer is that most applications are not limited by raw GPU TFLOPs and the limiting factor is something else. As a result, you should always take the raw TFLOPs numbers with a huge grain of salt.

Emphasis added by my, you are incorrect regarding the Geekbench OpenCL benchmarks as well. These only run on a single GPU at a time. One of my D700 GPUs handily beats my wife's GTX 680. Yes, it's AMD vs Nvidia, but you seem to be commenting on things you're not fully understanding, either. Which is fine. And yes, raw TFLOPS is just a number that doesn't translate well into real-world performance expectations.

My point also stands. Gaming GPUs and computing GPUs are better at different things. My car may have 500HP but my 350HP truck will easily tow more, faster, and for longer periods.

shaunoneil100 · May 4, 2017

Synchro3 said:
? No, in CUDA it scores 139735: https://browser.geekbench.com/v4/compute/614412

Well, I could install my GTX 980 Ti as second GPU in the Mac Pro to achieve that score, but it is already in my Kaby Lake-PC.

View attachment 695022

Just new to this and I was thinking of taking back the new 1080 Ti for 2 X R9 390x due to my high use in FCPX - however this ego boost and virtual forum whipping stick that geek bench just gave me I might just stick with it.

All jokes aside - what are the thoughts on future support on the Nvidia side to push for better OpenCL capability as my work depends on it. Y'all think I should keep the 1080ti or switch to CF 390x's

6700k
32gb ram
msi 1080ti aero oc (can't overclock it in OS X as far as I know - any suggestions?)
250gb 960 evo m.2

AndreeOnline · May 4, 2017

I actually took my 1080 Ti back yesterday. I had 14 days before I couldn't return it anymore and those days were up so...

The 1080 Ti is great hardware in itself. I found performance in Resolve in CUDA mode to be great. I use Maxwell Render 4 with GPU support and saw nice performance there too, but not in all conditions.

FCPX playback wasn't problematic per se, but BruceX could take a minute to export. F1 2016 hung every now and then. Luxmark Luxball worked, but the heavier scenes didn't. Geekbench OpenCL didn't work.

As a test I put my RX 480 in again and tested the F1 2016 benchmark that made the 1080 Ti hang, and it turned out not only didn't the 480 hang, but it also beat the 1080 Ti in performance.

So.. ups and downs. At the end it came down to simply recognising the fact that the drivers aren't completely up to speed yet. They may, or may not, work better in the future. But I decided not to wait and find out and returned the card.

I'll try to wait for Vega and see if that will work. I also think the Radeon Pro Duo looks very interesting with 11.5 TFlops for $995. I could even drop two Pro Duo in the Mac Pro for some sweet 23 TFlops. =)

h9826790 · May 4, 2017

H2SO4 said:
Benchmark suites are odd things. My RX470 has a higher teraflop output than both of your D700s combined yet gets a lower score.

No, the RX470 is NOT stronger than 2x D700.

Also, the driver for D700 is very mature and highly optimised. On the other hard, there is no official support for RX470. You can make it work by kext edit doesn't mean that the driver can release the card's full potential. In fact, MacOS may not even able to use all 32CU.

H2SO4 · May 4, 2017

h9826790 said:
No, the RX470 is NOT stronger than 2x D700.

Also, the driver for D700 is very mature and highly optimised. On the other hard, there is no official support for RX470. You can make it work by kext edit doesn't mean that the driver can release the card's full potential. In fact, MacOS may not even able to use all 32CU.

My bad. I find it funny that they would post individual specs for memory but not for throughput, ie. say 6GB VRAM each but not say 3.5 TFLOPS each.

Dual AMD FirePro

D700

graphics processors

Dual AMD FirePro D700 graphics processors with 6GB of GDDR5 VRAM each

2,048 stream processors
384-bit-wide memory bus
264 GB/s memory bandwidth
3.5 teraflops performance

PowerMike G5 · May 5, 2017

AndreeOnline said:
I actually took my 1080 Ti back yesterday. I had 14 days before I couldn't return it anymore and those days were up so...

The 1080 Ti is great hardware in itself. I found performance in Resolve in CUDA mode to be great. I use Maxwell Render 4 with GPU support and saw nice performance there too, but not in all conditions.

FCPX playback wasn't problematic per se, but BruceX could take a minute to export. F1 2016 hung every now and then. Luxmark Luxball worked, but the heavier scenes didn't. Geekbench OpenCL didn't work.

As a test I put my RX 480 in again and tested the F1 2016 benchmark that made the 1080 Ti hang, and it turned out not only didn't the 480 hang, but it also beat the 1080 Ti in performance.

So.. ups and downs. At the end it came down to simply recognising the fact that the drivers aren't completely up to speed yet. They may, or may not, work better in the future. But I decided not to wait and find out and returned the card.

I'll try to wait for Vega and see if that will work. I also think the Radeon Pro Duo looks very interesting with 11.5 TFlops for $995. I could even drop two Pro Duo in the Mac Pro for some sweet 23 TFlops. =)

Yes, it really does come down to whether your intended usage gets the speed increase from the new hardware.

I also bought the 1080 Ti the other day to see if it would accelerate my mostly 4K Adobe cc workflow, over the current Titan X Maxwell I am using. All the synthetic benchmarks were indeed showing a roughly 70% increase in CUDA and OpenCL performance, via GeekBench, LuxMark and Unigine.

But then I tried out some real-world rendering tests, pertinent to my daily workload.

In Adobe Premiere, I rendered out a DCI 4K ProRes(HQ) 30-sec clip with 4 effects applied: Lumtri Color with 2 LUTs applied, another Lumetri Color with optical mask tracking, a Colortista/Mojo Filter, and NeatVideo noise reduction. The Neatvideo filter itself can be assigned full resources of your hardware, so I applied 11 of my physical CPU cores and 100% of the VRAM and computing from the GPUs to the filter applied to the footage).

This is where I was a bit surprised with the results:

Titan X - CUDA - 05:59
Titan X - OpenCL - 06:04

1080 Ti - CUDA - 05:53
1080 Ti - OpenCL - 05:52

Then I took a 02:15 DCI 4K ProRes (HQ) clip and exported it out in Adobe Media Encoder as a 2K H.264 master at 25mbps.

Titan X - AME - CUDA - 01:29
Titan X - AME - OpenCL - 01:29

1080 Ti - AME - CUDA - 01:30
1080 Ti - AME - OpenCL - 01:29

Suffice to say, though the new GPU hardware itself was more powerful, the results were mostly the same as my older GPU in my workflow.

SoyCapitanSoyCapitan · May 5, 2017

PowerMike G5 said:
Yes, it really does come down to whether your intended usage gets the speed increase from the new hardware.

I also bought the 1080 Ti the other day to see if it would accelerate my mostly 4K Adobe cc workflow, over the current Titan X Maxwell I am using. All the synthetic benchmarks were indeed showing a roughly 70% increase in CUDA and OpenCL performance, via GeekBench, LuxMark and Unigine.

But then I tried out some real-world rendering tests, pertinent to my daily workload.

In Adobe Premiere, I rendered out a DCI 4K ProRes(HQ) 30-sec clip with 4 effects applied: Lumtri Color with 2 LUTs applied, another Lumetri Color with optical mask tracking, a Colortista/Mojo Filter, and NeatVideo noise reduction. The Neatvideo filter itself can be assigned full resources of your hardware, so I applied 11 of my physical CPU cores and 100% of the VRAM and computing from the GPUs to the filter applied to the footage).

This is where I was a bit surprised with the results:

Titan X - CUDA - 05:59
Titan X - OpenCL - 06:04

1080 Ti - CUDA - 05:53
1080 Ti - OpenCL - 05:52

Then I took a 02:15 DCI 4K ProRes (HQ) clip and exported it out in Adobe Media Encoder as a 2K H.264 master at 25mbps.

Titan X - AME - CUDA - 01:29
Titan X - AME - OpenCL - 01:29

1080 Ti - AME - CUDA - 01:30
1080 Ti - AME - OpenCL - 01:29

Suffice to say, though the new GPU hardware itself was more powerful, the results were mostly the same as my older GPU in my workflow.

Did these same tests on this forum more than a year ago. Same AME results on Open CL and CUDA. Then someone informed us that the GPU doesn't encode h.264. Sure enough I turned on software rendering and the result was the same. There's no GPU rendering for some codecs on macOS.

Then I rebooted into Bootcamp and the AME render result was exactly 4x faster than macOS. On software rendering! On the same machine!

PowerMike G5 · May 5, 2017

SoyCapitanSoyCapitan said:
Did these same tests on this forum more than a year ago. Same AME results on Open CL and CUDA. Then someone informed us that the GPU doesn't encode h.264. Sure enough I turned on software rendering and the result was the same. There's no GPU rendering for some codecs on macOS.

Then I rebooted into Bootcamp and the AME render result was exactly 4x faster than macOS. On software rendering! On the same machine!

Yes, it seems like AME is using the GPU in the same way as Premiere Pro mostly. So if someone is rendering out a standalone master clip to, say H264, the GPU is only handling the scaling, if there is any at all. Otherwise, it is mostly CPU in this case (my 12-core Mac Pro was using all cores in this instance).

The GPU looks like it will come into play far more in AME, if exporting from a Premiere timeline that hasn't been rendered out. In that case, the GPU will accelerate any effects/scaling/etc. that have been optimized as using the GPU for such.

So GPU acceleration can still have quite an impact, but it depends on how one works on their machine.

ActionableMango · May 5, 2017

Yahooligan said:
Gaming GPU vs workstation GPU, the gaming GPUs are great for gaming but don't do as well at computations. Workstation GPUs do better at computations but aren't as good for gaming.

My D700 and my wife's GTX 680 are both in the 3.1-3.5 tflop range, my D700 kills her GTX 680 in computational benchmarks like OpenCL, her GTX 680 wins when doing rendering benchmarks like Valley.

The big problem with the theory you've picked (gaming vs workstation) is that the D700 is really just an HD7970. They are the same card, they use the same drivers, and they bench the same. Heck, they even have the exact same ID--AMD didn't bother to give the D700 a different one. There have been many discussions about this in the past. The D700 is not a workstation card except in branding.

ATI did an article where they explained the difference between workstation and gaming cards. Workstation cards are not faster than (or slower than) their equivalent gaming cards. The exception is where there are highly optimized workstation-GPU-only drivers. But these are on Windows only, not OS X, and wouldn't help a D700 there anyway because the card reports itself identically to an HD7970.

Your D700 vs GTX680 comparison is apples and oranges. Differences in benchmarks for that particular pair of cards can be explained many different ways, from having different architectures, to using different drivers, and to which brand the software is optimized for. A 7970 will perform just as well as a D700 against a GTX680 in OpenCL and just as poorly in Valley, so "workstation" vs "gaming" is not the explanation.

whartung · May 5, 2017

RX460 here. I'm having second thoughts with this card - I don't think its as OOB as one might think - it seems to crash some programs after running for a while.

linuxcooldude · May 5, 2017

ActionableMango said:
The big problem with the theory you've picked (gaming vs workstation) is that the D700 is really just an HD7970. They are the same card, they use the same drivers, and they bench the same. Heck, they even have the exact same ID--AMD didn't bother to give the D700 a different one. There have been many discussions about this in the past. The D700 is not a workstation card except in branding.

ATI did an article where they explained the difference between workstation and gaming cards. Workstation cards are not faster than (or slower than) their equivalent gaming cards. The exception is where there are highly optimized workstation-GPU-only drivers. But these are on Windows only, not OS X, and wouldn't help a D700 there anyway because the card reports itself identically to an HD7970.

Your D700 vs GTX680 comparison is apples and oranges. Differences in benchmarks for that particular pair of cards can be explained many different ways, from having different architectures, to using different drivers, and to which brand the software is optimized for. A 7970 will perform just as well as a D700 against a GTX680 in OpenCL and just as poorly in Valley, so "workstation" vs "gaming" is not the explanation.

Again, AMD/Nvidia base a lot of their Quattro/Firepro cards on other Radeon/GTX offerings. Besides differences in drivers/support and perhaps slight chip differences. But I would not call them a workstation card just because of that distinction alone.

mmomega · May 10, 2017

Here's my 10.2.4 results.
1080Ti

Screen Shot 2017-05-10 at 8.11.50 PM.png

AidenShaw · May 10, 2017

mmomega said:
Here's my 10.2.4 results.
1080Ti
View attachment 699269

So CUDA is more than three times faster than Metal.

SAD! ^H^H^H^H^H That's rather disappointing.

mmomega · May 10, 2017

AidenShaw said:
So CUDA is more than three times faster than Metal.

SAD! ^H^H^H^H^H That's rather disappointing.

This was my thought. This is on the same machine. Run One test, turn around run the 2nd.

linuxcooldude · May 11, 2017

AidenShaw said:
So CUDA is more than three times faster than Metal.

SAD! ^H^H^H^H^H That's rather disappointing.

No surprise there. Nvidia is not optimized in Metal as much as CUDA. Might change with the new Mac Pro.

koyoot · May 11, 2017

AidenShaw said:
So CUDA is more than three times faster than Metal.

SAD! ^H^H^H^H^H That's rather disappointing.

Ahem. There is M395X from iMac in there. And it still scores higher in OpenCL than GTX 1080 Ti, in Metal

.

So it appears its not matter of Metal, but matter of Nvidia rubbish Metal/OpenCL drivers.

P.S. R9 395X has 3.7 TFLOPs of compute power. GTX 1080 Ti - 11.5 TFLOPs.

So CUDA performance actually does not reflect the difference in performance that should be apparent between both GPUs.

But this is MacOS.

flowrider · Jul 27, 2017

Synchro3 said:
Yes. Only OpenCL benchmark does not work.

It does now with the release today of Geekbench 4.1.1. GeekBench 4.1.1 now works with 10.12.6 and Nvidia Web Drivers on the OpenCL test.

Lou

Asgorath · Jul 27, 2017

Oh look, it was an application bug that was preventing it from running on NVIDIA:

http://geekbench.com/blog/2017/07/geekbench-411/

Fixes an issue that could prevent the Compute Benchmark from running on recent NVIDIA GPUs under macOS.

Geekbench METAL scores

macrumors 68000

macrumors 68020

macrumors 603

macrumors 68020

macrumors 6502a

macrumors 68000

macrumors 68000

macrumors 6502a

macrumors newbie

Attachments

macrumors 6502a

macrumors P6

macrumors 603

macrumors 6502a

Suspended

macrumors 6502a

macrumors G3

macrumors newbie

Attachments

macrumors 68020

macrumors demi-god

macrumors P6

macrumors demi-god

macrumors 68020

macrumors 603

macrumors 604

macrumors 68000

Our Staff