Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
There are two different wattage types for the 3080.
~80W and ~155W.
If you sort by GPU versus by device the extras go away. Then you get the "best score" for the card by the "best api". Which is weird because the 6900XT tops the list if you do it by device but the 3090 tops if you do it by GPU.
 

crazy dave

macrumors 65816
Sep 9, 2010
1,453
1,229
The layer likely translates OpenCL kernels directly into Metal IR and OpenCL API is trivially converted into Metal API calls. It should be fairly efficient overall. I have no idea how much overhead there is because of API mismatches…

Yeah that’s what I figured
 

OriginalBaki

macrumors member
Oct 12, 2021
65
66
I worked as an engine / graphics programmer in the games industry for 8 years and the main reason for it is that 99% of devs are in it for passion not money.

I was earning half what I could make outside of games, and the main reason I put up with it so long was that I could work on cutting edge, interesting stuff and get paid to do it. Lots of people in the industry work on their own projects for free too just because they love doing it. I guarantee that almost every game that currently supports ray tracing started with the programmers begging the business people to let them do it and not the other way around.

It's the same with the artists. They want to make mind-blowing high quality art because they love making mind-blowing high quality art. If you asked them to spend all day quickly knocking up ugly low resolution art most of them would quit.
Also the main money makers are console games which all the new ones support ray tracing. The pc games are ports.
 

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
Okay I am not sure how I feel about this. I ran a quick test on my 6900xt and got vastly different scores depending on API chosen.
6900xt.jpg
I wouldn't say the 3080 Laptop scores are suspect, but they could be missing results (if you sort it appears the Vulkan and DX12 tests were not ran).
 

vladi

macrumors 65816
Jan 30, 2010
1,008
617
That sounds like a Fusion problem that can be fixed.

That's how Fusion's been doing since the days of eyeon, the original developers. It needs as much as RAM as you can give it to it if you want real time or close to real time playback. Of course it all depends on complexity of the comp but add some particles, some 3D render to 2D and post processes such as motion blur, aberration, shake and my private PC with 128GB of ram has over 75% eaten up by the app. More and more tools now are GPU accelerated and app defaults to GPU even though it says Auto in the options. Maybe hard switching tools to CPU in Preferences will help the RAM issue.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,679
Okay I am not sure how I feel about this. I ran a quick test on my 6900xt and got vastly different scores depending on API chosen.
View attachment 1872574
I wouldn't say the 3080 Laptop scores are suspect, but they could be missing results (if you sort it appears the Vulkan and DX12 tests were not ran).

API overhead and implementation details have a non-trivial impact at these high framerates. That’s also why GFXbench is not very good, it’s simply not demanding enough. But it can be used to approximate things to a limited degree. It’s best to compare the best score to the best score.
 
  • Like
Reactions: diamond.g

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
API overhead and implementation details have a non-trivial impact at these high framerates. That’s also why GFXbench is not very good, it’s simply not demanding enough. But it can be used to approximate things to a limited degree. It’s best to compare the best score to the best score.
Which also is ridiculous as there is a vanilla M1 with a 800FPS score.
 

crazy dave

macrumors 65816
Sep 9, 2010
1,453
1,229
Okay I am not sure how I feel about this. I ran a quick test on my 6900xt and got vastly different scores depending on API chosen.
View attachment 1872574
I wouldn't say the 3080 Laptop scores are suspect, but they could be missing results (if you sort it appears the Vulkan and DX12 tests were not ran).

I’m not sure I understand the last comment about “appearing to not having been run” but the first part is expected.

GPU benchmarks like these test: how optimally the benchmark was coded in the API; how optimally the driver for the API was written for the hardware; and the underlying hardware.

Creating tests for just the last part is almost impossible. So you just accept that there’s going to be more variation across different tests and more variables underlying the result. And, ultimately, when you are using the GPU all three are indeed what you care about: how well the program was coded for the API, how well does the API run on your GPU, and how good is the hardware at that specific task.
 

JimmyjamesEU

Suspended
Jun 28, 2018
397
426
Returning to Geekbench metal scores for a moment. The A14 gets just under 9000 metal score with 4 cores, the M1 gets 21000 with 8 cores. That seems like great scaling. Why wouldn’t the Max get around 80000? What am I missing?
 
Last edited:
  • Like
Reactions: zoltm

ElfinHilon

macrumors regular
May 18, 2012
142
48
Last edited:

ElfinHilon

macrumors regular
May 18, 2012
142
48
Returning to Geekbench metal scores for a moment. The A14 gets just under 9000 metal score with 4 cores, the M1 gets 21000 with 8 cores. That seems like great scaling. Why wouldn’t the Max get around 80000? What ami missing?
See my post directly above this. I genuinely suspect we are looking at the 24 core GPU option and geekbench is reporting incorrectly.
 

ElfinHilon

macrumors regular
May 18, 2012
142
48
That does make sense to me.
Even further, we know that the 5600M is supposed to be about par with the 16 core GPU. I guess we will have to wait and see what happens with the benchmarks. This is only one benchmark, so I wouldn't hold too much wait in it, but the more and more I look into figures directly provided by Apple, I think this is the 24 core GPU just being miss-labeled in Geekbench.

EDIT: Alternatively, we could be actually looking at the 32core GPU, but it just doesn't scale well for the M1 Max due the depreciated graphics API being used here. That's also entirely possible. However, if we are indeed looking at the 24core here, oh boy this is gonna be a wild ride.
 
Last edited:

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
I’m not sure I understand the last comment about “appearing to not having been run” but the first part is expected.

GPU benchmarks like these test: how optimally the benchmark was coded in the API; how optimally the driver for the API was written for the hardware; and the underlying hardware.

Creating tests for just the last part is almost impossible. So you just accept that there’s going to be more variation across different tests and more variables underlying the result. And, ultimately, when you are using the GPU all three are indeed what you care about: how well the program was coded for the API, how well does the API run on your GPU, and how good is the hardware at that specific task.
I say that because on the score browser the 3080 laptop gpu is missing Vulcan/DX12 scores just like the 6900XT is missing those scores as well. It shows up as failed/not supported which isn't true (clearly I ran the test).
 
  • Like
Reactions: crazy dave

leman

macrumors Core
Oct 14, 2008
19,521
19,679
Returning to Geekbench metal scores for a moment. The A14 gets just under 9000 metal score with 4 cores, the M1 gets 21000 with 8 cores. That seems like great scaling. Why wouldn’t the Max get around 80000? What am I missing?

A14 only does half FP32 rate, M1 has full FP32 enabled.
 
  • Like
Reactions: crazy dave

JimmyjamesEU

Suspended
Jun 28, 2018
397
426
Even further, we know that the 5600M is supposed to be about par with the 16 core GPU. I guess we will have to wait and see what happens with the benchmarks. This is only one benchmark, so I wouldn't hold too much wait in it, but the more and more I look into figures directly provided by Apple, I think this is the 24 core GPU just being miss-labeled in Geekbench.

EDIT: Alternatively, we could be actually looking at the 32core GPU, but it just doesn't scale well for the M1 Max due the depreciated graphics API being used here. That's also entirely possible. However, if we are indeed looking at the 24core here, oh boy this is gonna be a wild ride.
It’s curious. The M1 score is 19000. Divided by 8 equals around 2400 approximately. The 16 core yields 38000. Again around 2400 per core. one Would think the 32 core would be around 72000 opencl score. 60000 must be the 24 core.
 

jeanlain

macrumors 68020
Mar 14, 2009
2,463
958
See my post directly above this. I genuinely suspect we are looking at the 24 core GPU option and geekbench is reporting incorrectly.
That may be the case if the M1 Max was about 1.5x faster than the M1 Pro across the various subtests. But the M1 Max is twice faster in some, and not even faster in others.
The plot thickens.

EDIT: some have suggested that the M1 Max could be hitting some power/thermal limit in that test, if it runs on battery or if some macOS setting limits power/fan speed.
 
Last edited:

JimmyjamesEU

Suspended
Jun 28, 2018
397
426
Even further, we know that the 5600M is supposed to be about par with the 16 core GPU. I guess we will have to wait and see what happens with the benchmarks. This is only one benchmark, so I wouldn't hold too much wait in it, but the more and more I look into figures directly provided by Apple, I think this is the 24 core GPU just being miss-labeled in Geekbench.

EDIT: Alternatively, we could be actually looking at the 32core GPU, but it just doesn't scale well for the M1 Max due the depreciated graphics API being used here. That's also entirely possible. However, if we are indeed looking at the 24core here, oh boy this is gonna be a wild ride.
Why would it scale from 8 to 16 but not to 32?
 

ElfinHilon

macrumors regular
May 18, 2012
142
48
It’s curious. The M1 score is 19000. Divided by 8 equals around 2400 approximately. The 16 core yields 38000. Again around 2400 per core. one Would think the 32 core would be around 72000 opencl score. 60000 must be the 24 core.
Yeah, that's what I am personally leaning towards. I'm confused as to why the 32core would see such terrible scaling in relation to the rest.
That may be the case if the M1 Max was about 1.5x faster than the M1 Pro across the various subtests. But the M1 Max is twice faster in some, and not even faster in others.
The plot thickens.
Yep, good point. I'm honestly thinking we are looking at the 24 core here. It's possible this is also just extremely poor scaling for OpenCL or something.
Why would it scale from 8 to 16 but not to 32?
My guess would be something with the different core complexes or another. I'm not an engineer, so I'm just making educated guesses here. All I know, is that OpenCL doesn't do too great for Macs.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.