Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
There is something in the way the M1 Max and Ultra GPU scaling works that seems to significantly hold back the GPU on those chips. If you look at blender open data benchmark you can see the M2 Max can match the M1 Ultra in blender rending already and that the M2 Pro -> M2 Max scales more linearly than did M1 Pro -> M1 Max

It's not just that, but M2 Pro and Max deliver much better performance per GPU core in Blender. When I ran the Blender benchmark on my M1 Max the GPU utilisation was relatively low. This suggests that there are some stalls preventing the GPU from flexing its muscles. I would guess this has been fixed for M2 Pro/Max in some way (also see my previous post)

P.S. What's interesting is that the base M2 does not show any improvements in Blender beyond what can be explained with clock increases but M2 Pro is already as fast as M1 Max. So it must have to do with the M2 Pro/Max design itself.
 
Benchmarks indicates improvement at M2 over M1, this is usual by generations. I tend to rely on real daily workloads, when you work with different scenes or more RAM dependent scenes benchmarks could not be reflected correctly. It is a fact M2 GPU cores much faster than M1 GPU cores. Also somehow M1 Ultra GPU cores slower than M1 Max GPU cores. I guess Max 1.3200 v.s. Ultra 900, I m not remember exact numbers right now. Anyway looks like there is really nice gains M2 Ultra over M1 Ultra. Hope to see real world numbers soon.
 
Is Apple allowing developers to make their own BLAS/TLAS? That is something the other API's blackbox (as far as I can tell). Wonder if Apple will do/allow dynamic BLAS build for "infinite LOD" (or to drop LOD models I guess).
Looks like Apple is splitting things into primitive acceleration structures containing actual geometry and multi-level instance acceleration structures which can contain links to primitive accel structures or even other instance structures (perhaps even recursive?)

Their entire RT API looks awesome at the moment, and maybe they are waiting until the dust settles and the API is mature before they spend money making hardware.
 
  • Like
Reactions: Xiao_Xi
Looks like Apple is splitting things into primitive acceleration structures containing actual geometry and multi-level instance acceleration structures which can contain links to primitive accel structures or even other instance structures (perhaps even recursive?)

Their entire RT API looks awesome at the moment, and maybe they are waiting until the dust settles and the API is mature before they spend money making hardware.
They also have dynamic and static accel structures, which I am not sure other API's even bother with.
 
Blender benchmark database shows that M2 Pro/Max have substantially improved the rendering performance over the M1 series. M2 Max is same speed or faster than M1 Ultra here.

So something has changed, maybe the way how they schedule work on GPU cores, or maybe how synchronisation is done. There are some recently published patents that describe a new interconnect design for Apple GPU as well as a new work distribution system, we might be seeing the effects of that.
If the Octane X benchmark from the WWDC presentation is to be believed, I would go as far as to speculate that the M1 Max/Ultra GPU might have a hardware bug of sorts. Really looking forward to see real world 3d rendering benchmarks on the M2 Ultra because something always seemed a bit off with the M1 Ultra. Time will soon tell.
 
  • Like
Reactions: aytan
They also have dynamic and static accel structures, which I am not sure other API's even bother with.
I think they’re the same thing, they just recommend separating dynamic parts of your scene into its own branch so you only rebuild the dynamic part of the accel structure per frame
 
To add to my previous post, I think the Fabric interconnect subsystem was either not working optimally due to a hardware bug or there was some inherent hardware limitation in the first generation that Apple has fixed now.
 
To add to my previous post, I think the Fabric interconnect subsystem was either not working optimally due to a hardware bug or there was some inherent hardware limitation in the first generation that Apple has fixed now.

That’s my impression as well. Also, the improved interconnect seems to be only available on the M2 Pro/Max (which strictly speaking are the same floodplain anyway).
 
I just realised how useless Geekbench Compute is for comparing Apple's SoC metal GPU scores. It doesn't indicate how many GPU cores the SoC has.


Screenshot 2023-06-10 at 10.43.12.png


If this is accurate.... WOW.
 
How can the M2 ultra already be on the chart, given that a search for "M2 ultra" only returns one Metal test result?
That's a very good question.

However it does look like that score is 220k, and the one at the top of this chart is 280k.

220k / 280k = ~0.79
60 cores / 76 cores = ~0.79

So I guess it's plausible that the 220k score is for the 60 core Ultra and 280k is for the 76 core.

Also if that's true then the scaling is excellent.
 
It's believable in the sense that the M2 Max gets around half that.

Apple definitely fixed what scaling issues they had with the M1 Ultra. Also the M2 Max 38-core GPU rivaling the M1 Ultra 48-core GPU says a lot.

The M2 Max with 30-core GPU gets around 120K, so that validates the M2 Ultra 60-core scores somewhat.
 
  • Like
Reactions: Macintosh IIcx
That's a very good question.

However it does look like that score is 220k, and the one at the top of this chart is 280k.

220k / 280k = ~0.79
60 cores / 76 cores = ~0.79

So I guess it's plausible that the 220k score is for the 60 core Ultra and 280k is for the 76 core.

Also if that's true then the scaling is excellent.
*runs off and configures a Mac Studio M2 Ultra with 76 cores*

*sees the price and cries*
 
Could be faked for the purpose of "first post'... I'd wait next week for more samples to show up.
I assumed the same, but the chart shown is supposed to be the average of at least 5 different tests. I think it would be much harder to cheat. It is weird however.
 
I assumed the same, but the chart shown is supposed to be the average of at least 5 different tests. I think it would be much harder to cheat.
And even if they were fake results, they should be returned by the search. What results is the chart based on??
 
  • Like
Reactions: Xiao_Xi
And even if they were fake results, they should be returned by the search. What results is the chart based on??
I believe if you purchase a copy of GB, you have the option to keep scores secret. It would be the first time in history reviewers have bought a copy however!
 
  • Like
Reactions: jeanlain
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.