Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Another test. Blender 3.2 alpha. Calculating Fluid bake: M1 Pro 2:49 min. 8 Core iMac 2020 4:56 min. I like the M1 Pro.
 
this video is good comparing M1 Ultra 64 core v 48 core...

Made me decide to cancel my 64 core order as gains from 48 core to 64 core don't seem to be worth it.

Gonna spend that extra £1000 on software instead as that will improve my work more than slightly quicker render times.
 
this video is good comparing M1 Ultra 64 core v 48 core...

Made me decide to cancel my 64 core order as gains from 48 core to 64 core don't seem to be worth it.

Gonna spend that extra £1000 on software instead as that will improve my work more than slightly quicker render times.
It's really disappointing to see them throttling GPU speed while its thermal design has plenty of rooms. This is not a mobile device Apple.
 
It's really disappointing to see them throttling GPU speed while its thermal design has plenty of rooms.
We don't know if Apple throttles GPU speed. It could be some hardware or software issue that prevents the saturation of GPU cores.
 
Sorry for the slight tangent on the topic at hand, but have you guys seen Lentil for Arnold?
It's a very interesting camera toolkit for Arnold Render.
It works by "taking each sample and bidirectionally re-distributing its energy over the image plane. Resulting in smooth bokeh and motion blur even at low AA samples." As well as the ability to simulate light paths through physical, real world lenses reproducing their unique bokeh signatures.

Available for macOS and.... it's free!

https://www.lentil.xyz/index.html

Screenshot 2022-04-02 at 0.21.53.png
 
I doubt this'll be the end of the story. I suspect Apple will release an update that'll improve speeds. Something just ain't right here.
Everything is right. There might be some headroom sure but not as much as youtubers think so. Turning up the heat doesn't mean linear power gain. And there is a question of design quality in components, maybe they can't accumulate such heat.
 
Everything is right. There might be some headroom sure but not as much as youtubers think so. Turning up the heat doesn't mean linear power gain. And there is a question of design quality in components, maybe they can't accumulate such heat.

The heat is negligible compared to what’s standard in the industry. It’s about getting the performance one expects from the hardware. Apple Silicon design is about horizontal aka. linear scaling after all.

My personal suspicion is that the hardware is simply too powerful and the software used to test it so far is simply unable to utilize it properly. Of course, there might also be some bugs on Apples side.
 
My personal suspicion is that the hardware is simply too powerful and the software used to test it so far is simply unable to utilize it properly. Of course, there might also be some bugs on Apples side.
I'm not sure what "too powerful" means.
There has to be something wrong on the Apple side since other GPUs scale better (e.g., Geekbench, GFXBench at 1080p...).
 
My personal suspicion is that the hardware is simply too powerful and the software used to test it so far is simply unable to utilize it properly. Of course, there might also be some bugs on Apples side.
Apple could end this uncertainty by releasing the software it used for the benchmarks. But, I'm afraid the Ultra drivers are starting to look like Universal Control and are not ready from day one.
 
The heat is negligible compared to what’s standard in the industry. It’s about getting the performance one expects from the hardware. Apple Silicon design is about horizontal aka. linear scaling after all.

My personal suspicion is that the hardware is simply too powerful and the software used to test it so far is simply unable to utilize it properly. Of course, there might also be some bugs on Apples side.
My suspicion is that Apple focused on memory heavy workloads to force the 3090 to swap from system memory.

I have no doubts that in specific workloads the Ultra would match the 3090, but when a lot more software is optimized for Optix and CUDA over Metal, then it just makes Apple’s claims look like hogwash.
 
  • Like
Reactions: shuto
I'm not sure what "too powerful" means.
There has to be something wrong on the Apple side since other GPUs scale better (e.g., Geekbench, GFXBench at 1080p...).

Geekbench scaling problem was already explained — the workload very short and Apple GPU does not enter the high energy mode when doing such light work. The motivation for this was also already discussed: Apple needs to be very energy efficient when doing low-key GPU work and performant when doing demanding work. Their solution is to manage the GPU performance level based on the amount of the outstanding work. If a task can be done in under 5-10ms, it is considered low priority and the GPU stays in a lower performance mode. This is reasonable since these times are under what a human user can perceive. If a task needs longer to run, the GPU will enter the high performance mode. This strategy will likely correctly cover most usage scenarios, with the notable exception of real-time systems that do short GPU processing on regularly incoming data, but with signifiant (10ms+) pauses between the data packages. BTW, very similar strategies are used for CPUs.

I supposed this can be "fixed" by introducing priorities for GPU work queues, with "high priority" queues telling the system that it needs maximal performance from the start, but quite honestly, I am not sure Apple would be willing to introduce such a functionality because from the rational point of view nothing is broken. If you are submitting a single GPU work package only, it doesn't really matter whether it's done in 10ms or 1ms, you won't notice a difference anyway. Not so with 10s or 1s. In the end, the only software that "suffers" from this behaviour are benchmarks...

Regarding GFXBench... this is much trickier. I am not sure one can discuss these benchmarks in any meaningful terms simply because the tests are woefully inadequate for the hardware tested. All of these GPUs produce hundreds of frames per second, which is much higher than any real-world software or game will need or do (you are limited by the screen refresh after all). The difference between 500fps and 1000FPS is just 1ms, this can very well be the result of the internal system implementation.

Frankly, I am surprised that I have not seen anyone using modern demanding games to test the GPU performance. There are many options, and yet everyone is still using the ancient Rise of the Tomb Raider for some reason... Metro Exodus, BG3, Total War: Three Kingdoms — these are all demanding games that have fairly well optimised macOS versions and could be used for reasonable cross-platform comparison.
 
Last edited:
Frankly, I am surprised that I have not seen anyone using modern demanding games to test the GPU performance. There are many options, and yet everyone is still using the ancient Rise of the Tomb Raider for some reason... Metro Exodus, BG3, Total War: Three Kingdoms —
Here's Metro Exodus:
I don't think it performs that well on the M1 Ultra (it can go below 60 fps at 1080p). But this is a Vulkan game using Rosetta, so perhaps not the best demonstration of the GPU capabilities. At least Tomb Raider games are written in Metal.
As for BG3, it's still in Beta right? I don't expect many people to use it as a benchmark tool.
So that leaves us a single demanding Mac game (but not that graphically impressive IMO) that is native to Apple Silicon, which is that Total War game. There is no other game AFAIK. This is ridiculous.
 
Here's Metro Exodus:
I don't think it performs that well on the M1 Ultra (it can go below 60 fps at 1080p). But this is a Vulkan game using Rosetta, so perhaps not the best demonstration of the GPU capabilities. At least Tomb Raider games are written in Metal.
As for BG3, it's still in Beta right? I don't expect many people to use it as a benchmark tool.
So that leaves us a single demanding Mac game (but not that graphically impressive IMO) that is native to Apple Silicon, which is that Total War game. There is no other game AFAIK. This is ridiculous.

Total War Three Kingdoms is not M1 native, but it uses Metal. Total War: Warhammer 3 (not yet released for Mac) will be M1 native though. I think that Rosetta-bases games are a fair game (pun) for benchmarking, we have what we have and why shouldn't we use it? But yeah, currently the only native available game is BG3.
 
Ha, I confused with Total War Rome Remastered.
Feral seems to be focusing on console games, so I don't expect many AAA games for Apple Silicon in the foreasable future. Apparently, no one cares. Apple certainly doesn't.

As for BG3:
But how do you compare performance with PC GPUs? The ARM version of the game can't show the frame counter, and there isn't even an integrated benchmark tool in this game, AFAIK.
 
The comparisons I would like to see are complex demanding 3d scenes, say something that takes the Ultra a couple of hours to render.
The closest benchmark could be the rendering of Moana island.

The scores from the Moana Island scene rendering with Redshift is very promising.

2 x 1080 Ti = 77m
2 x 2080 Ti = 34m:17s
1 x 3090 = 21m:45s
2 x 3090 = 12m:44s
M1 Max (64GB) = 28m:27s
 
Some numbers, because everyone likes numbers. This is a test render of the AL Labs scene on the base Mac Studio (10 core, 24GPU, 32 GB ram) vs an iMac Pro (10 Core, 64 GB) rendered in Houdini/Karma (CPU):

6m 55s Mac Studio (Native)
7m 25s iMac Pro
11m 47s Mac Studio (Rosetta)

So there's a significant improvement in render times for AS. It's surprisingly close to the iMac Pro; going by geekbech it should be around 1.3x so there's probably some optimisation to go.

That said I'm returning the Mac Studio; while performance is where I thought it would be - far better than the iMac for sims and much more responsive, the GPU is really slow, unusably so (I think it was around 5fps in the labs scene vs 30+ on the iMac). Upgrading to the Ultra, 48 core, would presumably give me a whole 10fps :/
 
  • Like
Reactions: Lone Deranger
The closest benchmark could be the rendering of Moana island.
Sill not has fast as the 3090, as opposed to Apple's claim.
The results shown during the keynote will remain a mystery.
 
Sill not has fast as the 3090, as opposed to Apple's claim.
Did Apple really state that it was as fast or faster than the 3090, or rather that at a relative performance of 200, the M1 Ultra uses about 115 Watts and the 3090 330 Watts? The comparative claims are probably more about energy efficiency, than performance.

That said, in certain tests it was shown that it really can keep up with the 3090 and is even a smidge faster, which is impressive since Mac Studio is small and the 3090 huge and power hungry.
 
Sorry for the slight tangent on the topic at hand, but have you guys seen Lentil for Arnold?
It's a very interesting camera toolkit for Arnold Render.
It works by "taking each sample and bidirectionally re-distributing its energy over the image plane. Resulting in smooth bokeh and motion blur even at low AA samples." As well as the ability to simulate light paths through physical, real world lenses reproducing their unique bokeh signatures.

Available for macOS and.... it's free!

https://www.lentil.xyz/index.html

View attachment 1984800
This is interesting. Thanks for sharing. With M1 Ultra being good at CPU and not super great at GPU rendering, maybe looking into Arnold rendering is a good idea ?‍♂️
 
The comparisons I would like to see are complex demanding 3d scenes, say something that takes the Ultra a couple of hours to render.

Ain't no one got time for that. But yeah I think most benchmark scenes are under 30min; I guess it's be useful to see for sustained performance and to ensure everything spins up.

In theory you could trivially setup a benchmark from the AL Lab scene; it's USD based and supports usd_shade so in theory any renderer that support USD should be able to open it and render it. Plus it has around 14GB of textures, which would make a reasonable test.

At the moment USD support in DCC apps is a bit all over the shop (Blender's pretty bad, Maya's getting there, not sure about the others).

Is Nvidia's Path Tracing relevant or a marketing gimmick?

From a cursory look, it was kinda cool, but seem like it relies heavily on de-noising the hell out of everything. Isn't it real time because the amount of samples it uses is so coarse (much like the early raytracing hype from Nvidia)? Not sure if I'm missing why this is such a revolutionary thing...

Should probably actually watch the presentation before commenting more though :)
 
  • Like
Reactions: Xiao_Xi
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.