Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Andropov

macrumors 6502a
May 3, 2012
746
990
Spain
Yep not optimized yet. In other GPU based task and benchmarks, the M1 Max reaches 3070m and in some rare cases even 3080m levels of performance.

GPU rendering has been almost exclusively developed for Nvidia GPUs for the past 5+ years so it’s not surprising that Apple Silicon (and AMD for that matter) are falling behind and need extra development to show their true performance.

If Apple was already getting twice the level of rendering performance before the official Blender 3.1 release then we should be seeing something closer to 3070m performance once the optimizations are built in and especially if the neural network is used for de-noising.

I don't know how much render time is spent de-noising the image, but unless it's a big percentage of the total frame rendering time, I think there are other things that are going to speed it more.

From the Apple engineer's post on the Blender forum, I take that white it's now running using Metal, there are significant architecture-dependent optimizations to be made. He mentions that it's already somewhat optimized because it's not copying data back and forth due to the Unified Memory Architecture, but that comes for 'free' with the Metal implementation. There are other (structural) changes that should improve the GPU performance further, without needing to offload work to additional hardware.

For example, and since the post explicitly mentions the rendering path still being closely tied to the Nvidia/CUDA model and not optimized for Apple's architecture, maybe they haven't started trying to reorder and merge the rendering passes that only need access to tile memory. The TBDR of Apple's GPUs means that some rendering passes that needed to be separate in IMR GPUs can now be merged into a single pass, greatly reducing pipeline change overhead and providing much faster memory access, specially if they can fit in 'memoryless' render targets that would need to be copied to VRAM in IMR GPUs but can reside only in tile memory on TBDR GPUs. That IMHO could be a huge performance boost, but it takes a lot of time and deep knowledge of the whole rendering process.
 
  • Like
Reactions: Unregistered 4U

iPadified

macrumors 68020
Apr 25, 2017
2,014
2,257
I don't know how much render time is spent de-noising the image, but unless it's a big percentage of the total frame rendering time, I think there are other things that are going to speed it more.

From the Apple engineer's post on the Blender forum, I take that white it's now running using Metal, there are significant architecture-dependent optimizations to be made. He mentions that it's already somewhat optimized because it's not copying data back and forth due to the Unified Memory Architecture, but that comes for 'free' with the Metal implementation. There are other (structural) changes that should improve the GPU performance further, without needing to offload work to additional hardware.

For example, and since the post explicitly mentions the rendering path still being closely tied to the Nvidia/CUDA model and not optimized for Apple's architecture, maybe they haven't started trying to reorder and merge the rendering passes that only need access to tile memory. The TBDR of Apple's GPUs means that some rendering passes that needed to be separate in IMR GPUs can now be merged into a single pass, greatly reducing pipeline change overhead and providing much faster memory access, specially if they can fit in 'memoryless' render targets that would need to be copied to VRAM in IMR GPUs but can reside only in tile memory on TBDR GPUs. That IMHO could be a huge performance boost, but it takes a lot of time and deep knowledge of the whole rendering process.
For my short renders, de-noising takes 3-5 seconds irrespective of the render time. See post above. To expect a optimised AMD of M1 Metal rendering already now is unrealistic. Cycles was built with CUDA in mind so no wonder it performs better on NVIDIA cards. Looking forward to 3.2.

Comparing GPUs is nearly impossible as the performance is so strongly linked to the software implementation.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,675
Comparing GPUs is nearly impossible as the performance is so strongly linked to the software implementation.

Yep. And that, at the end of the day is why common GPU APIs just don't work. What is the utility of a common API if different vendors require different algorithmic approaches anyway?
 

mi7chy

macrumors G4
Oct 24, 2014
10,622
11,294
5 months isn't really enough and even Apple developer admitted that they need to optimize it continuously. Beside, many 3D software were optimized for several years you know.

If you know it takes several years then why do you keep asking why it's not optimized? Just wait a few years. Might actually be more than a few years considering render times have gotten slower on M1 over time.

MBA M1 Blender BMW
2m48s 12/2021 3.1 alpha
2m59s 3/2022 3.1 release

Meanwhile, 3060 is still consistently >10x faster than MBA M1, 3x faster than M1 Max 32GPU and expected to be 2x faster than M1 Ultra 64GPU.

70W mobile 3060
16s 12/2021 3.0 release
16s 3/2022 3.1 release
 
  • Haha
Reactions: sunny5

iPadified

macrumors 68020
Apr 25, 2017
2,014
2,257
If you know it takes several years then why do you keep asking why it's not optimized? Just wait a few years. Might actually be more than a few years considering render times have gotten slower on M1 over time.

MBA M1 Blender BMW
2m48s 12/2021 3.1 alpha
2m59s 3/2022 3.1 release

Meanwhile, 3060 is still consistently >10x faster than MBA M1, 3x faster than M1 Max 32GPU and expected to be 2x faster than M1 Ultra 64GPU.

70W mobile 3060
16s 12/2021 3.0 release
16s 3/2022 3.1 release
I down loaded the BMW scene from blender.org. Does not seem that the BMW scene supports Metal, does it? So you are only running CPU on your M1 or non metal implementation of GPU?

I my private test file, the alpha and release 3.1 gave very similar results using Metal on M1 Pro. The Metal was significantly faster using both CPU and GPU on M1 Pro.
 
  • Like
Reactions: Andropov

jmho

macrumors 6502a
Jun 11, 2021
502
996
Rendering the BMW on my M1 Max takes about 50 seconds, and the GPU utilisation is about 31%.

Looking forward to them getting that closer to 100%.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
M1 Pro GPU+CPU Metal 1:24 (silent of course)

iMac 8-clore i7 (10700K) and 5700 GPU
CPU only: Loud fan 2:50
GPU only: Silent(!) 0:54
GPU+CPU: Load fan 0:50


Looks like the 5700 is rather competitive compared to the M1 pro
According to Blender benchmarks:
5700 XT - 750
M1 Max (GPU) - 700
M1 Pro (GPU) - 360
M1 (GPU) - 200
M1 Max - 190
M1 Pro - 175
M1 - 115

 
  • Like
Reactions: iPadified

sunny5

macrumors 68000
Jun 11, 2021
1,838
1,706
If you know it takes several years then why do you keep asking why it's not optimized? Just wait a few years. Might actually be more than a few years considering render times have gotten slower on M1 over time.

MBA M1 Blender BMW
2m48s 12/2021 3.1 alpha
2m59s 3/2022 3.1 release

Meanwhile, 3060 is still consistently >10x faster than MBA M1, 3x faster than M1 Max 32GPU and expected to be 2x faster than M1 Ultra 64GPU.

70W mobile 3060
16s 12/2021 3.0 release
16s 3/2022 3.1 release
And you claimed that they already spent 5 months. "so how much longer does it need to be optimized?" Now you are saying that it will take several years? How ironic.
 

iPadified

macrumors 68020
Apr 25, 2017
2,014
2,257
BMW does support GPU since with just CPU it's even slower at 5m58s.
Now it works. Could not load the scene directly by double clicking. The metal tab was not visible in that case. Loading from blender worked better.
59s on 5700
2:43 CPU 10700k

At last the GPU is worth anything in Blender.
 

JimmyjamesEU

Suspended
Jun 28, 2018
397
426
If you know it takes several years then why do you keep asking why it's not optimized? Just wait a few years. Might actually be more than a few years considering render times have gotten slower on M1 over time.

MBA M1 Blender BMW
2m48s 12/2021 3.1 alpha
2m59s 3/2022 3.1 release

Meanwhile, 3060 is still consistently >10x faster than MBA M1, 3x faster than M1 Max 32GPU and expected to be 2x faster than M1 Ultra 64GPU.

70W mobile 3060
16s 12/2021 3.0 release
16s 3/2022 3.1 release
Wow, sorry to see how little improvement there has been for the 3060 over the course of a year. Worrying times for Nvidia. Truly x86 and a dgpu separated by an anemic pci bus is a legacy platform.
 

mi7chy

macrumors G4
Oct 24, 2014
10,622
11,294
Rendering the BMW on my M1 Max takes about 50 seconds

Is that M1 Max 32GPU? Previously recorded result was 43 seconds so if that's the case it's gotten slower from alpha to release. Wonder what changed to drop performance.
 

JimmyjamesEU

Suspended
Jun 28, 2018
397
426
Is that M1 Max 32GPU? Previously recorded result was 43 seconds so if that's the case it's gotten slower from alpha to release. Wonder what changed to drop performance.
Too much development probably. They never have spent 5 months doing it. 3 1/2 is the optimum amount of dev work on something like Blender.
 
  • Haha
Reactions: januarydrive7

sirio76

macrumors 6502a
Mar 28, 2013
578
416
If you know it takes several years then why do you keep asking why it's not optimized? Just wait a few years. Might actually be more than a few years considering render times have gotten slower on M1 over time.

MBA M1 Blender BMW
2m48s 12/2021 3.1 alpha
2m59s 3/2022 3.1 release

Meanwhile, 3060 is still consistently >10x faster than MBA M1, 3x faster than M1 Max 32GPU and expected to be 2x faster than M1 Ultra 64GPU.

70W mobile 3060
16s 12/2021 3.0 release
16s 3/2022 3.1 release
You keep posting this stuff like Blender was the benchmark and the only renderer available, as a matter of facts is the slowest GPU engine and the less optimized for M1. In Redshift for example the Ultra should be near a 3080, but silent, with tons more memory, and using a fraction of the power.
 

sunny5

macrumors 68000
Jun 11, 2021
1,838
1,706
You keep posting this stuff like Blender was the benchmark and the only renderer available, as a matter of facts is the slowest GPU engine and the less optimized for M1. In Redshift for example the Ultra should be near a 3080, but silent, with tons more memory, and using a fraction of the power.
Do you have any Redshift results with m1 Max?
 

terminator-jq

macrumors 6502a
Nov 25, 2012
719
1,513
And you claimed that they already spent 5 months. "so how much longer does it need to be optimized?" Now you are saying that it will take several years? How ironic.
You gotta realize Nvidia basically owned the entire GPU rendering market from the start…

So for the last 5+ years, basically all GPU rending engines were developed exclusively for nvidia cards. AMD didn’t really push into the GPU rendering department until the release of their Pro Render engine and even then, most GPU engines were still being only developed for CUDA. Having Apple directly help with development should speed things up but it’s still going to take longer than just 5 months to get results closer to what the M1 Max is really capable of.
 

jmho

macrumors 6502a
Jun 11, 2021
502
996
Is that M1 Max 32GPU? Previously recorded result was 43 seconds so if that's the case it's gotten slower from alpha to release. Wonder what changed to drop performance.
Yeah, it's the 32 core. It's to be expected - the first pass is going to be just getting stuff to work, then the next pass is going to be getting things to be correct and have all the expected features, and yes that will probably increase render times (which is where we are now), and then the final pass will be optimizing while still maintaining correctness.

It's the sane and logical thing to do. Nobody wants a fast, incorrect renderer that tries to get more "correct" over time. They want a correct renderer that gets faster over time while maintaining its correctness, and thus usability, from day 1.
 

iPadified

macrumors 68020
Apr 25, 2017
2,014
2,257
That's actually good for a nearly 3 year old GPU. For comparison, $579 AMD RX6800 takes 30s on BMW so 2x faster.
It is surprisingly good for an iMac. If I were doing 3D full time, I would go PC due to software and NVIDIA. For general purpose and the odd 3D visualisation the M1 pro and 2020 iMac are good enough and the OS is much more pleasant. My workflow is typically CAD (Fusion 360) and visualisation of structures using Blender (Modo was too cumbersome payment structure but so easy to get nice results).

I am happy that Apple takes 3D somewhat seriously. I am surprised they had completely missed the side of creativity and got stuck with video, photography and music. Looking forward to hardware RT in MX at some point.

Is Intel/AMD CPU had cheaper and more core rich CPU 10 years ago, NVIDIA would have had larger trouble to enter the market of compute intensive tasks.
 

jujoje

macrumors regular
May 17, 2009
247
288
Going to ask a stupid question, so totally expecting a stupid answer, but gave Blender 3.2 a go on 12.3 with my AMD Vega 64 and gave the Alabs scene a go and discovered two things:

1. USD kinda works in blender 3.2 but they haven't implemented the shaders (USD shade), so everything rendered as grey shaded. Small steps but getting there.

2. Performance was laughably bad; turned on Cycles -> Metal -> GPU and I can render the scene with full textures and lights faster on CPU in Houdini than I can render what is effectively an AO pass in Cycles.

Is cycles metal support is limited to more modern GPU's (the vega64 is definitely a bit crusty these days)? There was some talk about which GPUs were going to be supported at one stage, but haven't found a definitive answer...
 

jujoje

macrumors regular
May 17, 2009
247
288
I am happy that Apple takes 3D somewhat seriously. I am surprised they had completely missed the side of creativity and got stuck with video, photography and music. Looking forward to hardware RT in MX at some point.

It seems a pretty big omission on their part; I guess the current push seems to be hoping to remedy this in time for AR to become a thing (you've got to create the assets somewhere).
 
  • Like
Reactions: iPadified

Pressure

macrumors 603
May 30, 2006
5,180
1,544
Denmark
You keep posting this stuff like Blender was the benchmark and the only renderer available, as a matter of facts is the slowest GPU engine and the less optimized for M1. In Redshift for example the Ultra should be near a 3080, but silent, with tons more memory, and using a fraction of the power.
And remember that the default benchmark is using 128x128 blocks which isn’t optimal for M1.

The developer Panos Zompolas mentions it here.
Please note that the benchmark is using 128x128 blocks which are not ideal to the M1 chips (due to their pretty high latencies). So the benchmark (as it stands today) might not show you particularly good scaling. The same is true for the M1 Pro/Max - but to a lesser extent.

As a stopgap solution I might look at adding a blocksize parameter to the benchmark. Or force it to 256 or larger if apple silicon is detected.

The scores from the Moana Island scene rendering with Redshift is very promising.

2 x 1080 Ti = 77m
2 x 2080 Ti = 34m:17s
1 x 3090 = 21m:45s
2 x 3090 = 12m:44s
M1 Max (64GB) = 28m:27s
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.