3D Rendering on Apple Silicon, CPU&GPU

Xiao_Xi · Jan 31, 2023

diamond.g said:
I wonder if Apple will continue optimizing for M1 SoCs since the M2 is out. Will M2 optimizations work on M1 units?

I believe that all SoCs, not just those in the M2 family, will take advantage of the optimizations. As an example, this optimization showing the potential of MetalRT.

This patch optimises subsurface intersection queries on MetalRT. [...] On M1 Max, with MetalRT selected, this can give a render speedup of 15-20% for scenes like Monster which make heavy use of subsurface scattering.

Blender Archive - developer.blender.org

developer.blender.org

By the way, it looks like Apple has gone back to Cycles optimizations after finishing the Metal backend for viewport.

l0stl0rd · Jan 31, 2023

Xiao_Xi said:
I believe that all SoCs, not just those in the M2 family, will take advantage of the optimizations. As an example, this optimization showing the potential of MetalRT.

Blender Archive - developer.blender.org

developer.blender.org

By the way, it looks like Apple has gone back to Cycles optimizations after finishing the Metal backend for viewport.

Metal Viewport is actually still not complete but usable.

All patches so for don’t make a difference between M1 or M2 except that one https://developer.blender.org/rB08b3426df9e5b5dd3c7cc042197bea3ea2398e75

ader42 · Jan 31, 2023

Looks like a good speed bump for M2 Max in Blender:

vinegarshots · Jan 31, 2023

ader42 said:
Looks like a good speed bump for M2 Max in Blender:

So about 5-6X slower than an Nvidia 4090 now. Not actually that horrible if you just work on still-image renders. Maybe if they can jam 5 M2 Max chips inside the new Mac Pro (or something like 200 cores), they might be able to match performance with a 4090. 😀

innerproduct · Jan 31, 2023

vinegarshots said:
So about 5-6X slower than an Nvidia 4090 now. Not actually that horrible if you just work on still-image renders. Maybe if they can jam 5 M2 Max chips inside the new Mac Pro (or something like 200 cores), they might be able to match performance with a 4090. 😀

Yes, this is nice for a semi-slim laptop. But i agree, we need something with about 512 apple gpu cores to compete with what’s on PC. A 7950x consumer grade rig with a good psu can host two 4090s. A big threadripper machine might host up to 4 dual slot blower cards (at a cost) Now, apple maybe doesn’t have to be that extreme to be usable, but to me it seems the new MP needs that quadra m2max at least.

Xiao_Xi · Jan 31, 2023

vinegarshots said:
So about 5-6X slower than an Nvidia 4090 now.

Out of curiosity, what figures did you use for comparison? Techgage used the BMW and PartyTug scene to test some PC GPUs, but unfortunately the results are not comparable.

aytan · Jan 31, 2023

Xiao_Xi said:
I believe that all SoCs, not just those in the M2 family, will take advantage of the optimizations. As an example, this optimization showing the potential of MetalRT.

Blender Archive - developer.blender.org

developer.blender.org

By the way, it looks like Apple has gone back to Cycles optimizations after finishing the Metal backend for viewport.

As expected they will go for Cycles optimizations. On the other hand, I believe ( looks like there is only me and myself ) Blender is not enough mature for long term projects now. Apple spends long time build up all system on a better way, with Metal UI optimizations first step passed before that UI performance was just below nightmare. Also they had to optimize Blender memory usage on AS macs. It is very suspicious now, large scenes or dense structures consumes too much memory from unified memory.
Blender optimizations will get a few + to Apple, in general for 3D, somehow Apple need to catch up RT, had to find a way while maintain current power efficiency also adding much more/faster GPU cores in a cheaper way very soon.
What we saw from that YT videos, M2 Max GPU cores points out improvements on a single optimized software, but +100 degrees on board, all I know I don't like to work long hours with this machine. The price you pay is too high for %20-%30 improvement on GPU side which turns back to you as heat and overall frequency dropdown. Maybe it will be better on next generation Mac Studio or future AS Mac Pro.

leman · Jan 31, 2023

vinegarshots said:
So about 5-6X slower than an Nvidia 4090 now. Not actually that horrible if you just work on still-image renders. Maybe if they can jam 5 M2 Max chips inside the new Mac Pro (or something like 200 cores), they might be able to match performance with a 4090. 😀

Once Apple has hardware RT (hopefully with M3) the difference might be closer than one thinks.

leman · Jan 31, 2023

aytan said:
but +100 degrees on board, all I know I don't like to work long hours with this machine. The price you pay is too high for %20-%30 improvement on GPU side which turns back to you as heat and overall frequency dropdown. Maybe it will be better on next generation Mac Studio or future AS Mac Pro.

What frequency dropdown? Temperature doesn’t matter, what’s important is that this hardware produces relatively little heat and the cooling system can keep it at max performance pretty much forever.

l0stl0rd · Jan 31, 2023

I will share my test for you guys.

Tests done one the 16" M2 Max 30 core (on battery).
I did run every render twice.

Monster under the bed (the only one I checked on the release version).
3.41 = 1 min 38 sec
3.5 = 1 min 19 sec

Tree Creature 15 fps

Patry Tug from compiled shaders (material preview).
1st run 11 sec
2nd run 5 sec

Party Tug from solid view 28 sec

Amy animation playback
Material Preview 24fps (left the frame limit on).
Eevee render 20 fps

Classroom 49 sec

BMW (just for fun) 22 sec

Lone Monk 6 min 40 sec

Also max temp I saw on 30 core is 95°

aytan · Jan 31, 2023

leman said:
What frequency dropdown? Temperature doesn’t matter, what’s important is that this hardware produces relatively little heat and the cooling system can keep it at max performance pretty much forever.

What I saw on the mentioned video, if I m not wrong there was frequency dropdown on CPU side eventually overall performance dropdown. All I pointed that. Also I am very happy with this improvements and look forward M2 Ultra or what will Apple release beyond Ultra. M1 Ultra reaches kind of 'warm' behavior on render session just now, I assume if GPU frequencies will go higher M2 Studio Max/Ultra could be hotter.

aytan · Jan 31, 2023

l0stl0rd said:
I will share my test for you guys.

Tests done one the 16" M2 Max 30 core (on battery).
I did run every render twice.

Monster under the bed (the only one I checked on the release version).
3.41 = 1 min 38 sec
3.5 = 1 min 19 sec

Tree Creature 15 fps

Patry Tug from compiled shaders (material preview).
1st run 11 sec
2nd run 5 sec

Party Tug from solid view 28 sec

Amy animation playback
Material Preview 24fps (left the frame limit on).
Eevee render 20 fps

Classroom 49 sec

BMW (just for fun) 22 sec

Lone Monk 6 min 40 sec

Also max temp I saw on 30 core is 95°

Looks great, already faster than 48 core M1 Ultra.

l0stl0rd · Jan 31, 2023

aytan said:
Looks great, already faster than 48 core M1 Ultra.

Yes looking great so far considering I had a 14 core M1 Pro. In some cases it is close to 3x faster.
Viewport not as much in some cases and a lot better in others.

aytan · Jan 31, 2023

l0stl0rd said:
Yes looking great so far considering I had a 14 core M1 Pro. In some cases it is close to 3x faster.
Viewport not as much in some cases and a lot better in others.

Looks like you have made a good choice with 16''

sirio76 · Jan 31, 2023

aytan said:
What I saw on the mentioned video, if I m not wrong there was frequency dropdown on CPU side eventually overall performance dropdown. All I pointed that. Also I am very happy with this improvements and look forward M2 Ultra or what will Apple release beyond Ultra. M1 Ultra reaches kind of 'warm' behavior on render session just now, I assume if GPU frequencies will go higher M2 Studio Max/Ultra could be hotter.

You don’t want to rely on that video channel, just to get an idea of the misinformation: https://forums.macrumors.com/thread...-channel.2378896/?post=31923135#post-31923135

aytan · Feb 1, 2023

sirio76 said:
You don’t want to rely on that video channel, just to get an idea of the misinformation: https://forums.macrumors.com/thread...-channel.2378896/?post=31923135#post-31923135

I agreed about YT and misinformation, I just told what I saw on the screen not what he said on video.
We are using right now bunch of M1's, 2013 mac pro's, Imac's, some midrange PC's exct. Personally I have a few macs, Mac studio's and a few PC's. There is only 2 of them works fine under Video/vfx/cc-grading/3D workflows. 16'' M1 MBP ( going really hotter sometimes with DaVinci when using with multi screen setup ) and with a doubt, 3 of Studio's.
There is no one happier than me if M2 macs will perform beyond expectations. I really want them perform better than M1 and seems like they are.
I know I should give a try by myself

maybe we will purchase e few of them this week but I tend to wait a few weeks more at least until March.

aeronatis · Feb 1, 2023

vinegarshots said:
So about 5-6X slower than an Nvidia 4090 now. Not actually that horrible if you just work on still-image renders. Maybe if they can jam 5 M2 Max chips inside the new Mac Pro (or something like 200 cores), they might be able to match performance with a 4090. 😀

Even an RTX 3060 Laptop GPU is faster than M2 Max in Blender, simply because of CUDA/OptiX utilization of the software whereas it has no special compatibility for Apple Neural Engine and unified memory architecture yet.

Just by further software optimizations alone, the situation could be much better, not to mention how it could be if or when Apple adds RT cores inside Apple Silicon chips.

Besides, it never has to match RTX 4090 Desktop GPU. Being comparable to RTX 4070/4080 while only consuming 50 watts at peak would be a real game changer as people will finally have true mobile workstations then. I agree Apple needs more work on the desktop chips though.

leman · Feb 1, 2023

aeronatis said:
Even an RTX 3060 Laptop GPU is faster than M2 Max in Blender, simply because of CUDA/OptiX utilization of the software whereas it has no special compatibility for Apple Neural Engine and unified memory architecture yet.

M2 Max is more or less the same as 3070 RTX mobile in blender using compute only (CUDA, Metal), obviously much slower when Nvidia’s hardware RT is used (Optix). Neural engine has nothing to do with any of it. M2 family performs as well as Nvidia relative to the raw compute capability of the GPU, so I’d think the software optimizations are already reasonably mature. Maybe Apple can get another 10% out of it, who knows.

But next step for them has to be hardware RT, that’s non-negotiable.

Xiao_Xi · Feb 1, 2023

leman said:
Neural engine has nothing to do with any of it.

Could Apple use the neural engine for a denoiser?

jmho · Feb 1, 2023

Xiao_Xi said:
Could Apple use the neural engine for a denoiser?

Yes, I was just about to post that nVidia cards use OptiX for their de-noising. If Apple can come with a Core ML denoiser then that would free up the GPU for more rendering.

Xiao_Xi · Feb 1, 2023

jmho said:
If Apple can come with a Core ML denoiser then that would free up the GPU for more rendering.

Could a neural engine-based denoiser have more impact on rendering times than hardware-based ray tracing?

aytan · Feb 1, 2023

Xiao_Xi said:
Could a neural engine-based denoiser have more impact on rendering times than hardware-based ray tracing?

Denoiser and Optix are different pieces I guess. As far as I know, you could use them together, for still frames any Denoiser works more or less, for animation Optix accelerates whole rendering process + you can add Optix Denoiser for extra time savings, depends on which software or render engine you use. If you use any Nvidia HW with RT Optix could be default render engine option. However using any Denoiser for animation sequences not works for all scenes and ıt has got own risks. Most of the time not works as desired. It has different behaviors depending on which software you use.

aytan · Feb 1, 2023

For now you can use Intel open image denoiser with M1/M2 Macs. I works well but adds + time fragments to your single frame render times. There is not any kind of acceleration with denosier itself.

jmho · Feb 1, 2023

Xiao_Xi said:
Could a neural engine-based denoiser have more impact on rendering times than hardware-based ray tracing?

No. Hardware RT will be huge.

AI de-noisers are mostly incredible for viewport stuff because you can generate a very good estimate of your final frame using just a tiny handful of samples that would otherwise look incredibly noisy. It's less useful for final renders as aytan mentions.

jmho · Feb 1, 2023

aytan said:
Denoiser and Optix are different pieces I guess.

Yeah Optix denoiser is mostly a separate piece of technology from the main Optix raytracing engine. The denoiser doesn't use the RT cores at all and just runs on the tensor cores (which are nVidia's version of the neural engine)

3D Rendering on Apple Silicon, CPU&GPU

macrumors 68000

macrumors 6502

macrumors 6502

macrumors 65816

macrumors regular

macrumors 68000

macrumors regular

macrumors Core

macrumors Core

macrumors 6502

macrumors regular

macrumors regular

macrumors 6502

macrumors regular

macrumors 6502a

macrumors regular

macrumors regular

macrumors Core

macrumors 68000

macrumors 6502a

macrumors 68000

macrumors regular

macrumors regular

macrumors 6502a

macrumors 6502a

Our Staff