Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
A few days ago a new version of Open MoonRay was released, but most of the improvements seem to focus on Nvidia GPUs.

On the up side it's now pretty trivial to get the Houdini version built and installed. On the downside it doesn't appear to be using the GPU at all (whereas in 1.6.0.0 was)
 
  • Sad
  • Haha
Reactions: komuh and novagamer
I’d say that is expected. Spec-wise these GPUs are very similar. In fact, I’d expect that Apple would be faster due to more efficient shader execution and larger caches.
It is a good start for Epic. They have a ways to go to catch up to Blender's performance though.
 
Not sure it's been mentioned here, but 3Delight has also been updated to run on Apple Silicon (apparently a while back, but I only just found out). Was always a pretty nice renderer, particularly for volumes, and you can get a free version that uses 12 cores for testing.
 
  • Like
Reactions: singhs.apps
One day Pixar will update Renderman for Apple Silicon, but not today :D To my mind that's the main holdout on the rendering front, other than Karma xPU (GPU).
 
One day Pixar will update Renderman for Apple Silicon, but not today :D To my mind that's the main holdout on the rendering front, other than Karma xPU (GPU).
With plenty of M4 MacBook Pros and Mini Pros now in artists’ studios, early 2025 would be a good time for GPU rendering in Arnold and Enscape to enter the chat.
 
Last edited:
  • Like
Reactions: jujoje
With plenty of M4 MacBook Pros and Mini Pros now in artists’ studios, early 2025 would be a good time for GPU rendering in Arnold and Enscape to enter the chat.

Skärmavbild 2024-12-26 kl. 09.02.53.png
 
Enscape doesn’t do GPU raytracing on M4 hardware yet.

Hardware Ray-tracing featuresApple´s HWRT implementation is not yet supported by Enscape.Currently not supported

Here in (almost) 2025 we’re looking for full GPU hardware support.

Designers in our office use RTX acceleration working in the Windows version. (Same with Arnold, natch)
 
Last edited:
  • Like
Reactions: komuh
First 5090 Blender score appears vs 4090. Hopefully, 5090 retains most of it at ~50% power limit.

1737130862230.png
 
There are also numbers for 4.3. About 30% faster than 4090. Would have expected more with node shrink + 125w more power. Anyway, at least at tentative m4ultra will not look quite aa bad at it could have been 😂
 
There are also numbers for 4.3. About 30% faster than 4090. Would have expected more with node shrink + 125w more power. Anyway, at least at tentative m4ultra will not look quite aa bad at it could have been 😂
30% more cores = 30% higher score in this case.
 
There are also numbers for 4.3. About 30% faster than 4090. Would have expected more with node shrink + 125w more power. Anyway, at least at tentative m4ultra will not look quite aa bad at it could have been 😂
There is no node shrink (like there was 30 to 40 series).
 
There is no node shrink (like there was 30 to 40 series).
There was a very small one (Ada was on N5/N4, Blackwell 2 is on N4P). But yeah nothing like the shift from consumer Ampere on Samsung 8nm (A100 was TSMC N7 apparently) to Ada on TSMC N5, so your point stands. Still, that have been seemingly no FP32 architecture improvements in the latest generation is a little disappointing. GDDR7 is nice though. And of course FP4 and the ML/AI stuff for those that use it.
 
Last edited:
  • Like
Reactions: diamond.g
3.6 LTS for stability and add-on compatibility vs bleeding edge.
You should try to follow the advice of the Cycles developers. In the near future, Cycles will start advising people to use the latest version of Blender to test the latest hardware.
  • Make it somehow more clear for people who does benchmarks that LTS is not the best to depict performance on the new released hardware.
    • Add a warning in the benchmark launcher?
    • Add a warning in the Blender LTS itself when a device from the future is detected?
 
  • Like
  • Haha
Reactions: mi7chy and Homy
30% more cores = 30% higher score in this case.

I am surprised that the massively increased memory bandwidth does not have a larger effect. It is hard to believe these GPUs are compute-limited. Maybe the new GDDR7 offers no advantage for the memory access patterns used in Blender?
 
I am surprised that the massively increased memory bandwidth does not have a larger effect. It is hard to believe these GPUs are compute-limited. Maybe the new GDDR7 offers no advantage for the memory access patterns used in Blender?
It's about 36-40% improvement and while there are 33% more cores, core clocks were reduced so TFLOPS is actually about 27% higher. Meanwhile memory bandwidth is about 80% higher. So it's fair to say that some of the improvement in Blender score would indeed appear to be from memory improvements.

It's GB5 CUDA score went about 27%, exactly what one would expect - while some of GB 5 may also be memory bound, in general my memory is that GB 5 (and maybe 6 too, I can't remember) has difficulty filling the larger GPUs with enough work to saturate them so the score actually going up linearly with TFLOPS is a decent result. GB6 Vulkan and OpenCL score uplift had a much wider range and is probably more reflective of the state of those compute APIs than raw compute prowess of the device.
 
Last edited:
It's about 36-40% improvement and while there are 33% more cores, core clocks were reduced so TFLOPS is actually about 27% higher. Meanwhile memory bandwidth is about 80% higher. So it's fair to say that some of the improvement in Blender score would indeed appear to be from memory improvements.

It's GB5 CUDA score went about 27%, exactly what one would expect - while some of GB 5 may also be memory bound, in general my memory is that GB 5 (and maybe 6 too, I can't remember) has difficulty filling the larger GPUs with enough work to saturate them so the score actually going up linearly with TFLOPS is a decent result. GB6 Vulkan and OpenCL score uplift had a much wider range and is probably more reflective of the state of those compute APIs than raw compute prowess of the device.
I always thought there was a bigger difference between CUDA and OpenCL on GB 5. Judging by the scores though, it seems to be around 10%.
OpenCL = 490170
CUDA = 542157
 
  • Like
Reactions: crazy dave
I always thought there was a bigger difference between CUDA and OpenCL on GB 5. Judging by the scores though, it seems to be around 10%.
OpenCL = 490170
CUDA = 542157
It depends. Here are two 4090 scores that are more like 18% apart:



It's possible that OpenCL in GB 5 improved in the 5090 by much better than OpenCL in GB 6. It's also possibly just random variation across runs. 🤷‍♂️ TBH I've always found Geekbench results interesting but not always easy to compare across APIs. They're supposed to be comparable, but ... I've always found too many oddities.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.