Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

vinegarshots

macrumors 6502a
Sep 24, 2018
982
1,349
At first glance this seems pretty impressive for the Mac.

On my 4090 in Wanderer, Playback is around 26FPS and Render Time is about 1.7 seconds.

Thats a little confusing to me, though, as I'm not sure how the Mac is almost twice as fast in viewport playback, but almost 2/3 slower in render time (since it is using the same engine). Something seems a little weird there.

Ok, I'm an idiot. I didnt realize that Wanderer is set at 24fps render by default, which limits the realtime playback.

So, with the render FPS set higher, new results on the 4090:
Realtime Playback: Over 120fps
Render: 1.7 Seconds.

I'm actually not sure how high the realtime playback can go on my system, because the FPS is now capped by my monitor refresh rate (120hz). It appears that it has plenty of headroom to go faster.

So, the Mac Metal viewport performance is better, but still nowhere near a Nvidia card on OpenGL.
 
  • Like
Reactions: Xiao_Xi

Pressure

macrumors 603
May 30, 2006
5,182
1,544
Denmark
Ok, I'm an idiot. I didnt realize that Wanderer is set at 24fps render by default, which limits the realtime playback.

So, with the render FPS set higher, new results on the 4090:
Realtime Playback: Over 120fps
Render: 1.7 Seconds.

I'm actually not sure how high the realtime playback can go on my system, because the FPS is now capped by my monitor refresh rate (120hz). It appears that it has plenty of headroom to go faster.

So, the Mac Metal viewport performance is better, but still nowhere near a Nvidia card on OpenGL.
You are correct that it is slower than the GeForce RTX 4090 (as expected). You need to put the performance into some kind of perspective though.

The GeForce RTX 4090 is a much larger GPU (76.3B transistors) with way more resources (16,384 CUDA cores) that pushes 82.58TFlops, built on the latest 4nm process and have a much higher power draw (TDP of 450 Watt). It only features 24GB of RAM though. The graphic card is larger than the entire Mac Studio chassis that houses the M1 Ultra SoC.

For comparison the M1 Ultra features two 32-core GPUs over its two dies for a total of 8,196 cores clocked at a lowly 1.3GHz for a total of 21TFlops of performance and much lower power draw. It shares up to 128GB of RAM with the system due to the unified memory architecture of Apple Silicon.
 
  • Like
Reactions: stevemiller

sirio76

macrumors 6502a
Mar 28, 2013
578
416
Viewport rendering and conventional GPU rendering are very different things. Also, on a traditional PC the viewport data must first be computed by the CPU(that’s why a fast single core speed is very important in this regard), then displayed by the GPU in the viewport. On AS all of this happens on the same SoC while on the PC data needs to travel across multiple device, that’s why a much smaller/efficient device can get decent performance compared to a 4090
 
  • Like
Reactions: aytan and Xiao_Xi

Bodhitree

macrumors 68020
Apr 5, 2021
2,085
2,216
Netherlands

leman

macrumors Core
Oct 14, 2008
19,521
19,677
Viewport rendering and conventional GPU rendering are very different things. Also, on a traditional PC the viewport data must first be computed by the CPU(that’s why a fast single core speed is very important in this regard), then displayed by the GPU in the viewport. On AS all of this happens on the same SoC while on the PC data needs to travel across multiple device, that’s why a much smaller/efficient device can get decent performance compared to a 4090

Can you explain this in more detail? I thought that Blender viewport was just a 3D view into the scene and uses the GPU rasterisation pipeline to produce the image (just like a computer game would)? What do you mean when you say that the viewport data must be computed by the CPU?
 

sirio76

macrumors 6502a
Mar 28, 2013
578
416
On traditional PC (or Mac), the data are not loaded from the disc or the memory and displayed in the viewport by the GPU. Before you display the data the CPU need to compute a lot of things, for example generators, deformer etc, and unfortunately many of this stuff are just single threaded, so if your viewport is slow quite often the bottleneck is on the CPU rather than the GPU.
Can you explain this in more detail? I thought that Blender viewport was just a 3D view into the scene and uses the GPU rasterisation pipeline to produce the image (just like a computer game would)? What do you mean when you say that the viewport data must be computed by the CPU?
 

vinegarshots

macrumors 6502a
Sep 24, 2018
982
1,349
On traditional PC (or Mac), the data are not loaded from the disc or the memory and displayed in the viewport by the GPU. Before you display the data the CPU need to compute a lot of things, for example generators, deformer etc, and unfortunately many of this stuff are just single threaded, so if your viewport is slow quite often the bottleneck is on the CPU rather than the GPU.

The thing that’s not really made clear in those Mac performance numbers is that the demo scene (Wanderer) is using the Eevee engine only. In Blender, there are multiple “rendered” viewport modes available. Eevee is essentially the game-engine renderer. There is also a Cycles (ray traced) viewport as well.

We all know that Apple Silicon can’t compete with PC on ray tracing performance, so it makes sense that they haven’t bothered publishing any performance data on the Cycles viewport.

But seeing that PC is at least 3X faster with the basic raster engine (Eevee) performance is pretty shocking to me, honestly.
 

sirio76

macrumors 6502a
Mar 28, 2013
578
416
To me shocking is comparing a huge, noisy, triple slot GPU that consume 450W to an integrated GPU that will consume about 1/10 of that power ;)
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
Can Blender's Eevee be considered a good benchmark for testing the rasterization capabilities of Apple's GPU
 

Standard

macrumors 6502
Jul 8, 2008
296
59
Canada
Interestingly enough, Mudbox 2024 now has AS support. This is great as it is still an amazing software. I wonder if they did this for future consolidation into Maya?

 

sirio76

macrumors 6502a
Mar 28, 2013
578
416
Can Blender's Eevee be considered a good benchmark for testing the rasterization capabilities of Apple's GPU
Take your average scene, test it in the software you use, check if the hardware is fast enough to get the job done comfortably.
Since user needs may vary a lot, that’s the only benchmark that you should consider a “good benchmark”, because in the end the only purpouse of our systems is to generate income, and if they are able to do that then it’s all that matter.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,677
On traditional PC (or Mac), the data are not loaded from the disc or the memory and displayed in the viewport by the GPU. Before you display the data the CPU need to compute a lot of things, for example generators, deformer etc, and unfortunately many of this stuff are just single threaded, so if your viewport is slow quite often the bottleneck is on the CPU rather than the GPU.

Thanks! I don't see much evidence that the CPU is the bottleneck in these particular examples though. There is a healthy speedup using Metal (on the same CPU), and another user reported the same scene running much faster on a fast Nvidia GPU.



But seeing that PC is at least 3X faster with the basic raster engine (Eevee) performance is pretty shocking to me, honestly.

Well, you are using the 4090, a GPU with twice as many shading/raster clusters and 60-70% higher clock, so to be fair, it should't really be any more shocking than a fact that a truck can haul more load than a minibus. I think it would be mich more interesting to compare GPUs with similar nominal performance. Such as the M2 max and the RTX 3060 (desktop) or the RTX 3060 (mobile).

To me shocking is comparing a huge, noisy, triple slot GPU that consume 450W to an integrated GPU that will consume about 1/10 of that power ;)

Depends on your comparison criteria. If you are a professional 3D artist who uses a desktop workstation chances are you care about performance more than about power consumption or size of the tower. The Studio might have its merits, but they will hardly be relevant to you.

Can Blender's Eevee be considered a good benchmark for testing the rasterization capabilities of Apple's GPU

Probably not. There is usually a lot of complexity behind this kind of software that makes it difficult to understand what exactly one is measuring. For measure of rasterisation capabilities I would look still at games and gaming benchmarks.
 
  • Like
Reactions: Xiao_Xi

leman

macrumors Core
Oct 14, 2008
19,521
19,677
We all know that Apple Silicon can’t compete with PC on ray tracing performance, so it makes sense that they haven’t bothered publishing any performance data on the Cycles viewport.

I thought that Cycles was the production renderer of Blender? Is there also a Cycles viewport?
 

innerproduct

macrumors regular
Jun 21, 2021
222
353
Yes. Same cycles renderer but different settings usually. Like most renderers these days. Some call it IPR, interactive preview rendering. Eevee can also be used for preview or finals with different levels of quality. Eevee was basically game tech viewport using opengl. Now apple ported it to metal for better performance. Eevee next will have more hybrid functionality like raytracing some effects. It will however not require RT HW.
 

dmccloud

macrumors 68040
Sep 7, 2009
3,142
1,899
Anchorage, AK
What's so confusing? Those are the best GPU options on PC and Mac lines.

It's also not a direct comparison. Even if you're looking at either the M1 Ultra or the M2 Max (38 core GPU), the graphics capabilities are simply not the same as an RTX 4090 or 4080. Also, some of those 40xx GPUS actually are quad slot monsters now, simply because of how much cooling they have to add to the card to make it work under extended load.
 

mi7chy

macrumors G4
Oct 24, 2014
10,623
11,295
Not, by any means, an off-the-shelf build though...

This is a customized enhancement to the previous build mentioned in the video that is off the shelf except for some aesthetics. Doesn't change the performance of stock RTX A2000 dGPU on Blender benchmark relative to M1 Ultra 64GPU. Customized build just adds about +24% performance on top by uncapping power limit or wait for newly announced RTX 4000 SFF to ship which is ~2x faster in the same dimensions and stock 70W power.

1680312934938.png
 

iPadified

macrumors 68020
Apr 25, 2017
2,014
2,257
This is a customized enhancement to the previous build mentioned in the video that is off the shelf except for some aesthetics. Doesn't change the performance of stock RTX A2000 dGPU on Blender benchmark relative to M1 Ultra 64GPU. Customized build just adds about +24% performance on top by uncapping power limit or wait for newly announced RTX 4000 SFF to ship which is ~2x faster in the same dimensions and stock 70W power.

View attachment 2182222
Noise levels?
 

mi7chy

macrumors G4
Oct 24, 2014
10,623
11,295
Noise levels?

Ask the source on YouTube or Discord but looking at the build list the Noctua NH-L9i active cooler in the fan cooled (Josh) build is rated at 23.6 dB and the water cooled (Eric) build with Alphacool NexXxoS XT45 radiator looks to be 30 dB so not "huge, noisy, triple slot GPU that consume 450W" BS. Compared to Mac Studio that's about 25 dB idle.

Air cooled (Josh) build video briefly demos noise level under gaming load.

 

mi7chy

macrumors G4
Oct 24, 2014
10,623
11,295
Sure...

1680374257567.png

1680373793057.png

1680374295549.png


If the 4090 takes 15 hours to complete a short 3 minute animated rendering then how many days for M1 Ultra? A week?

1680375709146.png
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.