Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

sirio76

macrumors 6502a
Mar 28, 2013
578
416
Just use Google.
Is more correct to say that all those program have also GPU engines but they are born as CPU renderer and it’s still their main use by a large margin. GPU engine are not used in production for heavy tasks, maybe just for look development, all the rest including final render is done on CPU. Is not just RAM limitation, CPU engines are more stable and easier to code and debug.
 
  • Like
Reactions: Boil

Boil

macrumors 68040
Oct 23, 2018
3,477
3,173
Stargate Command
M1 Max performance is understandable. It does kinda put a damper on the hype for the next Apple Silicon Mac Pro though unless it somehow gets RT hardware.

I really want to be able to go full Mac only, but it looks like it might still be a few years away.

With the M1 Max reveal, I was wanting a maxed out M1 Max in a Mac mini...

But it is first gen, with no (known) raytracing...

So I await a compromise, see sig below...!

Maybe by the time the M3-series of SoCs drops, raytracing will be in the mix & the GPU power will be highly increased; the perfect time to move up to a M3 Max Duo Mac Pro Cube...?!?

Looking forward to diving into Blender next year; I am sure the performance will be better than when I was running EIAS on a PowerTower Pro with 64MB of RAM & a 4MB GPU...! ;^p
 

jujoje

macrumors regular
May 17, 2009
247
288
You mean Phenomenon...?!? ;^p

Well that's a blast from the past. Funny to see much of the same hopes with regards to apples pro apps back then, as there is now. Hopefully this time apple will deliver and we won't be looking back at this thread in decade... If that ever became a thing it would have my money.

This seems to imply that any serious rendering should be GPU based which is simply false, as a matter of facts any high end Hollywood blockbuster is still rendered using CPU (Renderman, Arnold, Vray or other proprietary engine). Even for my boutique studio GPU rendering isn’t enough because of the limitation. Of course there are many user case that can benefit from a fast GPU but rendering isn’t necessarily one of them. On the other end CPU rendering seems quite fast for a laptop, from my tests on real scenes (not simple benchmark with a bunch of polygons) it looks like a 12core desktop.

Is more correct to say that all those program have also GPU engines but they are born as CPU renderer and it’s still their main use by a large margin. GPU engine are not used in production for heavy tasks, maybe just for look development, all the rest including final render is done on CPU. Is not just RAM limitation, CPU engines are more stable and easier to code and debug.

GPU renderers definitely have their place; as you say lookdev is def one, as is mograph, advertising, product shots etc where turnaround is tight and scene complexity is manageable. Renderman, Karma and Arnold are all pushing xPU for lookdev rather than final frames (although the Renderman gets impressively close results between CPU and GPU).

The draw backs of GPU renderers are that they don't really support the complexity required for more custom shading; things like point clouds and custom geometry or the flexibility required for photorealism. They also tend to fall over pretty quickly once you start adding volumes.

From a studio perspective, driver updates may well effect how a scene renders or break things entirely (nice one Nvidia). Speaking of whom, at the moment you pretty much have vendor lock-in to Nvidia if you want the get the fastest turnaround time.

GPU rendering has been something like the year of the linux desktop always in the near future, but never quite arriving. Definitely has it's place, but feel that it is somewhat over egged at teh future of rendering.

Do you have any link that talks about it?

As sirio76 said, any film that has any CG in it will be rendered using a CPU farm. Pretty much pick any Marvel production. Honestly can't think of a blockbuster film that's primarily used GPU for final frames.
 

jujoje

macrumors regular
May 17, 2009
247
288
Brad Peebler and Stuart Ferguson at least. A couple of years ago now though.

Didn't realise that Brad Peebler had joined Apple; that definitely gives me a bit more confidence in Apple's ability to internally develop usable 3D tools. Good to know he's still kicking around the industry either way :)

USD itself is a mess. Pixar makes great tech… but over complicates the workflow (looking at Renderman). It can’t seem to think beyond Studio/pipeline paradigm.

Hopefully Renderman will get a bit simpler now they're dropping the all the REYES architecture. While the look Renderman produces is really nice (handles light pretty much better than any of the other renderers imho), it still feels unnecessarily complex, fiddly and full of Pixar jargon.

Speaking of Pixar jargon, USD is very artist unfriendly, full of baffling terms and concepts. Every time I think I've got it nailed, it does something unexpected.

From a studio view it's really good; it makes managing complexity across departments and software much easier. It's great for mid-large studios, but the conceptual overhead for smaller studios or freelancers is a pretty high barrier to entry. I only really started to get an understanding of what it was after a year of fiddling around with it. I still regularly break it though :p
 
  • Like
Reactions: singhs.apps

singhs.apps

macrumors 6502a
Oct 27, 2016
660
400
I have a feeling..and I could be wrong about this… but isn’t all pre processing done on the CPU ? Can Apple’s SOC approach with unified memory have its GPU cores seen as slow per ‘core’ but orders of magnitude in Number ‘rendering’ cores (no reading/writing to system/vram memory) ?

If so, considering the 3090 GPUs perform around 1.5x - 2x faster than a 64 core threadripper in renders… (4090 potentially even more so at 3-4x), can we still see it as a win if the quad m1max Soc operates in the 3090 range …which would still be 150–200% faster than the top of the line CPU ? With none of the quirkiness of the GPU nor vram limitations?
 
Last edited:

jujoje

macrumors regular
May 17, 2009
247
288
That's an interesting point.

At least with Karma xPU there's a signifiant, slightly annoying, hitch while it preps the data to be sent to the different devices. If you just render with Karma CPU then the time to first pixel is significantly faster as it skips the prep stage and it updates much faster to new geo or materials.

It looks like we already see some benefits to not swapping data with Redshift in the Moana benchmark scene:

2x 2080ti = 34m:17s
Single 3090 = 21m:45s
2x 3090 = 12m:44s
Apple M1 Max 64gb = 28m:27s

(From: here)

I guess in part this is going to depend on how much Apple can persuade developers to optimise for their SOC; not shuffling data about could make it an order of magnitude faster for complex scenes. The performance profile would be pretty interesting though. For low complexity scenes it would be merely ok, but potentially significantly faster than the competition as complexity increases. Also time to first pixel would be super fast...
 

jujoje

macrumors regular
May 17, 2009
247
288
They'd also be an increasing gulf between the results given by the benchmarks most you tubers use (smaller scenes that take 5-10min to render), and the heavier scenes that something like the quad Mac Pro would be designed to handle for those buying it for work. The hot takes and internet arguments would be epic :D

Really hope one of the main companies (most likely redshift I guess) goes all in optimising for unified memory; as you say, the speed of the gpu with non of the quirks could be quite something.
 

singhs.apps

macrumors 6502a
Oct 27, 2016
660
400
Yeah. I think this SOC would be pretty nifty and would perform better than the sum of its parts. Maybe the windows systems might end up taking this route too, one of these days.

Besides Houdini itself could benefit for sims, pushing what it can to the GPU to speed things up and even the sim + renders would benefit greatly. Maybe even their PDG system.

Maya’s Bifrost is multithreaded. That too can take advantage of this approach. Debugging might be easier.

That redshift benchmark is telling. If the M1 Max is 8x slower than a 3080, the numbers should have been much poorer in that test. So yeah, extra onboard ram+ ssd cache might be speeding things quite a bit in translation which is where the bottleneck of the 3090 vram comes into play. Too much talking back and forth between the GPU and the CPU.

The quad Soc may not be a 4090…but it could well be a 2 x 64 core threadrippers. And that’s massive for CG/VFX
 
Last edited:

singhs.apps

macrumors 6502a
Oct 27, 2016
660
400
Didn't realise that Brad Peebler had joined Apple; that definitely gives me a bit more confidence in Apple's ability to internally develop usable 3D tools. Good to know he's still kicking around the industry either way :)



Hopefully Renderman will get a bit simpler now they're dropping the all the REYES architecture. While the look Renderman produces is really nice (handles light pretty much better than any of the other renderers imho), it still feels unnecessarily complex, fiddly and full of Pixar jargon.

Speaking of Pixar jargon, USD is very artist unfriendly, full of baffling terms and concepts. Every time I think I've got it nailed, it does something unexpected.

From a studio view it's really good; it makes managing complexity across departments and software much easier. It's great for mid-large studios, but the conceptual overhead for smaller studios or freelancers is a pretty high barrier to entry. I only really started to get an understanding of what it was after a year of fiddling around with it. I still regularly break it though :p
That Renderman thing was a industry thing. It was Arnold’s approach that forced them and the likes of vray to lay off the hyper fiddling approach and now almost every renderer follows that mantra (and Karma too XD )
 

jujoje

macrumors regular
May 17, 2009
247
288
That Renderman thing was a industry thing. It was Arnold’s approach that forced them and the likes of vray to lay off the hyper fiddling approach and now almost every renderer follows that mantra (and Karma too XD )

I remember the Arnold team making the argument that artist time is the expensive thing and artists shouldn't be wasting time fiddling with render settings; just get a big farm and let it sort it out :)

Definitely changed the renderers for the better; I still have nightmares about Mental ray for Maya's render globals from around that time.
 
  • Like
Reactions: singhs.apps

jmho

macrumors 6502a
Jun 11, 2021
502
996
I went to WWDC in 2018 and they had a talk by one of the artists from Pixar who had worked on Coco, where the backdrop to most scenes was a gigantic wall of thousands of houses, each one covered in hundreds of tiny coloured decorative lights.

I remember her saying that each individual frame had like ~100,000 lights in it, and any system to try to cleverly divide the scene by working out which objects were affected by which lights ended up being more computationally expensive than just "brute forcing" everything, which makes sense because with path tracing each ray will ignore every light it doesn't hit so having 100,000 lights isn't actually as expensive as it sounds.

But obviously this means that you're just rendering a massive multiple-terabyte scene in one go, which is going to be impossible to do with GPUs.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
The draw backs of GPU renderers are that they don't really support the complexity required for more custom shading; things like point clouds and custom geometry or the flexibility required for photorealism. They also tend to fall over pretty quickly once you start adding volumes.

GPU rendering has been something like the year of the linux desktop always in the near future, but never quite arriving. Definitely has it's place, but feel that it is somewhat over egged at teh future of rendering.
What makes GPU rendering so challenging?
 

sirio76

macrumors 6502a
Mar 28, 2013
578
416
Beside the memory limitations for large scenes, coding for it is more difficult, it’s harder to debug, is more crash prone due to driver issues. The more complex the scene is the slower the GPU became. That‘s for rendering of course, for other tasks GPU computing can be easily several time faster than CPU no matter what you throw at it, but it’s an error to think that is like that for everything.
 
  • Like
Reactions: Xiao_Xi

iPadified

macrumors 68020
Apr 25, 2017
2,014
2,257
Well that's a blast from the past. Funny to see much of the same hopes with regards to apples pro apps back then, as there is now. Hopefully this time apple will deliver and we won't be looking back at this thread in decade... If that ever became a thing it would have my money.





GPU renderers definitely have their place; as you say lookdev is def one, as is mograph, advertising, product shots etc where turnaround is tight and scene complexity is manageable. Renderman, Karma and Arnold are all pushing xPU for lookdev rather than final frames (although the Renderman gets impressively close results between CPU and GPU).

The draw backs of GPU renderers are that they don't really support the complexity required for more custom shading; things like point clouds and custom geometry or the flexibility required for photorealism. They also tend to fall over pretty quickly once you start adding volumes.

From a studio perspective, driver updates may well effect how a scene renders or break things entirely (nice one Nvidia). Speaking of whom, at the moment you pretty much have vendor lock-in to Nvidia if you want the get the fastest turnaround time.

GPU rendering has been something like the year of the linux desktop always in the near future, but never quite arriving. Definitely has it's place, but feel that it is somewhat over egged at teh future of rendering.



As sirio76 said, any film that has any CG in it will be rendered using a CPU farm. Pretty much pick any Marvel production. Honestly can't think of a blockbuster film that's primarily used GPU for final frames.
I believed many went to GPU rendering because it was cheap compared to Intels Xeons and render farms not because they produced better renders. I do wonder about the metrics of GPU vs CPU:

Does GPU use less energy to render a frame compared to CPU?
Is the foot print of the GPU chip smaller for a given performance compared to CPU?

Note I mean normal compute cores for general purpose use not the specialised ray tracing cores that of course are more energy efficient but should not be considered a GPU.
 

singhs.apps

macrumors 6502a
Oct 27, 2016
660
400
I believed many went to GPU rendering because it was cheap compared to Intels Xeons and render farms not because they produced better renders. I do wonder about the metrics of GPU vs CPU:

Does GPU use less energy to render a frame compared to CPU?
Is the foot print of the GPU chip smaller for a given performance compared to CPU?

Note I mean normal compute cores for general purpose use not the specialised ray tracing cores that of course are more energy efficient but should not be considered a GPU.
Quality of Render is largely the function of the Renderer, not hardware. Speed can be the domain of both.

Regarding energy: currently, a totl Threadripper consumes around 280w under all-core load while a 3090 consumes around 360-370. yet the 3090 can only best the Threadripper by a thin margin (based on Vray’s Cuda port to CPUs tests).
But price gouging notwithstanding, you can get almost 3 3090s for the price of a Threadripperwx…and get 3x the rendering speed (assuming your render job can fit on the 3090 ) with 4x the power consumption

Dual/Quad Threadripper systems don’t exist and if you create a render farm, you negate any power savings while multiplying the upfront hardware cost multifold.

A Threadripper is a safer bet as far as renderers go, but you cant scale up unless budget isn’t a concern.

The footprint is smaller for GPUs if you want to scale up, considering each Threadripper will mean a complete unit (you can use Server grade multi CPU systems but the cost will balloon up real fast.

So there isn’t really a clear-cut answer, but generally, GPUs are more cost-effective and faster than CPUs.

One way that I can see Apple closing the gap, and this depends on how serious Apple is about 3D, is creating its own hardware raytracing cores, or full-blown ray tracing accelerators (like its dedicated encoders seen in the m1 pros ) thus minimizing the load on its SOCs.
 
Last edited:

Boil

macrumors 68040
Oct 23, 2018
3,477
3,173
Stargate Command
One way that I can see Apple closing the gap, and this depends on how serious Apple is about 3D, is creating its own hardware raytracing cores, or full blown ray tracing accelerators (like it’s dedicated encoders seen in the m1 pros ) thus minimizing load on its SoCs.

5U rackmount chassis with X amount of blades, each blade is a Mn Max Quadra with attendant ray tracing accelerators...?

Hybrid rendering leveraging CPU, GPU, NPU (Neural Engine), & RTA (Ray Tracing Accelerator) cores...?
 

singhs.apps

macrumors 6502a
Oct 27, 2016
660
400
Not sure about Hybrid rendering. It's largely a way to make use of existing resources... if the GPU + RTX cores speed things up by 25%, why not use them and free up the CPU?

The Ram supplies the same data to the CPU and the GPU in Apple's SOCs, reducing the VRAM bottleneck, not to mention allowing GPUs to exploit a much larger pool of memory than traditional dedicated ones, again reducing the need to have CPUs in the mix.
Debugging should also be easier for Renderers.

But I was thinking of how Apple might leverage RTX cores for its AR/Vr solution if eventually, it wants them to be independent devices with their own OS.
 

jmho

macrumors 6502a
Jun 11, 2021
502
996
The memory limit is essentially the only real hard limitation of path tracing. If you can fit a scene in memory, then path tracing it is actually fairly simple (and embarrassingly parallel). If you can't fit it in memory then you have massive problems because there are no good algorithms that will let you split up a scene and render half of it on one CPU / GPU and half on another and then somehow merge the result back together.

Light is incredibly unpredictable and if you're rendering a city and your light ray bounces say 16 times, that ray could end up travelling hundreds of kilometres to every corner of your scene in a single sample, so you need to have everything in memory all the time.

Say you split a city down the middle and give part A to GPU A and part B to GPU B, what does GPU A do when a reflection bounces off towards part B? You either have to send the ray to GPU B and then wait for GPU B to return the ray back to GPU A (slooooow), or you try to make some kind of light cache, which would likely end up being larger than the scene anyway and probably look worse. The other option is that GPU A ditches scene A, loads scene B from disk, traces the reflection, and then ditches scene B and re-loads scene A.

All of these options are all so prohibitively slow that they're not worth doing, which means that the final option left is to say that parts A and B cannot reflect / cast shadows / light each other. This is going to put an incredible amount of work onto the artists as they have to find ways to fake things and try to make things look somewhat correct.

That's why GPUs are amazing for small scenes, but for people who are pushing the boundaries (and as Blinn's law states, as computers get faster, rendering times stay the same because people just render more stuff :p) the limit is always going to be memory. If you have enough memory, you can just throw cores at the problem. If you don't (and GPUs generally don't) then you're out of luck, no matter how fast the GPU or how many cores it has.

If the Apple Silicon Mac Pro ends up having a terabyte or more of RAM that is accessible by the GPU, then that could be a game changer at the high end, even without RT hardware.
 

Boil

macrumors 68040
Oct 23, 2018
3,477
3,173
Stargate Command
If the Apple Silicon Mac Pro ends up having a terabyte or more of RAM that is accessible by the GPU, then that could be a game changer at the high end, even without RT hardware.

LPDDR5X would allow up to 64GB chips, so 1TB RAM (with 2TB/s UMA bandwidth) on a Mn Max Quadra Mac Pro Cube should be possible...?

But it would also be quite expensive...! ;^p
 

sirio76

macrumors 6502a
Mar 28, 2013
578
416
The memory limit is essentially the only real hard limitation of path tracing. If you can fit a scene in memory, then path tracing it is actually fairly simple (and embarrassingly parallel). If you can't fit it in memory then you have massive problems because there are no good algorithms that will let you split up a scene and render half of it on one CPU / GPU and half on another and then somehow merge the result back together.

Light is incredibly unpredictable and if you're rendering a city and your light ray bounces say 16 times, that ray could end up travelling hundreds of kilometres to every corner of your scene in a single sample, so you need to have everything in memory all the time.

Say you split a city down the middle and give part A to GPU A and part B to GPU B, what does GPU A do when a reflection bounces off towards part B? You either have to send the ray to GPU B and then wait for GPU B to return the ray back to GPU A (slooooow), or you try to make some kind of light cache, which would likely end up being larger than the scene anyway and probably look worse. The other option is that GPU A ditches scene A, loads scene B from disk, traces the reflection, and then ditches scene B and re-loads scene A.

All of these options are all so prohibitively slow that they're not worth doing, which means that the final option left is to say that parts A and B cannot reflect / cast shadows / light each other. This is going to put an incredible amount of work onto the artists as they have to find ways to fake things and try to make things look somewhat correct.

That's why GPUs are amazing for small scenes, but for people who are pushing the boundaries (and as Blinn's law states, as computers get faster, rendering times stay the same because people just render more stuff :p) the limit is always going to be memory. If you have enough memory, you can just throw cores at the problem. If you don't (and GPUs generally don't) then you're out of luck, no matter how fast the GPU or how many cores it has.

If the Apple Silicon Mac Pro ends up having a terabyte or more of RAM that is accessible by the GPU, then that could be a game changer at the high end, even without RT hardware.
Things are not exactly like this:)
About path tracing, well.. thanks God modern engines uses every sort of trickery to avoid that the light emitted by a light bulb travels kilometers away, that will result in immense computing time. Once a ray reach a certain threshold and do not contribute to the scene significantly is cutted away.
About memory usage, only progressive rendering needs to keep all the assets loaded on memory every time, Vray bucket sampler for example can split the scene into smaller parts and assign every part to a CPU thread or a GPU, this will save RAM since not all assets needs to be on memory constantly and can also be loaded and unloaded on demand depending on what the bucket is rendering. Incidentally bucket rendering not only consumes less RAM, is also a bit faster than progressive rendering.
 
  • Like
Reactions: singhs.apps

singhs.apps

macrumors 6502a
Oct 27, 2016
660
400
How much memory does a large scene need?
Depends on many factors.

But there are clever ways you can create a sense of large scale, some brute force ones too (Epic’s excellent nanite + Lumen ), clever instancing, clever de duplication (isotropix Clarisse ), Parallex OSL etc

The goal is real-time (always has been) illusion of reality for CG. Ray tracing is currently the best option to mimic light behaviour which is where the dumb GPU cores, but orders of magnitude in number, can brute force the vectors - camera based or bi directional tracing.

So cheating is inbuilt into CG….but it can get difficult if you aren’t ‘rendering’ a scene for fixed, non controlled POV ( that is games, VR etc)
 
Last edited:
  • Like
Reactions: Xiao_Xi
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.