Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

leman

macrumors Core
Oct 14, 2008
19,522
19,679
CUDA cores are slower than RT cores, so people don't use them.

I think you misunderstand what the "CUDA core" is and what the "RT core" is. A "CUDA core" is the GPU compute shader core, the hardware capable of executing shader programs (be it for graphical or general purpose computation tasks). An "RT core" is an auxiliary hardware unit that accelerates the task of finding which triangle is hit by a ray. But "RT cores" cannot run programs, all they can do is take a ray and a list of triangles and say "oh, this triangle is hit", but they can do it really fast, much faster than if you wrote a program for it using the general purpose shader cores. You still need a program that generates the rays and decides what to do with the hit information (e.g. shade the pixel based on the light bounces etc.). So an RT accelerated GPU program runs on "CUDA cores" and uses "RT cores" to make raytracing fast.


What is the point of the optix renderer if the cuda renderer also uses the RT hardware?

It's been a while that I last worked with CUDA and I never touched Optix but from what I understand CUDA itself does not have any APIs to access the RT hardware and Optix is basically a framework that has access to RT hardware and uses CUDA to specify what to do with the RT results. It's a little bit confusing, but one (a bit naive and not entirely correct) way is to consider Optix as CUDA with hardware raytracing.
 
  • Like
Reactions: Xiao_Xi

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
It looks like there are two pipelines, one for rasterization and one for ray tracing.
pasted-image-0-7.png

Are CUDA cores used for rasterization and RTX cores for ray tracing?

 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
Are CUDA cores used for rasterization and RTX cores for ray tracing?

No. In the graphs you posted the white nodes represent programmable stages, the grey rhombi represent fixed-function stages and the partially grey-filled nodes represent fixed-function partially configurable stages*. CUDA cores are what runs the programmable stages (shading and ray generation) while the RT cores run the "traversal & intersection" stage. Rasterisation is a fixed function hardware stage that receives primitive data and generates pixel data to be shaded.

*I don't necessarily agree with that kind of representation but it's ok as a simplification.
 
  • Like
Reactions: Xiao_Xi and dmr727

dmr727

macrumors G4
Dec 29, 2007
10,677
5,872
NYC
It looks like there are two pipelines, one for rasterization and one for ray tracing.
View attachment 2094849
Are CUDA cores used for rasterization and RTX cores for ray tracing?


If you look at the video where he talks about this slide (thanks for posting it btw, it was interesting to me!) - only the middle of the Ray Tracing pipeline is handled by RTX (the green box):

Screen Shot 2022-10-14 at 12.46.01 PM.jpg
 
  • Like
Reactions: Xiao_Xi

sirio76

macrumors 6502a
Mar 28, 2013
578
416
What is the point of the optix renderer if the cuda renderer also uses the RT hardware?
Vray for example have two GPU mode, CUDA and RTX. The CUDA engine will use only the traditional GPU cores, the RTX will use both the standard core and the RT core to accelerate some part of the render.
 

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
Vray for example have two GPU mode, CUDA and RTX. The CUDA engine will use only the traditional GPU cores, the RTX will use both the standard core and the RT core to accelerate some part of the render.
Why wouldn't they just have one renderer/API like Apple?
 

tomO2013

macrumors member
Feb 11, 2020
67
102
Canada
Just for ****s and giggles…. here is the PowerVR Photon approach to ray tracing. I’m truly excited to see this IP in the wild in working Apple Silicon.


I like their overview of CXT and photon here..


I could see a future M2 Ultra/M3 Ultra including their L4 ray tracing solution - it would allow Apple to compete with nVidia on workloads that benefit from RT cores, but to do so at massively lower power consumption figures.

I have to say that I’m somewhat disappointed by the approach taken with the 4090. Personally I’d have taken a cheaper card with 40-50% performance improvement over the 3090 but at significantly lower power usage and lower heat/noise when under load. Noise output from my workstations is a big thing for me. Even when PC gaming - without getting into LN or custom liquid cooling solutions - I hate when graphics cards get loud and noise under load.
 

innerproduct

macrumors regular
Jun 21, 2021
222
353
Cuda is the general api for doing compute work on nvida GPUs. If I remember correctly you may not access the RT cores directly from Cuda. Optix is a specific ray tracing framework that is higher level and easy for developers to use. All gpu based nvidia ray tracers used to use Cuda but that required deep understanding of solving things like bvhs etc. witg optix, that is taken care of in hardware optimal ways for nvidias different chips. So, all in all, in order to use nvidas latest architectures you need to use Optix. There is absolutely no sane reason to just use plain Cuda anymore for most raytracing. It would be like writing your own video codecs instead of using the built in hw that is designed for that specific task. In a way metalRT is similar to Optix but there is just no hw backend at the moment.
 
  • Like
Reactions: Xiao_Xi

mi7chy

macrumors G4
Oct 24, 2014
10,625
11,298
I have to say that I’m somewhat disappointed by the approach taken with the 4090. Personally I’d have taken a cheaper card with 40-50% performance improvement over the 3090 but at significantly lower power usage and lower heat/noise when under load. Noise output from my workstations is a big thing for me. Even when PC gaming - without getting into LN or custom liquid cooling solutions - I hate when graphics cards get loud and noise under load.

Not rocket science. Just scale the 4090 power consumption at 60% to achieve 85% performance or 50% power for 75% performance with silent operation.

1665531716729-png.2093063
 

tomO2013

macrumors member
Feb 11, 2020
67
102
Canada
Not rocket science. Just scale the 4090 power consumption at 60% to achieve 85% performance or 50% power for 75% performance with silent operation.

1665531716729-png.2093063


For most users that is not an option or read differently, it’s not an option for those who work for big companies who will get their nvidia 4090 as part of a developer/creator pre-configured workstation from Dell, HP, Lenovo etc…

From a business perspective - not a tinkerer/hobbyist who builds his/her/their own PC - do you think that most people / IT departments / graphics departments allow their staff to tinker with power profiles on their GPUs other than stock or do you think that there is an expectation that they run the hardware that they buy from such vendors in a stock profile so as not to impact the vendor provided warranty?

The ‘rocket science’ that you present is simply not pragmatic for most large IT, graphics design organizations where rubber meets the road. Unfortunately as is often the case, a choice in hardware spec is distilled to a conversation along the lines of…
“ here are the corporately provided and approved hardware configurations that we can offer staff. We have a contract negotiated with our favorite vendor (insert Dell, HP or Lenovo). Please pick option A, B or C and we’ll get your hardware ordered“
”well I was watching Jayz two cents/Linus Tech tips/<insert reviewsite> and we can get 4090 and custom build and underclock/overclock so that it will be less noisey”
”let me interrupt you…. pick either A, B or C. You’ll be using it stock. Hardware is managed to corporate standards by IT operations team. Have a nice day! “.
 

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
For most users that is not an option or read differently, it’s not an option for those who work for big companies who will get their nvidia 4090 as part of a developer/creator pre-configured workstation from Dell, HP, Lenovo etc…

From a business perspective - not a tinkerer/hobbyist who builds his/her/their own PC - do you think that most people / IT departments / graphics departments allow their staff to tinker with power profiles on their GPUs other than stock or do you think that there is an expectation that they run the hardware that they buy from such vendors in a stock profile so as not to impact the vendor provided warranty?

The ‘rocket science’ that you present is simply not pragmatic for most large IT, graphics design organizations where rubber meets the road. Unfortunately as is often the case, a choice in hardware spec is distilled to a conversation along the lines of…
“ here are the corporately provided and approved hardware configurations that we can offer staff. We have a contract negotiated with our favorite vendor (insert Dell, HP or Lenovo). Please pick option A, B or C and we’ll get your hardware ordered“
”well I was watching Jayz two cents/Linus Tech tips/<insert reviewsite> and we can get 4090 and custom build and underclock/overclock so that it will be less noisey”
”let me interrupt you…. pick either A, B or C. You’ll be using it stock. Hardware is managed to corporate standards by IT operations team. Have a nice day! “.
Those places should be buying the "Quadro" cards anyways...
 
  • Like
Reactions: iPadified

tomO2013

macrumors member
Feb 11, 2020
67
102
Canada
Those places should be buying the "Quadro" cards anyways...
Actually many places and game studios, including XBox game studios, use consumer cards during the development process :) It’s really not uncommon.

Also the 4090 is pitched by nVidia as a content creation card, taking over from Titan and not just a gaming card.

Puget systems for example , tout the 4090 advantages for workstation loads (rightly so) over a 3090.

In any case, I standby my point. Most folks who get their mits on a 4090 or quadro derivative in a business situation won’t be touching the card to underclock/overclock beyond default manufacturers configuration. Those who follow the add-in-board home builder are for the most part in the minority.
 

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
Actually many places and game studios, including XBox game studios, use consumer cards during the development process :) It’s really not uncommon.

Also the 4090 is pitched by nVidia as a content creation card, taking over from Titan and not just a gaming card.

Puget systems for example , tout the 4090 advantages for workstation loads (rightly so) over a 3090.

In any case, I standby my point. Most folks who get their mits on a 4090 or quadro derivative in a business situation won’t be touching the card to underclock/overclock beyond default manufacturers configuration. Those who follow the add-in-board home builder are for the most part in the minority.
Yeah. Was just thinking the workstation cards pull less power than the consumer cards. It will be interesting to see how Dell HP and so on handle these 4090's this go around.
 
  • Like
Reactions: tomO2013

mi7chy

macrumors G4
Oct 24, 2014
10,625
11,298
For most users that is not an option or read differently, it’s not an option for those who work for big companies who will get their nvidia 4090 as part of a developer/creator pre-configured workstation from Dell, HP, Lenovo etc…

You wanted a "cheaper card" so personal use was implied. Employee time is money to companies so $1600 is cheap but any decent company has a centralized render/compute farm anyhow.
 
  • Like
Reactions: iPadified

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
Is GPU-accelerated MNEE coming to macOS in Blender?
 

altaic

Suspended
Jan 26, 2004
712
484
Is GPU-accelerated MNEE coming to macOS in Blender?
The first of the Metal 3 commits, cool! 🚀 🎉

Edit: Never mind. Premature celly, my bad.
 
Last edited:

galad

macrumors 6502a
Apr 22, 2022
611
492
Not really, it's just that Apple improved their shaders compiler on macOS 13 and fixed some bugs, that commit doesn't use any of the new Metal 3 features.
 

altaic

Suspended
Jan 26, 2004
712
484
Not really, it's just that Apple improved their shaders compiler on macOS 13 and fixed some bugs, that commit doesn't use any of the new Metal 3 features.
Aw, you’re right. I should have looked at the diffs.
 

jmho

macrumors 6502a
Jun 11, 2021
502
996
Metal 3 only works on Ventura which isn't out yet, so it'll probably be a while before they start putting Metal 3 features in Blender.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
This is getting very interesting. There is an add-on to use Stable Diffusion in Blender.

Is there something similar for other software?
 
  • Like
Reactions: jujoje

altaic

Suspended
Jan 26, 2004
712
484
Metal 3 only works on Ventura which isn't out yet, so it'll probably be a while before they start putting Metal 3 features in Blender.
That commit was specifically for macOS >= 13, i.e. Ventura, hence my presumption that it had to do with Metal 3.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.