Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Have the patents Img Tech have on tile based deferred renderers expired?

Don’t think so but prior to this Apple was no longer licensing anything substantial from, and was in dispute with, ImgTech for several years. This was seen as a big expansion of Apple’s interest in ImgTech’s IP rather than any kind of settlement. Nobody really knows and both sides are being coy about exactly why Apple penned a new deal - with ImgTech saying only that “It’s an expanded license, a wide range of IP, for multiple years. I’m sure you can speculate. Everyone is speculating. But we can’t say.”


If that’s accurate then it sounds to be about more than just TBDR and the biggest new thing ImgTech have developed is ray tracing. So still speculation, but not unfounded.
 

"Unfortunately the KeyShot 11 release will not support GPU rendering on Mac Operating Systems.
One of the many reasons is due to Apple GPUs not possessing RayTracing cores, a key feature in KeyShot's programming.

While I know this is not the answer you were looking for, I want to assure you that things are in happening behind the scenes to better support KeyShot on Mac OS.



Med Venlig Hilsen / Best Regards,



Erik Williams

Customer Support Specialist"


It seems like we will have to wait for apple silicon to get RT cores for some programs like Keyshot to get GPU support. Not sure about a native cpu only keyshot release though.

Wonder if Toolbag 4 is waiting for something similar before bringing over native support.
 
It seems like we will have to wait for apple silicon to get RT cores for some programs like Keyshot to get GPU support. Not sure about a native cpu only keyshot release though.

That does sounds like a weird marketing-formulated excuse. Presence or absence of RT cores only affects performance. From the feature and programmability standpoint M1 might as well contain RT cores…

P.S. I had a quick look and Keyshot does support GPUs like Nvidia P1000 that do not have RT cores either. So yeah, just an excuse.
 
Last edited:
That does sounds like a weird marketing-formulated excuse. Presence or absence of RT cores only affects performance. From the feature and programmability standpoint M1 might as well contain RT cores…

P.S. I had a quick look and Keyshot does support GPUs like Nvidia P1000 that do not have RT cores either. So yeah, just an excuse.
Yeah I thought it was an excuse too. Think i mentioned that in the other 3D thread. Still I think it basically means we have to wait until apple silicon has rt cores for them to consider adding gpu acceleration.

I tried the demo earlier today and the cpu only via rosetta made my m1 struggle hard. At least on marmoset toolbag 4 i can move around and use the program. It even supports ray tracing.
 

MetalRT closely follows the Optix implementation, and in some cases (notably handling of transforms) it makes sense to extend Optix special-casing to MetalRT. For these generalisations we now have KERNEL_GPU_RAYTRACING instead of KERNEL_OPTIX.

  • MetalT92212 4: Would like to move quickly to get a fully working version in the code base as quickly as possible. Currently only 4 test cases are failing. We would also like to get the Metal RT patch in. Looking into using run time compilation to specialise the rendering pipeline which had 5%-10% performance impact. However, compilation can take a long time negating the benefit. We are wondering if Blender is set up to build and test Metal builds. Finally, the Metal implementation currently does not have an lgamma implementation that is usable for licensing reasons and this is currently used by some closures in Cycles so one needs to be found that can be used. D13263 3, D13241 3, D13243 3, D13236 3. (Micheal)
  • Currently Blender can easily support Metal builds as it currently runs on a M1 laptop so it should easily be able to support testing Metal. (Brecht)
 
Interesting topic but why not just add a chip just for ray tracing like RTX series then?
 
Interesting topic but why not just add a chip just for ray tracing like RTX series then?
Do you mean adding ray tracing cores to the GPU like RTX or do you mean bundling ray tracing in its own accelerator on the SOC a la the NPU?
 
Either or both.
The how and when is what we’re discussing. Apple is obviously setting up for the addition of such capabilities with software APIs but the reason they haven’t shipped anything yet is that Apple didn’t have (as far as we know) any hardware ray tracing IP to do so. It’s likely that’s what their recent deal with ImgTech was to gain access to. If Apple’s recent moves that @hefeglass linked to with blender are anything to go by, we’re likely to see hardware accelerated ray tracing in Macs sooner rather than later.
 
Last edited:
Interesting topic but why not just add a chip just for ray tracing like RTX series then?

Its not a straightforward thing to do. But as others have said, Apple is most certainly working on it and they have recently bought IP from IMG. Metal is RT ready.

If it is separate silicon (like nvidia) I would figure the desktop/notebooks will see it first as the phones don’t really have the space to spare.

RT hardware has to be integrated into the GPU, I just don’t see how it can be a separate coprocessor, communication overhead will make everything incredibly slow. Bounding box acceleration is trivial, the tricky part is optimizing memory access. The papers I’ve seen trying to figure out how Nvidia does it seem to suggest that “RT cores” are closely integrated with the texture units and their main job is to batch memory accesses to improve cache coherency.
 
I could make arguments for any device level.
When could Apple be prepared to ship ray tracing hardware if it uses Imagination IP? This fall? Next year?
Which devices could Apple choose to have ray tracing hardware? Only computers? iPad? iPhone?
Could Apple ship ray tracing hardware only in Pro devices?

RT hardware has to be integrated into the GPU, I just don’t see how it can be a separate coprocessor, communication overhead will make everything incredibly slow.
Blocks of the Imagination's GPU with ray tracing.
Imagination-GPU.png


The main components of the GPU include:
• Unified Shading Cluster (USC) – the compute heart of the GPU, a multi-threaded programmable SIMT processor which can simultaneously process pixel data, geometry data, compute data as well as 2D/copy housekeeping tasks. More USCs equates to higher compute performance for the GPU configuration.
• Texture Processing Unit (TPU) – handles texture addressing, sampling and filtering in highlyoptimised logic. More texture unit equates to higher visual complexity, greater refresh rates and increased display resolution support.
• Raster/Geometry Block – a collection of fixed-function units enabling post- and pre-processing of data before/after processing by the USC including culling, clipping, tiling, compression, decompression, iteration, etc.
• Top-level (CXT RT3) – including L3 cache, AXI bus interfaces and firmware processor
• Ray Acceleration Cluster (RAC) – a new dedicated block for efficient handling of all ray tracing processing stages.

Source: Imagination's white paper "Rays your game" https://imaginationtech.com/resources/rays-your-game/
 
Last edited:
And: what is its main use case? Games? If so, this may be interesting by itself
For Apple? Rendering apps for quick previews (since from my understanding everyone does final product in CPU). I doubt they will tout games as the best use if it.
 
RT hardware has to be integrated into the GPU, I just don’t see how it can be a separate coprocessor, communication overhead will make everything incredibly slow. Bounding box acceleration is trivial, the tricky part is optimizing memory access. The papers I’ve seen trying to figure out how Nvidia does it seem to suggest that “RT cores” are closely integrated with the texture units and their main job is to batch memory accesses to improve cache coherency.
Not a separate coprocessor. A dedicated block in the GPU itself. AMD doesn't have a dedicated ray accelerator. They "reuse" the TMU. Nvidia doesn't "reuse" or dual purpose any of it's ray acceleration hardware.
 
Blocks of the Imagination's GPU with ray tracing.

I would advise care when looking at block diagrams like that. It’s just a logical partitioning, the hardware does not have to look like that.

For Apple? Rendering apps for quick previews (since from my understanding everyone does final product in CPU). I doubt they will tout games as the best use if it.

Metal RT is kind of geared to do production rendering acceleration on the GPU. And regarding gaming - sure, why not? Apple already has tutorials on how to do hybrid RT shadows or ambient occlusion in games.

Not a separate coprocessor. A dedicated block in the GPU itself. AMD doesn't have a dedicated ray accelerator. They "reuse" the TMU. Nvidia doesn't "reuse" or dual purpose any of it's ray acceleration hardware.

And how do you know that? Because Nvidia gives it a different name? We don’t have any internal documentation on how the RT cores work. Why would they be a separate block? Making them a part of TMU makes a lot of sense, because it’s the part that has to deal with memory locality. And locality of access is the biggest issue for GPU-driven RT.

P.S. This is the paper I am referring to: http://ceur-ws.org/Vol-2485/paper3.pdf
 
Last edited:
I would advise care when looking at block diagrams like that. It’s just a logical partitioning, the hardware does not have to look like that.



Metal RT is kind of geared to do production rendering acceleration on the GPU. And regarding gaming - sure, why not? Apple already has tutorials on how to do hybrid RT shadows or ambient occlusion in games.
They do, and so far no one has seemed to take them up on the offer (the one game they touted during last years WWDC [Metro Exodus] didn't bother to even add those effects in the basic version of the engine).
And how do you know that? Because Nvidia gives it a different name? We don’t have any internal documentation on how the RT cores work. Why would they be a separate block? Making them a part of TMU makes a lot of sense, because it’s the part that has to deal with memory locality. And locality of access is the biggest issue for GPU-driven RT.

P.S. This is the paper I am referring to: http://ceur-ws.org/Vol-2485/paper3.pdf
Because Nvidia says they are a separate block/core? ??‍♂️
Page 10 https://images.nvidia.com/aem-dam/en-zz/Solutions/geforce/ampere/pdf/NVIDIA-ampere-GA102-GPU-Architecture-Whitepaper-V1.pdf said:
Each SM in GA10x GPUs contain 128 CUDA Cores, four third-generation Tensor Cores, a 256 KB Register File, four Texture Units, one second-generation Ray Tracing Core, and 128 KB of L1/Shared Memory, which can be configured for differing capacities depending on the needs of the compute or graphics workloads.
Page 12 has figure 3 which shows the separate texture units (4 per sm) and the RT cores below it.
 
Granted what Nvidia seems to provide isn't 100% the same as the ISA papers AMD provides for RDNA, but from what I can tell AMD ISA paper doesn't really talk about how their Ray Accelerators work, from a code perspective, either.
 
It certainly seems like Apple’s relationship with Imagination has turned out to be a fairly smart bet, as I do not see Adreno or Mali architectures from the likes of Qualcomm or ARM to have these capabilities. I heard Samsung are trying to adapt some of AMD’s graphics technology for use in low-power devices but that doesn’t seem to be a simple process.

What jumped out at me from the Photon pages on Imagination’s website were the specs: 7.2 GRays total, 9 TFlops (32-bit) performance, aimed at a mobile processor. That’s no small amount compared to the M1’s current totals. It also seems like this is a whole new graphics technology with 50% greater geometry and shading throughput compared to previous generations.
 
It certainly seems like Apple’s relationship with Imagination has turned out to be a fairly smart bet, as I do not see Adreno or Mali architectures from the likes of Qualcomm or ARM to have these capabilities. I heard Samsung are trying to adapt some of AMD’s graphics technology for use in low-power devices but that doesn’t seem to be a simple process.

What jumped out at me from the Photon pages on Imagination’s website were the specs: 7.2 GRays total, 9 TFlops (32-bit) performance, aimed at a mobile processor. That’s no small amount compared to the M1’s current totals. It also seems like this is a whole new graphics technology with 50% greater geometry and shading throughput compared to previous generations.
The crazy thing to me is no one else is using IMGTec stuff but Apple. For it to be as good as it is, with no one else using it is weird.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.