Why does not Apple use any RT cores in their SoC?

diamond.g · Mar 16, 2022

leman said:
Never said it is a wrong one. Seems to be working quite well for them.

I guess Apple has to choose how to spend it's transistor budget carefully. Fully custom RT cores like nvidia are probably too expensive for such a small audience (or usage). AMD's solution (to repurpose a TMU) is probably more Apples speed since it doesn't require any additional hardware, but as have been shown performance isn't it's strong suit.

Is there a between option? Seems like what Imagination has done is closer to nvidia than AMD. It would be nice to see some working silicon with it though so we get a better idea of what is going on.

Reggaenald · Mar 16, 2022

T'hain Esh Kelch said:
Why not? They are the biggest company on the planet?

Priorities. Time is money, both isn’t infinit. You better believe hat Apple probably knows better than us what they’re gonna release and when. This thread is irrelevant.

jmho · Mar 16, 2022

My guess is that Apple already know their biggest problem now is software. 16 months later and we still have a whole ton of software that is still using Rosetta and that are only just beginning to support Metal.

Porting to any kind of Apple RT hardware is going to require 3 stages:

Step 1: First off convert / get rid of all your old hand-crafted X86 simd code that is preventing you from making an Apple Silicon binary. (trivial to 3 months work)*
Step 2: Port to the main Metal API. (6 months to a year's work)*
Step 3: Implement the ray tracing portion of the Metal API. (2 months work)*

* obviously all these estimates may vary wildly depending on any number of factors, and this is more to show the relative scale of the tasks.

The job of the M1 Ultra is to create the customer demand to push 3d software vendors into implementing steps 1 and 2.

Once there are numerous apps on step 2 (or even step 3, since the RT API is available) then it makes sense for Apple to release RT cores.

At that point getting the software ready will take studios 2 months and not a year and a half, and Apple will be able to go to someone like Redshift and say "Hey, we're releasing RT cores soon, can we loan you some engineers for 2 months so we can put you in the keynote?"

If Apple had dropped RT cores with the M1 Ultra, developers would go "Oh it's going to take us 16 months to get our software ready for it? Nevermind, lets just tell everyone to keep using Windows."

diamond.g · Mar 16, 2022

Ah yeah ImgTech is basically going the nvidia route (separate hardware block) @leman which probably would be good for being able to shrink transistor usage on SoC where it isn't needed (like the mobile ones).

Imagination launches its most-advanced ray tracing GPU for mobile

Imagination launched its most-advanced ray tracing graphics processing unit (GPU) for games and other markets.

venturebeat.com

Xiao_Xi · Mar 16, 2022

jmho said:
16 months later and we still have a whole ton of software that is still using Rosetta and that are only just beginning to support Metal.

I naively believe that situation will improve when an affordable CI/CD service for Apple Silicon exists.

jmho · Mar 16, 2022

Xiao_Xi said:
I naively believe that situation will improve when an affordable CI/CD service for Apple Silicon exists.

There are already loads of these available and they're very easy to set up and use. I've used both CircleCi and Github Actions and they do just fine. Plus Xcode cloud is probably coming out soon which will make things even easier.

Anyone who doesn't have an Apple Silicon build at this point likely just has a bunch of code that literally can't be compiled for ARM64 - whether that's a 3rd party library full of SIMD math stuff or hand-crafted assembly. The only solution there is going to be to re-write that code in a language that can be compiled to ARM.

Xiao_Xi · Mar 16, 2022

jmho said:
There are already loads of these available and they're very easy to set up and use. I've used both CircleCi and Github Actions and they do just fine.

The Github Actions documentation doesn't explain that they offer Apple Silicon for CI/CD. Can you share a link?

About GitHub-hosted runners - GitHub Docs

GitHub offers hosted virtual machines to run workflows. The virtual machine contains an environment of tools, packages, and settings available for GitHub Actions to use.

docs.github.com

I'm sure Julia programmers would be very happy to have a CI/CD for Apple Silicon.

jmho · Mar 16, 2022

Fair enough, you can build for Apple Silicon from an Intel Mac, but you are correct that the machines themselves might not be Apple Silicon which I guess would potentially prevent you from running tests on Apple Silicon if that was important for some reason?

(Although you could always just build for iOS and test on a simulator)

leman · Mar 16, 2022

diamond.g said:
I guess Apple has to choose how to spend it's transistor budget carefully. Fully custom RT cores like nvidia are probably too expensive for such a small audience (or usage). AMD's solution (to repurpose a TMU) is probably more Apples speed since it doesn't require any additional hardware, but as have been shown performance isn't it's strong suit.

To my knowledge we don’t have much technical information how any of these hardware RT units work. AMD only seems to have fixed-function intersection hardware with specialized instructions. Some researchers have speculated that Nvidia RT cores are part of the texture units and their main job is reordering memory access.

This is all about IP. Nvidia certainly has the best RT IP that is shipping. I am sure that Apple is working on their own solution that’s likely to be more flexible than Nvidia one.

diamond.g · Mar 16, 2022

leman said:
To my knowledge we don’t have much technical information how any of these hardware RT units work. AMD only seems to have fixed-function intersection hardware with specialized instructions. Some researchers have speculated that Nvidia RT cores are part of the texture units and their main job is reordering memory access.

This is all about IP. Nvidia certainly has the best RT IP that is shipping. I am sure that Apple is working on their own solution that’s likely to be more flexible than Nvidia one.

For not having "dedicated blocks" like nvidia AMD's solution is still able to match Turing in speed (in a lot of cases, I have seen some where it is behind there too), so that isn't horrible right?

Xiao_Xi · Mar 16, 2022

diamond.g said:
For not having "dedicated blocks" like nvidia AMD's solution is still able to match Turing in speed (in a lot of cases, I have seen some where it is behind there too)

For what workload? Gaming? That would mean AMD is a generation behind.

leman · Mar 16, 2022

diamond.g said:
For not having "dedicated blocks" like nvidia AMD's solution is still able to match Turing in speed (in a lot of cases, I have seen some where it is behind there too), so that isn't horrible right?

RDNA2 is much faster and has fast cache, so it's hardly a "compliment". AMD's current solution is a stopgap, I am sure they are working on something more advanced. RT is here to stay and we will see more sophisticated hardware. I think Apple is in a good spot here, and their software framework is already very solid.

Krevnik · Mar 16, 2022

jmho said:
Fair enough, you can build for Apple Silicon from an Intel Mac, but you are correct that the machines themselves might not be Apple Silicon which I guess would potentially prevent you from running tests on Apple Silicon if that was important for some reason?

(Although you could always just build for iOS and test on a simulator)

I’m not sure how running in an iOS simulator helps for Apple Silicon. The iOS simulator uses the host CPU architecture, so it’s x64 on Intel systems. So you still need an ARM host if you are trying to test NEON SIMD or some other ARM-specific behavior.

That said, I agree with the general sentiment that it shouldn’t be a requirement to run the CI on the target architecture. At least in many/most cases.

jmho said:
Anyone who doesn't have an Apple Silicon build at this point likely just has a bunch of code that literally can't be compiled for ARM64 - whether that's a 3rd party library full of SIMD math stuff or hand-crafted assembly. The only solution there is going to be to re-write that code in a language that can be compiled to ARM.

Or they have a 3rd party dependency that’s binary only and hasn’t been updated in years.

jmho · Mar 16, 2022

Krevnik said:
I’m not sure how running in an iOS simulator helps for Apple Silicon. The iOS simulator uses the host CPU architecture, so it’s x64 on Intel systems. So you still need an ARM host if you are trying to test NEON SIMD or some other ARM-specific behavior.

That said, I agree with the general sentiment that it shouldn’t be a requirement to run the CI on the target architecture. At least in many/most cases.

Or they have a 3rd party dependency that’s binary only and hasn’t been updated in years.

Good point. I've been using an M1 from day 1 so I completely forgot that intel simulators run x86_64.

I will admit that I don't really care about the underlying architecture that much / at all. I think for regular graphics programming most people have been fine letting the compiler do its thing for the past 10 years or so.

I get that that's not the case in research / scientific computing though.

There is a reason Blender was able to compile to Apple Silicon almost immediately (being almost entirely written in C/C++), while it looks like Julia is going to be stuck not supporting Apple Silicon properly for another few years.

That said I'm not sure how useful RT cores would be for scientific computing anyway.

Xiao_Xi · Mar 16, 2022

jmho said:
There is a reason Blender was able to compile to Apple Silicon almost immediately (being almost entirely written in C/C++), while it looks like Julia is going to be stuck not supporting Apple Silicon properly for another few years.

Yes, money to buy M1 minis. If you can test your code on hardware, you fix your bugs faster.

If online CI/CD services offered Apple Silicon, more developers would access to the new Apple hardware cheaply.

jmho · Mar 16, 2022

I doubt Blender has (or has ever needed) an Apple Silicon CI/CD. I also doubt anyone is going to offer Apple Silicon CI/CD any time soon because nobody really needs it (except Julia I guess), and basically everyone is just running macOS VMs on a giant Linux machine.

I guess Julia's only solution is as you say to buy M1 minis and do it themselves.

diamond.g · Mar 16, 2022

Xiao_Xi said:
For what workload? Gaming? That would mean AMD is a generation behind.

Path tracing. Yes AMD is a generation behind considering RDNA 1 didn’t have any RT hardware.

leman said:
RDNA2 is much faster and has fast cache, so it's hardly a "compliment". AMD's current solution is a stopgap, I am sure they are working on something more advanced. RT is here to stay and we will see more sophisticated hardware. I think Apple is in a good spot here, and their software framework is already very solid.

For rasterization I agree RDAN2 is faster, but that means nothing in the RT side of things (clearly). RDNA 3 is supposed to finally move on from GFX10 (RDNA/RDNA2) to GFX11 so in a way RDNA2‘s RT is slapped on as a stopgap as you say. Unless they decide they can make it work better. Guess we will see in a few months.

Search

Search

Why does not Apple use any RT cores in their SoC?

diamond.g

macrumors G5

Reggaenald

Suspended

jmho

macrumors 6502a

diamond.g

macrumors G5

Imagination launches its most-advanced ray tracing GPU for mobile

Xiao_Xi

macrumors 68000

jmho

macrumors 6502a

Xiao_Xi

macrumors 68000

About GitHub-hosted runners - GitHub Docs

jmho

macrumors 6502a

leman

macrumors Core

diamond.g

macrumors G5

Xiao_Xi

macrumors 68000

leman

macrumors Core

Krevnik

macrumors 601

jmho

macrumors 6502a

Xiao_Xi

macrumors 68000

jmho

macrumors 6502a

diamond.g

macrumors G5

Our Staff