Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
Never said it is a wrong one. Seems to be working quite well for them.
I guess Apple has to choose how to spend it's transistor budget carefully. Fully custom RT cores like nvidia are probably too expensive for such a small audience (or usage). AMD's solution (to repurpose a TMU) is probably more Apples speed since it doesn't require any additional hardware, but as have been shown performance isn't it's strong suit.

Is there a between option? Seems like what Imagination has done is closer to nvidia than AMD. It would be nice to see some working silicon with it though so we get a better idea of what is going on.
 

jmho

macrumors 6502a
Jun 11, 2021
502
996
My guess is that Apple already know their biggest problem now is software. 16 months later and we still have a whole ton of software that is still using Rosetta and that are only just beginning to support Metal.

Porting to any kind of Apple RT hardware is going to require 3 stages:

Step 1: First off convert / get rid of all your old hand-crafted X86 simd code that is preventing you from making an Apple Silicon binary. (trivial to 3 months work)*
Step 2: Port to the main Metal API. (6 months to a year's work)*
Step 3: Implement the ray tracing portion of the Metal API. (2 months work)*

* obviously all these estimates may vary wildly depending on any number of factors, and this is more to show the relative scale of the tasks.

The job of the M1 Ultra is to create the customer demand to push 3d software vendors into implementing steps 1 and 2.

Once there are numerous apps on step 2 (or even step 3, since the RT API is available) then it makes sense for Apple to release RT cores.

At that point getting the software ready will take studios 2 months and not a year and a half, and Apple will be able to go to someone like Redshift and say "Hey, we're releasing RT cores soon, can we loan you some engineers for 2 months so we can put you in the keynote?"

If Apple had dropped RT cores with the M1 Ultra, developers would go "Oh it's going to take us 16 months to get our software ready for it? Nevermind, lets just tell everyone to keep using Windows."
 
  • Like
Reactions: singhs.apps

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
Ah yeah ImgTech is basically going the nvidia route (separate hardware block) @leman which probably would be good for being able to shrink transistor usage on SoC where it isn't needed (like the mobile ones).

gpu-CXT-RT3-Block-Diagram.jpg

gpu-Photon-RAC-Block-Diagram.jpg

 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
16 months later and we still have a whole ton of software that is still using Rosetta and that are only just beginning to support Metal.
I naively believe that situation will improve when an affordable CI/CD service for Apple Silicon exists.
 

jmho

macrumors 6502a
Jun 11, 2021
502
996
I naively believe that situation will improve when an affordable CI/CD service for Apple Silicon exists.
There are already loads of these available and they're very easy to set up and use. I've used both CircleCi and Github Actions and they do just fine. Plus Xcode cloud is probably coming out soon which will make things even easier.

Anyone who doesn't have an Apple Silicon build at this point likely just has a bunch of code that literally can't be compiled for ARM64 - whether that's a 3rd party library full of SIMD math stuff or hand-crafted assembly. The only solution there is going to be to re-write that code in a language that can be compiled to ARM.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
There are already loads of these available and they're very easy to set up and use. I've used both CircleCi and Github Actions and they do just fine.
The Github Actions documentation doesn't explain that they offer Apple Silicon for CI/CD. Can you share a link?

I'm sure Julia programmers would be very happy to have a CI/CD for Apple Silicon.
 

jmho

macrumors 6502a
Jun 11, 2021
502
996
Fair enough, you can build for Apple Silicon from an Intel Mac, but you are correct that the machines themselves might not be Apple Silicon which I guess would potentially prevent you from running tests on Apple Silicon if that was important for some reason?

(Although you could always just build for iOS and test on a simulator)
 

leman

macrumors Core
Oct 14, 2008
19,521
19,677
I guess Apple has to choose how to spend it's transistor budget carefully. Fully custom RT cores like nvidia are probably too expensive for such a small audience (or usage). AMD's solution (to repurpose a TMU) is probably more Apples speed since it doesn't require any additional hardware, but as have been shown performance isn't it's strong suit.

To my knowledge we don’t have much technical information how any of these hardware RT units work. AMD only seems to have fixed-function intersection hardware with specialized instructions. Some researchers have speculated that Nvidia RT cores are part of the texture units and their main job is reordering memory access.

This is all about IP. Nvidia certainly has the best RT IP that is shipping. I am sure that Apple is working on their own solution that’s likely to be more flexible than Nvidia one.
 

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
To my knowledge we don’t have much technical information how any of these hardware RT units work. AMD only seems to have fixed-function intersection hardware with specialized instructions. Some researchers have speculated that Nvidia RT cores are part of the texture units and their main job is reordering memory access.

This is all about IP. Nvidia certainly has the best RT IP that is shipping. I am sure that Apple is working on their own solution that’s likely to be more flexible than Nvidia one.
For not having "dedicated blocks" like nvidia AMD's solution is still able to match Turing in speed (in a lot of cases, I have seen some where it is behind there too), so that isn't horrible right?
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
For not having "dedicated blocks" like nvidia AMD's solution is still able to match Turing in speed (in a lot of cases, I have seen some where it is behind there too)
For what workload? Gaming? That would mean AMD is a generation behind.
 

leman

macrumors Core
Oct 14, 2008
19,521
19,677
For not having "dedicated blocks" like nvidia AMD's solution is still able to match Turing in speed (in a lot of cases, I have seen some where it is behind there too), so that isn't horrible right?

RDNA2 is much faster and has fast cache, so it's hardly a "compliment". AMD's current solution is a stopgap, I am sure they are working on something more advanced. RT is here to stay and we will see more sophisticated hardware. I think Apple is in a good spot here, and their software framework is already very solid.
 

Krevnik

macrumors 601
Sep 8, 2003
4,101
1,312
Fair enough, you can build for Apple Silicon from an Intel Mac, but you are correct that the machines themselves might not be Apple Silicon which I guess would potentially prevent you from running tests on Apple Silicon if that was important for some reason?

(Although you could always just build for iOS and test on a simulator)

I’m not sure how running in an iOS simulator helps for Apple Silicon. The iOS simulator uses the host CPU architecture, so it’s x64 on Intel systems. So you still need an ARM host if you are trying to test NEON SIMD or some other ARM-specific behavior.

That said, I agree with the general sentiment that it shouldn’t be a requirement to run the CI on the target architecture. At least in many/most cases.

Anyone who doesn't have an Apple Silicon build at this point likely just has a bunch of code that literally can't be compiled for ARM64 - whether that's a 3rd party library full of SIMD math stuff or hand-crafted assembly. The only solution there is going to be to re-write that code in a language that can be compiled to ARM.

Or they have a 3rd party dependency that’s binary only and hasn’t been updated in years.
 

jmho

macrumors 6502a
Jun 11, 2021
502
996
I’m not sure how running in an iOS simulator helps for Apple Silicon. The iOS simulator uses the host CPU architecture, so it’s x64 on Intel systems. So you still need an ARM host if you are trying to test NEON SIMD or some other ARM-specific behavior.

That said, I agree with the general sentiment that it shouldn’t be a requirement to run the CI on the target architecture. At least in many/most cases.

Or they have a 3rd party dependency that’s binary only and hasn’t been updated in years.
Good point. I've been using an M1 from day 1 so I completely forgot that intel simulators run x86_64.

I will admit that I don't really care about the underlying architecture that much / at all. I think for regular graphics programming most people have been fine letting the compiler do its thing for the past 10 years or so.

I get that that's not the case in research / scientific computing though.

There is a reason Blender was able to compile to Apple Silicon almost immediately (being almost entirely written in C/C++), while it looks like Julia is going to be stuck not supporting Apple Silicon properly for another few years.

That said I'm not sure how useful RT cores would be for scientific computing anyway.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,628
1,101
There is a reason Blender was able to compile to Apple Silicon almost immediately (being almost entirely written in C/C++), while it looks like Julia is going to be stuck not supporting Apple Silicon properly for another few years.
Yes, money to buy M1 minis. If you can test your code on hardware, you fix your bugs faster.

If online CI/CD services offered Apple Silicon, more developers would access to the new Apple hardware cheaply.
 

jmho

macrumors 6502a
Jun 11, 2021
502
996
I doubt Blender has (or has ever needed) an Apple Silicon CI/CD. I also doubt anyone is going to offer Apple Silicon CI/CD any time soon because nobody really needs it (except Julia I guess), and basically everyone is just running macOS VMs on a giant Linux machine.

I guess Julia's only solution is as you say to buy M1 minis and do it themselves.
 

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
For what workload? Gaming? That would mean AMD is a generation behind.
Path tracing. Yes AMD is a generation behind considering RDNA 1 didn’t have any RT hardware.
RDNA2 is much faster and has fast cache, so it's hardly a "compliment". AMD's current solution is a stopgap, I am sure they are working on something more advanced. RT is here to stay and we will see more sophisticated hardware. I think Apple is in a good spot here, and their software framework is already very solid.
For rasterization I agree RDAN2 is faster, but that means nothing in the RT side of things (clearly). RDNA 3 is supposed to finally move on from GFX10 (RDNA/RDNA2) to GFX11 so in a way RDNA2‘s RT is slapped on as a stopgap as you say. Unless they decide they can make it work better. Guess we will see in a few months.
 
  • Like
Reactions: Xiao_Xi
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.