Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

leman

macrumors Core
Oct 14, 2008
19,517
19,664
I'm not sure there is much point in comparing different APIs like that.

Of course there is. We care about achievable performance, don’t we? Driver quality and API are parts of the equation.

However, @frustum’s result does surprise me, as M3 Pro is generally comparable to 6700XT in RT-accelerated production rendering. Yes, it’s a different type of workload, and does not necessary translate one to one, but it’s still odd. Is there a chance that your MetalRT backend could be improved?
 

frustum

macrumors member
Jun 16, 2021
32
12
Do we have positive confirmation that MetalRT actually uses Navi2 RT hardware?
When we tested it last time, the driver didn't use HW RT cores of Radeon 6600. There have been no major AMD driver updates since Apple silicone introduction because there is no eGPU support.
 

frustum

macrumors member
Jun 16, 2021
32
12
I'm not sure there is much point in comparing different APIs like that.

The RX 6600 only gets 3,358 under Metal RT on macOS.

The M3 Pro is also 16% faster than the M2 Max.
There is no graphical API other than Metal on macOS/iOS. Any new or ported to macOS game must use Metal API. We are giving a fair comparison of the same rendering algorithms on all platforms and APIs. So everybody can see the actual HW and driver performance.
 
  • Like
Reactions: altaic

frustum

macrumors member
Jun 16, 2021
32
12
Of course there is. We care about achievable performance, don’t we? Driver quality and API are parts of the equation.

However, @frustum’s result does surprise me, as M3 Pro is generally comparable to 6700XT in RT-accelerated production rendering. Yes, it’s a different type of workload, and does not necessary translate one to one, but it’s still odd. Is there a chance that your MetalRT backend could be improved?
RT performance depends on two factors: how fast the driver can build a TLAS and the actual speed of RT cores. We cannot improve it on the application side. GravityMark uses 3 convergent rays per pixel (one for the viewport and two for shadow).

6700XT is 3 times faster than M1 Pro on RT, but by the cost of 168Watt.
3060Ti is 5 times faster with around the same power consumption.

 

leman

macrumors Core
Oct 14, 2008
19,517
19,664
RT performance depends on two factors: how fast the driver can build a TLAS and the actual speed of RT cores. We cannot improve it on the application side. GravityMark uses 3 convergent rays per pixel (one for the viewport and two for shadow).

6700XT is 3 times faster than M1 Pro on RT, but by the cost of 168Watt.
3060Ti is 5 times faster with around the same power consumption.


Where exactly is the bottleneck - tree construction or traversal? What does the profiler say?

As we have it right now, pretty much any other popular benchmark out there (compute, raster, production RT) places M3 Pro as 40-50% faster than 6500XT, so your test is the odd one out. I’m not saying that it’s invalid - your use case might be hitting a slow path in the hardware etc - but we do have a discrepancy and it would be interesting to analyze it further.
 
  • Like
Reactions: Adult80HD

diamond.g

macrumors G4
Mar 20, 2007
11,435
2,659
OBX
Where exactly is the bottleneck - tree construction or traversal? What does the profiler say?

As we have it right now, pretty much any other popular benchmark out there (compute, raster, production RT) places M3 Pro as 40-50% faster than 6500XT, so your test is the odd one out. I’m not saying that it’s invalid - your use case might be hitting a slow path in the hardware etc - but we do have a discrepancy and it would be interesting to analyze it further.
If traversal is hardware accelerated, how would there be a slow path (lets ignore bvh building for the moment)?
 

frustum

macrumors member
Jun 16, 2021
32
12
Where exactly is the bottleneck - tree construction or traversal? What does the profiler say?

As we have it right now, pretty much any other popular benchmark out there (compute, raster, production RT) places M3 Pro as 40-50% faster than 6500XT, so your test is the odd one out. I’m not saying that it’s invalid - your use case might be hitting a slow path in the hardware etc - but we do have a discrepancy and it would be interesting to analyze it further.
This rendering flow is the same for all platforms and APIs:
  • Compute shader generates Asteroid transformations (the same for all rendering modes).
  • A TLAS has been built using the provided transformation buffer.
  • Primary and shadow rays have been traced.
If the driver cannot efficiently build TLAS, it's a problem. TLAS is a dynamic structure.
If the HW cannot trace 3 rays per pixel, it is a problem.
 

JordanNZ

macrumors 6502a
Apr 29, 2004
779
290
Auckland, New Zealand
This rendering flow is the same for all platforms and APIs:
  • Compute shader generates Asteroid transformations (the same for all rendering modes).
  • A TLAS has been built using the provided transformation buffer.
  • Primary and shadow rays have been traced.
If the driver cannot efficiently build TLAS, it's a problem. TLAS is a dynamic structure.
If the HW cannot trace 3 rays per pixel, it is a problem.
Have you profiled using the tools in Xcode? I’m curious what the bottleneck is on the M3 systems.
 

name99

macrumors 68020
Jun 21, 2004
2,407
2,309
M3 Pro RT test:


And, unfortunately, it's behind AMD Radeon 6500 XT:


iPhone 15 Pro Max cannot run RT benchmark in 2K/200K mode (not enough memory):

I have no idea what this tests, but the Apple results are much more "stable".
Don't go as high, but also no stretches of under 10fps...

I think we need to know a lot more about what this benchmark actually claims to be doing before having any opinion.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.