Apple Silicon GPU performance

crazy dave · Oct 6, 2024

leman said:
A quick comment on this: if I remember correctly most int operations execute at half the rate compared to fp arithmetics on M3. You can however execute fp and int operations simultaneously (similar to Nvidia) which can improve performance. Apple also offers native 24-bit int MAC, which might be faster. Division and modulo operations are very slow as they are implemented via injected subroutines. Bit operations on the other hand are fast (Apple GPUs have native bit sequence extract and popcount instructions).

I am currently rewriting my benchmarking code for Apple GPU and hope to publish a comprehensive overview in the next couple of months.

Looking forward to it!

leman · Oct 6, 2024

crazy dave said:
Looking forward to it!

I know, I promised a while ago. Sorry for the delays, these months have been crazy for me.

crazy dave · Oct 7, 2024

leman said:
I know, I promised a while ago. Sorry for the delays, these months have been crazy for me.

No worries, I understand. Excited to see what you put out when you're ready though. Take your time.

Search

Search

Apple Silicon GPU performance

crazy dave

macrumors 68000

leman

macrumors Core

crazy dave

macrumors 68000

Our Staff