Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

crazy dave

macrumors 65816
Sep 9, 2010
1,450
1,219
A quick comment on this: if I remember correctly most int operations execute at half the rate compared to fp arithmetics on M3. You can however execute fp and int operations simultaneously (similar to Nvidia) which can improve performance. Apple also offers native 24-bit int MAC, which might be faster. Division and modulo operations are very slow as they are implemented via injected subroutines. Bit operations on the other hand are fast (Apple GPUs have native bit sequence extract and popcount instructions).

I am currently rewriting my benchmarking code for Apple GPU and hope to publish a comprehensive overview in the next couple of months.

Looking forward to it! :)
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.