The whole 'But wait! One that shows the M1 Ultra trading blows with a 3090!' was largely tongue in cheek (hence the '!') given how many random, largely spurious, benchmark results we seem to end up discussing, but apparently didn't read that way so much. Welcome to the internet I guess
In raw computational power, the 3090 should be at least 30-50% faster, possibly more. I assume that the bottleneck in this particular test is communication between the GPU work packages, where M1 might have an edge thanks to much larger cache.
I didn't think the 3090 was that much faster (unless it was using Optix), been a while since I looked at any benchmarks though, so could be way off base.
Would that mean that with larger datasets the performance difference would increase? A bit unsure on what you mean, but assuming that the larger available memory would mean larger work items and less transferring of data around?
I find it more surprising that a company decided to optimize their software using Metal first instead of Cuda/Optix.
I was pretty surprised myself; kinda neat though. Tbh largely posted it because I thought it was interesting, as I haven't seen this kind of thing being optimised for metal (let alone metal first).
Gimped by OpenCL. Nice try though.
Probably shouldn't dignify this low effort troll with a response, but hey ho. There's a compelling case for OpenCl (at least historically). Off the top of my head:
- Vendor and platform agnostic.
- Tends to be more stable. Nvidia semi-regularly seem to release drivers that break things which makes using them in production fun.
- Works on the GPU and CPU; this enables you to develop things fast locally and submit the farm and get the same results (ish).
Curious to see how things go as we move away from OpenCL/GL.
No it doesn't. OpenCL is not equivalent to Metal. In fact, Metal is what replaced OpenCL and OpenGL when Apple deprecated it in MacOS. The fair comparison would be Metal to Cuda, not Metal to OpenCL.
Agree that the story would be different with CUDA, but was curious to see the improvements in performance on the Mac between Metal and OpenCL. I guess that out of date OpenCL version was really hindering things. Besides for my use case OpenCL is a far better comparison, since most of the Houdini (outside of Karma GPU) is uses OpenCL (I think the only things that uses Optix are the vellum pressure constraint nodes).
I guess, comparing Apples to Apples (heh), the main takeaway would be that Nvidia offers terrible performance for OpenCL, given that Apple's massively out of date version (1.2 from 2013 iirc) gets pretty close it.