Err , a couple years ago Apple told folks to throw away their 32 bit code. Apple “spends“ developer dollars ( other people’s money ) on a regular basis . They asked folks to write for three GPU implementations before . Happens on Windows and Linux all the time . Yet the sky will fall if have to write more than one optimization pipeline on M-series ? Probably not . The ones for Intel and AMD are already there . That’s the bigger issue Apple has; changing inertia, not that multiple models was too hard .
You are confusing changes that are necessary (or at least motivated) with changes that are arbitrary. Removing 32bit support was long overdue and enabled better hardware and software. Some rewriting of professional apps is required to take better advantage of Apple GPUs. And in the case of Mac Pro, if Apple decides to implement a NUMA architecture, some API adoption will be required as well, in order to take advantage of that system aspect.
Watch the WWDC session on optimizing image processing apps for Apple Silicon. It would be completely unreasonable for Apple to ask the devs to implement these changes now, and then next year to say "oh by the way, it was all a joke, now you have to change your apps back to work with a dGPU". One of the advantages of Appel Silicon is the streamlined programming model, and Apple would co complete fools if they don't see it though. Now compare this with a "our Mac Pro features multiple compute boards, so if you want to take care of all that extra processing power, please make this simple change to your app that will allow it to redistribute threads and compute kernels across multiple boards using the same API as you were using before", that's something else entirely.
Detached compute board doesn’t match their security model at all .
The farther apart these compute cards get the more latency will creep in and Apple’s wide pipelines will hiccup on data swaps .
I really don't see it this way. Explicit NUMA API will task the developer with planning the costs appropriately. Look how multi-GPU is currently implemented for Metal for an idea.
There was a WWDC session on Friday where Apple was crowing about how it was great that there Tensor Flow port can see and use a GPU ( now in 2021) . Like folks weren’t ask8ng for that 2-4 years ago. That’s is kind of competitive timeliness get when Apple goes down the Apple only hardware rabbit hole .
2-4 years ago they didn't have their own hardware on the Mac side. I hope that things will speed up from here. Besides, APIs are there, everyone can implement a backend for their favorite ML framework now.
@cmaier, please download hashcat, crack some hashes, and get back to us. We eagerly await your findings.
Yes, yes, please try hashcat and let me know.
@leman, condescending isn't the word I'd use. You should also try hashcat and report your findings here, in public.
I am sorry, what exactly do you want us to try and report? I don't have a bunch of high-end GPUs and a multi-GPU workstation laying around. You are the one making outlandish claims about multi-GPU scaling, why don't you "report"?
640k ought to be enough for anybody!
There is no free lunch, everything is a tradeoff. If Apple can theoretically ship a single-GPU system that is going to offer competitive performance to most multi-GPU workstation on the market, what else do you want?
I think the gpu cores in the m1 are dedicated. They are just located on the same die.
The industry-standard definition of a dedicated GPU is a GPU that is a separate device (that is, it has it's own physical memory pool and connects to the rest of the system via an expansion bus). So any PCIe GPU with it's own RAM = dedicated, M1, Intel/AMD APUs, modern gaming consoles = integrated.
Frankly, this definition is not helpful as it has nothing to do with performance.