Addition to my last post with link about HSA. Fiji and Tonga are a bit different architecture than the GPUs that were released before them.
Lets start from the beginning.
http://assets.hardwarezone.com/img/2013/10/radeon-r9-290-core-1.jpg Radeon R9 290X block diagram.
http://media.redgamingtech.com/rgt-website/2015/08/AMD-Radeon-R9-Nano-Fiji-GPU-Block-Hot-Chips.jpg
The differences in the upper part of the block are apparent. Now, read this post about it:
http://forums.anandtech.com/showpost.php?p=37669975&postcount=14
Here is the thread about it:
http://forums.anandtech.com/showthread.php?t=2444961
Here is the cause for the thread:
http://ir.amd.com/phoenix.zhtml?c=74093&p=RssLanding&cat=news&id=2083146
AMD opened new world of possibilities of the use of GPU.
A bit of personal post. Everybody that know me, and read my posts on polish Apple forum: myapple.pl can say that last year, I was gigantic fan of Maxwell architecture. I was staggered about the benchmarks, and what people said about Maxwell architecture. I was hoping that we can see in the "Thrash can" GM204 GPUs. I was going that we can see "thrash can-like" Mac Mini, with quad core E3 Xeon CPU and GM107 or something like this. But then, Apple decided to completely ditch the Nvidia from the lineup, with iMac 5K. I wondered why, and started digging. Ive seen that real world OpenCL performance of Nvidia GPUs is rubbish, and R9 280X is 3 times faster in Final Cut Pro X than GTX 970 while using OpenCL. Then I started analyzing the architectures of AMD and Nvidia hardware, and differences became apparent.
Everything what benefits Nvidia is due to software. Drivers, CUDA, even virtualization is made by software. AMD is other way around - what benefits them is the hardware. Asynchronous compute is the pinnacle here. The problem for AMD was that before low-level API world the software was not able to extract all of their capabilities from hardware. GPUs would stall in some part, not being fully utilized. Times has changed in this case.
Apple is pushing for HSA foundation instead of CUDA. Apple is pushing OpenCL, as the go-to solution. The benefits for it are clear, as you can see in the WWCF article. And you also may know why Mac Pro has 450W PSU, and Dual, undervolted, and declocked GPUs that are wide enough. You now know why it is AMD instead of Nvidia: Asynchronous Compute/HWS. It is time to end the arguments here.
And focus on technical stuff
.