If you compare the 3090 to the A6000, they basically have similar performance, but the 3090 uses GDDR6X, and the A6000 uses straight GDDR6. The "X" version is faster, but overheats easily. The "non-X" RAM does not see similar problems, even with much simpler coolers.
Modern GPUs live and die by memory bandwidth (especially true for Nvidia, AMD has advanced cache technology to help them out). If you are having thousands of processing units in your cluster, you‘d better have the ability to fetch and store the data quickly enough to keep them occupied. GDDR is very fast, but it pays for it by running hot. HBM needs less power for thevsamw bandwidth, but it’s too expensive for a mainstream gaming card.
I only mention this because the RAM Apple is using for the M1's GPU might be limited by heat issues more than anything else.
Nah, Apple uses custom versions of RAM already designed for low-power operation. The M1 RAM is drawing less than 2 watts of power under load (and most of the time, less than half a watt), which is crazy low. I am sure their prosumer chips will use similarly efficient RAM tech. Of course, the fact that Apple has huge caches (their latest phone chip has more last level GPU cache than the entirety of a 3090) + bandwidth saving rendering technology helps a lot.
Why don’t GPUs use similar RAM I hear you asking? Because using low-power RAM with very wide memory bus would would be crazy expensive and make a very large board. One could ask, wait, can’t one do something smart to cut those drawbacks down? One can indeed! That’s what HBM is