That said, I would stop at 16 CPU cores. At higher clock speeds those extra cores are a liability not worth their weight. You can always get some bang for your buck increasing the GPU core count, but boosting the CPU core count is a game of diminishing returns.
Let's assume the GPU cores are now clocked up relative to what TSMC's 5nm process can easily allow and each consume 1W. And in taking the Firestorm cores from 3GHz to 3.5GHz, they now consume an egregious 4W (this might be too generous, but I don't feel like math right now).
64W - 16 Firestorm Cores
64W - 64 Apple GPU Cores
20W - 4x 16GB HBM2 (64GB)
---
148W
The APU might still melt. The CPU cores are hot and very close together and surrounded by GPU cores and HBM stacks that are also hot and very close together. But it will maybe be OK. Its biggest problem is the slew of 5GHz x86 processors that will run circles around it. Unified high speed memory will help with that a little.
I would think the Mac Pro line-up would want more than 16 cores, current Mac Pro goes up to 28 cores & has hyperthreading. iMac Pro goes up to 18 cores & also has hyperthreading.
Many have said the current Apple arm64-based APUs do not have hyperthreading, and if that stays the norm, then a higher P core count is needed to make up for the lack of hyperthreading?
As far as cooling a 148W APU, two thing:
1 - Larger die (Threadripper-sized package) so not so 'crowded'
2 - Large heat sink like in current Mac Pro should cool any of the wattage I outlined earlier with no problem