[Edit: I doubt Apple is going to dive deep into threading design as ARM isn't deisgned for that. So we can probably just assume they will scale core count and think of other ways to speed processes up like their accelerators).
Apple’s basic recipe for performance (or at least, part of it) seems to be going really wide. Their designs are ridiculously superscalar, with estimated 13 execution ports (compare this to 8 ports on Skylake and 10 on ice lake) and tons of cache to feed them all. If they can scale the frequency of their CPUs, they will probably outperform Intel and AMD by 10-25% in single threaded applications.
And as you say, it’s reasonable to assume that they can stack multiple cores together to scale the multidimensionalere performance. Memory aces will be an issue, but I suppose they have some sort of solution there, given how well their unified memory SoC perform in an iPad. I wouldn’t be surprised to see some sort of stacked DRAM design with very wide bus...
And the whole graphics thing. iGPU can replace low-to-somewhat midrange options. But how do you dethrone say a Navi flagship? At least Nvidia as of late late last year has said their cards can run in conjunction with ARM.
If their GPU scales, they could easily build something that rivals a 2060-2070 under 50watt TDP. Of course, this would probably mean using dedicated video memory (something that Apple never did before). Or they could again go for a stacked design with high-bandwidth RAM to feed both the CPU and the GPU (like consoles do). Video RAM is a hack to begin with and Apples GPUs need much less bandwidth because if their design... but at this point I’m probably too far in the realm of wishful thinking. I would love to have a fast TBDR GPU in a desktop, that would allows some really neat applications in games.
[automerge]1593211332[/automerge]
We do not know about retail Apple Silicon chips yet, but an existing ARM CPU, the 32-core Ampere eMAG, compares nicely with Intel's 28-core Xeon Gold 5120 and AMD's 24-core EPYC 7401P in
benchmarks in terms of performance and power efficiency.
The Ampere eMAG was designed for servers, but
a workstation recently came out that has it.
Ampere has an 80-core ARM CPU set to hit the market by the end of the year.
Ampere is very good in perf per $, but it’s raw performance is not too impressive... also, these are server workloads. The Xeon W in a Mac Pro for example need a different performance profile.
Personally, I believe that for professional computing, single threaded performance is still the way to go. Not all tasks are embarrassingly paralleYou want designs that maximize it while providing a large amount of cores for scaling. Incidentally, I think that Apple (and ARM) could have an advantage here with the asymmetric CPU design. Say 4 high performance cores and 32 high efficiency ones. Kind of reminds me of the Cell
Cell failed since it was too complex to program, but these days, it’s basic design principles could map very well to what the pro market needs.