I wonder how Apple Silicon compares to AWS Graviton3? M1 Ultra has way more memory bandwidth, but Graviton3 has more cores. Both are manufactured on the TSMC 5 nm process.
Does the Ultra CPU cores have way more bandwidth? Apple puts a bandwidth data throttle on the CPU cores so that the GPU/NPU/ProRes decode don't run into bandwidth throughput issues.
The M1 Max CPU bandwidth tops out at around 240GB/s
www.anandtech.com
The Graviton doesn't have a GPU cohort on same memory bus problem and tops out at 300GB/s (with always on memory encryption on. )
It is always an exciting time when there is a new compute engine coming into the market, and interest is particularly keen with any new Arm server chip
www.nextplatform.com
The graviton has 32 PCI-e v5. lanes to connect to any 3rd party GPU want to add to the mix. The Ultra has something fewer than x8 PCI-e v4 .
Graviton 3 runs at about 100W so it is lower power than M1 Ultra too. More CPU cores , more CPU core bandwidth , lower power. Amazon made different design choices and got to a different outcome. They do not have a good 'single user at a time' SoC.
From same NextPlatform article
"...
According to the report in SemiAnalysis, which is presumably based on a presentation given at re:Invent 2021, the 64 cores on the Graviton3 chip are on one chiplet, and two PCI-Express 5.0 controllers have a chiplet each and four DDR5 memory controllers have a chiplet each, for a total of seven chiplets. These are linked together using a 55 micron microbump technology, and the Graviton3 package is soldered directly to a motherboard rather than put into a socket. ..."
There are 8 DDR5 memory channels. So have to fully populate all the DIMM slots to get max memory throughput. But likely can end up with a having memory capacity than the M1 Ultra when do so (if don't go cheap on the DIMM capacity size ). AWS allows provisioning 1TB instances on one of these processors.
Amazon has Nitro DPU that also likely whip badly any DPU could remotely try to connect to an M1 Ultra box.
If the M1 Ultra can leverage an app that pulls some computation into AMX/NPU/GPU cores then probably has some traction. But running 1000's of Apache images? Apple doesn't have much. Apple wasn't trying to build something for that space either. But the point is why should they even bother to? Graviton 3 and the next gen Ampere on 5nm will be deployed widely in 2022. If Apple needed ARM powered web services they can just rent/buy an Arm server chip that is already better than what they got with the top of the line M1 offering.
That isn't likely to change in 2023 when Apple has M2/M3 and the other folks have
also iterated. The x86_64 folks are not in "Rip van Winkle" mode either (Epyc 5/Genoa , Xeon SP Gen 4). ( On an easily multithreaded, workstation , CPU-focused-scalable workload the Threadripper W5995 smokes the M1 Ultra also. Probably not accident that AMD wasn't 'scared' to release that one the same day Apple showed their cards. )