None of this addresses memory requirements. It speaks to performance of ARM over x64 which is not the topic of this discussion.here you are, right from the other thread :
Pipeline complexity,
x86 - 5+9×N
ARM - 4×N
Above assuming zero cache misses,
ARM can execute instructions without waiting for condition checks
ARM Requires alot less registers to move memory around
x86 everything has to be stored in memory , most of the code we had was moving around data, arm not so much.
an example :
PCAP traffic required us to write it all in memory as it was coming off the line , than flush to nvme array. ARM we can push data off the line right into the disk array by using significantly less memory , i think we are around 12GB total vs 200ish gb before.
inspecting that traffic can be done right from the array , vs loading large chunks into memory to inspect.
i would never argue more memory is bad , my argument is how arm handles memory in the first place , how the code is designed. how it works in a technical aspect. any developer in the apple ecosystem does not get a complete picture of how it works since apple is doing alot of this on their own. if you go down the road of building arm applications and understanding how it works. it becomes a much clearer picture about how it is more efficient.
all can be easily proven right in the white papers for AARCH.
I have been developing on Graviton for years now.
The M1 Macs have been out for approximately six months now. Are there no comparisons of a 8GB M1 system having comparable levels of memory paging compared to a 16GB x64 system for data sets which require 16GB of RAM?