Technically "among the first" ARM desktop was Acorn Archimedes, circa 1987.
The more modern ones would be Microsoft Surface RT, circa 2012.
You left off the Galaxy Book S (2018), the Lenovo Yoga C630 (2019), and the Surface Pro X, (2020...
Technically "among the first" ARM desktop was Acorn Archimedes, circa 1987.
The more modern ones would be Microsoft Surface RT, circa 2012.
There are some ARM machines out there but most are power-hungry servers.
TOP500 regularly publishes rankings of the 500 fastest supercomputers. In 2013, they added a new list—the Green 500—which ranks the 500 most energy-efficient supercomputers, based on GFlops/watt.Actually ARM is used in servers (i.e Amazon Graviton/Neoverse) and in supercomputers due to its efficiency. Because it offers the same (or more, depending on the design) performance for less W. In supercomputers the power consumption and heat disipation & cooling are as important if not more than the hardware itself, as it's a massive cost to cool all those so a big part of the budget goes there.
Now, if you meant a server consumption > customer computer consumption then yes that's true indeed.
As you can see from the reply I just posted to Woochoo, thus far ARM ISA's seem to to lose a lot of their power efficiency advantage to x86 at the higher end of the power scale. Maybe Apple can do a better job.Yes, definitely, that is the most important topic on which we have no data. Anandtech tests of A12 show that power consumption raises rapidly towards the end of the frequency range, extrapolating the curve further suggests that the chip doesn’t have much space to grow. Of course, a lot could have changed in two generations and one node shrink. Let’s see what A14 can do.
Doesn't matter. As old as I am, I'll never be old enough to need one, I'll be dead before then. Given the world and the younger generations, I'm glad I've selected my own use-by date.I now a little bit about processor technology and have read up on x86 and ARM architecture but I still do not really understand what makes ARM so superior to Intel or x86 technology in general that has people believing the the new ARM Macs will be much better and faster than Intel based macs.
I understand that the advantages of ARM are power efficiency and the ability to have many more cores but isn’t Intel still better in raw power in multi threaded operations?
Will ARM at first be a replacement for intels mobile processors which are arguably already worse in many ways than an A12Z or A13 or will they also be able to create a processor than can beat i9 and even Xeon processors?
Can we really expect a „night and day“ difference?
By the way - just yesterday, it was announced that the fastest supercomputer of the world is now ARM based. it uses ARM processors made by Fujitsu:https://www.arm.com/company/news/2020/06/powering-the-fastest-supercomputer
Fits perfectly to Apples announcement.
Sure, I was just responding to the general idea that ARM efficiency is inherently >> x86 efficiency at higher power scales. As I acknowledged in my subsequent post to Lehman, we don't know what the scaling will be for Apple's design. And as I acknowledged in my earlier post, industry has less experience with ARM ISAs at higher power levels, so there's probably more room for optimization there.Now notice that the ARM being mentioned is not Cortex but rather a custom core design by Fujitsu.
Actually ARM is used in servers (i.e Amazon Graviton/Neoverse) and in supercomputers due to its efficiency. Because it offers the same (or more, depending on the design) performance for less W. In supercomputers the power consumption and heat disipation & cooling are as important if not more than the hardware itself, as it's a massive cost to cool all those so a big part of the budget goes there.
Now, if you meant a server consumption > customer computer consumption then yes that's true indeed.
TOP500 regularly publishes rankings of the 500 fastest supercomputers. In 2013, they added a new list—the Green 500—which ranks the 500 most energy-efficient supercomputers, based on GFlops/watt.
If you look at who's in the top 10 of the most recent Green 500 list (published June 2020), you'll see that it includes machines built with all the major CPU architectures: Intel Xeon, AMD EPYC, IBM Power 9, and Fujitu ARM (A64FX). This suggests to me that, at least thus far, at the higher power scale, and in this application, no one architecture is particularly more efficient than the other.
Thanks for pointing that out. So it seems we are left with two possibilties:That performance per watt is not directly related to cpu efficiency. Most if not all of them also have custom accelerators or GPUs that contribute to their performance results.
it says at the top of the link you posted.
The most energy-efficient system on the Green500 is the MN-3, based on a new server from Preferred Networks. It achieved a record 21.1 gigaflops/watt during its 1.62 petaflops performance run. The system derives its superior power efficiency from the MN-Core chip, an accelerator optimized for matrix arithmetic. It is ranked number 395 in the TOP500 list.
I don’t think you are giving enough justice to Intel and AMD here. Intel’s CPUs also contain a fast GPU, a built-in AI accelerator, wide vector unit, matrix operations, encryption, video encoder, I/O controllers, integrated WiFI controller etc... Apple SoCs might contain more specialized processors and their ML accelerators are much faster, but fundamentally, the principal differences between these systems is minor.
And all these custom systems have little to do with CPU itself. A system might have best ML acceleration in the world, but it’s utility is going to be limited if it struggles with basic tasks. Apple CPUs are custom-designed, sophisticated devices that offer very high performance at very low power draw. This is their key advantage to an average user (be it a home user or a professional).
As you can see from the reply I just posted to Woochoo, thus far ARM ISA's seem to to lose a lot of their power efficiency advantage to x86 at the higher end of the power scale. Maybe Apple can do a better job.
Though it should also be noted that the clocks needed to compete with the Intel Core i9's are even higher than those needed to compete with the Intel Xeons.
Interestingly, as you'll probably recall (since you gave me a thumbs-up for it ), this is precisely the position I took earlier in this thread (https://forums.macrumors.com/thread...or.2242787/page-3?post=28689244#post-28689244), as well as in my subsequest discussion about this with cmaier (https://forums.macrumors.com/thread...rm.2248115/page-5?post=28733170#post-28733170) who worked on chip design for AMD. cmaier argued strongly that the ARM ISA is inherently more efficient than x86.Its because performance has little do do with the ISA - it’s up to the CPU architecture.
It certainly makes sense that some ISAs would be naturally better suited to certain tasks than others. And that could certainly explain the active interest in exploring ISAs. But as to the question at hand—namely whether one ISA is inherently superior [in efficiency] to another (x86 vs ARM)—my take-away at this point (based on the difference between what you wrote, and what's written in those papers) is that this is not yet a settled question in the field.
Interestingly, as you'll probably recall (since you gave me a thumbs-up for it ), this is precisely the position I took earlier in this thread (https://forums.macrumors.com/thread...or.2242787/page-3?post=28689244#post-28689244), as well as in my subsequest discussion about this with cmaier (https://forums.macrumors.com/thread...rm.2248115/page-5?post=28733170#post-28733170) who worked on chip design for AMD. cmaier argued strongly that the ARM ISA is inherently more efficient than x86.
As you can see from the reply I just posted to Woochoo, thus far ARM ISA's seem to to lose a lot of their power efficiency advantage to x86 at the higher end of the power scale. Maybe Apple can do a better job.
Though it should also be noted that the clocks needed to compete with the Intel Core i9's are even higher than those needed to compete with the Intel Xeons.
Well, cmaier certainly has more expertise, he actually works with the stuff
My take on this is that it's probably much easier to make a low-power ARM CPU, since the basic building blocks components can be made much simpler. But again, once you get into high-performance domain, you need to design a superscalar CPUs with instruction reordering, dependency tracking, register renames, write coalescing, wide vector units, complex cache hierarchies... and that is an entirely different thing. It is still a mystery to me how Apple manages to reach these performance levels with only 5 watts of actively consumed power...
On another hand, I also believe that ARMv8 is a better ISA design than Intel's monstrosity of x86... It's more symmetrical, more logical, and has less irregularity.
I'm also skeptical that you really need wide vector units. They presumably reduce decoding overhead, but Intel is the only one that went really wide, and they introduce a lot of weirdness in the process.
They still use 128 bit lanes, so shuffles which move data across a 128 bit boundary have different latencies. They also keep incrementally adding instruction versions with memory addresses as the third operand, so that shuffle and arithmetic ops can be issued from a port that normally handles loads.
I have no idea how Intel hasn't driven their compiler team mad. I just imagine some lord of the flies scenario there, partly because it amuses me.
Wide units can be beneficial for HPC, if your workloads are inherently SIMD-compatible. Fujitsu also went with 512-bit vector units for there A64FX for example (which is where most of the chips raw FLOPS performance comes from).
I completely agree hoverer that wide vector units are a waste on a general-purpose machine, especially when you consider the ISA extension fragmentation. Four separate 128-bit ALUs will almost always outperform a single 512-bit one. Apple currently has 3 128-bits ALUs. Personally, I would much prefer if they just add an additional one and implement SVE/SVE2 to allow the ALUS to be pooled together as needed. Although I can imagine that scheduling will be a challenge at this point.
Yeah, it's a mess. Another problem is that using different SIMD extensions has non-trivial impact on frequency and power consumption. There is also the warmup delay and transaction delays... Intel wide SIMD are still completely unsuitable for quick bursty work. They are only worth it if you do thousands of cycles worth of SIMD operations in a row. If you just need to add a bit of SIMD here and there in your code, AVX2 and above might actually end up slowing you down. I run into this phenomenon while working on my game engine. That's a tremendous amount of silicon and performance potential wasted.
I now a little bit about processor technology and have read up on x86 and ARM architecture but I still do not really understand what makes ARM so superior to Intel or x86 technology in general that has people believing the the new ARM Macs will be much better and faster than Intel based macs.
I understand that the advantages of ARM are power efficiency and the ability to have many more cores but isn’t Intel still better in raw power in multi threaded operations?
Will ARM at first be a replacement for intels mobile processors which are arguably already worse in many ways than an A12Z or A13 or will they also be able to create a processor than can beat i9 and even Xeon processors?
Can we really expect a „night and day“ difference?
By the way - just yesterday, it was announced that the fastest supercomputer of the world is now ARM based. it uses ARM processors made by Fujitsu:https://www.arm.com/company/news/2020/06/powering-the-fastest-supercomputer
Fits perfectly to Apples announcement.
Performance per watt is huge. They'll be able to produce a computer that's twice as powerful while consuming a fraction of the power of your average Intel processor.
Bigger than performance though is a roadmap that's actually going somewhere. Intel's 7th through 10th generation processors are just retreads of its 6th generation processors from 2016. Apple's Silicon, on the other hand, has advanced tons since the A9, A9X, and A10 processors of 2016.
This suggests to me that, at least thus far, at the higher power scale, and in this application, no one architecture is particularly more efficient than the other.
AWS does claim higher efficiencies for its ARM-based Graviton2 in the server space vs. Xeon designs, but here they're comparing 2.5 GHz Graviton2's with 2.9-3.2 GHz Xeons, and you'd expect higher-clocked chips to be less efficient, so this is not quite an apples-to-apples comparison.
Hi!
Something to remember is that Apple Silicon is only "ARM" in that it uses the ISA (with a lot of Apple addons). The microarchitecture is 100% Apple designed and microarchitecture is a FAR more important determinant of performance. Apple Silicon is already more performant than anything else out there on a per core basis. This is actually covered in other posts in threads on this site - Apple Silicon cores are much more closely related to Intel Conroe than Cortex or Neoverse. They are big, wide, short pipes with super accurate branch prediction and VERY advanced memory management and cache design.