What makes ARM superior?

dogslobber · Jul 16, 2020

jdb8167 said:
Why? Apple isn't going to market their SoCs. They are going to sell ASi Macs as complete units just like they do for everything they sell today. You have to dig to find out what Intel CPU they are using for different SKUs. When you buy a MacBook Pro, Apple doesn't tell you how many PCIe lanes it has, just that it supports 2-4 Thunderbolt 3 ports. ASi Macs will be the same.

Market segmentation. Every Apple machine falls into an Intel defined market segment. If Apple doesn’t pub the tech specs then people will reverse engineer the silicon. One company that will do that is Intel.

MikhailT · Jul 16, 2020

Apple Silicon != ARM != CPU. It's not about superiority, that's not what Apple is after. They want total integration between hardware and software and they want to be able to customize the SoC. Intel doesn't do this. AMD may provide semi-custom designs but what about the rest of the SoC stuff? AMD doesn't offer everything that Apple wants.

This image should explain on its own why Apple wants to switch to their own SoC.

It also comes with Apple's custom TBDR GPU, memory controller, custom CPU controller to manage high-pref / high-efficiency CPUs (all at once), ML engines, Secure Enclave, fast storage controller, etc.

Apple doesn't want to wait on their hardware suppliers to do this, they want the whole vertical integration so that they can write and optimize their software with it as well. They also wouldn't want to share this with others, so they'd own the whole patents of whatever they customize.

Kostask · Jul 16, 2020

Arm CPU of Apple Silicon is only part of the overall SoC. Apple has added a lot of "accelerators", or modules that have dedicated functions, the Machine Learning accelerator, the camera modules (for the front and rear cameras, as well as for Face ID), GPUs, network controllers (managed by one of the efficiency cores), HEVC accelerators, touch screen controllers, and the equivalent of the T1/T2 security chip. To get all this to work properly, it has no doubt been necessary to extend the ARM instruction set merely to be able to talk to those modules. Just look at the keynote, and see how modules surround the ARM CPUs.

Also, in this week's MacBreak Weekly podcast, the panelists were saying that this is only the beginning. Apple will, long term, be moving away from ARM as well, creating thier own instruction set to more closely match with the OS and software they create. That will not even be a visible change, as it will only entail a compiler change.

leman · Jul 17, 2020

dogslobber said:
Market segmentation. Every Apple machine falls into an Intel defined market segment. If Apple doesn’t pub the tech specs then people will reverse engineer the silicon. One company that will do that is Intel.

I have literally no idea what you are mean.

As to "defining market segments", there is a little story. Some years ago, an Intel customer approached them about designing a custom CPU that would be thinner and more energy efficient than the normal laptop CPU of the time. Laptops using that custom CPU ended up being so successful that they created an entirely new market segment — the ultra book, and Intel has since adopted this technology into all their mobile CPUs. I assume you can guess who that customer was.

Kjs100 · Jul 17, 2020

Fuzzy Dunlop said:
I get what he's saying regarding ARM and Apple Silicon, but Apple Silicon is such an awkward thing to say. (especially the way he says it) There should be some shorthand term.

Appicon?

dogslobber · Jul 17, 2020

leman said:
I have literally no idea what you are mean.

As to "defining market segments", there is a little story. Some years ago, an Intel customer approached them about designing a custom CPU that would be thinner and more energy efficient than the normal laptop CPU of the time. Laptops using that custom CPU ended up being so successful that they created an entirely new market segment — the ultra book, and Intel has since adopted this technology into all their mobile CPUs. I assume you can guess who that customer was.

You need to try to keep up. I have to down level a lot of what I say on here to equalize for the knowledge repliers demonstrate. Let me crystallize it for you. Intel markets cpus in market segments such as mobile and Xeon, often dictated by IO ability and core count. Every single machine Apple ships will fall into one of those buckets. Apple and people on here might think otherwise but that’s what the market dictates.

leman · Jul 17, 2020

dogslobber said:
You need to try to keep up. I have to down level a lot of what I say on here to equalize for the knowledge repliers demonstrate. Let me crystallize it for you.

How very generous of you. It must be a bliss to possess such a sunlike levels of intellect and kindness.

dogslobber said:
Intel markets cpus in market segments such as mobile and Xeon, often dictated by IO ability and core count. Every single machine Apple ships will fall into one of those buckets. Apple and people on here might think otherwise but that’s what the market dictates.

You (probably because your insane intellect can’t bother about the details) are confusing the needs of the system builder and the user. The user does not care how many PCI lanes the system offers. The user cares that they can use their PCI extension cards and get good performance. The upcoming Pro machines will offer as much high speed I/O as Apple deems nessesary. Implementing I/O is far from being an insurmountable engineering challenge like you seem to suggest.

dugbug · Jul 17, 2020

leman said:
How very generous of you. It must be a bliss to possess such a sunlike levels of intellect and kindness.

You (probably because your insane intellect can’t bother about the details) are confusing the needs of the system builder and the user. The user does not care how many PCI lanes the system offers. The user cares that they can use their PCI extension cards and get good performance. The upcoming Pro machines will offer as much high speed I/O as Apple deems nessesary. Implementing I/O is far from being an insurmountable engineering challenge like you seem to suggest.

I would also add the concept of binning by core and clockspeed is an artifact of the CPU manufacturing process. That creates a lot of the "market segments" being refered to. Most of the time you will only have a subset of cores work (as seen in the A12x vs A12z) so you bin the 8 cores working chips in a lower cost and the 10 cores working chips in an upper cost. Better that than to waste the two cores in the working chips.

Same with clock speed. A % of your asics will fail at higher clock speeds, so you clock them lower and bin them.

joema2 · Jul 17, 2020

Jmausmuc said:
I now a little bit about processor technology and have read up on x86 and ARM architecture but I still do not really understand what makes ARM so superior to Intel or x86 technology in general that has people believing the the new ARM Macs will be much better and faster than Intel based macs...

That is a good question. No computer scientist or CPU architect has ever publicly, convincingly described exactly *why* ARM instruction set CPUs would be superior in performance or power/performance to equally-developed higher-end x86 using similar fabrication technology.

Historically RISC was viewed as superior to traditional CISC for several reasons and that viewpoint was valid -- back then. But then around the mid-1990s with the Pentium Pro, Intel adopted an internal RISC-like architecture. Each conventional x86 instruction was decoded to multiple RISC-like instructions called micro ops. Those were scheduled on various CPU subsystems similar to a RISC machine. This even had the *additional* benefit over RISC of high instruction density, which meant bus bandwidth was lower and instruction/data caches were more efficient.

Following that there was a long period where RISC could not get a sufficient advantage, since x86 had become a covert RISC design. Due to the x86-to-micro-op front end, Intel had an additional design burden but their economies of scale and fabrication expertise evened the difference. For a long time the viewpoint was that given similar fabrication technology and CPU design complexity, there was no major difference in performance or power/performance between true RISC and x86 pseudo-RISC.

This seemed corroborated by the surviving RISC CPUs such as IBM POWER family not being dramatically superior. Even though the RISC instruction set itself might have been simpler, modern CPUs (whether RISC or CISC or pseudo-RISC) *all* now require tremendous complexity to make superscalar "out of order" speculative execution work. In the original RISC concept of the early 1980s, these hyper-complex acceleration methods did not exist. Today the instruction set itself (whether RISC, CISC or in between) has become encased within a fantastically complex surrounding CPU architecture which is required for current performance levels. So even if the RISC instruction set is "simpler", nothing else is.

After the late 1990s, the viewpoint mostly prevailed (even among CPU architects) that the alleged RISC advantage mostly no longer applied. There was also a view that any superscalar design (inc'l RISC and CISC) would run out of gas due to exponential cost of hardware dependency checking of multiple in-flight instructions. Thus for a while Intel pursued VLIW which bypassed that problem but failed for other reasons.

Moving to the 2010s, it's true ARM which specialized in the low end had good power efficiency, but Intel basically argued they could do that if they wanted to. On the high end they argued if ARM ever scaled up to Xeon levels it would burn just as much power as IBM POWER8/9.

This viewpoint started to erode as Apple's ARM-instruction CPUs began climbing in performance yet maintaining good power/performance. The actual silicon fabrication seemed to have an advantage over Intel but it's unclear if this was related to the x86 instruction set. E.g, AMD is x86 and they had no problems at 7 nanometers, even if you count that as 10 nm due to measuring differences.

While the future is uncertain we can easily measure what already exists. The Apple A12Z CPU in my 2020 iPad Pro is essentially as fast as the 4.2 Ghz 4-core i7-7700k Kaby Lake CPU in my top-spec 2017 iMac 27, yet it burns a tiny fraction of the power. To my knowledge, the exact nature of the architectural features which provide this advantage have not been convincingly described. So the OP question is actually not possible to answer, but we can observe that the development trajectory of Apple Silicon seems pointed in a Xeon direction.

The other major advantage of Apple Silicon is customization. To this day Intel hasn't put Quick Sync hardware acceleration for video transcoding on Xeon, which forces Apple to use T2 or other unwieldy methods. With their own CPUs, Apple can integrate whatever hardware acceleration methods they want, and on their time frame. This is not limited to video but could be for anything they deem worthwhile.

dogslobber · Jul 17, 2020

leman said:
You (probably because your insane intellect can’t bother about the details) are confusing the needs of the system builder and the user. The user does not care how many PCI lanes the system offers. The user cares that they can use their PCI extension cards and get good performance. The upcoming Pro machines will offer as much high speed I/O as Apple deems nessesary. Implementing I/O is far from being an insurmountable engineering challenge like you seem to suggest.

You seem pretty defensive just as you under estimate end users. Users need a comparison metric and that is throughput for IO. We can read the specs for Intel chips to see the throughput by lanes and PCIE generation. It can be proven under test condition. Apple’s ARM chips will have similar metrics in the usual marketing propaganda. But that doesn’t prove it as you never believe a vendor. All that data influences a user’s buying Decision. Put simply they won’t buy a perceived slower machine when compared to the market segment the Apple ArM machine falls into. Perhaps you don’t understand marketing driving engineering requirements to follow what I’m saying.

dmccloud · Jul 18, 2020

dogslobber said:
You seem pretty defensive just as you under estimate end users. Users need a comparison metric and that is throughput for IO. We can read the specs for Intel chips to see the throughput by lanes and PCIE generation. It can be proven under test condition. Apple’s ARM chips will have similar metrics in the usual marketing propaganda. But that doesn’t prove it as you never believe a vendor. All that data influences a user’s buying Decision. Put simply they won’t buy a perceived slower machine when compared to the market segment the Apple ArM machine falls into. Perhaps you don’t understand marketing driving engineering requirements to follow what I’m saying.

Throughput for I/O? That's the type of spec system designers care about, not the end user. Most end users care about whether the system will run the game or application they plan to use. I sell computers at my day job, and I can count the number of customers who have asked about "throughput" on exactly ZERO fingers. Even when I worked in IT, the questions revolved around compatibility with existing systems and peripherals, not throughput.

leman · Jul 18, 2020

dogslobber said:
You seem pretty defensive just as you under estimate end users. Users need a comparison metric and that is throughput for IO. We can read the specs for Intel chips to see the throughput by lanes and PCIE generation. It can be proven under test condition.

Tell us then - what is the total PCIe throughput of all thunderbolt 3 ports on a current Intel based 16” MacBook Pro?

theorist9 · Jul 19, 2020

"What makes ARM superior?" is begging the question. I.e., it's assuming something that hasn't been established.

Indeed, to the extent papers have been presented in professional journals or conference proceedings on this subject, the overall conclusion has been that one ISA isn't inherently superior to another, and what really matters is instead the implementation, i.e., the microarchitecture (and, in particular, how well-optimized the microarchitecture is for the use case).

Let's start with one of the best-known papers on this subject, by Blem at al., and then check through all the papers that cited it:

"We find that ARM and x86 processors are simply engineering design points optimized for different levels of performance, and there is nothing fundamentally more energy efficient in one ISA class or the other. The ISA being RISC or CISC seems irrelevant." [emphasis mine]

FROM: E. Blem, J. Menon and K. Sankaralingam, "Power struggles: Revisiting the RISC vs. CISC debate on contemporary ARM and x86 architectures," 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA), Shenzhen, 2013, pp. 1-12, doi: 10.1109/HPCA.2013.6522302.
[https://ieeexplore.ieee.org/abstract/document/6522302]

I then proceeded to do a quick scan though all 176 articles that had cited this one (https://scholar.google.com/scholar?cites=14820675711934164696&as_sdt=2005&sciodt=0,5&hl=en), to see if any of the citing articles directly addressed this question (and, in particular, to see if any disagreed). I only found three, all of which broadly supported Blem et al's conclusion:

1) "Our simulation results suggest that although ARM ISA outperforms RISC-V and X86 ISAs in performance and energy consumption, the differences between ARM and RISC-V are very subtle, while the performance gaps between ARM and X86 are possibly caused by the relatively low hardware configurations used in this paper and could be narrowed or even reversed by more aggressive hardware approaches. Our study confirms that one ISA is not fundamentally more efficient." [emphasis mine]

FROM: M. Ling, X. Xu, Y. Gu and Z. Pan, "Does the ISA Really Matter? A Simulation Based Investigation," 2019 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), Victoria, BC, Canada, 2019, pp. 1-6, doi: 10.1109/PACRIM47961.2019.8985059.
[https://ieeexplore.ieee.org/abstract/document/8985059]

2) "The difference in performance and power consumption between the studied processors seems to be determined by the intended application rather than by the choice of ISA. In other words, in modern processors, the way the ISA is implemented, that is, the microarchitecture, plays a more significant role in determining performance and power characteristics than ISA." [emphasis mine]

FROM: Chevtchenko, S. F., and R. F. Vale. "A Comparison of RISC and CISC Architectures." resource 2: 4. [No year given.]
[https://pdfs.semanticscholar.org/8977/18e3387690736f132e812d097dc40379ea2c.pdf]

3) "In this paper, we presented a survey of existing hardware performance benchmark suites that range from evaluation of heterogeneous systems to distributed ML workloads for clusters of servers. From the survey, we selected BigDataBench in order to compare the performance of server-grade ARM and x86 processors for a diverse set of workloads and applications, using real-world datasets that are scalable. We benchmarked a state-of-the-art dual socket Cavium ThunderX CN8890 ARM processor against a dual socket Intel? Xeon? processor E5- 2620 v4 x86-64 processor. Initial results demonstrated that ARM generally had slightly worse performance compared to x86 processors for Spark Offline Analytics workloads, and on par or superior performance for Hive workloads. We determined that the ARM server excels over x86 for write heavy workloads. It is worth noting the apparent disk I/O bottleneck of the ARM server when comparing performance results to the x86 server. There are many other BigDataBench workloads that have yet to be tested on ARM, many of which may lead to promising results when provided with larger amounts of disk and network I/O. Moreover, recording the CPU temperatures and power consumptions of these servers may yield even more fruitful results, further promoting the use of ARM in server-grade processing for ML and Big Data applications. [emphasis mine]

FROM:
Kmiec S, Wong J, Jacobsen HA. A Comparison of ARM Against x86 for Distributed Machine Learning Workloads. InTechnology Conference on Performance Evaluation and Benchmarking 2017 Aug 28 (pp. 164-184). Springer, Cham.
[https://link.springer.com/chapter/10.1007/978-3-319-72401-0_12]

[To be a bit more precise, this last paper, unlike the first three I cited above, is not attempting to tease out the effects of ISA specifically, but is rather an overall comparison of ARM vs. x86 implementations (which includes both ISA and microarchitecture) for ML/big data.]

Joelist · Jul 19, 2020

theorist9 said:
"What makes ARM superior?" is begging the question. I.e., it's assuming something that hasn't been established.

Indeed, to the extent papers have been presented in professional journals or conference proceedings on this subject, the overall conclusion has been that one ISA isn't inherently superior to another, and what really matters is instead the implementation, i.e., the microarchitecture (and, in particular, how well-optimized the microarchitecture is for the use case).

Let's start with one of the best-known papers on this subject, by Blem at al., and then check through all the papers that cited it:

"We find that ARM and x86 processors are simply engineering design points optimized for different levels of performance, and there is nothing fundamentally more energy efficient in one ISA class or the other. The ISA being RISC or CISC seems irrelevant." [emphasis mine]

FROM: E. Blem, J. Menon and K. Sankaralingam, "Power struggles: Revisiting the RISC vs. CISC debate on contemporary ARM and x86 architectures," 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA), Shenzhen, 2013, pp. 1-12, doi: 10.1109/HPCA.2013.6522302.
[https://ieeexplore.ieee.org/abstract/document/6522302]

I then proceeded to do a quick scan though all 176 articles that had cited this one (https://scholar.google.com/scholar?cluster=14820675711934164696&hl=en&as_sdt=0,5&sciodt=0,5), to see if any of the citing articles directly addressed this question (and, in particular, to see if any disagreed). I only found three, all of which broadly supported Blem et al's conclusion:

1) "Our simulation results suggest that although ARM ISA outperforms RISC-V and X86 ISAs in performance and energy consumption, the differences between ARM and RISC-V are very subtle, while the performance gaps between ARM and X86 are possibly caused by the relatively low hardware configurations used in this paper and could be narrowed or even reversed by more aggressive hardware approaches. Our study confirms that one ISA is not fundamentally more efficient." [emphasis mine]

FROM: M. Ling, X. Xu, Y. Gu and Z. Pan, "Does the ISA Really Matter? A Simulation Based Investigation," 2019 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), Victoria, BC, Canada, 2019, pp. 1-6, doi: 10.1109/PACRIM47961.2019.8985059.
[https://ieeexplore.ieee.org/abstract/document/8985059]

2) "The difference in performance and power consumption between the studied processors seems to be determined by the intended application rather than by the choice of ISA. In other words, in modern processors, the way the ISA is implemented, that is, the microarchitecture, plays a more significant role in determining performance and power characteristics than ISA." [emphasis mine]

FROM: Chevtchenko, S. F., and R. F. Vale. "A Comparison of RISC and CISC Architectures." resource 2: 4. [No year given.]
[https://pdfs.semanticscholar.org/8977/18e3387690736f132e812d097dc40379ea2c.pdf]

3) "In this paper, we presented a survey of existing hardware performance bench- mark suites that range from evaluation of heterogeneous systems to distributed ML workloads for clusters of servers. From the survey, we selected BigDataBench in order to compare the performance of server-grade ARM and x86 processors for a diverse set of workloads and applications, using real-world datasets that are scalable. We benchmarked a state-of-the-art dual socket Cavium ThunderX CN8890 ARM processor against a dual socket Intel?Xeon?processor E5- 2620 v4 x86-64 processor. Initial results demonstrated that ARM generally had slightly worse performance compared to x86 processors for Spark Offline Analytics workloads, and on par or superior performance for Hive workloads. We determined that the ARM server excels over x86 for write heavy workloads. It is worth noting the apparent disk I/O bottleneck of the ARM server when comparing performance results to the x86 server. There are many other BigDataBench workloads that have yet to be tested on ARM, many of which may lead to promising results when provided with larger amounts of disk and network I/O. Moreover, recording the CPU temperatures and power consumptions of these servers may yield even more fruitful results, further promoting the use of ARM in server-grade processing for ML and Big Data applications. [emphasis mine]

FROM:
Kmiec S, Wong J, Jacobsen HA. A Comparison of ARM Against x86 for Distributed Machine Learning Workloads. InTechnology Conference on Performance Evaluation and Benchmarking 2017 Aug 28 (pp. 164-184). Springer, Cham.
[https://link.springer.com/chapter/10.1007/978-3-319-72401-0_12]

THANK YOU!

This whole discussion is getting a bit silly. ARM like x86 is just an ISA. You can design awful CPUs on both (indeed there was a time AMD was notorious for awful CPUs and in the ARM space we've also seen bad SOCs).

I think we need to stop using the phrase "ARM" when speaking of Apple's upcoming change. If you don't like their marketing phrase Apple Silicon then say "A Series" until we know what they are actually calling their Mac SOCs. The microarchitecture is the big thing, and there the Apple version is extremely advanced and very powerful - indeed it is the most powerful microarchitecture known of on the ARM ISA (the funny part is Apple poached a lot of Intel design talent from the team that created the Conrow microarchitecture and you can see the influence in the A Series).

cool11 · Jul 20, 2020

Arm in general, is not known to be such a power cpu's...
Are our hopes, in Apple's optimization?
But still possibly is an anemic horse?

Joelist · Jul 20, 2020

Again....ARM is NOT A CPU. All it is is an ISA, just like x86. Really we need to stop referring to Apple's SOCs as ARM because while they use that instruction set everything else is Apple's design and very different from all the other SOCs out there.

Erehy Dobon · Jul 20, 2020

This discussion is a good example of how many cannot see the forest for the trees.

It's Apple's specific implementation of the ARM ISA that differentiates it from others. But that's only the CPU. Much of Apple's differentiation is outside the CPU: their custom graphics silicon, other things like Secure Enclave, the security features in the T2 Security Chip, signal processing, machine learning.

During the WWDC keynote video, the presenters were very careful NOT to call attention to the ARM ISA. They referred to it as Apple Silicon. Some might not love the term but that's what Apple themselves chose to call it because it is a better reflection of the big picture.

This was even more blatant when they discussed graphics where the ARM ISA is irrelevant. Apple was even more inscrutable when describing their upcoming graphics , again just referring to the entirety as Apple Silicon.

Every time there's a thread that uses the term "ARM Mac" you know there's someone who fundamentally can't see the forest for the trees. While a certain amount of this is understandable from Joe Consumer, even a large portion of the tech media still doesn't get this.

We all know that the day when Apple releases their first Apple Silicon Macs, there will be tons of articles referring to the "new ARM Macs" and tons of discussion threads here with the same phrase.

theorist9 · Jul 20, 2020

theorist9 said:
"What makes ARM superior?" is begging the question. I.e., it's assuming something that hasn't been established.

Indeed, to the extent papers have been presented in professional journals or conference proceedings on this subject, the overall conclusion has been that one ISA isn't inherently superior to another, and what really matters is instead the implementation, i.e., the microarchitecture (and, in particular, how well-optimized the microarchitecture is for the use case).

Expanding on this: While Apple may have had specific design reasons for preferring the ARM ISA over x86, I suspect a large factor in their decision to go with ARM was based on ARM's much more favorable licensing terms. With the ARM license, Apple has enormous latitude, so long as they maintain compatibility with the ARM instruction set.

If Intel were like ARM when it comes to licensing their ISA, Apple might have seriously considered creating its own x86 chips instead of its own ARM chips (back when it was first doing its own custom silicon). But it's not -- licensing x86 from Intel is a headache (and perhaps also much less financially favorable). Consider all the legal battles between Intel and AMD concerning AMD's x86 license: https://jolt.law.harvard.edu/digest/intel-and-the-x86-architecture-a-legal-perspective

Tech198 · Jul 20, 2020

Jmausmuc said:
I now a little bit about processor technology and have read up on x86 and ARM architecture but I still do not really understand what makes ARM so superior to Intel or x86 technology in general that has people believing the the new ARM Macs will be much better and faster than Intel based macs.

I understand that the advantages of ARM are power efficiency and the ability to have many more cores but isn’t Intel still better in raw power in multi threaded operations?
Will ARM at first be a replacement for intels mobile processors which are arguably already worse in many ways than an A12Z or A13 or will they also be able to create a processor than can beat i9 and even Xeon processors?
Can we really expect a „night and day“ difference?

By the way - just yesterday, it was announced that the fastest supercomputer of the world is now ARM based. it uses ARM processors made by Fujitsu:https://www.arm.com/company/news/2020/06/powering-the-fastest-supercomputer
Fits perfectly to Apples announcement.

Ya, but they where there first remember. I guess with less to do, you can ramp up the power, and it can be more efficient...
In the days of everyone using Intel, I kinda put the fact RISC is old-school,, i mean that's why Apple moved away originally in 2005 right?

RISC is made specially for ARM, which means Apple can control everything allot better, and since they will do it themselves, it *should* put an end to the delays. The disadvantage is less compatible,

curmudgeonette · Jul 21, 2020

theorist9 said:
I can see this being the case for lighter-weight programs that don't significantly challenge the CPU. But aren't a lot of heavier-weight programs (by which I mean ones that need CPU optimization) hardware-optimized?

Way back in the 68K era, Apple's C compiler had some strange quirks. By the language definition, these two code fragments are identical (false is defined to be 0):

if (someBoolean) {

if (someBoolean != false) {

The compiler would generate more efficient code for the second case!

Even today I expect optimized code to be full of these sorts of idioms. The problem is when you move to a different architecture or compiler. The idioms may longer generate better code. It is possible that the second case above will generate worse code.

However, there isn't just one x86 architecture for which to optimize. There's Sandy Bridge. There's Haswell. There's Skylake. There are M-series *lakes. There are Xeon *lakes. Unless you are selling a program for only one specifically configured system**, you can't really optimize it in ways that won't run well on other chips.

** On one specific 68K variant (never used by Apple), I found that unrolling a memory to memory copy loop to 4, 7, 10, 13, etc. moves was most efficient. Uneducated logic would say that 8 would be a good choice: It is a power of two, facilitating easy handling of remainders. 4 is too few, i.e. the loop overhead remains too high. 16 is too many, i.e. too much code space for any gain.

KPOM · Jul 21, 2020

theorist9 said:
Expanding on this: While Apple may have had specific design reasons for preferring the ARM ISA over x86, I suspect a large factor in their decision to go with ARM was based on ARM's much more favorable licensing terms. With the ARM license, Apple has enormous latitude, so long as they maintain compatibility with the ARM instruction set.

If Intel were like ARM when it comes to licensing their ISA, Apple might have seriously considered creating its own x86 chips instead of its own ARM chips (back when it was first doing its own custom silicon). But it's not -- licensing x86 from Intel is a headache (and perhaps also much less financially favorable). Consider all the legal battles between Intel and AMD concerning AMD's x86 license: https://jolt.law.harvard.edu/digest/intel-and-the-x86-architecture-a-legal-perspective

Yes, the Intel/AMD legal battles, combined with Intel’s Itanium debacle led to the current “mutually assured destruction” in which neither Intel nor AMD can license the ISA or sell themselves to another party without terminating the cross-licensing arrangements they have with each other. Intel has patents on the underlying architecture, but AMD created the original 64-bit ISA add-ons while Intel was promoting an incompatible 64-bit ISA as a successor (hoping that emulation would work for 32-bit support). When that didn’t work, Intel had to cross-license AMD’s technology.

littlepud · Jul 22, 2020

ARM isn't in-and-of-itself superior. The move to Apple Silicon system-on-a-chip simply allows Apple to provide on-die / on-package ASICs to support whatever new features they want to introduce.

FireFish · Aug 5, 2020

Jmausmuc said:
I now a little bit about processor technology and have read up on x86 and ARM architecture but I still do not really understand what makes ARM so superior to Intel or x86 technology in general that has people believing the the new ARM Macs will be much better and faster than Intel based macs.

I understand that the advantages of ARM are power efficiency and the ability to have many more cores but isn’t Intel still better in raw power in multi threaded operations?
Will ARM at first be a replacement for intels mobile processors which are arguably already worse in many ways than an A12Z or A13 or will they also be able to create a processor than can beat i9 and even Xeon processors?
Can we really expect a „night and day“ difference?

By the way - just yesterday, it was announced that the fastest supercomputer of the world is now ARM based. it uses ARM processors made by Fujitsu:https://www.arm.com/company/news/2020/06/powering-the-fastest-supercomputer
Fits perfectly to Apples announcement.

That’s a silly question. What makes it superior? The fact that it is fully engineered by Apple. Case closed. Benchmark may say otherwise, and who cares about the decades of actual experience the chipset company holds. We know by now that this ain’t Apple ? culture. Experience? Years of industry dominance? Yawn. F that. We are the children of Jobs.

Kostask · Aug 6, 2020

FireFish said:
That’s a silly question. What makes it superior? The fact that it is fully engineered by Apple. Case closed. Benchmark may say otherwise, and who cares about the decades of actual experience the chipset company holds. We know by now that this ain’t Apple ? culture. Experience? Years of industry dominance? Yawn. F that. We are the children of Jobs.

Benchmarks may say otherwise, but the top dog core i9 in the 16"MBP sure does great when it is thermally throttling, doesn't it? See how quickly it slows down when editing video? No chance that Apple, with almost a decade of designing its own chips into small, power and cooling constrained devices, may know a little something. its not like AMD is any threat to Intel either, considering that they are now selling about 80% of the desktop CPUs, and are making significant inroads into the server CPU space. I mean, the years of industry dominance must mean that their inability to move their process forward for the last 4-5 years isn't a hot mess moving towards becoming a train wreck, is it? No, none of that matters, as long as you can slap an Intel Inside label on the case.

Meanwhile, in the real world, what makes it (AS SoC) superior is that it will run at top speed without overheating. What makes it superior is that it will have an iGPU that can actually be useful, as opposed to a sad necessity just so that something can appear on a display. What makes it superior is that you can have lots of real cores, instead of the illusion that is hyperthreading (and its associates Meltdown and Spectre). What makes it superior is that custom modules/blocks/accelerators can be added at will, in the quantity and specific functtion that Apple wants to put in. What makes it superior is that it isn't using a process technology that hasn't moved forward for 4-5 years, or depending on how things go, longer. What makes it superior is that management has made a commitment to move forward every year, and to make the best products it can, instead of trying to impress the stock market by starving the R&D budget. That is what make it superior. It isn't the fundamentals like architecture or design, its the execution.

cool11 · Sep 19, 2020

Superior or...inferior?

I think this is in most of our heads as a thought. A fear...hidden or not...
Many people in the computer business, still wonder how this 'thing' with arm, could ever be better in computing power, than intel.
Maybe it is called as an 'apple silicon', and not doubt apple will make the best it can do,
but still, it is an arm cpu, no matter what the optimizations would be.

Many friends of mine, advice me to go for an intel mbp as far as I still can,
because arm mbp, at least at the first generations/years,
will be a big downgrade in computing power.

What makes ARM superior?

macrumors 601

macrumors 601

macrumors regular

macrumors Core

macrumors regular

macrumors 601

macrumors Core

macrumors 68000

macrumors 68000

macrumors 601

macrumors 68040

macrumors Core

macrumors 68040

macrumors 6502

macrumors 68000

macrumors 6502

Suspended

macrumors 68040

Cancelled

macrumors 6502a

macrumors P6

macrumors 6502

macrumors 6502

macrumors regular

macrumors 68000

Our Staff