Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

crazy dave

macrumors 65816
Sep 9, 2010
1,454
1,230
Actual emulation is harder than what Rosetta is doing. Rosetta basically takes a 64-bit binary that would run on an Intel Mac with Big Sur and transforms it into a native ARM binary that runs on an M1 Mac. It doesn't even bother dealing with 32-bit binaries compiled in 2007 for Leopard or Windows XP/Vista. An emulator would have to emulate hardware/OS behavior and features not available on the host platform, which often reduces the performance to that of interpreted code.

We’ve seen that with Wine and Rosetta, games can still be played. Further if the argument against whether or not ARM can succeed against x86 hinges on backwards compatibility with x86 games then the performance of Rosetta-like translation layers is indeed germane.
 

Sydde

macrumors 68030
Aug 17, 2009
2,563
7,061
IOKWARDI
Seems like now we’re past that, but I have to think that if it were me defining a new architecture, I would install a ton of registers, and eat the context-switch penalty.
Ever look at the 68000 series? Man, those CPUs (based on my old 68020 manual) had some massively ugly exception stack frames. Now exceptions, on ARM, are handled primarily in SPSRs, so there is almost no memory overhead – the downside being that there is no vectoring, so exceptions have to figure out what they are using SPSRs, which is probably not that big a deal. Really, with the big register set, stack overrun exploits should never happen, since most calls use registers instead of pushes – I would go so far as to separate the call return stack from the automatic variable stack so that those kind of things would be wholly impossible.

And, of course, if you have a lot of cores, the context switching activity should happen quite a lot less often, at load. I confess, the expanded ARMv8-64 register set made me a little uncomfortable at first, but, as you say, there really does not seem to be a significant downside to more registers. Register windowing might be a small gain, but is it worth it?
 

JouniS

macrumors 6502a
Nov 22, 2020
638
399
We’ve seen that with Wine and Rosetta, games can still be played. Further if the argument against whether or not ARM can succeed against x86 hinges on backwards compatibility with x86 games then the performance of Rosetta-like translation layers is indeed germane.
I've seen the reports. Some games run well, while others have poor performance or fail to run at all. Automatic translation can work well with software that was written with portability in mind. But sometimes there is plenty of low-level code that makes a lot of assumptions about the environment, and they you have no choice but to actually emulate it.
 

cmaier

Suspended
Jul 25, 2007
25,405
33,474
California
Ever look at the 68000 series? Man, those CPUs (based on my old 68020 manual) had some massively ugly exception stack frames. Now exceptions, on ARM, are handled primarily in SPSRs, so there is almost no memory overhead – the downside being that there is no vectoring, so exceptions have to figure out what they are using SPSRs, which is probably not that big a deal. Really, with the big register set, stack overrun exploits should never happen, since most calls use registers instead of pushes – I would go so far as to separate the call return stack from the automatic variable stack so that those kind of things would be wholly impossible.

And, of course, if you have a lot of cores, the context switching activity should happen quite a lot less often, at load. I confess, the expanded ARMv8-64 register set made me a little uncomfortable at first, but, as you say, there really does not seem to be a significant downside to more registers. Register windowing might be a small gain, but is it worth it?

Yeah, I’ve either designed, coded assembly for, wire wrapped onto proto boards, or otherwise dealt with pretty much every processor from the 6811 and onward :) At this point they all sort of blur together :)
 

pasamio

macrumors 6502
Jan 22, 2020
356
297
For phones, yes, but for desktops and laptops, most of the world runs on x64 Windows, and that's not going to change anytime soon. 30 years maybe it'll be a minority, but I bet you'll still be able to buy it.

All this intel's finished crud is ridiculous. People don't run platforms, only us geeks do -- they run software and they buy whatever machine runs their software.

Apple have demonstrated how you can effectively run x86-64 apps on ARM through some processor improvements and a recompilation process and who are aiming to get off x86 entirely within the next year or so. Microsoft already has an ARM port, has been working on improving their own emulation stack and already has a group in the server space looking at ARM chips. Chromebook's already do ARM rounding out the third major laptop operating system. Intel won't disappear but they have tough competition against ARM as well as AMD.

I think your last sentence is actually where Intel is in trouble. People buy what ever lets them do their task within their budget. Apple by the end of next year is supposed to be done with it's migration and it's generating a bunch of positive press. If there is a new Windows laptop that runs most of the software they use today, has amazing battery life, good performance and is potentially cheaper as well? Then they'll likely buy the "ARM" one.
 

Flight Plan

macrumors 6502a
May 26, 2014
883
824
Southeastern US
I get what the OP's feeling is here, that competition is good. And it is. Wherever there is no competition, either by lack of a competitor market or by government squashing the competitor market (public utilities, anybody?), less competition always discourages innovation. What's left is products or services that are unreliable, expensive, or just plain not relevant.

Examples can be found in a number of industries now: Taxi services have to compete with people who have a car and want to drive other people around for a fee. And just for an hour or two because we have to pick up the kids later from soccer and violin practice. Taxi services, particularly in the large cities, were noncompetitive, expensive, and full of fraud.

Another example is the US Postal Service. Now with UPS, FedEx, and other shippers, the USPS has had to get off their high horse and consider Sunday service. And Amazon, the biggest worldwide shipper now, has insourced that service. And mostly my stuff comes FAST. With USPS, I have to go to the mailbox. Or to my neighbors, since there's a super high rate of misdeliveries with the USPS. But with Amazon, my potholders, vacuum filters, and trinket deliveries always end up on MY porch (at least until my porch pirate decides to risk getting shot in my neighborhood and tries to steal them).

Even in the aerospace, space satellite, and defense industries, we have new players tipping over the applecart of the 100 year old companies.

This is all good.

But!

Just saying that the M1 needs a competing chip? Well on the surface that sounds great, but it's also missing a big component of the discussion here. Intel's "product" is the chip, yes. But Apple's product is not just "a chip". For Apple, it is the combination of the hardware package (of which the chip is only one part), combined with the software.

Apple is selling an EXPERIENCE, whereas everybody else is selling a widget.

Apple saw this with the original Motorola Mac, and wanted to IMPROVE THE PRODUCT (not just the chip) by improving the chip. Hence the move from IBM to Intel. But then AGAIN, the chip technology for the Mac was found to be lagging. This time, however; instead of moving to another chip supplier, Apple decided to source from a proven, reliable, and innovating chip provider: Apple itself.

So Apple, if you think about it, is only partially competing with Intel. It's more like Apple is competing with Microsoft, Windows, AND Intel as a group; while simultaneously playing hands on the side against AMD and Linux.

The "product" for Apple is the whole computing experience; not just one hardware part. Intel is only one cog in a much larger wheel.
 

bobcomer

macrumors 601
May 18, 2015
4,949
3,699
I think your last sentence is actually where Intel is in trouble. People buy what ever lets them do their task within their budget. Apple by the end of next year is supposed to be done with it's migration and it's generating a bunch of positive press. If there is a new Windows laptop that runs most of the software they use today, has amazing battery life, good performance and is potentially cheaper as well? Then they'll likely buy the "ARM" one.
Microsoft's WOA really doesn't run all Windows x86/64. They went a different path than Rosetta and it shows in the speed and ability to run x86. At least that's where things are now, so we wont know if there will ever be a fully compatible Windows laptop running ARM that has crazy battery life and great performance, and runs all their software.

In any case, I just can't help but thinking a lot of people are counting their chickens before they're even laid, much less hatched. :)
 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
Thats just wrong - of course it depends what you understand under "scanning". In a general computation model you calculate your output from the input - there is no notion of scanning. First of all, a parallel machine model or uniform circuit has essentially parallel access to the input - there is no notion of going sequentially through your input. Of course the algorithm (or function) itself might have dependencies, which might force you to sequentialize some computations - and that is precisely what we are talking about here.

Coming back to what you trying to say above, there are some theorems about lower limits if we remove the notion of scanning. One of them states that a function, which needs all inputs to calculate the output cannot be in NC^0 and must be at least in NC^1. (informally NC^x = O(log^x n) depth on a circuit with bounded fan-in using standard gates). As lecture i suggest to read up on the classes NC, AC, TC - but i guess Nicks Class (NC) is the most relevant for the discussion.

And I would agree with you 100%... if we were talking about a different problem, such as counting the number of instructions in a program. But this is not about identifying instructions in isolation, it is about executing them. And the execution has to be done sequentially, as mandated by ISA semantics. Furthermore, execution is done on a real machine, with a constant (not polynomial) number of execution units, and on real programs, where dependencies are complex and unpredictable. Execution is fundamentally an O(n) problem and that is what limits you in the end. You can't just split an arbitrary program listing into multiple chunks, execute them in parallel and merge the results, and even if you could, you can't "magically" grow more processors if you see that the program you are getting is a bit longer.

My point is that while a fixed-length encoding allows you to determine the location of i-th instruction in constant time, that fact alone does not mean much. You still need to schedule all instructions in [0, i) before you can schedule the i-th one. So the problem is not "how quickly can I detect all instructions" but "how quickly can I detect incoming instructions to keep the rest of the machine busy". In practical terms, it boils down to examining the next k bytes and extracting up to m instructions out of them (where k and m are constants). An ARM machine will be trivially able to do it in O(1). An x86 machine will be able to do it in... O(1), just with a higher constant factor (that will be some sort of function of k and m). For an instruction sequence of an arbitrary length, both will end up being O(n).

In the end, I don't think we are in disagreement, it's just that you seem to be focusing on the problem of instruction detection while I view it strictly in the practical context of program execution. A simple confusion. I believe we both agree that x86 variable instruction length adds considerable circuit complexity and has massive scalability issues, inherently limiting the benefits from going very wide. ARM does not have this limitation. Both are limited by the complexity of orchestrating multiple execution units and by the ILP in real world code. And as you point out, ARM has some advantage here as well with its larger register set and just overall smarter ISA design that reduces the need of temporary registers. As someone has pointed out one time, AArch64 appears to be designed with OOE in mind.


That having said, i truly believe, if ARM decides to make the cores the same size as Firestorm, they would achieve very similar IPC - contrary to the x64 competition. In some cases ARM is very explicit, they say, when we would have increased feature xyz by this amount, they would have gained uvw amount of performance - so "we are not doing this". The X1 is the first ARM core, where ARM is somewhat deviating from these considerations.

Oh, there is no doubt. If anything, Apple has demonstrated that high IPC is possible, so there should be no reason why someone else cannot achieve the same or even better results. Looking forward to see what ARM, Qualcomm and friends will deliver in the coming years. I personally welcome the reign of efficient, smart CPUs to replace the steam-powered x86 chips :)
 

Sydde

macrumors 68030
Aug 17, 2009
2,563
7,061
IOKWARDI
… execution has to be done sequentially, as mandated by ISA semantics …
This is absolutely incorrect.

Take the simple expression ab + cd The computer can calculate ab and cd simultaneously and then add the two products. Similarly, if a target address requires multistep calculation, that calculation can be performed at the same time as the value to be stored there. There are large fractions of a simple linear program that can be parallelized without compromising the effective sequential progress of code, and compilers have become very good at arranging the istream for optimal concurrent dispatch.

ARM has a tiny advantage over x86 here. The math format of x86 is either d = s <op> d or s = s <op> d meaning one of the source arguments is also the destination, whereas ARM's math format is d = s <op> a where the destination may or may not be one of the arguments. This means that dependencies can be somewhat reduced if source values appear in multiple parts of a computation. This is not a huge advantage, but a small advantage spread over a large amount of code can add up to something non-small.
 

Icelus

macrumors 6502
Nov 3, 2018
422
579
ARM has a tiny advantage over x86 here. The math format of x86 is either d = s <op> d or s = s <op> d meaning one of the source arguments is also the destination, whereas ARM's math format is d = s <op> a where the destination may or may not be one of the arguments. This means that dependencies can be somewhat reduced if source values appear in multiple parts of a computation. This is not a huge advantage, but a small advantage spread over a large amount of code can add up to something non-small.
This isn't the case for AVX and BMI1/2 instructions as they provide a three-operand instruction format.
 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
This is absolutely incorrect.

Take the simple expression ab + cd The computer can calculate ab and cd simultaneously and then add the two products. Similarly, if a target address requires multistep calculation, that calculation can be performed at the same time as the value to be stored there. There are large fractions of a simple linear program that can be parallelized without compromising the effective sequential progress of code, and compilers have become very good at arranging the istream for optimal concurrent dispatch.

Yes, of course, but this does not change my argument. Instructions can absolutely be executed in parallel, superscalar out-of-order execution is in fact the foundation of the performance of wonder CPUs anyway. What I mean is that instruction scheduling is inherently sequential: you cannot start execution of instruction i+1 before having examined the instruction i. The system must figure out the argument dependencies, allocate/rename registers etc. before the instruction can be submitted to the execution unit. And while modern CPUs can do these things for multiple instructions simultaneously, this is a constant factor reduction, not asymptotic one.

The context of the statement you quote was the discussion with @Gerdi about parallelization and asymptotic complexity. It's all getting quite complicated and I find it difficult to express myself with 100% clarity, so I suppose that some of the things I wrote are easily misunderstood.

ARM has a tiny advantage over x86 here. The math format of x86 is either d = s <op> d or s = s <op> d meaning one of the source arguments is also the destination, whereas ARM's math format is d = s <op> a where the destination may or may not be one of the arguments. This means that dependencies can be somewhat reduced if source values appear in multiple parts of a computation. This is not a huge advantage, but a small advantage spread over a large amount of code can add up to something non-small.

This also ties nicely into @Gerdi's mention of larger available register namespace on ARM. An ISA x86 does not only have to juggle the accumulator-type instructions you mention (they did move to three-argument form for AVX but that loses the often mentioned "advantage" of shorter instruction encoding), but it's fundamentally register starved. These things absolutely make difference in real world code.
 

Botts85

macrumors regular
Feb 9, 2007
229
175
I know a lot of people share your mindset. I am different. I don't care what Intel does. I care about what Apple does regardless of Intel or anyone else. I buy Apple products because Apple has been giving me what I think are good reasons to do so.

I get your point, though. Competition can be good for the consumer.
Competition is good. A strong Intel is good for the market.

If Apple has a serious challenge choosing the highest performing chip between Apple Silicon, Intel, and AMD, that means there is plenty of high performing chips to choose from.

That said, architecture differences limit the realities of Apple switching silicon, but Apple can't rest on their laurels with Apple Silicon if AMD/Intel/Nvidia are improving rapidly.

A strong Intel also gives Apple other options for things like non-SoC graphics processors.
 

cmaier

Suspended
Jul 25, 2007
25,405
33,474
California
This is absolutely incorrect.

Take the simple expression ab + cd The computer can calculate ab and cd simultaneously and then add the two products. Similarly, if a target address requires multistep calculation, that calculation can be performed at the same time as the value to be stored there. There are large fractions of a simple linear program that can be parallelized without compromising the effective sequential progress of code, and compilers have become very good at arranging the istream for optimal concurrent dispatch.

ARM has a tiny advantage over x86 here. The math format of x86 is either d = s <op> d or s = s <op> d meaning one of the source arguments is also the destination, whereas ARM's math format is d = s <op> a where the destination may or may not be one of the arguments. This means that dependencies can be somewhat reduced if source values appear in multiple parts of a computation. This is not a huge advantage, but a small advantage spread over a large amount of code can add up to something non-small.

Not really sure I understand the second big paragraph. If I have d=s <op> d, the dependency on d doesn’t matter; the scheduler will be looking for any prior instruction that affects d, not the instruction being scheduled. So I’m not seeing how forcing (or not forcing) one source to be the same location as the destination matters.

In other words:

(1) d = d + a
(2) a = b + a
(3) d = d + c

Instruction 3 can’t issue until after (1), but can issue before (2). But if instruction (3) were, instead, “e = d + c,” the same would be true.
 
Last edited:

Sydde

macrumors 68030
Aug 17, 2009
2,563
7,061
IOKWARDI
Not really sure I understand the second big paragraph. If I have d=s <op> d, the dependency on d doesn’t matter; the scheduler will be looking for any prior instruction that affects d, not the instruction being scheduled. So I’m not seeing how forcing (or not forcing) one source to be the same location as the destination matters.

In other words:

(1) d = d + a
(2) a = b + a
(3) d = d + c

Instruction 3 can’t issue until after (1), but can issue before (2). But if instruction (3) were, instead, “e = d + c,” the same would be true.
Well, what I mean is that if you need to use the original value of d later on, you have to move (or reload) it, but with 3-register math, you almost never have to do a register-to-register move (unless, on ARM, it involves R30 being set aside briefly).

Granted, x86 supports direct memory math, and that might be a bit of a time saver in some cases. How much time it might save, though, is not really clear. To do something like add [EBX, 0x0C],EAX would result in the dispatcher issuing 3 μops, which works out to exactly the same amount of work as load-add-store on RISC – less code but a little more overhead for the decoder/dispatcher. Inherent result-to-memory operations seem to me almost vestigial in modern computing and are probably a burr that holds back Intel from doing weakly-ordered memory semantics.
 
  • Like
Reactions: jdb8167

cmaier

Suspended
Jul 25, 2007
25,405
33,474
California
Well, what I mean is that if you need to use the original value of d later on, you have to move (or reload) it, but with 3-register math, you almost never have to do a register-to-register move (unless, on ARM, it involves R30 being set aside briefly).

Granted, x86 supports direct memory math, and that might be a bit of a time saver in some cases. How much time it might save, though, is not really clear. To do something like add [EBX, 0x0C],EAX would result in the dispatcher issuing 3 μops, which works out to exactly the same amount of work as load-add-store on RISC – less code but a little more overhead for the decoder/dispatcher. Inherent result-to-memory operations seem to me almost vestigial in modern computing and are probably a burr that holds back Intel from doing weakly-ordered memory semantics.

Ah, I see. Of course, register-to-register transfers can sometimes be done in zero time, by updating the pointers in the register renamer, as part of the issue stage.

And, of course, another problem with direct-to-memory operations is that you would often be better off keeping the result in a register until you’re really done with it, but x86 has too few registers and bad register flexibility.
 

Gerdi

macrumors 6502
Apr 25, 2020
449
301
Not really sure I understand the second big paragraph. If I have d=s <op> d, the dependency on d doesn’t matter; the scheduler will be looking for any prior instruction that affects d, not the instruction being scheduled. So I’m not seeing how forcing (or not forcing) one source to be the same location as the destination matters.

In other words:

(1) d = d + a
(2) a = b + a
(3) d = d + c

Instruction 3 can’t issue until after (1), but can issue before (2). But if instruction (3) were, instead, “e = d + c,” the same would be true.

You did correctly point out that for (3) register renaming would not help, because there is a true dependence (1)<(3).
More interesting is instruction (2) here, as the register renamer could remove the anti-dependence (1)<(2), which extends the valid schedules :)
 
Last edited:

skaertus

macrumors 601
Feb 23, 2009
4,252
1,409
Brazil
Really confused by this forum but I may move on because this forum is proving more and more not to be an "Apple Enthusiast" site. People say Apple needs competition in order to innovate (strange since they are the minority), then Apple innovates on overdrive and puts out something amazing such as the M1. Apple is currently killing it but for some people it's still not enough. SMH. Now the OP wants Intel to push Apple back down to force Apple to do even better. It never seems people want Apple to actually win. Intel has kept Mac users with overheating Macs, short battery life and noisy fans trying to keep the super hot Intel processor cooled. I think Apple has proven that they don't need competition in terms of processors. Obviously Apple is trying to make the best computers for Mac customers so it makes no sense for Intel to push Apple when Intel is the one who's been failing us for years.

Also what's the point of a processor war between Apple, Intel and AMD? The only two that should compete are AMD and Intel since they do X86 processors. Apple serves the Macintosh community so it makes no sense to be part of a processor war. Either buy a Mac or buy Windows but if people are going to buy based on processors alone then it doesn't sound like they do any on their computers past web surfing. I have a lot of Mac-exclusive software and I prefer the Mac UI on software made for both Mac and Windows so I wouldn't care if Intel made a faster processor, I wouldn't give a Mac up for a faster Windows machine. At the end of the day it's about running my software.
Of course it matters. A person may buy PCs or Macs, but they are ultimately buying computers.

It is important that Intel, or AMD, or Qualcomm, or all of them, are worthy competitors in the processor arena. If Apple does not feel compelled to, it will not put so much effort and resources in making its processors even better.

Plus, there is pricing. You can buy a $999 MacBook Air today because there are PC laptops being sold for a fraction of this price. And there are PC laptops which are nearly equivalent to Macs which are sold for roughly the same price. Now, imagine how expensive would Macs become if there are no PC equivalents to compete with. If Apple feels that a Mac is twice as fast as a PC, it may feel compelled to charge twice the price.

Competition is important. Apple is only pushing the boundaries of innovation to show that it can provide better products than the competitors and therefore charge more. If competitors stand still, then Apple can either stop innovating or charging more for its products, or both.
 
Last edited:

scoobysnax

macrumors regular
Apr 2, 2016
153
147
I know the architecture is different, but it would be nice if the M1’s could support boot camp. It’s not like I boot into it often, but it’s certainly nice to be able to when needed. I also love the native dual monitor support, which is sorely missed on the M1 MacBooks.
 

Gerdi

macrumors 6502
Apr 25, 2020
449
301
Microsoft's WOA really doesn't run all Windows x86/64. They went a different path than Rosetta and it shows in the speed and ability to run x86. At least that's where things are now, so we wont know if there will ever be a fully compatible Windows laptop running ARM that has crazy battery life and great performance, and runs all their software.

Thats not really true at all. Rosetta has essentially the very same limitation as WOA x64 emulation - namely kernel mode code cannot be emulated. Performance is also very comparable. So if you look at the apps, which are not running - these are mostly games with kernel mode anti-cheat drivers, or utilities like disc-utilities or virus scanners, which require a low level kernel driver. And then some x86 SW is running out of memory due to the 2GByte process space limitation - does not happen under x64 emulation.
I encountered very few applications, which did not run due to shortcomings of emulation itself. And while you are technically correct, that not all apps are running, you somehow make it sound, as if there were a larger gap.
 
Last edited:

Maconplasma

Cancelled
Sep 15, 2020
2,489
2,215
Of course it matters. A person may buy PCs or Macs, but they are ultimately buying computers.
If a person uses Macs a faster processor in a PC is not going to entice them into buying a Windows machine. If it does then they were not using their Mac for anything important that involves Mac software. People like that generally just web surf and any computer is fine with them.
It is important that Intel, or AMD, or Qualcomm, or all of them, are worthy competitors in the processor arena. If Apple does not feel compelled to, it will not put so much effort and resources in making its processors even better.
Apple is not the one that needs the competition in the processor space. Intel is. You sound like Apple is the one that needs a light under the butt. No. Intel does.
Plus, there is pricing. You can buy a $999 MacBook Air today because there are PC laptops being sold for a fraction of this price.
And if price was all that mattered to people then Apple's computer line would've been dropped many years ago.
And there are PC laptops which are nearly equivalent to Macs which are sold for roughly the same price.
So? This isn't a discussion about which Windows PC to buy. Definitely sounds like you're in the Windows camp because you're doing an awful lot of defending of it.
Now, imagine how expensive would Macs become if there are no PC equivalents to compete with.
And vice versa. If there were no Macs and the world lived only on Windows machines there would be no competition so you're making a non-point here.
If Apple feels that a Mac is twice as fast as a PC, it may feel compelled to charge twice the price.
The M1 Macs ARE twice (and more depending on the task) as fast and twice the battery life and more than 90% lower in heat dissipation than PC's in their class. The M1 MacBook Air and MacBook Pro are the same price as their older Intel-based counterparts. 100% nonsense that Apple would charge twice the price. They haven't right? Of course not. What a ridiculous thing to say. ?
If competitors stand still, then Apple can either stop innovating or charging more for its products, or both.
Well Intel has been standing still for decades and with the introduction of the M1 and the great success it has been so far I don't see and price hikes from Apple. Stop making stuff up just because it makes sense to you, because everything you wrote is nonsensical and nothing but some crazy conspiracy theory. What I love about this forum is how people continually throw jabs at Apple stating that they need the competition. NO. Intel needs the competition. Windows machines need the competition. Macs are in the minority and Microsoft has the monopoly on the world of computing. Sounds like you're perfectly fine with Microsoft maintaining that monopoly because you sure are doing a great deal of defending of Windows machines.....on a Mac forum. SMH.
 

LeeW

macrumors 601
Feb 5, 2017
4,342
9,446
Over here
I hope intel comes back with something as well, I mean we are all consumers, options are a good thing.
 

bobcomer

macrumors 601
May 18, 2015
4,949
3,699
Thats not really true at all. Rosetta has essentially the very same limitation as WOA x64 emulation - namely kernel mode code cannot be emulated.
That's true, but I was speaking more on Rosetta and translation, rather than emulation. It doesn't work on everything, but tranlation works really well when it does work. WOA doesn't do translation. (it does do some library based stuff, but it's a pale comparison to what Rosetta does. Not to mention that there's hardware there to support rosetta..

Performance is also very comparable.
I haven't found it to be even close performance-wise.

Performance is also very comparable. So if you look at the apps, which are not running - these are mostly games with kernel mode anti-cheat drivers, or utilities like disc-utilities or virus scanners, which require a low level kernel driver.
I don't do games, and I don't usually run any kernel mode cheats, or disk utilities, and I use defender for AV, but WOA doesn't run all I need anyway. And with all the builds, sometimes it works better than others, and sometimes less.

I encountered very few applications, which did not run due to shortcomings of emulation itself. And while you are technically correct, that not all apps are running, you somehow make it sound, as if there were a larger gap.
All I can go by is my own experience. The biggest problem for me is there's no licensing available for it. I can work around the apps I can't use, but I can't change the licensing situation. Will it ever change, who knows, but I wouldn't bet money on it.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.