Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Confused-User

macrumors 6502a
Oct 14, 2014
850
983
Hardly. A context switch is around 5 microseconds. If you run each thread for 1ms, you have spent one second running the threads and 0.5ms on context switching (let’s round this up to 1ms because I feel generous). That’s 1% loss. If you run each thread for 10ms, that’s 0.1% loss. And so on.
I believe you're off by an order of magnitude... though the actual numbers make your argument stronger, so that's ok. :)
It's 0.1% and 0.01%.
 

Confused-User

macrumors 6502a
Oct 14, 2014
850
983
I believe you're off by an order of magnitude... though the actual numbers make your argument stronger, so that's ok. :)
It's 0.1% and 0.01%.
@leman OK this is pretty funny - I just noticed this now...

Your actual bottom-line numbers were right, even though you did have an order-of-magnitude error. You wrote "A context switch is around 5 microseconds. If you run each thread for 1ms, you have spent one second running the threads and 0.5ms on context switching" - and that's where the error was. If you run each thread for 1ms, that's 1000 switches, which means 5ms context switching, not .5ms, and if you round up, then yes, you do have the 1% loss you stated.

(And just to be clear, this does not in any way invalidate his, roboto's, or my point.)
 

Icelus

macrumors 6502
Nov 3, 2018
421
574

Working
  • Touchpad
  • Keyboard (only post suspend&resume, i2c-hid patch WIP)
  • eDP, with brightness control
  • NVME
  • USB Type-C ports in USB2/USB3 (one orientation)
  • WiFi
  • GPU/aDSP/cDSP firmware loading (requires binaries from Windows)
  • Lid switch
  • Sleep/suspend, nothing visibly broken on resume
Not working
  • Speakers (WIP, pin guessing, x4 WSA8845)
  • Microphones (WIP, pin guessing)
  • Fingerprint Reader (WIP, USB MP with ptn3222)
  • USB as DP/USB3 (WIP, PS8830 based)
  • Camera
  • Battery Info
Should be working, but cannot be tested due to lack of hw
  • Higher res OLED, higher res IPS panels
  • Touchscreen
 

crazy dave

macrumors 65816
Sep 9, 2010
1,450
1,219
Tom's just published a short article with an SXE die shot.

The article title is pretty amusing, considering the comparison to the M4 within the article!
Comparing the die shots of the Snapdragon Elite with the M4 and the AMD Strix Point:



So one thing I didn't realize is that I thought the AMD chip was on regular N4, which I believe is what the Qualcomm Snapdragon SOC is fabbed on, but in fact the AMD chip is fabbed on the slightly newer N4P which has 6% more performance than N4 but should have the same transistor density which is what we care about here.

However, here are some interesting numbers (all numbers for the AMD chip and SLC for M4 I estimated based on square pixel area ratios compared to total die area):

all sizes in mm^2CPU coreAll CPUs + L2L3CPU + L2 + L3Die
Snapdragon Elite2.5548.7 (36MB)5.09 (6MB)53.79169.6
AMD Strix Point3.18 Z5 / 1.97 Z5c42.6 (8MB)15.86 (16MB + 8MB)58.5225.6
Apple M4*3.00 P / 0.82 E27 (16MB + 4MB)5.86 (8MB?)32.86165.9
*Unclear if the AMX (SME) coprocessor is being counted here, I don't think it is, so M4 numbers might be off. Maybe someone else who knows how to read a die shot can find it and confirm. Also it'd be great if someone could dig up and M2 or even better M2 Pro annotated die shot as manufactured on N5P would be the most apples-to-apples (pun intended) comparison point.

Right off the bat, this is why I don't consider comparisons the multicore performance of the Strix Point or the Elite to the base M-series "fair". We already knew this just from core count and structure alone, but we can see that the Elite CPU and Strix Point CPU areas are massive compared to the Apple M4. Another thing that stands out is that the Apple/Qualcomm SLC ('L3") seems fundamentally different from the AMD Strix Point L3 which appears to function much more similarly to the L2 of the Elite/M4 (i.e. the L3 of the AMD chip is per CPU cluster rather than a last level cache for the SOC). Thus I would actually consider the appropriate comparison of sizes to be as so:

all sizes in mm^2"CPU Size"
Snapdragon Elite48.7
AMD Strix Point58.5
Apple M4*27

Further not broken out are the L1 caches for the various CPUs. So here are their relative sizes in KiB (I only have data for M3, unclear if same or bigger for M4):

L1 cache per core (instruction + data)
Snapdragon Elite192+96 KiB
AMD Strix Point32+48 KiB
Apple M3192+128 KiB P / 128+64 KiB E

In other words a much larger portion of the Elite and Apple ARM core is L1 cache compared to the Strix Point. Cache is more insensitive to die shrinks and points to a smaller transistor density needed for logic even beyond what we see above where it might appear that the Zen 5 core and especially the Zen 5c core are beginning to match the Apple M4 in size and despite them being on a less dense node. That said, the M4 is clearly a beefy ARM CPU, no doubt its extra, extra wide architecture is playing a role here.

Differences in vector design likewise play a role in core size. I believe the Elite has 4x128b FP units and I can't remember if the M4 has 4 or 6 such 128b NEON units. Strix Point cores are 256b-wide but with certain features that allow them to "double-pump" AVX-512 instructions making them larger and more complex than normal AVX-256 vector units. I believe there are 4 such vector units (unsure if the "c" cores have fewer).

Comparing the Elite and the Strix Point, the Strix Point CPU is about 20% bigger (and die is overall 33% bigger too) despite slightly bigger L2 and L3 caches in the Elite. Smaller and manufactured on a slightly older node, the Elite should be significantly cheaper than the Strix Point and the smaller CPU is part of the reason why. How does performance fair between the two? Next post, which also compares them (unfairly in multithreaded) to Lunar Lake.
 
Last edited:
  • Like
Reactions: senttoschool

crazy dave

macrumors 65816
Sep 9, 2010
1,450
1,219
A revisualization of Notebookcheck's Cinebench R24 performance and efficiency data.

Screenshot 2024-09-25 at 6.34.34 AM.png


Details: This is only results from one benchmark, Cinebench, which has gone from being one Apple's worst performing benchmarks in R23 to one of its best in R24. As I am interested in getting as close as possible to the efficiency of the chip itself, power measurements above subtract idle power which NotebookCheck does not do when calculating the efficiency of the device. With the release of Lunar Lake, an N3B chip, I've added M3, Apple's corollary to Lunar Lake, and estimated M3 Pro's efficiency based on its power usage in R23 and the base M3 CPU's power/performance in R23/R24 (NotebookCheck did not have power data for R24 for the M3 Pro). I feel it is maybe overestimating M3 Pro's efficiency a little, but not by enough to matter given the gulf between it and every other chip. The M3 was in the Air, given that Cinebench is an endurance benchmark, its score and power usage will both likely be higher in the 14" MacBook Pro. I did also have the Snapdragon X1E-84-100 but removed it since it was clutter and didn't add much. The Ultra 7 258 is one of the upper level Lunar Lake chips, but not the top bin - the 288 might improve efficiency/performance somewhat by having better silicon, but the effect will be small relative to the patterns we see.

So what do we see? First off, Lunar Lake has great single core performance and efficiency ... for an x86 chip - helped perhaps by being on a slightly better node, N3B, than the M2 Pro (N5P), HX 370 (N4), and Snapdragon (N4). Despite this advantage, the Snapdragons on N4 and M2 Pro on N5P are still superior in ST performance and efficiency. The Intel 7 288V might increase performance to match or slightly beat the Elite 84 (not pictured), but it would be at the cost of even more power. That said, the efficiency and performance improvements here are enough to make x86 potentially competitive with this first generation of Snapdragons - at least enough that with compatibility issues, Intel can claim wins over Qualcomm and begin to lessen its appeal.

However, this has come at a cost of MT performance. The prevailing narrative is that without SMT2/HT, Intel struggles to compete against AMD and Qualcomm. And to certain extent that's true, but with only 4 P-cores and 4 E-cores in a design optimized for low power settings, it was never going to compete anyway. The review mentions it gets great battery and decent performance on "everyday" tasks in stark contrast to full tilt performance represented by Cinebench R24 and that after all this for thin and lights. The closest non-Apple Lunar Lake analog is the 8c/8T Snapdragon Plus 42 whose ST performance is a little lower than the 258V, but with much, much greater efficiency and whose MT performance and efficiency is much better than the Lunar Lake chip. However, the Snapdragon Plus 42 has a significantly cut down GPU which was already the weakest part of the processor. I'm not saying it can't provide compelling product, especially if priced well, but given the compatibility issues it's tougher sell for Qualcomm that it would've been last year. As for AMD, there is no current analog to the Lunar Lake in AMD's lineup. Sure, a down clocked HX 370 gets fantastic performance/efficiency at 18W ... but that's to be expected from a 12c/24T design which would frankly be cramped inside thin and lights - its not really meant for that kind of device. It's a Mx Pro level chip at heart and should be compared to the upcoming Intel Arrow Lake mobile processors. AMD's smaller Kraken Point is supposedly coming out next year with a more similar CPU but again is rumored to cut the GPU and according to the notebookcheck review, the Intel iGPU in Lunar Lake is already competitive with if not better than the AMD iGPU in the larger Strix Point. It's fascinating how AMD and Qualcomm both designed more workload-oriented CPU-heavy designs while Intel has basically designed Lunar Lake to be like the base M3, more well rounded.

But that brings us to the M3 and the comparisons here are pretty ugly for all of its competitors. Again, Apple tends to do very well in CB R24, so we shouldn't extrapolate from this one benchmark that it will be quite this superior to AMD, Intel, and Qualcomm in every benchmark. With that caveat aside ... damn. The ST performance and efficiency are out of this world and simply blow the other N3B chip, the Lunar Lake 258V, away with both a large performance gap and an even larger efficiency gap, nearly 3x. Even the M2 Pro and Snapdragons are just not that close to it. Sure in MT a down clocked Strix Point can match the base M3's performance profile at 18W, but that is a massive CPU by comparison and the comparable Apple chip to the HX 370, the M3 Pro, is leagues better than anything else in this chart, including the HX 370. I have to admit: while Apple adopted the 6 E-core design for the base M4, if the M4 Pro doesn't have its own bespoke SOC design and is a chop of the Max, then, depending on how Apple structures the upcoming M4 Max/Pro SOC, it'll be a shame to see Apple lose a product at this performance/efficiency point. The M3 Pro is rather unique. Also, its 6+6 design really highlights how improved the E-cores (and P-cores) were moving from the M2 to the M3, especially in this workload.

Meanwhile the two chips of comparable CPU design to the base M3, the Plus 42 and the 258V, are simply not a match for the base M3 in MT requiring double or more power to match its performance or otherwise offering significantly reduced performance at the same power level. Intel claimed to match/beat the M3 in a variety of MT tasks in its marketing material, but aside from specially compiled SPEC benchmarks, you can see how much power it takes for it to actually do that. Basically Apple can offer a high level of performance (for the form factor) in a fan-less design and its competitors, including the N3B Lunar Lake, simply cannot. Also like Lunar Lake, Apple also went for a balanced design here opting for powerful-for-its-class iGPUs to be paired with its CPUs (though obviously some of these chips, especially the Strix Point can also be paired with mobile dGPUs). There is a point to be made about the base MacBook Pro 14"'s price which is quite expensive, has a fan, and is still the base M3 with a low base memory/storage option - but even so, as we can see above, the base Apple chip is not without its merits at that price point/form factor. Oh and ... this is the performance/efficiency gulf of the newest generation of AMD, Intel, and Qualcomm processors with the M3 ... with the M4 Macs about to come out. 😬

Comparing the Snapdragon and HX370, the higher end Elite chips (e.g. the Elite 80) should be on the same multicore performance/W curve as the HX370 whilst being 20% smaller on a slightly older performing node (same density though). Single thread is a similar story but greatly exaggerated: the Oryon core size is again roughly 20% smaller than the Zen 5 core but with much greater ST performance and efficiency. This represents an advantage in manufacturing and perf/W of ARM chips relative to x86. That said, is this a big enough advantage to overcome compatibility issues? Maybe, maybe not. Further we don't yet know how Arrow Lake and Intel's new cores compare in size yet. Compatibility issues may go down over time, but these advantages Qualcomm enjoys may also shrink.

Another thing that has occurred to me looking at this chart comparing the Qualcomm Elite to the M2 Pro is I once estimated that the Qualcomm Elite was missing 20% of its multicore performance in CB R24 based on how 12 M2 P-cores should behave. However, here we see that for the same ST CB R24 performance the Elite 80 is ... 20% less efficient than the M2 Pro's P-core. If that's true at lower clocks as well (i.e. in a multithreaded scenario), then that alone could explain the discrepancy. That's a big if, but it's plausible.

References:

 
Last edited:

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
it'd be great if someone could dig up and M2 or even better M2 Pro annotated die shot as manufactured on N5P would be the most apples-to-apples (pun intended) comparison point
P-core:
P-core.png

E-core:
E-core.png

GPU:
gpu.png

 
  • Like
Reactions: crazy dave

crazy dave

macrumors 65816
Sep 9, 2010
1,450
1,219
P-core:
View attachment 2430433
E-core:
View attachment 2430432
GPU:
View attachment 2430434
Thanks! Based on this, the M2 Pro's P+E CPU-complex would've been roughly the same size as the Snapdragon Elite's CPU, albeit with 4 smaller E-cores, 2 P-AMX units, 1 E-AMX Unit, and the 8 P-cores being slightly bigger. And a 6% density advantage for the Elite being on N4 vs N5P as I believe N5P has the same density as N5.
 
Last edited:

ewitte

macrumors member
Jan 3, 2024
45
25
Even if CPU is decent performance the RAM speed and GPU are massively slower. It's just normal dual channel ram with a bit higher clock.
 

nonns

macrumors regular
Sep 10, 2008
135
90
View attachment 2301539
View attachment 2301538

Maybe Apple Silicon is in a danger?
They’re comparing it to the m2 not the m3 or 4. Power consumption is important too. Gap closure will happen but it’s going to take time and there will still be many reasons to go with apple
 
  • Like
Reactions: genexx

genexx

macrumors regular
Nov 11, 2022
221
124
Well most Software does not Run or in Emulation on Win11 ARM or at least on Snapdragon X Elite.


Still no Real Linux, only talks and this with the Huge ARM Linux Lib.


Have you been aware of their Dev Kit´s with Windows Home ? Are you freakin kidding me ?


They burn much more Energy and are mostly loud.

I Have Windows 11 for Arm Pro 24H2 on my VM via UTM.

Their Marketing works well, but there is no new Jesus on the Arm Planet.
I am still very happy with my MacOS MBA M2 since 2 Years and full working Linux ARM on UTM.

But as a VM i can get around the lack of compatibility with Win11Arm because it does the minimal Parts it has to do.

I was on the Plan to may buy one but... Not usable today and not even end of next year as it seems.
Better get an M4.

What i like is the Specs the Price and The WebCam 4K, but what to use it for ? Surfstation with some VideoCalls and then change to the Real PC/Mac ? NOPE

I still Like my Lenovo 14" AMD with Dual Boot Win11Pro with several VM´s and WSL / Zorin and 32GB/2TB.
Tamed the Fan with a diy Upgrade of the Cooler and it is very silent.
The Fan barely switch on.

Bought a Canondale Tesoro X1, better for my health anyway.
 
Last edited:
  • Like
Reactions: schnaps

genexx

macrumors regular
Nov 11, 2022
221
124
You should check the software you use is supported before buying. Here, you’re welcome.
This List is nowhere near to be accurate or in anyway complete as i have noticed because i use Win11 for Arm 24H2 Pro on my UTM / MBA M2 VM since mid 2022 and have a lot of practice with Arm.

Since i use Windows for ARM since 2 Years i know there is a very slow developing and this is for a Reason.
To use the X Elite Based Laptops only as a Surfstation does not justify a buy.
That is the simple reason for sells do not reach expectations but what to expect if a Car does not drive ?

The M4 from the iPad leaves it in the dust anyway.
AMD Ryzen 9 AI is close.
Intel is working hard also.

I could imagine they had overplayed there Marketing and loose on the longrun.

As the original Company with Apple Personal was developing for Server i still do not get what Qualqomm has messed up so hard that installing Unix/Linux is that complicated.
There is so much ARM Linux out there...



My MBA M2 is doing a real good Job since 2022 and i feel no need to Upgrade.
If i would do more extensive 4K Video Editing or 3D there are plenty X86 NasaPC´s in my reach.

There is a Licensing Lawsuit also:

It is somehow funny to see them all try to catch the goal Apple has set.

😂🤣😅😇
 
Last edited:
  • Like
Reactions: schnaps

komuh

macrumors regular
May 13, 2023
126
113
This List is nowhere near to be accurate or in anyway complete as i have noticed because i use Win11 for Arm 24H2 Pro on my UTM / MBA M2 VM since mid 2022 and have a lot of practice with Arm.

Since i use Windows for ARM since 2 Years i know there is a very slow developing and this is for a Reason.
To use the X Elite Based Laptops only as a Surfstation does not justify a buy.
That is the simple reason for sells do not reach expectations but what to expect if a Car does not drive ?

The M4 from the iPad leaves it in the dust anyway.
AMD Ryzen 9 AI is close.
Intel is working hard also.

I could imagine they had overplayed there Marketing and loose on the longrun.

As the original Company with Apple Personal was developing for Server i still do not get what Qualqomm has messed up so hard that installing Unix/Linux is that complicated.
There is so much ARM Linux out there...



My MBA M2 is doing a real good Job since 2022 and i feel no need to Upgrade.
If i would do more extensive 4K Video Editing or 3D there are plenty X86 NasaPC´s in my reach.

There is a Licensing Lawsuit also:
To be honest new Intel laptops are probably best stuff in laptops i saw in a while outside of M-series.

My sister needed windows for work and she likes to play games from time to time so I bought her a zenbook with Intel 7 258V 32 GB of ram and 1TB SSD (srsly can we get back to normal names AMD and Intel just stop) and it got around ~13-14 h of battery before she start to load it (she is trying to use only 20-80% ranges so it is probably ~19h from 100% to 0 with 120Hz OLED) also it was ~100 USD cheaper than M3 Air 16 GB with 512 GB SSD and 60Hz screen here in EU not even comparing it to M3 Max or Pro prices.

And how the freak it is so snappy my Windows 11 with RTX 3090 and 12700k isn't even close to that level, not even saying anything about MacOS when i can't get 120Hz animations for switching Desktops and a lot of stuff is still locked to 30/60Hz animations which you can't do anything without turning SIP off (It is already 4 years from first 120Hz displays on MacBooks just fix it guys pleas, maybe get Tim Cook check and hire few extra software engineers to polish your OS for modern era).

If i didn't already have MacBook i would probably buy one for myself too as I rarely need to use any Apple software outside of work and I have Mac Ultra for that, maybe new AMD AI is also as good if yes we are probably close to parity between windows and apple laptops.
 
  • Like
Reactions: schnaps and genexx

genexx

macrumors regular
Nov 11, 2022
221
124
I would buy a Windows based Laptop for my Daughters if they needed a new one as well.

But for me Working on Linux Servers all Day there are many advantages using a Mac (MBA M2 on 165Hz 32" in Clamshell btw.) beside that i am just used to it since 1992 starting with DTP.

(the minimal VB6 SQL stuff is running on Win11 Arm since 2022 on my MBA M2 via UTM.)

If i would wish i could do it on the Nasa PC´from my Daughters or on my Lenovo AMD Laptop or the Z490 Hackintosh with Win11 also and Proxmox but it is snappy and very fast via the VM so...

The Mac Arm makes sense because my countless Software is running + it is a Unix System and the new X86 Concurrence from Intel and AMD makes sense as well because the Software runs.

Good they had to compete.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
Canonical is working on Ubuntu support for Snapdragon X Elite laptops. They have managed to run it on a Lenovo Thinkpad T14s.

I find it unbelievable that Linux support on Arm-based laptops is on a case-by-case basis.
 

Xiao_Xi

macrumors 68000
Oct 27, 2021
1,627
1,101
Do you know what's the technical reasons behind it?
It seems that x86-based laptops use a different mechanism to tell Linux what hardware they have than Arm-based laptops. x86-based laptops seem to use ACPI, while Arm-based laptops use Devicetree. For some reason that escapes me, that seems to mean that Arm-based laptops require a specific Linux kernel instead of a generic Linux kernel like x86-based laptops.
 

jdb8167

macrumors 601
Nov 17, 2008
4,859
4,599
It seems that x86-based laptops use a different mechanism to tell Linux what hardware they have than Arm-based laptops. x86-based laptops seem to use ACPI, while Arm-based laptops use Devicetree. For some reason that escapes me, that seems to mean that Arm-based laptops require a specific Linux kernel instead of a generic Linux kernel like x86-based laptops.
Device trees are common with embedded devices using Linux. Since most of those require building from source, it isn’t such a big deal. For example, Toradex boards come with specific builds for each generation of Linux supported.

This is a problem for pre-built distributions but isn’t solved yet so people are using the embedded solutions already supported by the Linux kernel.
 

mfram

Contributor
Jan 23, 2010
1,355
404
San Diego, CA USA
My opinion is the whole situation is a difference in the PC culture vs. the culture in creating embedded devices. PC culture created a system where PCs had to be backward compatible as much as possible and to have configuration standards so that many companies could create PCs. Mostly they just had to be compatible with Windows since that was the dominant O.S. in the industry. Embedded devices, by definition, were generally limited in storage for both code and data. So they couldn't afford crazy amounts of code and data to remain backwards compatible with previous generations of software. Plus, each device generally had its own proprietary operating system. The hardware and software engineers would work together and create both at once. Devicetree in Linux was a way to help standardize the method of telling the O.S. about the configuration of each piece of hardware. But DT is created manually by developers for each piece of hardware. It wasn't designed like ACPI where Windows could automatically detect the configuration by tables in BIOS/UEFI. Eventually the situation will probably get better out of demand for simplicity given the upcoming popularity of ARM-based PCs. But it's a transition that's going to take a little time.
 
Last edited:
  • Like
Reactions: Xiao_Xi and jdb8167

crazy dave

macrumors 65816
Sep 9, 2010
1,450
1,219
My opinion is the whole situation is a difference in the PC culture vs. the culture in creating embedded devices. PC culture created a system where PCs had to be backward compatible as much as possible and to have configuration standards so that many companies could create PCs. Mostly they just had to be compatible with Windows since that was the dominant O.S. in the industry. Embedded devices, by definition, were generally limited in storage for both code and data. So they couldn't afford crazy amounts of code and data to remain backwards compatible with previous generations of software. Plus, each device generally had its own proprietary operating system. The hardware and software engineers would work together and create both at once. Devicetree in Linux was a way to help standardize the method of telling the O.S. about the configuration of each piece of hardware. But DT is created manually by developers for each piece of hardware. It wasn't designed like ACPI where Windows could automatically detect the configuration by tables in BIOS/UEFI. Eventually the situation will probably get better out of demand for simplicity given the upcoming popularity of ARM-based PCs. But it's a transition that's going to take a little time.
One of the primary problems is that it is also a Linux culture problem as well. The best kernel maintainers are overworked and doing it in their spare time ... the worst are toxic and impossible to work with. ARM has been trying to get SystemReady (UEFI + TPM + other things) as a standard into Linux-ARM for years and years and years at this point - to obviate @Xiao_Xi's observation that, while DeviceTree may be simple and easy to work with, redoing the Linux kernel for every device is nuts and for all their faults a lot of these programs (UEFI + TPM + other things) being standard in x86-world mean the tools are standard, especially for server/cluster IT. But UEFI and ACPI are large complex programs with huge attack surface and potential for bugs and many kernel maintainers seem to resent that Linux had to support UEFI on x86 because they had to basically. This has led to ARM trying to wrangle chipmakers into getting SystemReady certified only for a lot of those to respond, "why? it's not even upstream in the kernel". This by the way is from frustrated ARM engineers, the maintainers in question may have a different viewpoint but given the drama around not just this, but Asahi Linux and Rust and hell just the use of the archaic mailing list as a development tool, there is a clear problem in the Linux kernel development world and a lot of burnout. Again I should stress that not every Linux maintainer is like this, but enough are that it is a cultural problem (never mind the larger cultural problems in free software development as a whole).
 

Confused-User

macrumors 6502a
Oct 14, 2014
850
983
Amazing that they decided to go nuclear. What is their endgame here? It does them no favors to boot QC out of the market and destroy the first real chance at ARM making real headway with Windows. Do they think they can win with their own cores? Nobody's even tried so far.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.