Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

crazy dave

macrumors 65816
Sep 9, 2010
1,453
1,229
Better in what respect?
I mean why Alder Lake needs 2x power to beat AMD? Why its 11 gen is so bad comparing with Zen 3?

While it's hard to know for sure without the same designs being taped out on different fabrication nodes, 3rd party analyses estimate that indeed Intel's 10nm power per area and density are roughly equivalent to TSMC 7nm (maybe better in some respects, a little worse in others depending on the design) and Intel's 7 node (10nmESF or ++ in the old system) should therefore be better. This is what I've been referring to as the good news/bad news for Intel. Their fabrication nodes aren't as behind as people commonly think, but their microarchitecture (uarch) is - especially the P-core uarch. The reason the Intel E-cores exist on Alder Lake silicon is because the P-cores take up so much die area and run so hot that Intel has trouble matching AMD in core count and MT (multithreaded) performance. Thus the E-cores exist as another layer of almost SMT/HT (hyper threading) rather than what we see in the ARM space where the E-cores are true efficiency cores, meant to be actually low power and take care of background tasks. As I've written before, if we were to borrow the terminology from ARM, then this is huge.Medium rather than big.Little (while ARM themselves have transitioned to effectively big.Medium.Little, often Arm's "big core" is probably as small as Intel's "medium" if not smaller ;)).

Zen 3 was a remarkable piece of engineering. AMD didn't even change fabrication nodes and came up with a new design that kept the best parts of Zen 2 and improved everything else. Supposedly it even took AMD by surprise when they first realized just how big an improvement it really was - on their internal roadmaps they hadn't expected quite this level of IPC (Instructions Per Clock) increase going from Zen 2 to 3. Sufficed to say, there's a reason Zen 3 won the Anandtech Gold award.

It should be noted that the 2x power consumption is a bit of an exaggeration. As Andrei from Anandtech wrote on Twitter, ADL only actually breaches the 200W mark in AVX2 (vector) workloads were it does also manage to extend its performance lead over AMD. Otherwise it keeps in the same wattage regime as AMD and may beat it in perf/W in many “ever day” MT workloads (even in AVX2 loads).


You can also downclock the ADL i9 chip so that it'll only use about 150W total and only get minor drops in performance (like 15% or so) - past that the performance drop will be linear. This is the "elbow" on the power curve. Intel likes to push their cores this way as the default to claim performance wins, but it really does burn a lot of extra energy for every bit you push the core (AMD does this too, but less extreme).

Finally, part of Andrei's tweets above alludes to one of the few problems with the Zen 3 chips: while a lot of the main parts of the chip like the cores themselves are manufactured at TSMC 7nm, for any non-monolithic die, the IO die is not (things like the fabric connections with core-to-core communications and core-to-memory and PCIe). It's manufactured on GloFo's like 12 or 14nm I think. Space-wise it probably doesn't make much difference as you can only shrink down the stuff in an IO die so much, but with respect to power? It's blamed for why the AMD chips are not even more power efficient than they are. Why did AMD do this? Probably because it was cheaper aaaannnnd because they are contractually obligated to buy a certain number of wafers from GloFo since the split.

TLDR conclusion: Thus Intel enjoys a huge fabrication advantage in the IO die and probably a smaller one for the cores themselves. But AMD is still competitive since their core and chiplet designs are that much better. If Intel and AMD both keep to their roadmaps next year, they'll both be on new nodes and new designs (and there are rumors that Zen 4 IO will be on TSMC).

======

As for why 11th gen Intel was so bad - well Tiger Lake mobile actually wasn't that bad, but Rocket Lake desktop ... yikes. This comes down to Intel's manufacturing woes. Intel tied a lot of their core designs to specific fabrication processes believing they would always keep pushing fabrication forwards. When forward movement on those processes stalled, their entire strategy fell apart. Tiger Lake was what Intel was supposed to put out years ago. Even worse, it's believed that the initial Intel 10nm fabrication nodes weren't suitable for the high power needed for desktop chips. Intel couldn't pump out desktop processors on it and had to rely on 14nm with some ungodly number of pluses. Rocket Lake was the last of such chips and was particularly bad because they back ported Ice Lake on 10nm to 14nm. This back port was such a disaster that frankly I'm shocked that they actually launched it rather than simply reduce prices on Comet Lake and wait 6 months for Alder Lake. The only thing I can think of is that they felt needed something out there and given that Alder Lake was a huge risk given the design, they wanted something out in the wild in case ADL fell flat on its face. Some tech journalists have argued that this experience was good for Intel as if they need to do this again (i.e. backport a design), they'll have learned from it and future chip designs will be more fabrication node agnostic to make it easier. And maybe this is true, but yeah Rocket Lake was bad. Tiger Lake was just late rather than intrinsically bad, but even so TGL still does highlight how inefficient Intel's uarch designs really are.
 
Last edited:

magbarn

macrumors 68040
Oct 25, 2008
3,018
2,387
Very informative post. Intel needed the 14nm beatdown to teach them a lesson on complacency. Their 14nm era reminded me of GM in the 80-90's where they put out seriously crappy vehicles for awhile.
 
  • Like
Reactions: JMacHack

deconstruct60

macrumors G5
Mar 10, 2009
12,493
4,053
Well, moving from LPDDR5 to DDR5 doesn't really give performance penalty or latency....power consumption may go up by around 20~30% but would be a non-issue on desktop platform like Mac Pro.


That is largely grounded in a presumption that Apple is designing a largely custom SoC for the Mac Pro from the outside ( the chassis ) in as opposed to looking at what they have ( M1 Max ) and designing out with cost effective changes.

It is a 1400W chassis and massive cooling so throw Perf/Watt out the window and go 'buck wild'. That has consistently not been the explicitly stated primary design objective. Even for this last M1 Pro/Max presentation it is Perf/Watt first and the rest flows from that.

The notion that Apple is out to build a AMD Epyc Zen3/4 , Intel Xeon SP Gen 4 , and Ampere Altra 80/128 'killer' SoC is probably miss setting expectations.

The dual edge sword is your assertion that the penalty and/or latency is not substantially different. If Apple can get approximately the same performance at. 20% less watts then the. Perf/Watt index is higher. If Perf/Watt is the number on priority goal then ..... they would far more likely take the higher index path.



DDR5 is pretty much the only option that can go above 1TB of memory capacity,

Apple sacrificed "max capacity" with the M1. ( the top end Intel Mini is still around a year later. Probably will be around months into 2022, so will be well over a year. )

There was not a "no sacrifices; go for ultra max" policy on max RAM cap for the 2009-2012 Mac Pro. ( just four DIMM slots per CPU). The MP 2013 ... again just four DIMM slots per CPU ). The Mac Pro 2019 is more of an outlier than a norm.

If they are chopping the overall system size into "half" ( chopped own on one dimension ) or a "quarter" (chopped down on two dimensions) then Max DIMM slots probably would get the axe. ( the backside of the board gets smaller just like the front side does if chop it in half or even more so into a quarter of the current size. ). Going to have issues getting to > 1TB RAM if don't have enough DIMM slots even if they are present in smaller number.



and wouldn't require significant modification of Jade SoC in terms of memory IO. Reason I think there is a pretty good chance next Mac Pro (or whatever end up being) may end up using DDR5.

It would require substantive changes in terms of memory subsystem implementation. something like.

4 * 8 * 16 = 512 = 4 * 4 * 32

presents as "easy" redistribution of a factor of 512 , but the actual implementation has other issues. The bandwidth/throughput balance of the whole system for the GPU core components has requirements that don't really show up there either.

Technically not a moonshot project , but is Apple going to spend the effort ? [ it is already going to be a quite substantial effort to coupled dies so that get minimal NUMA effects without paying a high Perf/Watt cost. More likely that effort has higher priority. Apple probably doesn't want the expense of a substantively different GPU driver stack. ]
 

throAU

macrumors G3
Feb 13, 2012
9,204
7,356
Perth, Western Australia
Power of the new Intel Adler CPU is amazing . Here is a benchmark from i7-12700K - price about 470$ per unit...
More Faster than any M1 Max or Pro. More faster than mac Pro with 16 cores Xeon.

Thing is, m1 (pro/max) can do that while rendering out a bunch of video, doing AI work in the background or crunching through GPU compute at the same time. For extended periods.

Alder lake? Boost is a bit of a joke before thermal throttling (even a liquid cooler will heat soak eventually; ask me how I know - it will also absorb heat from any air cooled GPU in the enclosure) and it simply doesn’t have the additional processors onboard.

I find it hilarious that intel fans are claiming victory on a test that doesn’t even use half of the m1 pro/max dedicated components, consumes more than 4-5x the power (without including a competent GPU) and requires a liquid cooler to dump that heat into the room.

?
 
Last edited:

JMacHack

Suspended
Mar 16, 2017
1,965
2,424
Very informative post. Intel needed the 14nm beatdown to teach them a lesson on complacency. Their 14nm era reminded me of GM in the 80-90's where they put out seriously crappy vehicles for awhile.
I’d say that period of crappy vehicles was more 1974-2008 (excluding trucks)
 

JMacHack

Suspended
Mar 16, 2017
1,965
2,424
Pretty much all of them left as I understand it.
I know Intel has had an internal conflict between salaried “blue tag” employees and contract “green tag” employees a couple years back. They were replacing regular “blue tag” workers with cheaper “green tag” workers and the blue tag guys started treating green tags like **** because they saw them as a threat.

I’d guess more than a few people jumped ship at that point.
 
  • Like
Reactions: throAU

satcomer

Suspended
Feb 19, 2008
9,115
1,977
The Finger Lakes Region
Right now we arguing media statements from marketing departments of giant dying media saying "trust me it's this good"! I won't trust Intel on thermals until they can beat AMD again or just blowing smoke again!
 

McDaveH

macrumors member
Dec 14, 2017
30
15
Don't worry about them, they still sell way more PC processors than everyone else and by a good margin. AMD is the only real threat in that area, and we users don't really need to care about that, x86 will live on. It'll be decades before x86 becomes hard to get, even if there's a major switchover. Same for Windows and Microsoft.

Worry about it when it becomes a problem, and that wont be anytime soon.

And besides, there really isn't anything that special with Arm based processors, lower power is the only advantage they have and there's no guarantee they wont be surpassed too. We wont have a real revolution in computing until we give up the current digital processors for something else.
Depends, with so may PC shipments being made to corporate/businesses all their procurement/IT depts have to decide is if Windows11/ARM's all-day battery life is better to run the small portfolio of (largely web) apps their workforce needs. When (not if) this happens, Intel will find things difficult.
 

McDaveH

macrumors member
Dec 14, 2017
30
15
That so many missed. Next step, a ray tracer (please). I saw once a video (can't find it) where a 10W chip performed better at ray tracing than a monster NVIDIA card because the former was dedicated to the task. I believe that was 5-7 years ago...
Yes but only accessible via the Metal Raytracing API, this will drive Apple-specific software adoption and allow Apple to evolve the parts under the hood. I've been looking out for RT enhancements in the G13/G13X GPU core but there seems to be none yet. Last year Will Usher (an Intel engineer on Embree/OneAPI) has recompiled his RTChameleon project to use MPSRayIntersector (not sure about acceleration structures) but found only a modest gain of 60-80% (from memory) over CPU performance but given that's sub-optimal too it's difficult to know where engineering stops & politics starts.
 

McDaveH

macrumors member
Dec 14, 2017
30
15
I have been quietly sitting back and listening (reading) to lots of different really great opinions and viewpoints and it‘s really great to hear all of the different perspectives here. :)

If I may, I’d like to add a viewpoint that I hope contributes to the friendly discussion that is taking place :)

The conversation around ‘lack of optimized software‘ for M1 I think warrants further discussion - particularly against, what does that actually mean!?.

There is definitely room to grow for some of the larger apps like Adobe, AutoCAD, Maxon Cinema 4d etc… However there is also a wealth of apps already today that have been ported at the very least from x86 Rosetta —> to ARM64 and in the best cases have been fully optimized on Apple Silicon to use the accelerators and co-processors that only become available to you when you go through Apple‘s API and Compiler stack. I think that last part and level of optimization is important to differentiate between … ARM64 optimized and Apple Silicon optimized.
Of course, not all problems lend themselves to Apples’ co-processors and accelerators.
Yet, when you consider all the background jobs that are taking place when even running menial tasks on Windows or Mac OS, no doubt farming off instructions to dedicated, fast, high energy efficient co-processors then also has the side benefit of freeing up the ALU/FPU for other more traditional workloads. Everybody wins!

A case in point is utilization of the AMX co-processor in Apple Silicon. Apple’s own native APIs compile to instructions targetting co-processors such as the AMX matrix co-processor that I believe is still not publicly acknowledged by Apple (likely for ARM licensing reasons) but nevertheless is a matrix process that speeds up matrix workloads (as succinct from the neural engine that is advertised by Apple).
Here is a really lovely essay from Erik Engheim that presents this in a vastly superior way to any way that I can regurgitate here!
Link to M1, Co-Processors and Accelerators discussion

Apples SoC architecture and overall ‘own the entire widget’ approach to design uniquely positions them to take a path forwards towards long term scalable performance beyond the traditional ‘throw more wattage, increase clock speed, use longer pipelines, shrink to a smaller die, throw more cores’ approach.

Intel on the other hand (at least from a business perspective) doesn’t have the same ‘ease’ with which a similar SoC approach could be taken towards performance because adding co-processors and accelerators to your silicon design means that you also need to have tighter integration and industry alignment (this one in particularly should not be underestimated from a business perspective). Because Intel and AMD need to partner with large vendors such as Microsoft - they need to ensure partnership, agreement and alignment with their silicon vision so that the entire ecosystem (dev tools, Operating System, right down to the silicon) is aware of these co-processors / accelerators and can take advantage. This takes time. Getting alignment in a single organization is challenging, to do so across companies is incredibly difficult. In that sense, one could argue that this is more a business ‘people’ problem than a technical problem.

Apple can not only drive more efficient (and more powerful designs per wattage) from their approach, but can unilaterally dictate the rollout timeframes (at least to native 1st and 3rd party software) for solutions. They still need to convince 3rd party developers to develop for what is effectively a niche Mac platform as judged by market share. However at least today when a developer builds for Mac, he/she/they are also in a position to port to iPad or iPhone where Apple commands a sizeable market share and in turn a larger revenue stream worth pursuing.

That being said, Apple is making big efforts to contribute to open source projects in order to drive Apple Silicon optimization where possible.
Regarding cinebench and Maxon 4d, I fully expect to see significant performance improvements as and when Maxon 4d optimizes more for Apple Silicon stack.
These numbers that we are seeing today are IMHO a worst case scenario for M1, M1 Pro and M1 Pro Max - and yet we are comparing a laptop chip (very favourably on a raw performance) with the absolute latest and greatest desktop/workstation class Core i9 Desktop/Workstation chip.

Finally I’m not sure if anybody checked out Apples videos on raytracing and ray tracing acceleration during WWDC this year - but there is some nice documentation on how to accelerate ray tracing on Apple Silicon https://developer.apple.com/documentation/metal/accelerating_ray_tracing_using_metal/
Obviously the level or precision may not be sufficient for some of the fine folks here where a fall back to more traditional CPU core execution would be required. Never the less, Apple had a lovely demo during WWDC on acceleration of ray tracing and how to optimize for a TBDR versus an immediate renderer. Again here, I expect to see optimization improvements in 3rd party renderers over time :)

Thanks for humouring a long diatribe! Hope everybody is having a really great Sunday and enjoying their MacBooks and Alderlakes.
You touched on a few pertinent points there and developments on Github are interesting as the Embree-Aarch64 project has just been folded into Embree itself. Embree optimisation is significant as it dictates the Cinebench and Blender Cycles performance which is being used to judge CPU performance.

Difficult to see the politics behind this, as you put it ARM64 optimised isn't ASi optimised so it could be an accomodate to control manoeuvre. A dedicated macOS fork needs to be established for Embree to make use of the Accelerate framework and access the AMX2 units because as it stands, Cinebench isn't using full vector processing capability of the M1s so is not truly representative of the SoC's performance.
 

bobcomer

macrumors 601
May 18, 2015
4,949
3,699
Depends, with so may PC shipments being made to corporate/businesses all their procurement/IT depts have to decide is if Windows11/ARM's all-day battery life is better to run the small portfolio of (largely web) apps their workforce needs. When (not if) this happens, Intel will find things difficult.
I'm one of those corporate IT guys, and for most PC's in that setting, all day battery life isn't an issue. As for web apps, we have none as fat clients work ever so much better for productivity.
 

McDaveH

macrumors member
Dec 14, 2017
30
15
I'm one of those corporate IT guys, and for most PC's in that setting, all day battery life isn't an issue. As for web apps, we have none as fat clients work ever so much better for productivity.
Personally I agree with you on natives but the SaaS market is real & huge, flexible working & the notebooks required to deliver it has dominated PC shipments due to global events of the last 18-months. I think your perspective isn’t reflective of enterprise clients as a whole.
 

jinnyman

macrumors 6502a
Sep 2, 2011
762
671
Lincolnshire, IL
I'm eager to see how AS in desktop sector fare compared to what x64 can offer.
As for mobile sector, I don't see any thing on x64 side that can compete with performance/power efficiency of AS for forseeable future.
 

bobcomer

macrumors 601
May 18, 2015
4,949
3,699
Personally I agree with you on natives but the SaaS market is real & huge, flexible working & the notebooks required to deliver it has dominated PC shipments due to global events of the last 18-months. I think your perspective isn’t reflective of enterprise clients as a whole.
There's no way i could claim i speak for all enterprise shops, we all have different needs too, just like normal people, but I can say with extreme confidence that we are not unique in computer usage, and there's a bunch of us that do the same types of things, especially older businesses that have built up a sizable code base.
 

JMacHack

Suspended
Mar 16, 2017
1,965
2,424
Depends, with so may PC shipments being made to corporate/businesses all their procurement/IT depts have to decide is if Windows11/ARM's all-day battery life is better to run the small portfolio of (largely web) apps their workforce needs. When (not if) this happens, Intel will find things difficult.
At one point I may have been bullish on the change away from x86. But seeing as how “far ahead” Apple is over other ARM chips and how aggressive Intel and AMD are becoming over being “the fastest” I’m not as sure.

Still I don’t think we have the full picture yet. It’s just a year into Apple’s own switchover, and they’re ahead of the curve. If there’s gonna be a switch from x86 dominance (which would be required for Intel to face any real trouble), it would take years I feel.

I’m not good at predicting things, but looking at stuff currently, other chipmakers would have to put serious effort into ARM over x86 to win over PC users and corporate buyers. As it currently stands, I can only see that capability coming from NVidia, who don’t own an x86 license. The other powers that be have difficulty meeting A-series in performance and efficiency, and if they scaled up their chips there’s no guarantee we’d see the same efficiency as the M-series.

There's no way i could claim i speak for all enterprise shops, we all have different needs too, just like normal people, but I can say with extreme confidence that we are not unique in computer usage, and there's a bunch of us that do the same types of things, especially older businesses that have built up a sizable code base.
This is true, it’s what keeps COBOL devs employed.

Still, I think it’s wrong to dismiss any possibilities right now. Everything has gone haywire for the past couple years and that’s a situation ripe for change. Even if current businesses don’t want to change what works, there’s no guarantee that another business without the legacy baggage can’t come along and take over.

That’s abstract of course, I don’t know how it would pertain to measuring wool fibers in your case.
 

bobcomer

macrumors 601
May 18, 2015
4,949
3,699
This is true, it’s what keeps COBOL devs employed.

Still, I think it’s wrong to dismiss any possibilities right now. Everything has gone haywire for the past couple years and that’s a situation ripe for change. Even if current businesses don’t want to change what works, there’s no guarantee that another business without the legacy baggage can’t come along and take over.
I don't disagree, it's the time frame people talk around here that I disagree with, where they say years I *know* that's definitely not realistic. Get into decades, yes, that's possible, nothing less than that.

Apple habit of breaking things will only make things longer.

That’s abstract of course, I don’t know how it would pertain to measuring wool fibers in your case.
Measuring wool fibers isn't anything I do, but I do support the computers that allow them to do that, and it's even more proprietary.
 

JMacHack

Suspended
Mar 16, 2017
1,965
2,424
I don't disagree, it's the time frame people talk around here that I disagree with, where they say years I *know* that's definitely not realistic. Get into decades, yes, that's possible, nothing less than that.

Apple habit of breaking things will only make things longer.
Just to be contrarian, I’d like to say that Apple is not unique in that position. I know some of my dev friends complain about the “move fast, break things” attitude many software companies have adopted.

And I know I complain about useless new features that break **** when Adobe updates their software.

That aside, I agree. It’s easy to get lost in the hype when the cpu competition has been re-ignited and interesting things are happening again. Rome wasn’t built in a day.
Measuring wool fibers isn't anything I do, but I do support the computers that allow them to do that, and it's even more proprietary.
Oh I see, is it the peripherals that are proprietary, the software or are these embedded machines, or a combination of those?
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,521
19,679
At one point I may have been bullish on the change away from x86. But seeing as how “far ahead” Apple is over other ARM chips and how aggressive Intel and AMD are becoming over being “the fastest” I’m not as sure.

Have you seen Amazon's Graviton3? ARM is not holding still — they are rapidly iterating and approaching the execution width of Apple designs. Their latest cores do not prioritize die area over everything else — X2 and friends are out for blood.

Everything is developing very very quickly. I think within 2 years basic ARM designs will be competitive performance-wise with high-end Intel, but with a significantly lower power consumption.
 
  • Like
Reactions: throAU and jdb8167

cmaier

Suspended
Jul 25, 2007
25,405
33,474
California
Have you seen Amazon's Graviton3? ARM is not holding still — they are rapidly iterating and approaching the execution width of Apple designs. Their latest cores do not prioritize die area over everything else — X2 and friends are out for blood.

Everything is developing very very quickly. I think within 2 years basic ARM designs will be competitive performance-wise with high-end Intel, but with a significantly lower power consumption.

Absolutely. There’s nothing to suggest that suddenly the flat improvement slope for x86 will launch upwards, and suddenly the slope of arm improvement, which has been constant for 10 years, will flatten.
 
  • Like
Reactions: throAU
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.