Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Bug-Creator

macrumors 68000
May 30, 2011
1,785
4,717
Germany
Don't see why only 2 "cool" core would mean anything as these are really only there to run the OS while idling (or run background tasks when heavy lifting is done elsewhere).

And I can tell you the latest version of macOS does idle quite well on my 12" rMB with only 2 not so "cool" cores which I'd guess are on par with a "slow&cool" M1 core ... for the 1st 10 seconds.....
 
  • Like
Reactions: Jorbanead

Falhófnir

macrumors 603
Aug 19, 2017
6,146
7,001
Sounds very unusual and unlikely to me, but the description is different enough that I have hard time believing this could be an M1-variant.

What really stands out are the 2 efficiency cores. I don't think that means there are simply fewer. I think that means totally redesigned efficiency cores, so that only 2 are needed. All of which speaks to an entirely new chip design.

It would certainly be unprecedented, but sounds like it might be true.
The problem I see there is it significantly increases the R&D cost... it's not impossible, but I think they'd need a very good reason to spend money creating more powerful efficiency cores when 4 of the ones they've got already will do fine... unless this is a next gen chip, and the 15th gen efficiency cores are so good they only need 2 (i.e. they're 2x as fast as ice storm)?

Edit: sorry I think I misread/ misunderstood your post the first time, think you're saying largely the same thing! :)
 
Last edited:
  • Like
Reactions: Lemon Olive

Bug-Creator

macrumors 68000
May 30, 2011
1,785
4,717
Germany
unless this is a next gen chip, and the 15th gen efficiency cores are so good they only need 2

Or.....

- they couldn't make the die bigger for some reason
- they couldn't connect more cores to the infinity matrix (CPU,GPU,NEURAL all end there)

Neither would be an issue for the A15 or the base M2 (if it is the new design) as these are smaller and have a lower overall core count and would also have the least negative impact for a "Pro" chip (compared to nixing a GPU or fast CPU).
 

vadimyuryev

macrumors member
Oct 3, 2017
65
209
I can't believe how many people think that the M1X is going into the redesigned MacBook Air.
This is laughable at this point.
 
  • Like
Reactions: Tagbert

cmaier

Suspended
Original poster
Jul 25, 2007
25,405
33,474
California
Don't see why only 2 "cool" core would mean anything as these are really only there to run the OS while idling (or run background tasks when heavy lifting is done elsewhere).

And I can tell you the latest version of macOS does idle quite well on my 12" rMB with only 2 not so "cool" cores which I'd guess are on par with a "slow&cool" M1 core ... for the 1st 10 seconds.....

No, that's not what those cores are "only there" for.
 

omenatarhuri

macrumors 6502a
Feb 9, 2010
994
1,027
32 GPU cores (assuming same per-core performance as M1) should outperform the RTX 3060, at least in the graphics department... Since it's also likely to be a new GPU architecture, we'd probably see somewhere around 10-15 TFLOPS compute power.
Is there something credible out there to support this? Genuinely curious. I'd love to see Apple put out such awesome GPU power, but I have doubts if they can run circles around Nvidia as they have around Intel.

What would be their fundamental design change that would allow such a step up?
 

CWallace

macrumors G5
Aug 17, 2007
12,528
11,543
Seattle, WA
I think two efficiency cores will be enough to handle the general background tasks and low-power needs, allowing more die space to be used to for the additional performance cores.

As for the GPU, there are the rumors of the "Lifuka" GPU so this could be an "external" GPU to keep the die size down on the main SoC by removing the on-die GPU cores as found on the M1. This GPU could still be on-package (like the DRAM of the M1) so it would benefit from physical proximity and high-bandwidth connections. We could also see a mix of on-die and on-package GPU cores like we have now with the Intel iGPU and AMD dGPUs.
 
  • Like
Reactions: smythey

leman

macrumors Core
Oct 14, 2008
19,522
19,679
Is there something credible out there to support this? Genuinely curious. I'd love to see Apple put out such awesome GPU power, but I have doubts if they can run circles around Nvidia as they have around Intel.

You mean other than benchmarks and GPU specs? Not much. My post was based on available graphical benchmarks, in particular the new 3DMark Wild Life that runs natively on both M1 machines and Windows PCs. The M1 scores around 5000 points in that benchmark, a desktop RX 3060 scores 18000. Quadruple the GPU cores and assuming linear performance scaling (as GPUs usually do, especially if they are so power efficient), and that's what you get.

What would be their fundamental design change that would allow such a step up?

No design change needed, you just use more GPU cores. Graphics (and compute) is a parallelizable tasks: you can just shade more triangles/process more compute threads in parallel using more compute cores (this is an oversimplification, but it gives you an idea).

A single M1-based G13 GPU core consists of 4 32-wide FP32 ALUs, making it 128 "compute units" per GPU core. Apple shader units are capable of a single floating point or integer operation per clock. Apple GPU in the M1 runs at approximately 1.26ghz, which gives it a peak compute throughtput of 2.6TFLOPS (1024 compute units * 1.26 ghz * 2 — the multiplier for fused multiply add which is a single operation that counts as two).

To compare this with Nvidia Ampere, it's "compute unit" (CUDA core) can do either a FP32+INT32 or two FP32 operations per clock. Take an RTX 3060 that has somewhere around 1920 units (it was never really clear to me) with the boost frequency of 1.77 ghz and you get the peak throughput of 1920*2 (two FP32 per clock)*1.77*2(again used multiple add) ~ 13.5 TFLOPS, which is the number you can see in Nvidia's marketing material.

Of course, these numbers should be taken with a big grain of salt as the represent theoretical maximums that will never be reached under normal conditions. I did reach the declared 2.6TFLOPS on the Apple GPU using a specially written compute shader that did nothing but a long chain of multiply+adds, but that's not how real world code looks like.

One thing about Apple GPUs is that they are just so incredibly power efficient. M1 (total power consumption including memory ~ 12 watts) needs just 4.6 watts per TFTLOP, while the RTX 3060 needs 13.5 watts (this number will be similar for all Ampere GPUs). So a hypothetical 32-core Apple GPU using G13 technology would be a 10TFLOPS part that uses just around 40-45 watts of power.
 

jeanlain

macrumors 68020
Mar 14, 2009
2,463
958
The rest of the rumor makes perfect sense. Up to 64GB RAM suggests quad-channel LPDDR5. Option of 16 or 32 core GPU suggests a separate GPU die.
Can this be reconciled with the unified memory architecture?
And will there be enough memory bandwidth for the GPU?
 

Hexley

Suspended
Jun 10, 2009
1,641
505
In terms of economies of scale & supply chain would it make any sense to make specialized chips specifically for mid-range to high-end Macs that probably sells less than 4.5 million units per year?

Would it not make more sense to have multiple M1 SoCs instead in 1 Mac? In terms of logic board space Macs have plenty compared to say a iPad Pro.
 

CWallace

macrumors G5
Aug 17, 2007
12,528
11,543
Seattle, WA
Can this be reconciled with the unified memory architecture?
And will there be enough memory bandwidth for the GPU?

If the GPU is on-package like the DRAM, then it should be able to use UMA and since the memory bandwidth is sufficient for 8 GPUs, I would think it should be sufficient for more.


In terms of economies of scale & supply chain would it make any sense to make specialized chips specifically for mid-range to high-end Macs that probably sells less than 4.5 million units per year?

Apple Silicon is designed to be scaleable and really the only "issue" is a higher-core-count SoC will have a larger die so you will not be able to make as many per wafer. So production costs will be a bit higher, but you're still running them through the same FAB on the same process so you still benefit from production scale cost reductions.

From a design standpoint, multi-chip packages are more complicated so I expect a 32-core "M1X" would have better performance than four 8-core M1s lashed together and the production costs would certainly be lower.
 
  • Like
Reactions: johnnyturbouk

cmaier

Suspended
Original poster
Jul 25, 2007
25,405
33,474
California
Can this be reconciled with the unified memory architecture?
And will there be enough memory bandwidth for the GPU?

It could all work with unified memory architecture. The latency may be higher just due to distance, but some of that can be negated with increased bandwidth and/or bigger system cache.
 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
Can this be reconciled with the unified memory architecture?
And will there be enough memory bandwidth for the GPU?

I don't see any principal problems here (but then again, I am not a chip designer). As I understand it, even with separate chips the topology essentially stays the same: you have multiple processors connected via a high-speed fabric to a common cache/memory controller (the last two components are what constitute unified memory). Just with an SoC, all these processors and caches are inside a single chip and the high-speed fabric is inside that chip as well. In a system-in-a-package design these processors can be on separate dies and the high-speed fabric is in the package substrate, connecting all them together.

This is essentially what AMD has been doing for a while: they have multiple CPU chips (chipsets) with 8-cores per chip, all connected via a shared I/O chip that hosts the cache and memory controllers. This is also the architecture of the upcoming Nvidia supercomputer chip as far as I understand. I think Apple will use something similar, because it just makes sense.

Bandwidth etc. is the same, the unstacked design might be slightly less efficient in terms of latency and power efficiency but it's not a big deal. It's cheaper and more flexible.

In terms of economies of scale & supply chain would it make any sense to make specialized chips specifically for mid-range to high-end Macs that probably sells less than 4.5 million units per year?

That's why I am betting on system-in-a-package design that would allow Apple to manufacture a smaller amount of chips (with higher yields) and combine them into the final system.

Or maybe it will just be a single SoC, with some cores disabled to cover for manufacture defects. I can also see that happening.
Would it not make more sense to have multiple M1 SoCs instead in 1 Mac? In terms of logic board space Macs have plenty compared to say a iPad Pro.

Because multiprocessor systems don't scale nearly as well as you think. They have their use in servers, but using multiple M1 chips like that would basically throw away all the advantages of Apple Silicon platform (unified memory, high performance, low latency, high power efficiency).
 

Lemon Olive

Suspended
Nov 30, 2020
1,208
1,324
So
- M1 with 8core cpu and gpu)
- M2 with 10core cpu and 16/32 core gpu
- M1x with 8 faster cpu cores cpu and 10 core gpu

I think only these 3 will have this year
I guess it depends on how you read the report. I originally thought the above as well, because I thought the report said the MacBook Air chip was simply adding a 10 core gpu variant. But if they are faster 8 cores as well, that doesn't sound like it can still be the M1 unless they are just clocking it higher, which I doubt.

It will be interesting to see. It is starting to sound more like all the new chips are not M1 variants, unless the report about the MacBook Air is misleading.
 

JMacHack

Suspended
Mar 16, 2017
1,965
2,424
The benchmarks I look are at about 2.3X the M1 (Passmark), but whatever, benchmarks are benchmarks and are only indicators.
https://www.notebookcheck.net/Apple...single-thread-performance-chart.531181.0.html
Single core is equal or superior here,
Passmark has the 11900k at 9k higher, so a third maybe in a single benchmark, passmark.
With double the logical cores, 100 extra watts, 2.0 extra gigahertz.
Somehow I think this is beatable for Apple.
 

dgdosen

macrumors 68030
Dec 13, 2003
2,817
1,463
Seattle
It could all work with unified memory architecture. The latency may be higher just due to distance, but some of that can be negated with increased bandwidth and/or bigger system cache.
How big will these die sizes be? or will Apple start their own stacking?
 

el-John-o

macrumors 68000
Nov 29, 2010
1,590
768
Missouri
I'm betting the two efficiency cores has a lot more to do with OS integration than anything about them, specifically.

Didn't Apple say that the OS offloads certain tasks onto the efficiency cores all the time? In that case, there could be an advantage to continuing to offload certain tasks onto cores that generate less heat and use less energy; giving more electrical and thermal bandwidth to the high performance cores who might, in theory, produce more heat / use more energy to perform those exact same tasks.
 

MalcolmH

macrumors member
Aug 8, 2020
41
14
So now the rumor is a 10-core chip with only two efficiency cores. If so, the efficiency cores must be quite different than the M1 efficiency cores. I continue to stand by my (not at all informed by any wine I swear) claim that these will be M2 and not M1x.

Or the rumor an (nobody i swear) could be wrong. Since a 4:1 ratio seems a little odd.
If true, then the efficiency cores must be much more powerful this time around, I’d think. Maybe double-wide issue or something.
Now that Apple controls the entire hardware and software stack we could see some aggressive tuning on how the efficiency cores are being used in addition to hardware changes. This article is interesting ..

How MacOS uses efficiency cores
 

leman

macrumors Core
Oct 14, 2008
19,522
19,679
Why don’t Apple use hyper threading ?

Why would they? Hyperthreading is a technique to improve the utilization of the execution backend by sharing execution resources between multiple threads. Given Apple's unprecedentedly large out-of-order execution window and already high backend utilization, they simply don't need hyperthreading to get most out of their CPUs. Besides, hyperthreading complicates chip design and doesn't really work well with Apple's goals of optimizing performance per watt.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.