Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

EntropyQ3

macrumors 6502a
Mar 20, 2009
718
824
Yeah, it’s hard to reduce to a single figure. Way back in the 90’s, when I was doing my PhD, I think I used Harvard Graphics to create these sorts of graphs, using one “metric”:

View attachment 1784622 View attachment 1784621

Thousands of simulations, varying all sorts of parameters, running all sorts of benchmarks, etc. No better way to get your head around these issues than to actually sit down with a blank sheet of paper and try and design a system. I had no idea how all sorts of design choices interacted until I actually had to solve the problem.

Awesome!
Skimmed a bit, and realised that it required a warm day in the shade with some beer and a cigar, and in paper form. I promise not to pester you with questions.
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,521
19,677
Is Apple going to go with LPGDDR 5 for the M2/M1X or will they switch to GDDR 6X (sweet sweet bandwidth, cheaper than HBM2)?

I doubt we will se GDDR6. It’s too hot, latency is too high, it’s just not the RAM for the job. My bet is on LPDDR and I sure hope it’s LPDDR5 this time.
 
  • Like
Reactions: jdb8167

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
I doubt we will se GDDR6. It’s too hot, latency is too high, it’s just not the RAM for the job. My bet is on LPDDR and I sure hope it’s LPDDR5 this time.
With it's giant cache is Apple Silicon really that latency sensitive?
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,521
19,677
With it's giant cache is Apple Silicon really that latency sensitive?

No idea. But the CPU has limited means of hiding latency (unlike the GPU), so significantly increasing the latency is probably not the best thing…
 

cmaier

Suspended
Jul 25, 2007
25,405
33,474
California
Awesome!
Skimmed a bit, and realised that it required a warm day in the shade with some beer and a cigar, and in paper form. I promise not to pester you with questions.
Hah! Nothing too exciting in it - this was for a very simple CPU, back before we had to worry about multiple cores, GPUs sharing memory, etc. But it does show that even for a very simple machine there are lots of things to think about.
 

cmaier

Suspended
Jul 25, 2007
25,405
33,474
California
No idea. But the CPU has limited means of hiding latency (unlike the GPU), so significantly increasing the latency is probably not the best thing…

Well, except of course that a giant cache means you don’t pay the price of that latency very often. It turns out you can afford a pretty high latency if you have a high enough cache hit rate (and trace simulations will show you that, for example, doubling the latency of RAM when you have a 90+% cache hit rate has little overall effect on the overall average cycles-per-instruction.)

Of course, if your workload is one that is super sensitive to latency, and has a highly stochastic memory access pattern, then all of that theory does you no good.
 

dogslobber

macrumors 601
Oct 19, 2014
4,670
7,809
Apple Campus, Cupertino CA
There will be a new chip which can move mountains. The fact Apple hasn't updated the 16" MBP and still sells the Intel 27" iMac means they are segmenting those markets in anticipation of a new chip. Or not. They may simply be bolting two of these M1 chips together with a fast backplane technology to go multi-chip. Nothing says these machines need to have only one M1 chip.
 

cmaier

Suspended
Jul 25, 2007
25,405
33,474
California
There will be a new chip which can move mountains. The fact Apple hasn't updated the 16" MBP and still sells the Intel 27" iMac means they are segmenting those markets in anticipation of a new chip. Or not. They may simply be bolting two of these M1 chips together with a fast backplane technology to go multi-chip. Nothing says these machines need to have only one M1 chip.
Yeah, they’re not doing that.
 
  • Like
Reactions: AutisticGuy

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
Well, except of course that a giant cache means you don’t pay the price of that latency very often. It turns out you can afford a pretty high latency if you have a high enough cache hit rate (and trace simulations will show you that, for example, doubling the latency of RAM when you have a 90+% cache hit rate has little overall effect on the overall average cycles-per-instruction.)

Of course, if your workload is one that is super sensitive to latency, and has a highly stochastic memory access pattern, then all of that theory does you no good.
Are there any current apps that are actually latency sensitive on Apple Silicon? And would upping the clock rate of GDDR x help any?
 

cmaier

Suspended
Jul 25, 2007
25,405
33,474
California
Are there any current apps that are actually latency sensitive on Apple Silicon? And would upping the clock rate of GDDR x help any?

I don’t know. You’d have to have a pretty random memory access pattern, or one that aliases with the cache memory replacement algorithm in such a way that the cache has the wrong addresses a lot. The whole goal of cache design is to avoid that. But there are always outlier cases.
 
  • Like
Reactions: jdb8167

leman

macrumors Core
Original poster
Oct 14, 2008
19,521
19,677
And would upping the clock rate of GDDR x help any?

The simple reason why you won’t see GDDR6 in a Mac laptop is because GDDR6 uses insane amounts of power and lacks any of the advanced power-saving features LPDDR has. It’s simply not going to happen. They are not going to ship a MacBook Pro with sub 5 hours battery life.
 

Fomalhaut

macrumors 68000
Oct 6, 2020
1,993
1,724
M1X or maybe some other letter or naming scheme only Apple knows. Zero chance it’s M2.
Why do you think so? If the next Apple Silicon for Macs (presumably already in pre-production) is using a next generation microarchitecture with common elements to the A15, then it would make sense to name it in such as way that demonstrates an incremental improvement. As A14->A15, so M1->M2.

Naming it M1X or M1 Pro or similar would imply that the next AS SoCs are a variant of the M1. This may be the case, but it seems about as likely that the next Macs will use a second-generation core architecture.

@cmaier has real-world experience with designing and releasing CPUs, and is of the opinion that the next release could realistically be a second-generation microarchitecture. It's by no means guaranteed, but I'm optimistic.
 
  • Like
Reactions: Realityck

EntropyQ3

macrumors 6502a
Mar 20, 2009
718
824
Are there any current apps that are actually latency sensitive on Apple Silicon? And would upping the clock rate of GDDR x help any?
The difficulty from a user perspective is - how would you know? The app does its thing and normally nothing outwardly demonstrates its access patterns. You need to use analytics tools when you write the code to determine these things.
I’ll say this though - when I was active writing scientific code (you have a problem and need to solve it, but typically don’t need to care about ”the user experience”), I was effectively always limited by memory somehow. My more intelligent/creative code by latencies, my more brute force code by bandwidth, and speeding things up when needed always involved trying to optimize data access patterns/flow. Rarely if ever was I constrained by ALU resources, keeping the beasts fed were the problem.
(This may also be a reason you always see the same old benchmark applications used for CPU benchmarking….but that’s its own discussion.)
Which is the background to my desire to go beyond CPU core analysis for assessments of SoCs.
 
Last edited:

JouniS

macrumors 6502a
Nov 22, 2020
638
399
Are there any current apps that are actually latency sensitive on Apple Silicon? And would upping the clock rate of GDDR x help any?
As a rule of thumb, anything that requires a lot of memory is sensitive to memory latency. If caching helps on one level of the memory hierarcy, it probably helps on multiple levels. Then your data might as well reside on disk, and caches will probably handle the rest.
 

cmaier

Suspended
Jul 25, 2007
25,405
33,474
California
As a rule of thumb, anything that requires a lot of memory is sensitive to memory latency. If caching helps on one level of the memory hierarcy, it probably helps on multiple levels. Then your data might as well reside on disk, and caches will probably handle the rest.

Since most caches use a Least Recently Used cache replacement policy, it all comes down to what your memory access pattern is. If you are constantly trying to read memory addresses that you haven’t read in a long time, then you end up with a lot of costly memory accesses.
 

thenewperson

macrumors 6502a
Mar 27, 2011
992
912
its will most likely be lpddr5
Theoretically. But as @Kung gu wrote above, LPDDR is the most likely option. It's cheaper, more flexible and supports advanced power management.
True, it's what I expect as well. I was just wondering what they'd do for bandwidth. The previous highs were HBM2 @ 400GBps in the MBP16 and 512GBps in the iMac Pro. They'd probably go for quad channel LPDDR5 to get to 64GB and ~200GBps right? I'm just wondering if they'd be okay with half the max bandwidth they used to have.
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,521
19,677
True, it's what I expect as well. I was just wondering what they'd do for bandwidth. The previous highs were HBM2 @ 400GBps in the MBP16 and 512GBps in the iMac Pro. They'd probably go for quad channel LPDDR5 to get to 64GB and ~200GBps right? I'm just wondering if they'd be okay with half the max bandwidth they used to have.

It is true that on paper 200GB/s looks to be significantly less compared to HBM2 bandwidth used in high-end Mac GPUs, but Apple Silicon is simply less reliant on bandwidth. Apple uses large caches (even the M1 has 16MB LLC where 5700XT has only 4MB L2 GPU cache), compute data compression and TBDR technology to optimize memory access and bandwidth utilization. M1 shows that they don't need ridiculous RAM bandwidth to achieve respectable results, and I am sure that the same will be true for prosumer chips that will likely come with even more cache and advanced technology.
 
  • Like
Reactions: senttoschool

senttoschool

macrumors 68030
Nov 2, 2017
2,626
5,482
It is true that on paper 200GB/s looks to be significantly less compared to HBM2 bandwidth used in high-end Mac GPUs, but Apple Silicon is simply less reliant on bandwidth. Apple uses large caches (even the M1 has 16MB LLC where 5700XT has only 4MB L2 GPU cache), compute data compression and TBDR technology to optimize memory access and bandwidth utilization. M1 shows that they don't need ridiculous RAM bandwidth to achieve respectable results, and I am sure that the same will be true for prosumer chips that will likely come with even more cache and advanced technology.
This person says that Nvidia's Grace has 8 channels of LPDDR5X which adds up to 546GB/s of bandwidth. I could see Apple going this route for Mac Pros but reduce the number of channels to 4 for Macbook Pros. Not sure how memory channels affect power usage. Eight channels of memory must use a lot of power, right?

 

quarkysg

macrumors 65816
Oct 12, 2019
1,247
841
This person says that Nvidia's Grace has 8 channels of LPDDR5X which adds up to 546GB/s of bandwidth. I could see Apple going this route for Mac Pros but reduce the number of channels to 4 for Macbook Pros. Not sure how memory channels affect power usage. Eight channels of memory must use a lot of power, right?
The 2019 Mac Pro already sports a 6 channels ECC DDR4 memory bus, so going to 8 channels is definitely possible for Apple.
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,521
19,677
This person says that Nvidia's Grace has 8 channels of LPDDR5X which adds up to 546GB/s of bandwidth. I could see Apple going this route for Mac Pros but reduce the number of channels to 4 for Macbook Pros. Not sure how memory channels affect power usage. Eight channels of memory must use a lot of power, right?

More memory channels simply mean more independent RAM chips. I assume that RAM power usage scales linearly. M1's RAM power usage is crazy efficient and usually only uses around 0.2 watts in everyday tasks. For gaming benchmarks it's around 0.7 watts. In benchmarks that really hammer the memory subsystem it can get up to 1.5 watts.

There is good reason to assume that doubling the memory channels will simply double the energy consumption so instead of RAM drawing 0.2 watts you'd have RAM drawing 0.5 watts on average. Not a big deal on a machine with a larger battery.
 
  • Like
Reactions: EntropyQ3

diamond.g

macrumors G4
Mar 20, 2007
11,438
2,665
OBX
Oh and maybe Apple will support PCIe gen 5 with it's onboard controller in the M2? Or do we think that they will stick with Gen 3? If they stick with 3, will they update their storage controller to match current NVMe speeds on Gen 4?
 

leman

macrumors Core
Original poster
Oct 14, 2008
19,521
19,677
Oh and maybe Apple will support PCIe gen 5 with it's onboard controller in the M2? Or do we think that they will stick with Gen 3? If they stick with 3, will they update their storage controller to match current NVMe speeds on Gen 4?

Who cares really? Apple does not use any PCIe devices so it’s mostly about Thunderbolt that’s limited anyway. Their SSD use custom communication Chanel that’s hooked directly into the SoC and they can make it as fast or as slow as they want. They are not limited by PCIe in this area.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.