Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Citizen45

macrumors member
Original poster
Apr 9, 2022
49
48
With the 18GB version of the MacBook Pro containing 3x6GB of RAM, what is the impact of that in relation to dual channel?

Does that mean that one of the RAM modules will be running at a lower speed than the other two?

Or does Apple have some way around this?
 

leman

macrumors Core
Oct 14, 2008
19,517
19,664
There are three RAM modules with 64-bit memory bus each, for total of 192-bit bus. As to number of channels, it depends how you count; either 3, 6, or 12 channels is not wrong.
 
  • Like
Reactions: Basic75 and altaic

altaic

macrumors 6502a
Jan 26, 2004
711
484
Edit: whoops, wires got crossed and I thought the OP was talking about the Max-lite rather than the Pro 🤦
 

RokinAmerica

macrumors regular
Jul 18, 2022
206
385
Triple channel has always been a thing. The first Core i7s in 2009 used to have that.
It was pretty much limited to a very small set of motherboards.

Does anyone have a source for triple channel being used on Macs?
 

danwells

macrumors 6502a
Apr 4, 2015
783
617
Yes, Apple... This is in their press downloads. I write for a photography site
Apple-M3-chip-series-unified-memory-architecture-M3-Pro-231030.jpg
 
  • Like
Reactions: Basic75

RokinAmerica

macrumors regular
Jul 18, 2022
206
385
Where does that show it being triple data rate? 3 sticks do not make/equal TDR. If they are using it, why are they not marketing it? I will continue to look, but I have not yet found anything saying they use TDR Ram.

My confusion is this. If they are using it, why not market it? If they are not, why the odd sizes of Ram?
 

picpicmac

macrumors 65816
Aug 10, 2023
1,239
1,833
I will continue to look
Those are LpDDR5 packages which are mounted right next to to the M3 chip. I posted a link the other day to the Samsung splash page for their LpDDR5 chips. Apple has used Micron and Samsung LpDDR in their products.

Go read up on LpDDR. There is a specified standard for data per pin.
 

RokinAmerica

macrumors regular
Jul 18, 2022
206
385
Will do and thanks. This is all part of the SoC package?

Excuse my questions, I am old school Windows, so my reference comes from there. I am looking up the LpDDr now, and again, thanks.
 

fakestrawberryflavor

macrumors 6502
May 24, 2021
423
569
With Apple silicons memory architecture, they aren’t limited to just ‘Single or Dual’ channel anymore. There are many more member busses as they are feeding a bandwidth hungry GPU also.
 
  • Like
Reactions: Citizen45

leman

macrumors Core
Oct 14, 2008
19,517
19,664
Where does that show it being triple data rate? 3 sticks do not make/equal TDR. If they are using it, why are they not marketing it? I will continue to look, but I have not yet found anything saying they use TDR Ram.

My confusion is this. If they are using it, why not market it? If they are not, why the odd sizes of Ram?

It’s in their technical data. 150GB/s with LPDDR5X equals 192-bit memory bus. That’s 6 LPDDR5 memory channels. What else do you want them to market? The Max series even has 512-bit bus (16 channels).

Don’t forget that these are heterogeneous computing machines that have to feed wide processing clusters besides the CPU. The traditional PC 128-bit architecture is not sufficient here. Apple Silicon is a hybrid between a gaming console and a (cheap) supercomputer in this regard.
 

danwells

macrumors 6502a
Apr 4, 2015
783
617
Not triple data rate - three CHANNELS.

There are three separate things that affect the speed of memory. One is the clock speed of the memory (how many clock pulses it gets per second). The second is whether it's DDR (pretty much everything is nowadays) - DDR memory transfers data on both the rising and falling edges of the clock pulse. QDR memory (which is experimental if it exists) would transfer FOUR times per pulse. The third is the number of channels. Since modern processors are 64-bit, each memory channel is as well. That's not ALWAYS true, because a processor can use one channel width internally and another to talk with the outside world. Intel's old 8088 was an example of this (internally, it was a 16-bit chip, but it used 8-bit buses)

To get the overall memory speed (expressed in bits per second), you need to multiply all of these things together - clock (Hz)*bits per transfer(which is channel width(64)*number of channels)*2 for DDR. Simplified bu multiplying out the constants, it's 128*channels*clock. That's a VERY large number, since clock speed is already a number in the billions. For ease of handling, we divide it by 8 billion (actually by 8589934592, which is 2^33), to get a number in gigabytes per second instead of bits per second. We generally round that number to something like 50 GB/second (MacBook Air) or 800 GB/second (Mac Studio Ultra)

As Leman says, Apple Silicon has something in common with supercomputer architectures (with very wide multichannel memory buses and lots of things having access to the RAM) in this regard (as do gaming consoles and GPUs). Old-time PC architecture is the odd one out here. These wide buses with multiple types of cores connected came from supercomputing, and have been adopted by Macs, game consoles, GPUs, phones (essentially everything EXCEPT x86 CPUs).
 
Last edited:

streetfunk

macrumors member
Feb 9, 2023
82
41
Not triple data rate - three CHANNELS.
i´m interested to understand these numbers vs. the relation of the CPU cores beeing in use.
Especially vs. single Core related tasks.

lets say on a application that strictly uses only one core, but that one to quasi full extent,
and lets say there is no special high graphics/GPU load, how much is it playing a role that we have now a different RAM and bandwith situation ?

i read here, somebody saying, guess it was @leman, that one CPU core alone can only create a 50M/Bits load ( sorry if i´m phrasing out things wrong, vs. "M/bits", "50" is correct.....i remember)


i´m not enough tech savy on this one to talk here.
My interesst is the M2pro vs. the M3pro, or M3 studio situation vs. the single core CPU workload in foreground.
As i understand, are these specific numbers only playing a role at high or very high GPU loads.
Is that correct ?

Or in other words, whatever M3 i´d chose, i´m save vs. my -highload- single Core task, right ?
 

quarkysg

macrumors 65816
Oct 12, 2019
1,247
841
i´m interested to understand these numbers vs. the relation of the CPU cores beeing in use.
Especially vs. single Core related tasks.

lets say on a application that strictly uses only one core, but that one to quasi full extent,
and lets say there is no special high graphics/GPU load, how much is it playing a role that we have now a different RAM and bandwith situation ?

i read here, somebody saying, guess it was @leman, that one CPU core alone can only create a 50M/Bits load ( sorry if i´m phrasing out things wrong, vs. "M/bits", "50" is correct.....i remember)


i´m not enough tech savy on this one to talk here.
My interesst is the M2pro vs. the M3pro, or M3 studio situation vs. the single core CPU workload in foreground.
As i understand, are these specific numbers only playing a role at high or very high GPU loads.
Is that correct ?

Or in other words, whatever M3 i´d chose, i´m save vs. my -highload- single Core task, right ?
From memory, Anandtech tested that a single M1 CPU core is already capable of saturating more than 60GB/s of bandwidth.

Tho. IIRC, M1 Max CPU clusters did not go higher than 200 GB/s.

M2 and M3 should go higher.
 
  • Like
Reactions: streetfunk

danwells

macrumors 6502a
Apr 4, 2015
783
617
I don't know the details, but it's the GPU more than the CPU that uses high memory bandwidth. I looked up a couple of numbers for memory bandwidth.

AMD Threadripper Pro 7995 WX: 332.8 GB/s (a 96-core, eight memory channel monster released in October)

Intel Sapphire Rapids Xeon 8490H: 300 GB/s ($17,000 60-core server processor from this year, eight channels)

Intel Core i9 14900K: 96 GB/s (one of Intel's top-end desktop processors)

Apple M3 Max MacBook Pro: 400 GB/s

Apple M2 Ultra Mac Studio: 800 GB/S

So far, it looks like Apple has the memory bandwidth of the beefiest workstations around (those are $10,000 and $17,000 CPUs with 350 watt TDPs, capable of using even more power than that) crammed into a laptop, with twice that in an innocent-looking desktop workstation. A really top-end desktop CPU, before you get into server chips with a ton of channels, has the memory bandwidth of a MacBook Air.

The catch is that big GPUs can go even higher:

nVidia GeForce RTX 4080 Mobile GPU: 432 GB/s (RTX 4090 mobile: 576 GB/s)

AMD Radeon 7900 XTX desktop GPU: 960 GB/s

nVidia GeForce RTX 4090 desktop GPU: 1008 GB/s

Another wild card is how having the CPU and GPU accessing the same memory affects performance. That could cut both ways. On the one hand, data transfer between CPU and GPU through unified memory is MUCH faster than over PCIe. Even PCIe 4.0 x16 (used by modern, high-end GPUs) is 32 GB/s. A MacBook Air can do three times better than that, and a Mac Studio can do 24 times better.

On the other hand, I could see contention between CPU and GPU for the memory bandwidth as an issue?

Supercomputers like to use shared memory pools to transfer data, and they're not designed to be slow, so I tend to think Apple's Unified Memory Architecture may have more benefits than downsides from a performance perspective. The clear downside is that you can't upgrade the darn RAM (yes, that really IS an architectural limitation, not just Apple making money)...
 

quarkysg

macrumors 65816
Oct 12, 2019
1,247
841
On the other hand, I could see contention between CPU and GPU for the memory bandwidth as an issue?
I would think that Apple has likely sized their SoC's SLC sufficiently large, based on most common workflows to make this a non-issue.

Of course if the workload is random enough that it randomly walks all over the memory space causing caches to trash, UMA or not, it will be bad.
 
  • Like
Reactions: streetfunk

leman

macrumors Core
Oct 14, 2008
19,517
19,664
i´m interested to understand these numbers vs. the relation of the CPU cores beeing in use.
Especially vs. single Core related tasks.

lets say on a application that strictly uses only one core, but that one to quasi full extent,
and lets say there is no special high graphics/GPU load, how much is it playing a role that we have now a different RAM and bandwith situation ?

This is a great question! It is important to understand that the CPU cores are not connected to memory directly, but instead have to go through multiple data highways and distribution stations (caches). Another crucial things is that Apple Silicon organizes CPU cores in clusters that’s share some infrastructure (in particular, memory data links and caches).

In M1/M2 family, the data link between a single CPU core and the RAM can sustain approximately 100GB/s. This is much higher than what most other CPUs can do. So on base chip, you can saturate the entire RAM connection with just one core, which is great for tasks that copy memory a lot. When you start working with multiple cores though the situation gets more complicated. You see, the data links between individual cores and the RAM are not independent but partially shared between cores. When you measure bandwidth using two cores, you only get 190GB/s instead of 200, three cores get a bit more but still under 200, etc. Anandtech measured close to 200GB/s total bandwidth using only P cores and 224GB/s combining P- and E-cores. Overall, it seems like there are hard limits per CPU cluster. I would speculate that a performance core cluster (consisting of four cores on M1/M2) tops out at 100GB/s and the efficiency core cluster provides another 25GB/s. So topping out the cores will max out these data links.

In the M3 family Apple has changed some things. Performance core clusters are now not 4-large but 6-large. We don’t know if the cluster still uses the same 100GB/s data link or whether it has been increased. These things would need to be measured.

At any rate, it goes without doubt that M3 Pro will have lower aggregate CPU memory bandwidth than M2 Pro. Simply because M2 Pro could access 200GB/s and the new chip only has 150GB/s. On the Max, it depends whether Apple has increased the per-cluster data channel, it they did Max could see higher CPU bandwidth. We don’t know yet.

Now, will this matter in practice? Probably not. It’s easy enough to construct an artificial example that will use a lot of RAM bandwidth, but finding a real use case is more tricky. After all, the CPU is not only copying data, it has also to do something useful to the data. So far, we see M3 Pro outperforming M2 Pro on every benchmark even despite having lower bandwidth. And in the end if the day, M3 Pro still has 3-4x higher single core RAM bandwidth than any other CPU on the market.
 

quarkysg

macrumors 65816
Oct 12, 2019
1,247
841
Now, will this matter in practice? Probably not. It’s easy enough to construct an artificial example that will use a lot of RAM bandwidth, but finding a real use case is more tricky. After all, the CPU is not only copying data, it has also to do something useful to the data. So far, we see M3 Pro outperforming M2 Pro on every benchmark even despite having lower bandwidth. And in the end if the day, M3 Pro still has 3-4x higher single core RAM bandwidth than any other CPU on the market.
Couldn't have said it better myself.

I think Apple's CPU architects, designers and engineers knows what they are doing.
 

streetfunk

macrumors member
Feb 9, 2023
82
41
Another crucial things is that Apple Silicon organizes CPU cores in clusters that’s share some infrastructure (in particular, memory data links and caches).
This is the point. Its hard to grasp from a non-Tech persons perspective. Plus, M3 is now different again.

Performance core clusters are now not 4-large but 6-large. ....
.....
We don’t know if the cluster still uses the same 100GB/s data link or whether it has been increased. ....
.....
These things would need to be measured.
(This was OG in one piece. i just added space, to set each part apart, and in bold alone)
Very Good ! I see. .....=>

At any rate, it goes without doubt that M3 Pro will have lower aggregate CPU memory bandwidth than M2 Pro. Simply because M2 Pro could access 200GB/s and the new chip only has 150GB/s. On the Max, it depends whether Apple has increased the per-cluster data channel, it they did Max could see higher CPU bandwidth. We don’t know yet.
Ok, we don´t know.....!
And even you folks would need to have more infos on the table.

The question here is, if the M3Max would be a win over the M3pro just alone for its memory bandwith => vs. singleCore related applications ? (which is the funny one) So, i see, "this" is not known yet.

it seems like there are hard limits per CPU cluster. I would speculate that a performance core cluster (consisting of four cores on M1/M2) tops out at 100GB/s
This is the relation and numbers i´d like to know more about.

"In theory", having now 6 P-cores per cluster, wouldn´t that require to have a bigger data channel to have same efficiency than the M2 cluster architecture has ?


repeating same quote:

On the Max, it depends whether Apple has increased the per-cluster data channel, it they did Max could see higher CPU bandwidth. We don’t know yet.
We need to know this ;)

the per-cluster data channel
this seems to be the magic term here.

When you folks talk the whole SoC and its parts, is this a specific part ? or is this more a "conglomerational" thing ?

I thank you very much ! @leman / and also @danwells and @quarkysg
 

Spanky Deluxe

macrumors demi-god
Mar 17, 2005
5,285
1,789
London, UK
It was pretty much limited to a very small set of motherboards.

Does anyone have a source for triple channel being used on Macs?
The last three generations of Mac Pro before the trash can Mac Pro and the trash can Mac Pro itself all used triple channel RAM (some generations were entirely triple channel, some like the trash can I think had 4 channel versions too). Intel's used to kind of have two concurrent processor lines, one more consumer grade that was dual channel and one more professional/prosumer that was triple channel. I had 96GB of RAM in my MacPro 4,1 upgraded to 5,1 and that operated in triple channel. It was one of the reasons why it could still keep up relatively well (as well as other upgrades) until I replaced it with my Mac Studio.
 
  • Like
Reactions: RokinAmerica

CWallace

macrumors G5
Aug 17, 2007
12,525
11,542
Seattle, WA
My confusion is this. If they are using it, why not market it? If they are not, why the odd sizes of Ram?

Because to the vast majority of their user-base, it doesn't matter and they would not understand it even if Apple did market it to them, so it would be counter-productive to do so.

I am not even sure app developers need to know about it since Apple abstracts so much of the hardware, but if there is a need for someone outside of the Apple engineering team to know, there is probably a technical document about it available somewhere.
 

danwells

macrumors 6502a
Apr 4, 2015
783
617
Triple CHANNEL (triple data rate does not exist) is actually towards the low end of Apple's channel count.

Single channel: iPhone(?), non M-series iPads(?) (also lower-end PC chips

Dual Channel: M1, M2, M3 (Airs, lowest-end MacBook Pros, iMac, some Mac Minis - also most average PCs, including most gaming rigs - a few gaming rigs about five to ten years ago used Xeon-derived quad-channel chips and boards).

Triple Channel: M3 Pro. The situation with triple-channel RAM in old Mac Pros (~2010) is confusing. Some of them really WERE triple-channel, and are fastest with three or six DIMMs (one or two per channel), even though they have four or eight slots. Adding RAM to the extra slot puts them in dual-channel mode (somewhat slower). Others (including the trash cans) are actually quad-channel, although Apple continued to ship three or six DIMMs (dropping them to triple-channel). In that case, adding an identical DIMM to the extra slot speeds them up. Some other older Xeon machines are triple channel.

Quad Channel: M1 Pro, M2 Pro (MacBook Pros, top-end Mac Mini, also High-End Desktop PCs like Threadripper and Xeon-W)

Six-channel: M3 Max (binned version) (MacBook Pros. There may be another six-channel chip out there somewhere, but I can't think of what? Some weird Xeon or Epic???). The other way to six channels is TWO old triple-channel Xeons (some of them were dual-processor capable)

Eight-channel: M1 Max, M2 Max M3 Max (fully enabled) (MacBook Pros, Mac Studio. Eight channel RAM is common on servers and also shows up on workstations using server-derived chips). Sometimes also seen in dual-processor machines using quad-channel chips.

Sixteen-channel: M1 Ultra, M2 Ultra, presumably M3 Ultra (Mac Studio, Mac Pro. I'm otherwise ONLY aware of sixteen-channel RAM in dual-processor servers and dual-processor workstations with a lot of server components. The Ultras are at least arguably dual processors in one package (and BY FAR the most affordable dual processor workstations in existence - most dual processor, sixteen channel Intel or AMD workstations start around $20,000)...).

These are the channel counts I can think of Apple using, and also the common channel counts across the industry. Twelve-channel may crop up on the M3 Ultra if a binned version exists, and is otherwise a rare (but perhaps not unheard of) configuration stemming from two six-channel chips. The only other configuration I am aware of in curentt use is 8n channel, where n is the number of processors in the system. Sixteen-channel is the smallest of these, but 32,48 and 64-channel configurations exist in 4,6 and 8 processor servers. Supercomputers may technically be 8n for n>8...
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.