32GB M1 RAM can equal 32GB of RAM + 28GB VRAM

rezwits · Dec 13, 2021

Eric40962005 said:
The RX A6000 is in a different galaxy than this M1 thing sorry man.

Not completely, I mean the M1s have ML Cores and ECC? But maybe the GPU implementation isn't up to snuff or close i.e. "Floating Point Math." I would be interested in seeing what kind of CAD programs evolve too but whatever.

IDK, cause honestly I just got ideas from the M1/Pro/Max and what they are capable of... I didn't really comprehend the sickness of 32+ GB of GPU VRAM...

I'm learning and reading...

altho, IDK, I am throwing 24GB of "data" around like a pizza, well more like I am "kneading the dough"

mi7chy · Dec 14, 2021

Some overzealous fanboys used to claim 8GB = 16GB but reality is shared memory contention will cause lag. We should hold them accountable for misleading and wasting people's money.

cmaier · Dec 14, 2021

mi7chy said:
Some overzealous fanboys used to claim 8GB = 16GB but reality is shared memory contention will cause lag. We should hold them accountable for misleading and wasting people's money.

8GB != 16GB, but UMA doesn’t necessarily cause a shared memory contention issue, either, so long as you have enough memory for your working set. People should just buy the memory they need and ignore the strange theories that keep popping up on here.

Jorbanead · Dec 14, 2021

mi7chy said:
Some overzealous fanboys used to claim 8GB = 16GB but reality is shared memory contention will cause lag. We should hold them accountable for misleading and wasting people's money.

This is your favorite talking point at parties isn’t it.

avkills · Dec 16, 2021

Since I am laughing hard right now. I am going to throw a wrench into his thinking....so I have a W6800x duo card in my Mac Pro with 64GB of total VRAM (32GB per gpu core), but for GPU 3D rendering we can only load up to 32GB since all the object/texture data needs to be loaded into the VRAM for both GPU cores for the rendering to work right.

I actually have more VRAM than system RAM (going to upgrade after the holidays). System runs just fine; benchmarks for Octane are pretty much the same as people with tons of system RAM.

and yes, the nVidia A6000 card has 48GB of VRAM and is currently the best overall GPU on the market right now for super high end 3D work (not gaming); but the W6xxxx series for the Mac Pro are no slouches either.

Unified memory is the future; although both AMD and nVidia are working clever plans to allow the CPU direct memory access to the GPU memory.

Really the only time GPU memory needs to be loaded to system RAM and back again is if the CPU needs to do calculations on something that is in GPU RAM; this is kind of where the unified RAM kick ass.

Ok back to the real world of whatever RAM you have is what you have....

MacCheetah3 · Dec 16, 2021

avkills said:
I actually have more VRAM than system RAM (going to upgrade after the holidays). System runs just fine; benchmarks for Octane are pretty much the same as people with tons of system RAM.

Interesting… I have wondered whether Otoy’s recommendations were about being extremely cautious (i.e., prepared for anything).

Otoy said:
A good rule of thumb is that you should have three to four times more RAM than GPU memory for the system to operate efficiently. The reason for the range is that not every scene requires the same amount of RAM usage, and you also want some extra RAM for other processes you’re running on your system.

Otoy said:
For example, If you have one RTX 3090 with 24GB of VRAM, you will already want three to four times the amount of system RAM in order to fully fill it (this varies depending on the scene and other RAM usage in your system). This means you’re looking at a minimum of 72GB of system RAM to keep everything happy. If you are running two 3090s that are connected via NVLink, you will actually want a minimum of 144GB of RAM, which is actually more than many consumer-end motherboards will support.

https://help.otoy.com/hc/en-us/articles/360054367272-Hardware-Guide-for-OctaneRender

avkills · Dec 16, 2021

MacCheetah3 said:
Interesting… I have wondered whether Otoy’s recommendations were about being extremely cautious (i.e., prepared for anything).

https://help.otoy.com/hc/en-us/articles/360054367272-Hardware-Guide-for-OctaneRender

Well of course what OTOY says is more or less correct; but heavy 3D scenes is not my day to day; and in fact right now I do not even use Octane to render anything; but it is something I am looking at. I am going to try and go to 192GB of RAM using 6 sticks, just in case I want to go to 384 later.

darngooddesign · Dec 16, 2021

8GB feels like 16GB because there is very little performance penalty to SSD swapping beyond your 8GB. So while you used to need 16GB to avoid lag, that isn't as much the case now.

avkills · Dec 16, 2021

darngooddesign said:
8GB feels like 16GB because there is very little performance penalty to SSD swapping beyond your 8GB. So while you used to need 16GB to avoid lag, that isn't as much the case now.

How is that different from any other modern computer system/OS that uses the hard drive for swap space? Faster hard drive = less noticeable performance hit when using swap. Swap performance has nothing to do with whether the system uses unified memory.

darngooddesign · Dec 16, 2021

avkills said:
How is that different from any other modern computer system/OS that uses the hard drive for swap space? Faster hard drive = less noticeable performance hit when using swap. Swap performance has nothing to do with whether the system uses unified memory.

Its not different except that the M1's memory bandwidth and SSD R/W-speed is a lot faster due to the RAM and drive integration. Unified just refers to the GPU using the same pool of memory, but thats not we're talking about.

leman · Dec 16, 2021

avkills said:
How is that different from any other modern computer system/OS that uses the hard drive for swap space? Faster hard drive = less noticeable performance hit when using swap. Swap performance has nothing to do with whether the system uses unified memory.

Well, technically most x86 systems use unified memory, so that's indeed not the reason. Also, I don't think that anyone has ever quantified whether (and if, by how much) swap on M1 machines is indeed faster. Not also that faster SSDs won't really help, we need low latency here, not high bandwidth (and it's not like M1 SSDs have ultra-low latency). Anecdotally however, there are two things worth mentioning:

1. Apple Silicon uses 16K memory pages, which ought to make swapping a bit more efficient (data transfer batching)
2. There is this very interesting paper from couple of years ago: https://web.cs.unlv.edu/jisooy/paper/yang_pmbench.pdf where the authors demonstrate that the limiting factor of swap performance is actually the OS. It is possible that Apple has reimplemented the pager to use more efficient latency-optimized algorithms.

avkills · Dec 16, 2021

darngooddesign said:
Its not different except that the M1's memory bandwidth and SSD R/W-speed is a lot faster due to the RAM and drive integration. Unified just refers to the GPU using the same pool of memory, but thats not we're talking about.

Well of course, faster memory bus and faster SSD speed of course it is going to be better with swap performance.

And yes the memory bandwidth of the M1 Max is very impressive. SSD speed is just a function of newer/faster tech.

avkills · Dec 16, 2021

leman said:
Well, technically most x86 systems use unified memory, so that's indeed not the reason. Also, I don't think that anyone has ever quantified whether (and if, by how much) swap on M1 machines is indeed faster. Not also that faster SSDs won't really help, we need low latency here, not high bandwidth (and it's not like M1 SSDs have ultra-low latency). Anecdotally however, there are two things worth mentioning:

1. Apple Silicon uses 16K memory pages, which ought to make swapping a bit more efficient (data transfer batching)
2. There is this very interesting paper from couple of years ago: https://web.cs.unlv.edu/jisooy/paper/yang_pmbench.pdf where the authors demonstrate that the limiting factor of swap performance is actually the OS. It is possible that Apple has reimplemented the pager to use more efficient latency-optimized algorithms.

Good points.

leman · Dec 16, 2021

darngooddesign said:
Its not different except that the M1's memory bandwidth and SSD R/W-speed is a lot faster due to the RAM and drive integration.

Sure, but it's not that much faster than other premium laptops. And mot importantly, bandwidth won't help you with swapping. Bandwidth is important when you want to transfer a lot of data quickly. But here we are talking about transferring 16KB pages! Even if you have 1000 page swap requests per second (at which point you are already have a massive problem), that's still "only" 16MB — a mere triffle for any SSD.

Much more important for seamless user experience is the latency — if the memory needs to be swapped in, it better happen before the user notices that something is amiss. That's the real challenge and the real problem: how to keep the response time und this magical human perception threshold. And there are a lot of things that one can do here, from sleight of hand (e.g. delayed animation transitions that give the system some precious milliseconds to do its stuff) to improve kernel algorithms and data structures to prioritizing swap data transfers and other fancy things.

Analog Kid · Dec 16, 2021

leman said:
Sure, but it's not that much faster than other premium laptops. And mot importantly, bandwidth won't help you with swapping. Bandwidth is important when you want to transfer a lot of data quickly. But here we are talking about transferring 16KB pages! Even if you have 1000 page swap requests per second (at which point you are already have a massive problem), that's still "only" 16MB — a mere triffle for any SSD.

Much more important for seamless user experience is the latency — if the memory needs to be swapped in, it better happen before the user notices that something is amiss. That's the real challenge and the real problem: how to keep the response time und this magical human perception threshold. And there are a lot of things that one can do here, from sleight of hand (e.g. delayed animation transitions that give the system some precious milliseconds to do its stuff) to improve kernel algorithms and data structures to prioritizing swap data transfers and other fancy things.

I feel like we're starting to talk about two different issues here-- swap for multi-tasking versus swap for very large datasets. If I'm changing focus from Xcode to Safari, then swap just needs to be fast enough that I don't notice the context switch and SSD is fine. If I'm concerned about whether my 16GB of unified RAM feels like 32GB of unified RAM while doing heavy GPU processing then presumably I'm doing a massive finite element model or excruciatingly detailed render and in that case the latencies of going to swap are enormous compared to extra RAM and you've bought the wrong hardware.

Fomalhaut · Dec 17, 2021

rezwits said:
Now let's say you put two machines side by side:

One a PC using that SILLY NVIDIA card for $10,000 at NewEgg, with 32 GB of RAM

One a M1 Pro with 32 GB of RAM

You would have:

n.b. I used the SILLY NVIDIA card for reference of having a 32GB GPU card, I think the most a Titan has is 12GB, I don't want to get into the reason why PC+NVIDIA gaming doesn't/can't use more at this point but a Workstation can if you working with large Graphical Data Sets, that a normal PC gamer just couldn't even use... so maybe some other day

On the M1 Macs if the GPU cores use 28GB of your RAM then you only have 4GB left for other things....

Yes, unified memory does reduce copying between GPU memory and system RAM, but it doesn't magically create more memory as your "calculation" appears to show. Your GPU uses the same RAM - hence "unified".

I really don't know what you're trying to say...think about how the memory space is used by CPU and GPU. It is shared, so you can't add up the usage by GPU and CPU - it's just a single pool.

Search

Search

32GB M1 RAM can equal 32GB of RAM + 28GB VRAM

rezwits

macrumors 6502a

mi7chy

macrumors G4

cmaier

Suspended

Jorbanead

macrumors 65816

avkills

macrumors 65816

MacCheetah3

macrumors 68020

avkills

macrumors 65816

darngooddesign

macrumors P6

avkills

macrumors 65816

darngooddesign

macrumors P6

leman

macrumors Core

avkills

macrumors 65816

avkills

macrumors 65816

leman

macrumors Core

Analog Kid

macrumors G3

Fomalhaut

macrumors 68000

Our Staff