Comparing M1 16GB…

T'hain Esh Kelch · Feb 19, 2023

Warped9 said:
So here is what I see. The only app running, other than Finder and Activity Monitor, is SketchUp.

If this is representative of your workload, then you are VERY far from needing 32GB RAM. And if isn't representative, then I don't get why you posted it.

Warped9 · Feb 19, 2023

T'hain Esh Kelch said:
If this is representative of your workload, then you are VERY far from needing 32GB RAM. And if isn't representative, then I don't get why you posted it.

Someone upthread asked me to check my Activity Monitor. So I did. I also checked it while I was rendering and the didn’t seem to be much dufference.

leman · Feb 19, 2023

theorist9 said:
But is it really the case that AS's memory latency is lower than that of high-end PC systems?

It’s not. In fact, it’s often higher. LPDDR already has higher latency than desktop DDR and the way Apple does memory hierarchy adds even more latency. This is one of the reasons why they use large SLC. The SLC doesn’t really seem to be that faster than RAM but it helps with latency.

And overall, the RAM approach of Apple Silicon is more similar to a traditional GPU than to a CPU. Which makes sense.

Zest28 said:
What baseless speculation? I have seen the GPU of my M1 Max pull 21GB RAM. So you need more RAM with M1 chips as the GPU will eat into the RAM. There is no dedicated VRAM.

That’s your argument? “I saw X once so I conclude a completely arbitrary Y?” If your workflow required 21GB for the GPU, how much do you think a traditional GPU system will need?

Queen6 · Feb 19, 2023

Warped9 said:
Someone upthread asked me to check my Activity Monitor. So I did. I also checked it while I was rendering and the didn’t seem to be much dufference.

Looking at that, 8GB would be more than enough LOL.

Q-6

Zest28 · Feb 20, 2023

leman said:
It’s not. In fact, it’s often higher. LPDDR already has higher latency than desktop DDR and the way Apple does memory hierarchy adds even more latency. This is one of the reasons why they use large SLC. The SLC doesn’t really seem to be that faster than RAM but it helps with latency.

And overall, the RAM approach of Apple Silicon is more similar to a traditional GPU than to a CPU. Which makes sense.

That’s your argument? “I saw X once so I conclude a completely arbitrary Y?” If your workflow required 21GB for the GPU, how much do you think a traditional GPU system will need?

What you don't understand with traditional GPU's having dedicated VRAM on the GPU itself so that it doesn't eat into the RAM? Some GPU's come with 24GB VRAM shipped on the GPU cards itself even.

M1 chips don't have VRAM on the GPU to use, so it will eat into your RAM, thus you need more RAM with M1 chips.

Your RAM will disappear when using 21GB VRAM on the M1 Max while if you had a dedicated GPU, you'd still have plenty of RAM available.

leman · Feb 20, 2023

Zest28 said:
What you don't understand with traditional GPU's having dedicated VRAM on the GPU itself so that it doesn't eat into the RAM? Some GPU's come with 24GB VRAM shipped on the GPU cards itself even.

M1 chips don't have VRAM on the GPU to use, so it will eat into your RAM, thus you need more RAM with M1 chips.

Your RAM will disappear when using 21GB VRAM on the M1 Max while if you had a dedicated GPU, you'd still have plenty of RAM available.

You appear to have a very simplistic view of the GPU hardware and driver stack. What you write might sound logical to you, but unfortunately, that's not how software and drivers usually work.

quarkysg · Feb 20, 2023

leman said:
The SLC doesn’t really seem to be that faster than RAM but it helps with latency.

Caches are implemented using static RAM while normal memory are dynamic RAM. SRAM is a lot faster to access compared to DRAM, as IIRC there's no refresh required for SRAM.

leman · Feb 20, 2023

quarkysg said:
Caches are implemented using static RAM while normal memory are dynamic RAM. SRAM is a lot faster to access compared to DRAM, as IIRC there's no refresh required for SRAM.

Sorry, I should have been more clear. When I wrote "doesn't seem to be faster" I was talking about bandwidth.

quarkysg · Feb 20, 2023

leman said:
Sorry, I should have been more clear. When I wrote "doesn't seem to be faster" I was talking about bandwidth.

Yeah ... the SLC is running at most CPU speed, which is 3.2 - 3.5 GHz and only transfers 1 cache line per cycle. LPDDR5 now runs at 3.2 GHz and higher with 2 transfers for cycle. SRAM still wins at random read tho. heh heh ...

T'hain Esh Kelch · Feb 20, 2023

Warped9 said:
Someone upthread asked me to check my Activity Monitor. So I did. I also checked it while I was rendering and the didn’t seem to be much dufference.

Since the entire thread is about on how much RAM you need, it would likely help quite a bit if you showed a situation where you are actually tasking the machine to the maximum of your workload, and not just having booted it, as that tells no-one how you use the machine.

Zest28 · Feb 20, 2023

leman said:
You appear to have a very simplistic view of the GPU hardware and driver stack. What you write might sound logical to you, but unfortunately, that's not how software and drivers usually work.

Since graphical data is stored in the VRAM, you have more RAM available than with M1 chips.

RAM is only used with dedicated GPU's if the VRAM is exceeded.

Warped9 · Feb 20, 2023

I checked the AV as I was rendering and it didn’t seem to make much difference. I will try it again.

I am not in the habit of having many apps open at once. Usually when using SketchUp I will sometimes use Safari, Mail and/or a Calculator app. Maxwell Render and Photoshop are other apps I might use while using SketchUp. I can’t recall a time when all of those were open at the same time. And at any time I have had multiple apps open none of them lagged in response.

I’m beginning to wonder whether if it‘s a glitch in my version of SketchUp running on High Sierra.

leman · Feb 20, 2023

Zest28 said:
Seems like you don't understand it. VRAM and RAM are separate from each other and dedicated GPU's cannot use RAM as VRAM unless you mess in the BIOS to allocate RAM to VRAM which is generally not recommended.

Since graphical data is stored in the VRAM, you have more RAM available than with M1 chips.

I've been doing graphics and GPU programming for around 20 years. Your attempts to explain me things is cute, but I'd rather recommend you focus your energy on learning how the real world works.

Zest28 · Feb 20, 2023

leman said:
I've been doing graphics and GPU programming for around 20 years. Your attempts to explain me things is cute, but I'd rather recommend you focus your energy on learning how the real world works.

So what is your version how graphical data with dedicated GPU's is being used? Does it go to VRAM or to RAM?

I can load textures and the limit is how much VRAM I have on the dedicated GPU, not the RAM that I have.

leman · Feb 20, 2023

Zest28 said:
So what is your version how graphical data with dedicated GPU's is being used? Does it go to VRAM or to RAM?

I can load textures and the limit is how much VRAM I have on the dedicated GPU, not the RAM that I have.

What you are missing is the fact that a copy of the data will be often kept in the system RAM, either by the application, by the driver or both. The purpose of the GPU RAM is not to release the pressure on the system RAM, but to allow GPU to do it's thing with good performance. GPU drivers and the OS will often do things you might not expect. You need to measure things like RAM usage to know for sure.

Zest28 · Feb 20, 2023

leman said:
What you are missing is the fact that a copy of the data will be often kept in the system RAM, either by the application, by the driver or both. The purpose of the GPU RAM is not to release the pressure on the system RAM, but to allow GPU to do it's thing with good performance. GPU drivers and the OS will often do things you might not expect. You need to measure things like RAM usage to know for sure.

But with M1 chips, 100% of the graphical data is in the RAM. With a dedicated GPU, it is not 100% that goes to RAM as most of it will be on the VRAM.

The exact percentage doesn't matter, as it is lower than 100%.

leman · Feb 20, 2023

Zest28 said:
But with M1 chips, 100% of the graphical data is in the RAM. With a dedicated GPU, it is not 100% that goes to RAM as most of it will be on the VRAM.

The exact percentage doesn't matter, as it is lower than 100%.

Sure, but it's not like you are likely to get any effective system RAM win out of it. How much are we talking about, couple of MB to maybe half a GB? That's negligible in the large scale. Not to mention that there can also be subtleties here. For example, many laptops drive the display via the iGPU, so even the dGPU framebuffer will need to be mirrored in the system memory. As I said, you have to measure these things.

Basic75 · Feb 20, 2023

Yebubbleman said:
With Unified Memory, as Apple has it, the RAM is there on the System-on-a-Chip and all processing/co-processing/sub-processing components can access that data at once without having to move it to and from anywhere.

The RAM being tightly coupled to the SoC is not a prerequisite for UMA. The Amiga had UMA in 1985 and the RAM chips were very separate.

Basic75 · Feb 20, 2023

Zest28 said:
But with M1 chips, 100% of the graphical data is in the RAM. With a dedicated GPU, it is not 100% that goes to RAM as most of it will be on the VRAM.

Some of the things in VRAM are likely to only be stored in VRAM like the frame buffer and depth buffer. Other things in VRAM are probably duplicated in system RAM like textures and geometry data. In graphically intense applications like many games the latter significantly outweighs the former. So just when you most need VRAM size to be in addition to system RAM size is when you least get this effect. Not sure where modern compositing window managers with GPU accelerated GUIs fall on this scale.

dmccloud · Feb 20, 2023

Zest28 said:
What you don't understand with traditional GPU's having dedicated VRAM on the GPU itself so that it doesn't eat into the RAM? Some GPU's come with 24GB VRAM shipped on the GPU cards itself even.

M1 chips don't have VRAM on the GPU to use, so it will eat into your RAM, thus you need more RAM with M1 chips.

Your RAM will disappear when using 21GB VRAM on the M1 Max while if you had a dedicated GPU, you'd still have plenty of RAM available.

You're trying to compare systems with a dedicated GPU to systems with iGPUs, which is the very definition of an "apples to oranges" comparison. In systems with iGPUs (which is the point of comparison here), x86 systems reserve part of the system RAM for graphics, and have to copy data to both "partitions" so that the CPU and GPU can work on the data in question. Because those systems then have to reconcile the changes made on both copies of the data, that introduces a hit to both performance and latency (because data is now traveling back and forth between RAM and the CPU multiple times.) With Apple Silicon, both the CPU and GPU can access the same data simultaneously without needing to a) partition RAM, b) create multiple copies of data, and b) reconcile changes made by both the CPU and GPU before executing said code.

Your 21GB one-time example has no relevance to the overall question of how much RAM you need on an Apple Silicon Mac, and to be honest, you probably need less on AS than you would on any Intel or AMD system with integrated GPUs, since the system doesn't partition RAM off like x86 platforms do.

Basic75 said:
The RAM being tightly coupled to the SoC is not a prerequisite for UMA. The Amiga had UMA in 1985 and the RAM chips were very separate.

If you want to make that "apples to oranges" comparison, then the Commodore VIC-20 and Commodore 64 both had "unified" memory as well. They both handled their limited RAM in a very different manner than any modern processor, which is why that's not really a valid comparison. If you go look at any of the videos on the history of the Commodore (8-Bit Guy has a great in-depth series on those machines), they segmented their limited amounts of RAM into blocks that could only be used by specific components of the machines. That's also why RAM expanders were such a big deal in the Commodore era.

In modern systems, UMA is not even remotely the same setup as those Commodore and Amiga systems used.

leman · Feb 20, 2023

dmccloud said:
You're trying to compare systems with a dedicated GPU to systems with iGPUs, which is the very definition of an "apples to oranges" comparison. In systems with iGPUs (which is the point of comparison here), x86 systems reserve part of the system RAM for graphics, and have to copy data to both "partitions" so that the CPU and GPU can work on the data in question. Because those systems then have to reconcile the changes made on both copies of the data, that introduces a hit to both performance and latency (because data is now traveling back and forth between RAM and the CPU multiple times.) With Apple Silicon, both the CPU and GPU can access the same data simultaneously without needing to a) partition RAM, b) create multiple copies of data, and b) reconcile changes made by both the CPU and GPU before executing said code.

Some of the really old iGPUs indeed worked this way, but that was years ago. Intel and AMD systems have been using UMA for a while. There is a little bit confusion around this since the driver/OS can sometimes reserve certain amount of RAM for graphics operation, but that’s just a way to program the system that’s orthogonal to the unified memory architecture.

Basic75 · Feb 20, 2023

I specifically chose the Amiga, and not the C64 for comparison.

dmccloud said:
In modern systems, UMA is not even remotely the same setup as those Commodore and Amiga systems used.

An Amiga 500 would have CPU and "GPU" assets freely mixed in chip-RAM, one unified address space where the processor and co-processors could access and manipulate everything. How is that "not even remotely the same setup"?

theorist9 · Feb 20, 2023

dmccloud said:
You're trying to compare systems with a dedicated GPU to systems with iGPUs, which is the very definition of an "apples to oranges" comparison. In systems with iGPUs (which is the point of comparison here), x86 systems reserve part of the system RAM for graphics, and have to copy data to both "partitions" so that the CPU and GPU can work on the data in question. Because those systems then have to reconcile the changes made on both copies of the data, that introduces a hit to both performance and latency (because data is now traveling back and forth between RAM and the CPU multiple times.) With Apple Silicon, both the CPU and GPU can access the same data simultaneously without needing to a) partition RAM, b) create multiple copies of data, and b) reconcile changes made by both the CPU and GPU before executing said code.

According to this Intel document describing its chips with integrated GPU's, they do indeed have unified memory architecture:

"Intel processor graphics shares memory with the CPU....Unlike other architectures with non-uniform memory architectures, memory shared between the CPU and GPU can be efficiently accessed by both devices."

Getting the Most from OpenCL™ 1.2: How to Increase Performance by...

Downloads Download Getting the Most from OpenCL™ 1.2: How to Increase Performance by Minimizing Buffer Copies on Intel® Proce

www.intel.com

theorist9 · Feb 20, 2023

@leman :

[TLDR: What's the typical range of percentages of the GPU RAM data the CPU needs to see? If it's 100%, then separate VRAM doesn't increase the effective RAM vs UMA, because in the former case it all needs to be copied over to the CPU RAM anyways. But if it's only a small percentage, then I think separate VRAM would increase the RAM available to the CPU. What's your take?]

Consider a non-UMA device with 32 GB RAM and separate VRAM, vs. a UMA device with 32 GB RAM. Futher assume in both cases we are doing a task in which the GPU generates x GB of RAM data.

If the CPU needs access to 100% of the GPU's RAM data, then both the non-UMA and UMA devices have the same effective RAM. That's because:
UMA: have x GB GPU data in unified memory, leaving (32-x) GB for CPU data
non-UMA: The x GB of GPU data in VRAM needs to be copied to CPU RAM, leaving you with (32-x) GB remaining in regular RAM for the CPU data.

HOWEVER, if it's typically the case that most of the GPU RAM data is used for internal processing by the GPU only, and only a small fraction (the result of the GPU's calculations) needs to be copied over to the CPU, then it's a different story. Let's assume the CPU only needs to see 10% of the GPU's RAM data. Then, in the non UMA-case, only 0.1 x GB needs to be copied over:

UMA: Same as above—have (32 – x) GB left for CPU data
non-UMA: 10% of x GB in VRAM needs to be copied to CPU RAM, leaving you with (32 – 0.1 x) GB remaining in regular RAM for the CPU data. [E.g., if x = 8 GB, then the UMA device is left with 32 – 8 = 24 GB for the CPU, while the non-UMA device is left with 32 – 0.8 = 31.2 GB for the CPU.]

So the question is: What's the typical fraction of GPU RAM data the CPU needs to access?

[Yes, there are separate benefits to UMA, but here I'm purely trying to address the question of the effective amount of RAM.]

DisraeliGears · Feb 20, 2023

Goodness gracious, this thread moved into a lot of folks arguing about minute (or semantic) technical points.

For OP, from what you've posted and shown, 16GB should totally be fine for you. This is doubly so considering you're attentive to siloing programs and only running what you need. As a previous poster mentioned, you might get away with 8GB, but most folks are looking to future proof to some extent, and for that purpose it's hard not to recommend 16GB at a minimum.

If you really want to know, I'd recommend looking at a program like Usage, that runs in your menu bar and can provide moment to moment statistics of what your computer is needing. The next time things get choppy, see what's being maxed out.

Comparing M1 16GB…

macrumors 604

macrumors 68000

macrumors Core

macrumors G4

macrumors 68030

macrumors Core

macrumors 65816

macrumors Core

macrumors 65816

macrumors 604

macrumors 68030

macrumors 68000

macrumors Core

macrumors 68030

macrumors Core

macrumors 68030

macrumors Core

macrumors 68020

macrumors 68020

macrumors 68040

macrumors Core

macrumors 68020

macrumors 601

macrumors 601

macrumors regular

Our Staff