If this is representative of your workload, then you are VERY far from needing 32GB RAM. And if isn't representative, then I don't get why you posted it.
If this is representative of your workload, then you are VERY far from needing 32GB RAM. And if isn't representative, then I don't get why you posted it.
Someone upthread asked me to check my Activity Monitor. So I did. I also checked it while I was rendering and the didn’t seem to be much dufference.If this is representative of your workload, then you are VERY far from needing 32GB RAM. And if isn't representative, then I don't get why you posted it.
But is it really the case that AS's memory latency is lower than that of high-end PC systems?
What baseless speculation? I have seen the GPU of my M1 Max pull 21GB RAM. So you need more RAM with M1 chips as the GPU will eat into the RAM. There is no dedicated VRAM.
Looking at that, 8GB would be more than enough LOL.Someone upthread asked me to check my Activity Monitor. So I did. I also checked it while I was rendering and the didn’t seem to be much dufference.
It’s not. In fact, it’s often higher. LPDDR already has higher latency than desktop DDR and the way Apple does memory hierarchy adds even more latency. This is one of the reasons why they use large SLC. The SLC doesn’t really seem to be that faster than RAM but it helps with latency.
And overall, the RAM approach of Apple Silicon is more similar to a traditional GPU than to a CPU. Which makes sense.
That’s your argument? “I saw X once so I conclude a completely arbitrary Y?” If your workflow required 21GB for the GPU, how much do you think a traditional GPU system will need?
What you don't understand with traditional GPU's having dedicated VRAM on the GPU itself so that it doesn't eat into the RAM? Some GPU's come with 24GB VRAM shipped on the GPU cards itself even.
M1 chips don't have VRAM on the GPU to use, so it will eat into your RAM, thus you need more RAM with M1 chips.
Your RAM will disappear when using 21GB VRAM on the M1 Max while if you had a dedicated GPU, you'd still have plenty of RAM available.
Caches are implemented using static RAM while normal memory are dynamic RAM. SRAM is a lot faster to access compared to DRAM, as IIRC there's no refresh required for SRAM.The SLC doesn’t really seem to be that faster than RAM but it helps with latency.
Caches are implemented using static RAM while normal memory are dynamic RAM. SRAM is a lot faster to access compared to DRAM, as IIRC there's no refresh required for SRAM.
Yeah ... the SLC is running at most CPU speed, which is 3.2 - 3.5 GHz and only transfers 1 cache line per cycle. LPDDR5 now runs at 3.2 GHz and higher with 2 transfers for cycle. SRAM still wins at random read tho. heh heh ...Sorry, I should have been more clear. When I wrote "doesn't seem to be faster" I was talking about bandwidth.
Since the entire thread is about on how much RAM you need, it would likely help quite a bit if you showed a situation where you are actually tasking the machine to the maximum of your workload, and not just having booted it, as that tells no-one how you use the machine.Someone upthread asked me to check my Activity Monitor. So I did. I also checked it while I was rendering and the didn’t seem to be much dufference.
You appear to have a very simplistic view of the GPU hardware and driver stack. What you write might sound logical to you, but unfortunately, that's not how software and drivers usually work.
Seems like you don't understand it. VRAM and RAM are separate from each other and dedicated GPU's cannot use RAM as VRAM unless you mess in the BIOS to allocate RAM to VRAM which is generally not recommended.
Since graphical data is stored in the VRAM, you have more RAM available than with M1 chips.
I've been doing graphics and GPU programming for around 20 years. Your attempts to explain me things is cute, but I'd rather recommend you focus your energy on learning how the real world works.
So what is your version how graphical data with dedicated GPU's is being used? Does it go to VRAM or to RAM?
I can load textures and the limit is how much VRAM I have on the dedicated GPU, not the RAM that I have.
What you are missing is the fact that a copy of the data will be often kept in the system RAM, either by the application, by the driver or both. The purpose of the GPU RAM is not to release the pressure on the system RAM, but to allow GPU to do it's thing with good performance. GPU drivers and the OS will often do things you might not expect. You need to measure things like RAM usage to know for sure.
But with M1 chips, 100% of the graphical data is in the RAM. With a dedicated GPU, it is not 100% that goes to RAM as most of it will be on the VRAM.
The exact percentage doesn't matter, as it is lower than 100%.
The RAM being tightly coupled to the SoC is not a prerequisite for UMA. The Amiga had UMA in 1985 and the RAM chips were very separate.With Unified Memory, as Apple has it, the RAM is there on the System-on-a-Chip and all processing/co-processing/sub-processing components can access that data at once without having to move it to and from anywhere.
Some of the things in VRAM are likely to only be stored in VRAM like the frame buffer and depth buffer. Other things in VRAM are probably duplicated in system RAM like textures and geometry data. In graphically intense applications like many games the latter significantly outweighs the former. So just when you most need VRAM size to be in addition to system RAM size is when you least get this effect. Not sure where modern compositing window managers with GPU accelerated GUIs fall on this scale.But with M1 chips, 100% of the graphical data is in the RAM. With a dedicated GPU, it is not 100% that goes to RAM as most of it will be on the VRAM.
What you don't understand with traditional GPU's having dedicated VRAM on the GPU itself so that it doesn't eat into the RAM? Some GPU's come with 24GB VRAM shipped on the GPU cards itself even.
M1 chips don't have VRAM on the GPU to use, so it will eat into your RAM, thus you need more RAM with M1 chips.
Your RAM will disappear when using 21GB VRAM on the M1 Max while if you had a dedicated GPU, you'd still have plenty of RAM available.
The RAM being tightly coupled to the SoC is not a prerequisite for UMA. The Amiga had UMA in 1985 and the RAM chips were very separate.
You're trying to compare systems with a dedicated GPU to systems with iGPUs, which is the very definition of an "apples to oranges" comparison. In systems with iGPUs (which is the point of comparison here), x86 systems reserve part of the system RAM for graphics, and have to copy data to both "partitions" so that the CPU and GPU can work on the data in question. Because those systems then have to reconcile the changes made on both copies of the data, that introduces a hit to both performance and latency (because data is now traveling back and forth between RAM and the CPU multiple times.) With Apple Silicon, both the CPU and GPU can access the same data simultaneously without needing to a) partition RAM, b) create multiple copies of data, and b) reconcile changes made by both the CPU and GPU before executing said code.
An Amiga 500 would have CPU and "GPU" assets freely mixed in chip-RAM, one unified address space where the processor and co-processors could access and manipulate everything. How is that "not even remotely the same setup"?In modern systems, UMA is not even remotely the same setup as those Commodore and Amiga systems used.
You're trying to compare systems with a dedicated GPU to systems with iGPUs, which is the very definition of an "apples to oranges" comparison. In systems with iGPUs (which is the point of comparison here), x86 systems reserve part of the system RAM for graphics, and have to copy data to both "partitions" so that the CPU and GPU can work on the data in question. Because those systems then have to reconcile the changes made on both copies of the data, that introduces a hit to both performance and latency (because data is now traveling back and forth between RAM and the CPU multiple times.) With Apple Silicon, both the CPU and GPU can access the same data simultaneously without needing to a) partition RAM, b) create multiple copies of data, and b) reconcile changes made by both the CPU and GPU before executing said code.