Hi,
How much ram does the M1 take up from the memory pool to run the GPU?
I want to get it for music production and I'd be happy with 16 gigs but I'm unsure as to how much memory the GPU using, especially with an external display?
The M1 GPU does not have allocated RAM like the Intel machines, where the GPU would have perhaps 1.5GB set aside that the CPU couldn't therefore use. If you're not using the GPU that hard, then you'll have most of the RAM available for the CPU.
m1 soc unified memory architecture - Google searchHow do you know this?
m1 soc unified memory architecture - Google search
Wow, you must have read through those articles quickly. Oh, wait. You didn't read them did you.How is that supposed to support the claim you are making? What I am asking is - do you have any factual evidence, or a reference to a source presenting such factual evidence that GPU memory allocation works differently on Intel and Apple GPUs? Both use unified memory architecture with last level cache shared between CPU and GPU. M1 definitely reserves some memory for GPU use, although I am unsure how much.
Wow, you must have read through those articles quickly. Oh, wait. You didn't read them did you.
Building everything into one chip gives the system a unified memory architecture.
This means that the GPU and CPU are working over the same memory. Graphics resources, such as textures, images and geometry data, can be shared between the CPU and GPU efficiently, with no overhead, as there's no need to copy data across a PCIe bus.
I am quite sure I read most relevant of them. I’ve been also doing technical analysis and low-level benchmarking of M1 GPU since I got my unit in December. That’s also the reason I am asking, maybe there is some new technical information out there than I am not aware of yet.
Intel desktop and laptop chips do not use UMA, they use Shared Memory. On Intel, the GPU has to be allocated a partitioned chunk of RAM to be used. The CPU cannot access the partition allocated to the GPU and vice versa. So you still need to copy data between the partitions.
Unless anybody reports back with memory pressure readings from a 16GB M1 machine running something close to your intended workload - anybody's guess. "Music production" is a piece of string to start with (RAM usage depends entirely on what sort of virtual instruments and plug-ins you're using). Some Music apps have quite elaborate UIs, and running a couple of high-res displays will need more RAM allocated to video - especially if you're using scaled modes where everything is rendered to an internal buffer and then downsampled.I want to get it for music production and I'd be happy with 16 gigs but I'm unsure as to how much memory the GPU using, especially with an external display?
Wow, you must have read through those articles quickly. Oh, wait. You didn't read them did you.
The whole difference between unified memory and shared memory is that lack of partitioning. The GPU and CPU can access the same block of memory. Compared to Intel it provides two specific benefits:
1. Without having to pre-allocate RAM to be used as video memory, you don’t have to deal with specific limits to video memory (the 1.5GB mentioned earlier), and can more easily balance between CPU and GPU demand.
2. Not needing to copy buffers means some measurable RAM savings, less pressure on memory bandwidth, and a bit of a latency boost.
Unless anybody reports back with memory pressure readings from a 16GB M1 machine running something close to your intended workload - anybody's guess. "Music production" is a piece of string to start with (RAM usage depends entirely on what sort of virtual instruments and plug-ins you're using). Some Music apps have quite elaborate UIs, and running a couple of high-res displays will need more RAM allocated to video - especially if you're using scaled modes where everything is rendered to an internal buffer and then downsampled.
Odds are, an M1 will not only do your job, but do it faster because it is all-round more efficient and what you lose on the roundabouts, you gain on the swings - and there have been plenty of Youtube demos showing it running a shedload of Logic Pro tracks & instruments.
However, the safe assumption is that if your workflow actually needed more than 16GB on Intel then it will at least benefit from more than 16GB on Apple Silicon and it would be best to wait for the higher-end Apple Silicon systems to come out. Even if an M1 can currently outperform a high-end Intel iMac or 16" MBP, in six months' time its going to be getting sand kicked in its face - this strange hiatus where the entry-level Macs apparently out-perform the more expensive ones won't last for long - Apple can't afford for it to go on or it's going to hit higher-end Mac sales.
That said, you need to be sure that you really do need all the ram on your Intel Mac in the first place (look at Memory Pressure).
Video RAM wise, the M1 is almost certain to be better than a MacBook or Mini with Intel integrated graphics. Vs. an iMac that only has a discrete GPU with 8GB+ of VRAM, it is harder to tell.
(Also, you need to carefully check whether all the plug-ins, drivers etc, you need are compatible with Big Sur yet, let alone the M1...)
Telling someone to effectively "go Google" isn't particularly helpful when the Internet is swimming with bogus information and unfounded speculation. Everything I've seen from Apple has been extremely vague, more marketing than technical info, and boiled down to "Unified Memory is faster because data doesn't have to be copied between devices" which says nothing about how RAM is allocated. All you get with a Google search is lots of tech sites speculating on the same limited Apple data. The possibility that the equivalent VRAM would be allocated "on demand" is a very plausible speculation - but unless someone can point to the Apple document that details that, it is speculation.
Reality seems to be that Unified Memory is more efficient - but how more efficient is hard to test, and hard to isolate from the other performance gains of the M1 (...which might look much less impressive when higher-end Apple Silicon Macs appear).
Lots of the YouTube stuff seems to come from people who don't understand the difference between "Memory Used" and "Memory Pressure" or "Swap used" and swap rate - or are looking for a RAM-related speedup on workflows that don't strain the RAM on an Intel system...
How is that supposed to support the claim you are making? What I am asking is - do you have any factual evidence, or a reference to a source presenting such factual evidence that GPU memory allocation works differently on Intel and Apple GPUs? Both use unified memory architecture with last level cache shared between CPU and GPU. M1 definitely reserves some memory for GPU use, although I am unsure how much.
Intel documentation disagrees with you.
As far as I refill, Intel has been using unified memory since at least Sandy Bridge, maybe earlier. There might be some restrictions, not 100% sure.
The M1 processor’s memory is a single pool that’s accessible by any portion of the processor. If the system needs more memory for graphics, it can allocate that. If it needs more memory for the Neural Engine, likewise. Even better, because all the aspects of the processor can access all of the system memory, there’s no performance hit when the graphics cores need to access something that was previously being accessed by a processor core. On other systems, the data has to be copied from one portion of memory to another—but on the M1, it’s just instantly accessible.
One thing you have to realize is that Intel's use of the term "UMA" is misleading. For Intel's purposes, they just renamed Intel HD to UMA, but made no changes to the underlying architecture.
Intel documentation disagrees with you.
As far as I refill, Intel has been using unified memory since at least Sandy Bridge, maybe earlier. There might be some restrictions, not 100% sure.
Think this could help: https://developer.apple.com/videos/play/wwdc2020/10632/
This explains the method of rendering common with mobile, consoles, and low power GPUs. This is why you need less memory than dGPUs. Tiles rendering has been a way to handle 3d in smaller tiles and scan changes, etc. It's been done for years on console as well as newer tablets (iPad.)
Given that, it isn't cut and dry how much memeory you use for textures and rednering. It's differnt than how dGPUs work with framebuffers.
Let’s dig into the last point, the on-chip memory. With the M1, this is also part of the SoC. The memory in the M1 is what is described as a ‘unified memory architecture’ (UMA) that allows the CPU, GPU, and other cores to exchange information between one another, and with unified memory, the CPU and GPU can access memory simultaneously rather than copying data between one area and another. Erik continues…
“For a long time, budget computer systems have had the CPU and GPU integrated into the same chip (same silicon die). In the past saying ‘integrated graphics’ was essentially the same as saying ‘slow graphics’. These were slow for several reasons:
Separate areas of this memory got reserved for the CPU and GPU. If the CPU had a chunk of data it wanted the GPU to use, it couldn’t say “here have some of my memory.” No, the CPU had to explicitly copy the whole chunk of data over the memory area controlled by the GPU.”
"access the same data without copying it between multiple pools of memory"
I’ll be honest, leman and mi7chy (despite the brash attitude) do point to good sources that Intel does at least support zero copy and dynamic partition sizing.
Even if it’s not true UMA as you claim, it looks like from the docs I’ve read through so far, it’s at least able to dedicate memory pages to be used for zero-copy, which I expect does some interesting tricks to make the same RAM page available to both sides.
The question I have which I’m hoping the docs will answer once I get more time is how those pages are handled in more detail, and what sort of integration the OS has to do to take best advantage of this. But even with that answer, if the OS APIs have to signal to the GPU how to manage the pages, how much optimization has Apple done there?
Welp, my understanding of Intel’s GPU architecture is proven to be out of date. Indeed, there shouldn’t be huge differences in that case.
My understanding was that there was still some fixed partitioning going on, but it looks like Google’s dredging up old articles on this, which led me down the wrong path.
My understanding after re-skimming the video (it’s been a few months since I last watched it) is that this doesn’t necessarily impact the amount of video memory this needs all that much, but rather the pressure placed on memory bandwidth.
I still need X MB for a texture of a given size, and X MB for the frame buffer in either design. However, TBDR reduces how often you need to reach out to (V)RAM, especially in situations where you need to make multiple passes. It *might* reduce intermediate buffers a little, but that assumes intermediate buffers are a noticeable contribution compared to the other buffers in use. My understanding was that the back buffer itself was used as the intermediate buffer, so I am a bit skeptical that there’s big gains to be had there. Draw to texture seems to be common these days, so there might be more than I expect, assuming these scenarios can be all done at the tile level, rather than a texture.
For sure out of date. I'm even more horrified by the fact they will now sell (to OEMs) descrete versions of the Iris Xe. More and more spreading out of the architecture. Such a mess.
So true, I can't believe they are even making discrete cards of these as well for 11th Gen.The Iris Xe is nothing more than a rebranded Intel UHD iGPU (which itself was rebranded from Intel HD). It's like they're trying to present a Chevelle as a brand new car just by repainting the exterior...
Seems to describe Gen 11 iGPU only? Does it also apply to earlier generations’ iGPUs?Intel documentation disagrees with you.
https://software.intel.com/sites/de...tel-Processor-Graphics-Gen11_R1new.pdf#page19