M1 Owners: Is 16GB the same as Intel or has it better memory management?

NJRonbo · Mar 22, 2023

pshufd said:
I have the base Studio and regretted not getting 64, but, in actual use, I don't need it.

You absolutely don't!

I was going to get the base model with 32GB. I felt that I couldn't live with myself if I didn't push for 64GB.

I have to tell you, I run my Studio with close to 40 utility apps running at startup and multiple tabs open on my browser.

On my Intel Machine, it would push 64GB. As I mentioned above, I am only pushing 30GB of memory right now under the same circumstances.

pshufd · Mar 22, 2023

NJRonbo said:
You absolutely don't!

I was going to get the base model with 32GB. I felt that I couldn't live with myself if I didn't push for 64GB.

I have to tell you, I run my Studio with close to 40 utility apps running at startup and multiple tabs open on my browser.

On my Intel Machine, it would push 64GB. As I mentioned above, I am only pushing 30GB of memory right now under the same circumstances.

I have a base Studio and an M1 Pro MacBook Pro with 32 GB. I figured that if I needed more RAM, I could just run some programs on the MacBook Pro. It turns out that I can run all of my stuff on either system which is nice because the Studio is hooked up to 4 monitors and I like it that I don't have to plug and unplug. I can just toss my MacBook Pro in my backpack and my environment is the same between the two systems.

NJRonbo · Mar 22, 2023

pshufd said:
I have a base Studio and an M1 Pro MacBook Pro with 32 GB. I figured that if I needed more RAM, I could just run some programs on the MacBook Pro. It turns out that I can run all of my stuff on either system which is nice because the Studio is hooked up to 4 monitors and I like it that I don't have to plug and unplug. I can just toss my MacBook Pro in my backpack and my environment is the same between the two systems.

I think Apple revolutionized the computer market when they moved away from Intel and created their own silicon chips.

These computers are simply amazing. More powerful than anything most of us could use them for. The memory management on these machines is remarkable.

I have been in awe of the Studio. I know there is an Ultimate model, but these standard ones outshine anything I have owned previously.

pshufd · Mar 22, 2023

NJRonbo said:
I think Apple revolutionized the computer market when they moved away from Intel and created their own silicon chips.

These computers are simply amazing. More powerful than anything most of us could use them for. The memory management on these machines is remarkable.

I have been in awe of the Studio. I know there is an Ultimate model, but these standard ones outshine anything I have owned previously.

I have 6.57 GB compressed RAM so there's definitely something there. I'm pretty sure that Intel Macs can compress RAM as well but the algorithms might be different if the compression performance is different between the two architectures. I'd guess that Apple Silicon has dedicated silicon to compression and decompression.

I do know that 16 isn't enough as I bought an M1 mini with 16 GB of RAM and it wasn't enough.

NJRonbo · Mar 22, 2023

pshufd said:
I have 6.57 GB compressed RAM so there's definitely something there. I'm pretty sure that Intel Macs can compress RAM as well but the algorithms might be different if the compression performance is different between the two architectures. I'd guess that Apple Silicon has dedicated silicon to compression and decompression.

I do know that 16 isn't enough as I bought an M1 mini with 16 GB of RAM and it wasn't enough.

Yeah, bought an M1 Mini 16GB ram when it first came out and returned it two days later. It choked.

name99 · Mar 22, 2023

pshufd said:
I have 6.57 GB compressed RAM so there's definitely something there. I'm pretty sure that Intel Macs can compress RAM as well but the algorithms might be different if the compression performance is different between the two architectures. I'd guess that Apple Silicon has dedicated silicon to compression and decompression.

I do know that 16 isn't enough as I bought an M1 mini with 16 GB of RAM and it wasn't enough.

The algorithm is probably similar on Intel and ARM. You can see the evolution of the algorithms in the Darwin source code. There've been about three generations of successive algorithms.

However
- Apple added a custom instruction to their chips to handle part of the compression (for the most recent algorithm).

- Apple has a dedicated LZ engine per cluster to perform page compression/decompression. Under normal circumstances this is apparently not used because it's slower (ie higher latency) than SW decode, but it is used when memory pressure becomes extreme, and may help with the machine not feeling like it has ground to a halt under these conditions.

In terms of amount of memory, no-one can know this but you. In MY case
- I have 32GB of RAM on my main (Intel) machine, but under normal circumstances no more than about 16GB is really being used. HOWEVER
- I also have 8GB of VRAM, and that's by far the largest pain point. Every so often I open enough Safari tabs+windows that this overflows and has to swap, and OMG you can IMMEDIATELY tell when that has happened.

So point is
(a) in one sense you need to add in the VRAM of your current machine to see your "working set", ie in my case if I got 16GB for my next machine I'd probably be unhappy; 24GB is probably what I need to also cover the VRAM (ie backing store for windows).
(b) on the other hand, no more 8GB limit, so I can open as many tabs/windows and more as on Intel without hitting that immediate cliff at 8GB.

The other issue of course is how long do you expect to use the machine before upgrading?
In my case, given everything I've told you, when M3 comes out I'll probably get a model with 32GB --24 GB would be "good enough", while the extra 8GB gives me padding for my workload to grow for five or seven years before the next upgrade.

MrGunny94 · Mar 22, 2023

With the transition of Apple Silicon I stopped having VMs since I cannot have fully working x86 VMs and that's fine. 16GB is more than enough for my coding and web apps open.

Maximum I have ever seen is 60% memory pressure and that's with Rosetta 2 aps opened.

dmccloud · Mar 22, 2023

pshufd said:
I have 6.57 GB compressed RAM so there's definitely something there. I'm pretty sure that Intel Macs can compress RAM as well but the algorithms might be different if the compression performance is different between the two architectures. I'd guess that Apple Silicon has dedicated silicon to compression and decompression.

I do know that 16 isn't enough as I bought an M1 mini with 16 GB of RAM and it wasn't enough.

Hard to verify this given just how little we know about Apple Silicon in terms of specific programming details, but I wouldn't be surprised if compression and decompression are handled by the ANE cores rather than either the CPU or GPU cores.

Sydde · Mar 22, 2023

dmccloud said:
Hard to verify this given just how little we know about Apple Silicon in terms of specific programming details, but I wouldn't be surprised if compression and decompression are handled by the ANE cores rather than either the CPU or GPU cores.

I doubt even that: I suspect they have hardwire logic circuits that handle the compression/decompression directly, without any kind of software coding at all. I mean, why not, if energy efficiency is what you want? It would be a system management function, so they just have a SPSR interface to the compression logic, the supervisor writes some registers and goes on about its business while the operation pours through the logic far, far more efficiently than would code. It is what I would do.

leman · Mar 23, 2023

dmccloud said:
Hard to verify this given just how little we know about Apple Silicon in terms of specific programming details, but I wouldn't be surprised if compression and decompression are handled by the ANE cores rather than either the CPU or GPU cores.

Why ANE specifically? Convolution is entirely different task from compression.

pshufd · Mar 23, 2023

name99 said:
The algorithm is probably similar on Intel and ARM. You can see the evolution of the algorithms in the Darwin source code. There've been about three generations of successive algorithms.

However
- Apple added a custom instruction to their chips to handle part of the compression (for the most recent algorithm).

- Apple has a dedicated LZ engine per cluster to perform page compression/decompression. Under normal circumstances this is apparently not used because it's slower (ie higher latency) than SW decode, but it is used when memory pressure becomes extreme, and may help with the machine not feeling like it has ground to a halt under these conditions.

In terms of amount of memory, no-one can know this but you. In MY case
- I have 32GB of RAM on my main (Intel) machine, but under normal circumstances no more than about 16GB is really being used. HOWEVER
- I also have 8GB of VRAM, and that's by far the largest pain point. Every so often I open enough Safari tabs+windows that this overflows and has to swap, and OMG you can IMMEDIATELY tell when that has happened.

So point is
(a) in one sense you need to add in the VRAM of your current machine to see your "working set", ie in my case if I got 16GB for my next machine I'd probably be unhappy; 24GB is probably what I need to also cover the VRAM (ie backing store for windows).
(b) on the other hand, no more 8GB limit, so I can open as many tabs/windows and more as on Intel without hitting that immediate cliff at 8GB.

The other issue of course is how long do you expect to use the machine before upgrading?
In my case, given everything I've told you, when M3 comes out I'll probably get a model with 32GB --24 GB would be "good enough", while the extra 8GB gives me padding for my workload to grow for five or seven years before the next upgrade.

I never even tried to run a Window virtual machine on my M1 mini.

I don't think that I need to upgrade for quite some time. I still have the M1 mini with 16 GB of RAM and could just run the mini and the Studio cooperatively if I really needed more RAM. I could also run the Studio with my Windows desktop cooperatively - it has 128 GB of RAM. I would just have to go back to screwing up the keyboard shortcuts switching back and forth between macOS and Windows. I have about 350 GB of RAM between all of my Windows and Mac systems and can mix and match as needed but it is nice running everything on the Studio. I don't see a need to upgrade for a long time unless my programs require more or I run additional applications which need it.

mr_roboto · Mar 23, 2023

dmccloud said:
Hard to verify this given just how little we know about Apple Silicon in terms of specific programming details, but I wouldn't be surprised if compression and decompression are handled by the ANE cores rather than either the CPU or GPU cores.

Sydde said:
I doubt even that: I suspect they have hardwire logic circuits that handle the compression/decompression directly, without any kind of software coding at all. I mean, why not, if energy efficiency is what you want? It would be a system management function, so they just have a SPSR interface to the compression logic, the supervisor writes some registers and goes on about its business while the operation pours through the logic far, far more efficiently than would code. It is what I would do.

None of the above. It's a pair of custom Arm instructions, wkdmc and wkdmd. They're named like that because Apple uses the WKdm compression algorithm; the 'c/d' on the end indicates compress or decompress.

The source of uncompressed data for wkdmc is always 1 VM page, and wkdmd always expands to exactly 1 VM page.

That said, if you go looking at XNU (kernel) source code releases, what you'll find is a software implementation of WKdm. I don't know whether this is something like them deciding to redact all custom instructions from public source releases, or they just aren't using the instructions even in shipping kernels (there could be a downside to actually using these instructions and they decided they weren't worthwhile after all).

name99 · Mar 24, 2023

mr_roboto said:
None of the above. It's a pair of custom Arm instructions, wkdmc and wkdmd. They're named like that because Apple uses the WKdm compression algorithm; the 'c/d' on the end indicates compress or decompress.

The source of uncompressed data for wkdmc is always 1 VM page, and wkdmd always expands to exactly 1 VM page.

That said, if you go looking at XNU (kernel) source code releases, what you'll find is a software implementation of WKdm. I don't know whether this is something like them deciding to redact all custom instructions from public source releases, or they just aren't using the instructions even in shipping kernels (there could be a downside to actually using these instructions and they decided they weren't worthwhile after all).

Apple uses multiple algorithms. WKDM used to be used, now LZ4 is used.
The switch seems to be at metacompressor() in
https://github.com/apple/darwin-xnu/blob/main/osfmk/vm/vm_compressor_algorithms.c
I don't care enough to figure what this is doing.
There is an Apple patent for a way to scanning, categorizing, and recording pages so that the VM system has some idea of what algorithm is optimal for a particular page.

You can see the assembly for the WKDM here

darwin-xnu/osfmk/arm64/WKdmCompress_16k.s at main · apple/darwin-xnu

Legacy mirror of Darwin Kernel. Replaced by https://github.com/apple-oss-distributions/xnu - apple/darwin-xnu

github.com

and for the LZ4 here

darwin-xnu/osfmk/arm64/lz4_decode_arm64.s at main · apple/darwin-xnu

Legacy mirror of Darwin Kernel. Replaced by https://github.com/apple-oss-distributions/xnu - apple/darwin-xnu

github.com

I have never heard of wkdmc/wkdmd instructions. They may be not exactly instructions but millicode or OS calls that expand into a WKCMD engine associated with a cluster. I'm aware of such an engine that, to my eyes, looked like an LZ engine, but maybe it can handle both?

The one ARM instruction I know of that's for this sort of task is one that accelerates the handling of LZ-FSE, specifically the FSE entropy part. (But, contrary to what I said above where I misrembered, I don't think that, ie LZ-FSE, is used for page compression; the extra compression is nice but not worth the extra time relative to basic LZ4.)

Sydde · Mar 24, 2023

leman said:
Why ANE specifically? Convolution is entirely different task from compression.

That is my first instinct as well. Compression seems like a stritcly deterministic process, especially if you want it to be lossless. However, there appears to be a meaningful correation between ML and compression, which could make a neural engine a good choice for that. But I think there would have to be some specialized logic in there if you want it fast and reliable.

leman · Mar 25, 2023

Sydde said:
That is my first instinct as well. Compression seems like a stritcly deterministic process, especially if you want it to be lossless. However, there appears to be a meaningful correation between ML and compression, which could make a neural engine a good choice for that. But I think there would have to be some specialized logic in there if you want it fast and reliable.

That's fascinating, thanks! Wouldn't this type of compression require training a new model for every to be compressed dataset however? Probably not the most efficient use of time and energy when compressing RAM pages...

Zest28 · Mar 25, 2023

NJRonbo said:
Yeah, bought an M1 Mini 16GB ram when it first came out and returned it two days later. It choked.

I have the same **** with my M2 MacBook Air with 16GB RAM. It chockes sometimes.

My 16” M1 Max MacBook Pro with 32GB RAM keeps on going.

But since I bought the M2 MacBook Air as an iPad replacement, I can live with it.

mr_roboto · Mar 25, 2023

name99 said:
I have never heard of wkdmc/wkdmd instructions. They may be not exactly instructions but millicode or OS calls that expand into a WKCMD engine associated with a cluster. I'm aware of such an engine that, to my eyes, looked like an LZ engine, but maybe it can handle both?

The place I heard about these is Asahi Linux docs:

GitHub - AsahiLinux/docs: Asahi Linux documentation

Asahi Linux documentation. Contribute to AsahiLinux/docs development by creating an account on GitHub.

github.com

JouniS · Mar 25, 2023

Sydde said:
That is my first instinct as well. Compression seems like a stritcly deterministic process, especially if you want it to be lossless. However, there appears to be a meaningful correation between ML and compression, which could make a neural engine a good choice for that. But I think there would have to be some specialized logic in there if you want it fast and reliable.

Information and probability are two sides of the same coin. Anything that deals with probabilities can be used for data compression, and sometimes even with good results. That said, data compression usually deals with problems that have known efficient solutions. Brute-forcing such problems with a big slow machine learning model tends to lead to solutions that are worse and/or slower.

There is a family of naive data compression algorithms that could benefit from machine learning. They take the data as a sequence of symbols (often bytes), predict the next symbol based on the symbols seen so far, and encode the actual next symbol according to the predicted probability distribution. These algorithms achieve good compression with many kinds of data, but they are very slow, because they have to do a lot of work with each symbol.

Practical compression algorithms transform the data into a shorter sequence of symbols before encoding it. For example, a Lempel–Ziv based algorithm could transform the data into a sequence of triples (n, i, s) meaning "take n symbols starting from i positions before and append symbol s". When optimized for speed, they may not even bother with complicated models, predictions, and encodings. Instead, they may choose one fixed model for n, another for i, and a third for s. In some cases, those models may even be hard-coded in the compressor.

satcomer · Mar 25, 2023

Yes memory is around the chip on M1-2 Macs for faster access and has be boughten with as much 16 or above you can enfold! Also the only things that really hit memory are intel Only applications seem to take up massive memory!
So yes if you want to make money doing audio, videos or picture taking so you compile much faster with at least 16G RAM in M1-2 for better management and as M1-2 stay away from Intel only applications if you can and memory will aways be in green!

driven01 · Mar 25, 2023

NJRonbo said:
Yeah, bought an M1 Mini 16GB ram when it first came out and returned it two days later. It choked.

What were you doing?

NJRonbo · Mar 26, 2023

driven01 said:
What were you doing?

My situation is rather unique. I have upwards of 30-40 utilities that open at login and remain in the top menu bar. Additionally, I run Safari with many pinned tabs that remain in memory.

So, without even doing a single thing, my Mac is using close to 16GB

name99 · Mar 26, 2023

mr_roboto said:
The place I heard about these is Asahi Linux docs:

GitHub - AsahiLinux/docs: Asahi Linux documentation

Asahi Linux documentation. Contribute to AsahiLinux/docs development by creating an account on GitHub.

github.com

Thanks!

MacInMotion · Jan 3, 2024

For all those people saying "RAM is RAM"

CheesePuff said:
RAM is RAM, and it will not behave differently on the M1 vs Intel in how RAM is managed. However, since the M1 uses Unified Memory, it doesn't need to allocate 1.5 GB right off the top solely to the iGPU, and when it comes to swap on the storage drive its faster then the read/write of the Flash memory of the models it replaced

and "Data is Data"

casperes1996 said:
Bear with me as I try and illustrate something with code.

Code:

typedef struct Point { UInt32 x; UInt32 y; } Point; int main(int argc, char** argv) { Point *p = malloc(sizeof(Point)*100); printf("How much space is heap allocated?"); return 0; }

It does not matter if you compile and run that on x86, ARM or any other architecture, it will always be a minimum of 6.4Kb allocated. Two 32 bit unsigned integers in a Point, and 100 points. There is no way for the CPU architecture to change the laws of physics on that.

I would like to point out that a lot of what gets stored in memory is software, a.k.a. code, a.k.a. machine language instructions. And on this point, the binary code compiled for Intel CPUs (amd64) is completely different from the binary code compiled for Apple's M-series chips (arm64).

The Intel CPUs are what are known as Complex Instruction Set Computers (CISC), while the M-series CPUs are what is known as Reduced Instruction Set Computers (RISC). CISC CPUs came first, RISC CPUs came later. When RISC computers were first introduced, the defining trade-off was:

CISC CPUs could take a single "complex" instruction, but it might take multiple clock cycles to execute.
RISC CPUs only implemented "simple" instructions, but each ran in a single clock cycle.

As a result, the first RISC programs, when compiled down to machine language (which are called "binaries"), were much bigger (took up more memory, sometimes as much as 4 times the memory) than the same source code compiled for a CISC CPU. So it is completely reasonable to ask if M-series software will use more memory than Intel-based software. I certainly expected that to be the case.

As it turns out, for a lot of reasons, RISC programs got shorter (binaries got smaller) and CISC programs got longer (binaries got bigger) as both architectures (and the compilers supporting them) have matured over the years. Now, the real-world examples of Apple software I have looked at that ship dual-architecture binaries (one for Intel, one for M-series) all have smaller M-series binaries than Intel binaries (around 10% smaller).

So we can conclude that Apple M-series computers actually use less memory than Intel-based computers, at least when they are running native code (as opposed to emulating Intel code), because the software binaries are the only thing that changes between the two architectures. On the flip side, due to the overhead of emulation, we can expect that Apple M-series computers use more memory than Intel-based computers when running Intel code under emulation, because in that case the OS needs both the Intel code and extra memory for the Rosetta translated M-series code, plus Rosetta code itself. How much more memory I don't know, but I've seen well researched estimates of 60% more memory for the code itself. Again, the data usage will be the same. However, it is very rare for me to find applications that are bigger than 100 MB of code, so compared to everything else, any extra memory usage from Rosetta is likely to be negligible.

Regarding GPU memory usage, that will definitely eat into your memory. On Intel Macs, the graphics cards have built-in dedicated memory. Mine has 8 GiB, which I think is the max for Intel MacBooks, and it routinely runs out of memory the way I use my machine (lots and lots of open browser windows, each with multiple tabs). That 8 GiB is now going to come out of the total system memory rather than being an add-on. For me, running several external displays, I expect the GPU to use up considerably more than 8 GiB on an M-series Mac, but for me that is a good thing, because swapping memory off a video card is brutally slow. For a lot of people, the GPU might only use up less than 1 GiB, but it will use up something.

On the whole, though, I conclude that if you are on an Intel Mac and add your current system RAM size to your current video card VRAM size and get that amount of memory, you will likely have the same or better performance on the new machine. The main exception would be if you are running compute-intensive Intel software that is not available in M-series native format.

theorist9 · Jan 3, 2024

armoured said:
Yes, migration assistant will work over ethernet cables (need a dongle on the M1 macs), wirelessly, or from a disk.

Reports are pretty positive generally that migration assistant works well over ethernet/wifi in the most recent macos/M1 macs. But note, usually ethernet/wifi are a bit slower,....

I'll just add that the speed difference is more than a bit. I've tested Migration Assistant between a 2019 iMac and an M1 MacBook Pro using wireless, an ethernet cable (with an adapter), and a 40 Gbps Cable Matters certified USB4 cable (connecting their Thunderbolt ports). I didn't record the figures, but I recall ethernet is faster than wireless, and Thunderbolt is multiples faster than ethernet.

Unless both machines have 10 Gbps ethernet, ethernet's max theoretical transfer rate will be 1 Gbps.

leman · Jan 3, 2024

MacInMotion said:
As it turns out, for a lot of reasons, RISC programs got shorter (binaries got smaller) and CISC programs got longer (binaries got bigger) as both architectures (and the compilers supporting them) have matured over the years. Now, the real-world examples of Apple software I have looked at that ship dual-architecture binaries (one for Intel, one for M-series) all have smaller M-series binaries than Intel binaries (around 10% smaller).

It is true that ARM64 often produces denser code than x86-64 (depending on application type and compilation settings), but code is tiny in relation to data. I just had a look at a bunch of applications, the difference between Intel and ARM code was between zero and few megabytes — completely negligible. Probably the biggest binary I have on my system is Baldur's Gates 3 (a demanding computer game that occupies around 100GB on disk), but even there the executable code size is 100MB — a small fraction of total memory usage. The Intel code segment is 7MB bigger than the ARM one. In summary, this will have no meaningful impact on memory consumption.

In addition, CPU technology has evolved a lot in the last 30 years. Original CISC and RISC CPU designs have as much relevance to today's technology as ancient chariots to modern cars. Some basic design philosophy has survived of course: contemporary "RISC" is designed with efficient execution in hardware in mind, while contemporary "CISC" is mostly a mixed bag that cares about compatibility with old code (in fact, Intel's current CPUs only real "CISC" feature is that operands can come from either registers or memory, so if one wants to argue that x86 is CISC the entire RISC/CISC distinction is trivially reduced to load-store vs. reg/mem design ).

MacInMotion said:
Regarding GPU memory usage, that will definitely eat into your memory. On Intel Macs, the graphics cards have built-in dedicated memory. Mine has 8 GiB, which I think is the max for Intel MacBooks, and it routinely runs out of memory the way I use my machine (lots and lots of open browser windows, each with multiple tabs). That 8 GiB is now going to come out of the total system memory rather than being an add-on. For me, running several external displays, I expect the GPU to use up considerably more than 8 GiB on an M-series Mac, but for me that is a good thing, because swapping memory off a video card is brutally slow. For a lot of people, the GPU might only use up less than 1 GiB, but it will use up something.

This is a popular argument but far from being obvious in practice. On many dGPU systems the application/driver/system will mirror a decent portion of GPU memory in system memory, for various reasons. So you can't conclude that the dGPU's separate memory pool gives you more effective memory. The specific memory usage needs to be measured in every case.

Regarding multiple external displays: you don't need a lot of memory to support a display on its own. Even a 5K monitor with full colors only needs around 60MB per frame. Of course, more displays usually means more visible and active apps, which is where the main memory cost comes from. Again, this is something that has to be measured for every use case.

M1 Owners: Is 16GB the same as Intel or has it better memory management?

macrumors 68040

macrumors G4

macrumors 68040

macrumors G4

macrumors 68040

macrumors 68030

macrumors 65816

macrumors 68040

macrumors 68030

macrumors Core

macrumors G4

macrumors 6502a

macrumors 68030

macrumors 68030

macrumors Core

macrumors 68030

macrumors 6502a

macrumors 6502a

Suspended

macrumors newbie

macrumors 68040

macrumors 68030

macrumors member

macrumors 601

macrumors Core

Our Staff