8GB RAM in ARM vs Intel Macbooks

project_2501 · Mar 16, 2021

I have been surprised that Apple is still starting at 8GB RAM for the MacBook M1 Air and Pro laptops.

Software and data is just much bigger - the OS itself has grown, the apps we use have grown, the data files we use have grown.

My question is - are people not talking about this because there is something about the M1 version of MacOS that is much smaller than the Intel version? Do third party apps themselves compile to smaller binaries on M1?

I have also noticed that only 8GB M1 MacBook Airs are appearing in the Apple Refurbished store - are people dissatisfied? Is the use of Rosetta pushing an already constrained RAM over its limits?

Have the SSD wear problems been due to an over-reliance on swap-to-disk because 8GB is just too small for a 2020 laptop?

I am fairly technically literate so please do explain in as much detail as you need.

delsoul · Mar 16, 2021

Those same SSD wear problems people said 10+ years ago that the hard drives were going to die in no time...and here they are still most all working perfectly fine 10 years later? Nothing wrong with the new M1's. Just people who will buy something and find anything to complain about instead (especially when there's nothing there to complain about) of enjoying it.

CheesePuff · Mar 16, 2021

The size of the application binary has zero effect on the memory usage, in fact universal binaries are larger.

RAM is lesser of an issue on these devices because of it being uniformed memory and faster storage which allows for faster swapping of memory to storage.

macOS and applications still use RAM the exact same way on AS or Intel.

The refurb store generally always has more of the basic configurations of any device as its the most sold ones.

chabig · Mar 16, 2021

project_2501 said:
Software and data is just much bigger - the OS itself has grown, the apps we use have grown, the data files we use have grown.

I don't really think that's true. The most data dense files are video files, and they haven't changed much. Besides that, the operating system manages memory using modern techniques that don't require everything to be resident in RAM at once.

project_2501 said:
My question is - are people not talking about this

I see that you haven't searched the forums. There are numerous threads on this.

project_2501 said:
I have also noticed that only 8GB M1 MacBook Airs are appearing in the Apple Refurbished store

That means nothing. There will be 16GB refurbished machines when they have them to sell.

project_2501 said:
Is the use of Rosetta pushing an already constrained RAM over its limits?

Rosetta doesn't require additional RAM, as it's not an emulation environment. Rosetta translates Intel code to M1 code, which then runs natively. It's done when an app is installed or first run and then the new code is saved.

project_2501 · Mar 16, 2021

CheesePuff said:
The size of the application binary has zero effect on the memory usage, in fact universal binaries are larger.

RAM is lesser of an issue on these devices because of it being uniformed memory and faster storage which allows for faster swapping of memory to storage.

macOS and applications still use RAM the exact same way on AS or Intel.

The refurb store generally always has more of the basic configurations of any device as its the most sold ones.

Hi CheesePuff - thanks for replying. Can I get clarification on two points?

1. If application binaries are bigger, they will consume more RAM. Does MacOS load the binaries and then 'free()' the segments that are not for the current architecture?

2. How does uniformed memory reduce the need for memory size? If I load a large image into Affinity Photo or load data into a python numpy array, if it needed X memory on Intel, it will still need X memory on ARM M1, surely? Is your point that 8GB RAM means we're using swap more but don't notice much because it is faster SSD storage?

Again - thanks for your patience.

project_2501 · Mar 16, 2021

chabig said:
I don't really think that's true. The most data dense files are video files, and they haven't changed much. Besides that, the operating system manages memory using modern techniques that don't require everything to be resident in RAM at once.

I see that you haven't searched the forums. There are numerous threads on this.

That means nothing. There will be 16GB refurbished machines when they have them to sell.

Rosetta doesn't require additional RAM, as it's not an emulation environment. Rosetta translates Intel code to M1 code, which then runs natively. It's done when an app is installed or first run and then the new code is saved.

Thanks I didn't know Rosetta was not an emulator but a translator that recompiled (transcompiled) code once.

Mac... nificent · Mar 16, 2021

chabig said:
Rosetta translates Intel code to M1 code, which then runs natively. It's done when an app is installed or first run and then the new code is saved.

So I'm wondering if when a program starts running natively, does this old code get erased?

project_2501 · Mar 16, 2021

Mac... nificent said:
So I'm wondering if when a program starts running natively, does this old code get erased?

I hope not. Updates to Rosetta can lead to new transcoded binaries which can run faster. If the old code is deleted, you can't rerun Rosetta.

Also storage is cheap, so no problem keeping the old code. Oh wait.. Apple storage isn' cheap

Mac... nificent · Mar 16, 2021

project_2501 said:
If the old code is deleted, you can't rerun Rosetta.

Why wouldn't it just run like it did the first time on your computer?

casperes1996 · Mar 16, 2021

project_2501 said:
1. If application binaries are bigger, they will consume more RAM. Does MacOS load the binaries and then 'free()' the segments that are not for the current architecture?

It doesn't load the whole binary. It loads only the relevant portion of it. Modern executables are not "raw"; They have a header. In macOS/Darwin this is a Mach-O executable header, in Linux it will typically be an ELF header. The header will store information about the executable, like at what offsets from the start of the file certain portions of it are located, like executable code vs data segments. This also extends to universal binaries, so the OS only needs to load the relevant portion of the binary.

Which also means that the file size of a universal binary ≠ its memory footprint. For single architecture binaries that more or less is the case but with some caveats that are different to each program.
ARM binaries are actually typically slightly larger than x86 binaries though it's a rather minuscule difference.
That said, sometimes the ARM binary will come out smaller. Here's an example from Final Cut

That's in bytes. As you can see, both binaries are around 1.7MB. And there's only ≈88k between them. I think Final Cut is a fairly big example as well.

To get this kind of information yourself, run
lipo -detailed_info <path-to-binary>
Note it must be the path to the binary itself, not the .app bundle.

The limo command can also be used to create a non-universal binary, splitting out individual architectures, if the additional MB's annoy you; You can read about its usage through its man page

man lipo.

One of my best guesses about the M1's perceived need for less memory to not slow down as much compared to its x86 brethren is not about disk speed (entirely; It of course is a factor), but about the chip's insanely wide OOO buffer (out of order). Thus if the chip knows some-500 instructions in advance it will need specific data from memory, memory perhaps being swapped to disk, it can request it far in advance of actually needing it. If memory serves, Apple's M1 OOO reorder queue is about 3x larger than Intel's Ice Lake chip. - Keep in mind that this is not an ISA difference. It's not about ARM vs x86; This is specific to each CPU micro-architecture.
Whether it's the reason, a contributing factor minor or major or whatever else may be going on I can't state confidently and don't have an M1 to write micro-benchmarks on. But it's my best guess at the moment. That and very intelligent swapping that knows which data is more required when; Perhaps even using the Neural Engine for said predictions, but that's unfounded speculation.

project_2501 · Mar 16, 2021

casperes1996 said:
It doesn't load the whole binary. It loads only the relevant portion of it. Modern executables are not "raw"; They have a

Thanks @casperes1996 and excellent reply.

You've explained why the apps aren't significantly different.

So that leaves me puzzled as to small 8GB RAM for 2021 laptops.

quarkysg · Mar 16, 2021

casperes1996 said:
but about the chip's insanely wide OOO buffer (out of order). Thus if the chip knows some-500 instructions in advance it will need specific data from memory, memory perhaps being swapped to disk, it can request it far in advance of actually needing it. If memory serves, Apple's M1 OOO reorder queue is about 3x larger than Intel's Ice Lake chip. - Keep in mind that this is not an ISA difference. It's not about ARM vs x86; This is specific to each CPU micro-architecture.

The M1's Firestorm cores have 192K of L1 instruction cache, while the Icestrom cores have 128K of L1 instruction cache. These are the ones feeding the CPU pipeline. When an instruction is read from memory, the memory controller will typically fill the cache with subsequent instructions from memory. How deep the controller will pre-fetch more instructions is not publicly known tho., I don't think.

From my understanding the OoO re-order buffers are used for in-flight instructions to determine which of those in-flight can be completed first, as some instructions takes longer clock cycles to complete.

Anyway, I don't think macOS has the intelligent build-in to manage swap based on the instruction stream tho. Most OSes do that when there's a page fault during memory access.

casperes1996 · Mar 16, 2021

quarkysg said:
The M1's Firestorm cores have 192K of L1 instruction cache, while the Icestrom cores have 128K of L1 instruction cache. These are the ones feeding the CPU pipeline. When an instruction is read from memory, the memory controller will typically fill the cache with subsequent instructions from memory. How deep the controller will pre-fetch more instructions is not publicly known tho., I don't think.

From my understanding the OoO re-order buffers are used for in-flight instructions to determine which of those in-flight can be completed first, as some instructions takes longer clock cycles to complete.

Anyway, I don't think macOS has the intelligent build-in to manage swap based on the instruction stream tho. Most OSes do that when there's a page fault during memory access.

Anandtech, in their review:

One aspect of recent Apple designs which we were never really able to answer concretely is how deep their out-of-order execution capabilities are. The last official resource we had on the matter was a 192 figure for the ROB (Re-order Buffer) inside of the 2013 Cyclone design. Thanks again to Veedrac’s implementation of a test that appears to expose this part of the µarch, we can seemingly confirm that Firestorm’s ROB is in the 630 instruction range deep, which had been an upgrade from last year’s A13 Lightning core which is measured in at 560 instructions. It’s not clear as to whether this is actually a traditional ROB as in other architectures, but the test at least exposes microarchitectural limitations which are tied to the ROB and behaves and exposes correct figures on other designs in the industry. An out-of-order window is the amount of instructions that a core can have “parked”, waiting for execution in, well, out of order sequence, whilst the core is trying to fetch and execute the dependencies of each instruction.

Apple Announces The Apple Silicon M1: Ditching x86 - What to Expect, Based on A14

www.anandtech.com

[/QUOTE]

Addendum:
The concept of out of order reordering is orthogonal to that of page faults. The OS will treat it the exact same as normally, and a page fault will trigger. an interrupt will trigger, the CPU will be told to load its idtr, jump to the OS's page fault handler and deal with it. All of that will be just the exact same as normal. The difference is purely inside the CPU itself.

If I tell you to
make dinner, eat diner, turn on the dishwasher from yesterday, then empty it and put in your newly dirtied dishes.

You might look at that and go "Hey, I'll just start the dishwasher at the start, so I don't have to wait as long".
Regardless of if something is in memory or will trigger a page fault or whatever the CPU can do the same to memory operations. - They involve a wait. And modern operating systems can do disk I/O asynchronously too. So the PFE will trigger sooner, the request sent to disk sooner, while dealing with all the other tasks it knows how to deal with. Done transparently to software mostly.

This is being done as a matter of fact. - What may or may not be happening, which I suspect is happening, is that the CPU is also using the fact its MMU is loaded with page tables to intelligently know if a memory instruction will or won't trigger a PFE, and weigh that into out of order reordering.

PS. I'm developing an OS for my bachelor project

casperes1996 · Mar 17, 2021

quarkysg said:
From my understanding the OoO re-order buffers are used for in-flight instructions to determine which of those in-flight can be completed first, as some instructions takes longer clock cycles to complete.

Didn't see this part; Didn't mean to come off lecturing - you know what you're talking about clearly

- But yeah instructions already loaded up can also affect the loading of memory in the manner I described

leman · Mar 17, 2021

project_2501 said:
So that leaves me puzzled as to small 8GB RAM for 2021 laptops.

Short answer: because this is absolutely sufficient and is the industry standard for consumer-level premium laptop in 2021. Dell, Lenovo etc. - they all sell more expensive laptops than M1 machines that still have 8GB RAM.

I think you are overestimating the memory needs in 2021. App memory usage is actually going down slightly as devs are adopting Swift and new image compression technologies become prevalent.

project_2501 said:
I have also noticed that only 8GB M1 MacBook Airs are appearing in the Apple Refurbished store - are people dissatisfied? Is the use of Rosetta pushing an already constrained RAM over its limits?

Or how about a much simpler and a more likely explanation: Apple sells many more 8GB machines than 16GB ones, so it only makes sense that the former make up the bulk of the returns.

project_2501 said:
Have the SSD wear problems been due to an over-reliance on swap-to-disk because 8GB is just too small for a 2020 laptop?

Na, that's probably just a bug in macOS and will get fixed soon.

quarkysg · Mar 17, 2021

casperes1996 said:
But yeah instructions already loaded up can also affect the loading of memory in the manner I described

I did a course on OS design many many years ago, so my understanding will be extremely outdated by now

Dabbled in a little of Linux driver development, mainly to hack my wireless routers running OpenWRT.

Anyway, I'm not sure if the M1 SoCs exposes the L1 cache or the re-order buffers metrics of it's CPU tho. Both u-arch implementation would likely be within the SoC itself. I would think that even if you could read the values out, the interrupts handling would probably kill the performance of the OS when it tries to determine whether to start swapping to/from SSDs from/to RAM.

quarkysg · Mar 17, 2021

leman said:
I think you are overestimating the memory needs in 2021. App memory usage is actually going down slightly as devs are adopting Swift and new image compression technologies become prevalent.

This probably stems from folks trying to use the current crops of M1 Macs for more demanding tasks such as video editing, since it so performant.

IMHO, the current crops of M1 Macs are meant to be used as general purpose computing devices, such as media consumption, typical office and home based use. Using Safari/Firefox/Chrome, Pages/Word, Numbers/Excel and Keynotes/PowerPoint does not require too much resources. 8GB would be able to last the lifetime of the current crops of M1 Macs.

It's only a problem when newer versions of the software needed newer hardware, and should not be due insufficient RAM.

ca$hman · Mar 17, 2021

project_2501 said:
Thanks @casperes1996 and excellent reply.

You've explained why the apps aren't significantly different.

So that leaves me puzzled as to small 8GB RAM for 2021 laptops.

You have the option to buy a 16GB model.

For a lot of people, probably the most who buy a MBP of MBA 8GB is more than sufficient for their daily tasks. Only when you do heavy video or audio editing or maybe for some coding purposes you need that higher end model, but yes, then you probably are a heavy user. I guess that is the case with many product. A version for the Mass and upgrades for the special needs.

And this 8GB version for the mass is really okay. I can confirm this as a 8GB user, even while I do lots of Photo editing and sometime Video editing.

Bottom line, if you consider yourself a heavy user, why go for the base 8GB model?

leman · Mar 17, 2021

casperes1996 said:
One of my best guesses about the M1's perceived need for less memory to not slow down as much compared to its x86 brethren is not about disk speed (entirely; It of course is a factor), but about the chip's insanely wide OOO buffer (out of order). Thus if the chip knows some-500 instructions in advance it will need specific data from memory, memory perhaps being swapped to disk, it can request it far in advance of actually needing it. If memory serves, Apple's M1 OOO reorder queue is about 3x larger than Intel's Ice Lake chip. - Keep in mind that this is not an ISA difference. It's not about ARM vs x86; This is specific to each CPU micro-architecture.

Ugh, I think you might be reaching a bit far with this one

Large execution buffers paired with the wide backend are the main reason why Apple CPUs consume so little power while delivering such high performance, and exceptional levels of memory parallelism is what truly gets the performance home, but I don't see much relation here to the RAM size.

In the end, I think that Appel Silicon does so well with less RAM because of a) ultra low latency swapping b) 16KB pages as opposed to 4KB pages (which again allows faster swapping) b) intelligent RAM prefetch and c) additional tricks such as likely hardware memory compression (maybe even backed into the controller) and some other stuff. Apple spent 10 years optimizing both the hardware and the software for RAM economy, so it's not surprising that they do well here.

casperes1996 · Mar 17, 2021

quarkysg said:
Anyway, I'm not sure if the M1 SoCs exposes the L1 cache or the re-order buffers metrics of it's CPU tho. Both u-arch implementation would likely be within the SoC itself. I would think that even if you could read the values out, the interrupts handling would probably kill the performance of the OS when it tries to determine whether to start swapping to/from SSDs from/to RAM.

To my knowledge it does not. Aside from just poking around to investigate it though I'm not sure why you might want to read that though? What I was thinking would be entirely within the chip itself and not require inspection of it. But all of that is somewhat speculative regardless. Bottom line is that the reported experience by most people is that the M1 stays responsive even when low on memory to a greater extend than other Macs. That's not equivalent to things taking up more or less space; That can vary for a number of reasons but is likely to not be all that different, but it just seems to suffer less under high memory pressure by most people's account.

leman said:
In the end, I think that Appel Silicon does so well with less RAM because of a) ultra low latency swapping b) 16KB pages as opposed to 4KB pages (which again allows faster swapping) b) intelligent RAM prefetch and c) additional tricks such as likely hardware memory compression (maybe even backed into the controller) and some other stuff. Apple spent 10 years optimizing both the hardware and the software for RAM economy, so it's not surprising that they do well here.

And a very good point. Arguably a disadvantage of x86 that you're stuck with either 4K or the "large page size" of 2MB, which is just impractically big.

casperes1996 · Mar 17, 2021

quarkysg said:
I did a course on OS design many many years ago, so my understanding will be extremely outdated by now Dabbled in a little of Linux driver development, mainly to hack my wireless routers running OpenWRT.

Also; Things change, sure - but a lot stays the same too. The fundamentals aren't all that different, and I'm sure your knowledge is still very relevant

Spindel · Mar 17, 2021

quarkysg said:
I did a course on OS design many many years ago, so my understanding will be extremely outdated by now Dabbled in a little of Linux driver development, mainly to hack my wireless routers running OpenWRT.

Anyway, I'm not sure if the M1 SoCs exposes the L1 cache or the re-order buffers metrics of it's CPU tho. Both u-arch implementation would likely be within the SoC itself. I would think that even if you could read the values out, the interrupts handling would probably kill the performance of the OS when it tries to determine whether to start swapping to/from SSDs from/to RAM.

Sorry for OT post.

But I have to thank you for your work on DIR-868L builds!!!! (mine is still running on an now old version as an wireless AP)

quarkysg · Mar 17, 2021

Spindel said:
But I have to thank you for your work on DIR-868L builds!!!! (mine is still running on an now old version as an wireless AP)

Glad that I managed to make older router useful for a few more years. I’m still using a couple of DIR-868Ls myself

I have to apologize too for being OT

8GB RAM in ARM vs Intel Macbooks

macrumors 6502a

macrumors 6502

macrumors 65816

macrumors G4

macrumors 6502a

macrumors 6502a

macrumors 6502a

macrumors 6502a

macrumors 6502a

macrumors 604

macrumors 6502a

macrumors 65816

macrumors 604

macrumors 604

macrumors Core

macrumors 65816

macrumors 65816

macrumors member

macrumors Core

macrumors 604

macrumors 604

macrumors 6502a

macrumors 65816

Our Staff