Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

theorist9

macrumors 68040
May 28, 2015
3,880
3,059
This is a popular argument but far from being obvious in practice. On many dGPU systems the application/driver/system will mirror a decent portion of GPU memory in system memory, for various reasons. So you can't conclude that the dGPU's separate memory pool gives you more effective memory. The specific memory usage needs to be measured in every case.

Regarding multiple external displays: you don't need a lot of memory to support a display on its own. Even a 5K monitor with full colors only needs around 60MB per frame. Of course, more displays usually means more visible and active apps, which is where the main memory cost comes from. Again, this is something that has to be measured for every use case.
I'm curious—for systems with separate RAM and VRAM, are there particular classes of items that take up a significant amount of VRAM, yet that would not be mirrored in RAM?

For instance are the textures and shaders for a video game copied from RAM to VRAM, after which they can be deleted from RAM? If so, for those particular items, the dGPU's separate memory pool would act as additional memory.
 

leman

macrumors Core
Oct 14, 2008
19,516
19,664
I'm curious—for systems with separate RAM and VRAM, are there particular classes of items that take up a significant amount of VRAM, yet that would not be mirrored in RAM?

For instance are the textures and shaders for a video game copied from RAM to VRAM, after which they can be deleted from RAM? If so, for those particular items, the dGPU's separate memory pool would act as additional memory.

I don't know. One would need to ask driver engineers how these things are handled. A big question for me is how drivers work in the context of multitasking. If you have different processes uploading textures, will you run out of memory? Or will the driver offload some texture data to the system RAM to make space? Shaders/pipeline states are even trickier. Do drivers permanently store all shade pipelines application creates in the GPU RAM (sounds like a huge waste)? Or do they upload the required pipelines when needed?

I can imagine that on a modern API (like VK or DX12), where you handle memory pools explicitly, it's your responsibility to manage the available space and juggle data if needed. But even there it's not clear to me how tis works with multiple applications. If an app reserves some memory, does it mean other apps will get less memory to play with or will the driver move things around it the background (if it's the latter, then the entire story with manual memory management is just a bad joke)? Note that even with all the focus on manual management, memory allocations performed by shader program creation are completely hidden from the user.
 
Last edited:

pipo2

macrumors newbie
Jan 24, 2023
24
9
In addition, CPU technology has evolved a lot in the last 30 years. Original CISC and RISC CPU designs have as much relevance to today's technology as ancient chariots to modern cars. Some basic design philosophy has survived of course: contemporary "RISC" is designed with efficient execution in hardware in mind, while contemporary "CISC" is mostly a mixed bag that cares about compatibility with old code (in fact, Intel's current CPUs only real "CISC" feature is that operands can come from either registers or memory, so if one wants to argue that x86 is CISC the entire RISC/CISC distinction is trivially reduced to load-store vs. reg/mem design ).
Not so sure, but are you on a crusade wrt the usage of CISC and RISC? I recall you have been doing this before.

As a programmer I (and others) still use the CISC, RISC and MISC terminology. Obviously it's not saying anything about the underlying hardware (anymore). It's software, the ISA.
We're talking about what is presented to us wrt number of registers and instruction set. Of course we have to deal with the "load-store vs. reg/mem design". Old hat. At least/last we can do something with the hardware presented to us. And I'm afraid it is not always so trivial. Regardless we soldier on...

BTW like it or not, ARM is still using RISC to describe itself:
 

leman

macrumors Core
Oct 14, 2008
19,516
19,664
Not so sure, but are you on a crusade wrt the usage of CISC and RISC? I recall you have been doing this before.

I am. I just don't think these notions are helpful. They just promote confusion and half-truths. In fact, I this is why I oppose mindless use of labeling altogether.

As a programmer I (and others) still use the CISC, RISC and MISC terminology. Obviously it's not saying anything about the underlying hardware (anymore). It's software, the ISA.
We're talking about what is presented to us wrt number of registers and instruction set. Of course we have to deal with the "load-store vs. reg/mem design". Old hat. At least/last we can do something with the hardware presented to us. And I'm afraid it is not always so trivial. Regardless we soldier on...

If the load-store vs reg/mem is the crucial distinction point, why not talk about this instead of using non-transparent labels like RISC and CISC? There is nothing "reduced" about the ARM instruction set, nor is the x86-64 particularly "complex" in comparison (messy and convoluted, maybe). Modern ARM certain has more addressing modes and instructions than x86. And there is a huge difference in complexity between ARM and core RISC-V, for example.

I mean, if we look at ARM64 vs. x86-64 ISA specifically. Both have registers, stack, condition flags, combined FP/packed SIMD state, and both offer pretty much the same set of arithmetic and logical operations. The notable differences are load-store vs. reg/mem design, fixed-width vs. variable-width instructions, more addressing modes in ARM (pre/post-increment, index register sign extension control — don't remember if x86 has that too), more instructions that do multiple things in ARM (two register load/store, combined ALU+shift), SIMD ISA design. I don't think that these differences can be meaningfully conveyed with RISC/CISC terminology...
 

casperes1996

macrumors 604
Jan 26, 2014
7,597
5,769
Horsens, Denmark
Not so sure, but are you on a crusade wrt the usage of CISC and RISC? I recall you have been doing this before.

As a programmer I (and others) still use the CISC, RISC and MISC terminology. Obviously it's not saying anything about the underlying hardware (anymore). It's software, the ISA.
We're talking about what is presented to us wrt number of registers and instruction set. Of course we have to deal with the "load-store vs. reg/mem design". Old hat. At least/last we can do something with the hardware presented to us. And I'm afraid it is not always so trivial. Regardless we soldier on...

BTW like it or not, ARM is still using RISC to describe itself:
I'd say Leman is right here. The number of general purpose registers available is not standardised among chips claiming to be either RISC or CISC. AArch64 IIRC has 32 GPRs. x64 has 16. 68K has 8 and 8 address registers. And what about vector registers like AVX, NEON, SVE, etc.? There's no point grouping the ISAs together in CISC or RISC camps when you need to consider each chip individually anyway. Each different ISA also has vastly different instructions available and much to Leman's point the only thing that really seems to determine whether something categorises as CISC or RISC is whether you perform load/store or can do register-memory operations directly. With things like SVE2 it's hardly like AArch64 is that reduced after all; Still quite advanced and numerous instructions in there.
 
  • Like
Reactions: leman

leman

macrumors Core
Oct 14, 2008
19,516
19,664
With things like SVE2 it's hardly like AArch64 is that reduced after all; Still quite advanced and numerous instructions in there.

ARM nowadays even has dedicated memcpy/memset instructions. That had some people take up pitchforks "because it's not RISC". Wouldn't couldn't be more far from the truth IMO. Copying is a fairly tricky operation if you want it done right and requires an expert-level understanding of hardware. Well-tuned memcopy/memset implementations are large, complex assembly subroutines, which might not reach best possible performance on new hardware. If one thinks about this rationally, it is quite insane that such a crucially important task is commonly done in software!
 

MrGunny94

macrumors 65816
Dec 3, 2016
1,148
675
Malaga, Spain
I do agree that they should go up to 12/16GB on base models depending on the Chip with the M3.

I'm already happy they did increase the base model RAM to 18GB on the Pro models (by design on the chip and memory)

Even with 16GB of shared memory across CPU/GPU when I'm hooked up to dual 4K monitors and have everything open I'm at 60-70% memory usage.

It's quite crazy if you think about it, the whole point of the 'Pro' models is for people to use them as professional tools so I am definitely disappointed that moving to Apple Silicon they didn't make at least the M 'Pro' line with 32GB base.
 

pipo2

macrumors newbie
Jan 24, 2023
24
9
I'd say Leman is right here. The number of general purpose registers available is not standardised among chips claiming to be either RISC or CISC. AArch64 IIRC has 32 GPRs. x64 has 16. 68K has 8 and 8 address registers. And what about vector registers like AVX, NEON, SVE, etc.? There's no point grouping the ISAs together in CISC or RISC camps when you need to consider each chip individually anyway. Each different ISA also has vastly different instructions available and much to Leman's point the only thing that really seems to determine whether something categorises as CISC or RISC is whether you perform load/store or can do register-memory operations directly. With things like SVE2 it's hardly like AArch64 is that reduced after all; Still quite advanced and numerous instructions in there.
"The number of general purpose registers available is not standardised among chips claiming to be either RISC or CISC." Sure.
IMHO getting upset about CISC and RISC is not worth it. How often are those terms used? In my world hardly. And a long time ago, when moving from 68k to PowerPC, there was some bewilderment wrt a reduced instruction set, just counting the PPC instructions ;-)
So I consider CISC and RISC as empty words (not abbreviations), sometimes useful to convey something in context. I would certainly not condemn people for using them. We differ, no problem.
 

leman

macrumors Core
Oct 14, 2008
19,516
19,664
"The number of general purpose registers available is not standardised among chips claiming to be either RISC or CISC." Sure.
IMHO getting upset about CISC and RISC is not worth it. How often are those terms used? In my world hardly. And a long time ago, when moving from 68k to PowerPC, there was some bewilderment wrt a reduced instruction set, just counting the PPC instructions ;-)
So I consider CISC and RISC as empty words (not abbreviations), sometimes useful to convey something in context. I would certainly not condemn people for using them. We differ, no problem.

I don't think our opinion differ that much. And I fully agree with you that these terms can be sometimes useful in a technical discussion, as long as all the interlocutors understand the nuances. But also I think that these notions are potentially dangerous in a casual non-technical discussion for wider audiences (taking @MacInMotion's post for example), because they obfuscate the reality. Instead of discussing what is actually going on (which might be interesting and educational for a hobbyist curious about these things), labels perpetuate unhealthy myths and overzealous generalizations (like "RISC is low-power, CISC is high-power" or "integrated is slow, dedicated is fast"). Labels are easy, and they tend to get repeated a lot, which makes them seem "right". And the end effect is that people stop at labels and don't bother learning the actual interesting effect hiding behind it.
 
Last edited:

casperes1996

macrumors 604
Jan 26, 2014
7,597
5,769
Horsens, Denmark
"The number of general purpose registers available is not standardised among chips claiming to be either RISC or CISC." Sure.
IMHO getting upset about CISC and RISC is not worth it. How often are those terms used? In my world hardly. And a long time ago, when moving from 68k to PowerPC, there was some bewilderment wrt a reduced instruction set, just counting the PPC instructions ;-)
So I consider CISC and RISC as empty words (not abbreviations), sometimes useful to convey something in context. I would certainly not condemn people for using them. We differ, no problem.
If they are empty words, their usage is pointless and better replaced with words that get to the real point. I don't condemn anyone for using the terms, but I would prefer their meaning to be, well, meaningful, when used.
 

pipo2

macrumors newbie
Jan 24, 2023
24
9
If they are empty words, their usage is pointless and better replaced with words that get to the real point. I don't condemn anyone for using the terms, but I would prefer their meaning to be, well, meaningful, when used.
Me too! :)
But this is English, depending on context, a word can have different meanings. Possibly a well known example is Noise. Without context, it's rather empty, no idea what is meant.
 

theluggage

macrumors G3
Jul 29, 2011
8,009
8,443
There is nothing "reduced" about the ARM instruction set, nor is the x86-64 particularly "complex" in comparison (messy and convoluted, maybe). Modern ARM certain has more addressing modes and instructions than x86. And there is a huge difference in complexity between ARM and core RISC-V, for example.
RISC doesn't simply mean "fewer instructions", it also means individual instructions which execute faster - ideally in a single clock cycle - and allow for more efficient pipelining and finer-grained code with more opportunities for optimisation by the compiler.

See https://en.wikipedia.org/wiki/Reduced_instruction_set_computer#Instruction_set_philosophy (and follow some of the citations), including this one which directly addresses the "fewer instructions" misconception. (The Wiki article also gives a better acronym: RISC=Relegate Interesting Stuff to the Compiler).

ARM - even the old 24 bit version - can be spun as having ridiculous number of instructions if you consider that every instruction can be made conditional and have a variety of shift/rotate options applied - but all of that is hardwired to happen without additional clock cycles - and handling simple conditionals without needing a jump avoids trashing the pipeline every time.
 
  • Like
Reactions: Chuckeee

leman

macrumors Core
Oct 14, 2008
19,516
19,664
RISC doesn't simply mean "fewer instructions", it also means individual instructions which execute faster - ideally in a single clock cycle - and allow for more efficient pipelining and finer-grained code with more opportunities for optimisation by the compiler.

This hasn’t been the case for many years. ARM instructions on current CPUs do not execute any faster than x86 instructions (if anything, it’s a property of the implementation and not the ISA). And ARM has its fair share of instructions that execute in multiple steps. For example, shift+add is one instruction in ARM, but pretty much every modern implementation (including Apple) executes it in two steps. And ARM designs until very recently even used micro-ops, just like x86 CPUs. So I really don’t see how this applies to the current situation.

I suspect the story is that x86 did offer a few “high-level” instructions at some point, to simplify programming in assembler. These instructions were never much used and have been made obsolete long time ago n

See https://en.wikipedia.org/wiki/Reduced_instruction_set_computer#Instruction_set_philosophy (and follow some of the citations), including this one which directly addresses the "fewer instructions" misconception. (The Wiki article also gives a better acronym: RISC=Relegate Interesting Stuff to the Compiler).

I don’t really see how this is the case with modern implementations. As I mentioned above, modern ARM even has dedicated memory copy instructions (exactly the case the article is arguing against), simply because a CPU can do a better job copying data than software.

ARM - even the old 24 bit version - can be spun as having ridiculous number of instructions if you consider that every instruction can be made conditional and have a variety of shift/rotate options applied - but all of that is hardwired to happen without additional clock cycles - and handling simple conditionals without needing a jump avoids trashing the pipeline every time.

Modern ARM (v8 and later) has dropped predicated instructions, because they made it more difficult to develop high-performance CPUs. Only a small selection of conditional moves remain, which have also been a standard feature on x86 for many years.

What is undoubtedly true is that ARMv8 is a more modern, streamlined, symmetric design than x86, simply because it’s much more recent. ARM64 is a clean skate design - which industry rumors suggest was developed in tight cooperation with Apple, the main goal being enabling very high performance CPU cores. On the other hand, x86 still carries around legacy baggage from ages past. But Intel has announced a new instruction encoding (APX) which they hope will give them parity with ARM.
 

mr_roboto

macrumors 6502a
Sep 30, 2020
856
1,866
If the load-store vs reg/mem is the crucial distinction point, why not talk about this instead of using non-transparent labels like RISC and CISC? There is nothing "reduced" about the ARM instruction set, nor is the x86-64 particularly "complex" in comparison (messy and convoluted, maybe). Modern ARM certain has more addressing modes and instructions than x86. And there is a huge difference in complexity between ARM and core RISC-V, for example.
The problem is that the original acronym was misleading, and as a result, what the general public thinks it means is very different from how it's used by the small community of people who get to design ISAs.

RISC wasn't ever merely about reducing instruction count. Perhaps the most important innovation of the RISC movement was bringing much more analytical rigor to ISA design. For example, at one time lots of people believed that future ISAs should close the "semantic gap" between CPU instructions and high level languages by making the former much more like the latter. This wasn't based on much beyond feeling it was the right thing to do. RISC rejected this in favor of a more scientific, data-driven approach.

As for the "reduced" theme, that wasn't primarily about counting up instructions, even though that has been a very popular interpretation suggested by the acronym. The important thing to reduce is the number of implementation pain points. This makes it easier and cheaper to design high performance implementations, and reduces their gate count and power draw. By itself, instruction count isn't a great predictor of implementation complexity.

In the CPU design community, RISC is also used as a shorthand for ISAs which are recognizably in the family tree of the original 1980s RISC ISAs: load/store, usually 32 general purpose registers, usually 32-bit fixed size instruction word, limited yet sufficient addressing modes, and several other things.

32-bit Arm was an outlier among those early RISCs; you could make an argument that it shouldn't have been lumped in with the rest. However, modern arm64 is a very orthodox RISC ISA. Ignore the high instruction count, that's not important. Most arm64 instructions are just variations on a theme, and none look very complex to implement.

Here's an example. I've frequently seen people cite arm64's "Javascript instruction" as evidence that it's not really a RISC. When you look this instruction up, it's a variant of floating point to integer conversion with a specialized rounding mode. For reasons I won't go into here, this variant is extremely important to Javascript performance.

The extra gate required by this instruction is almost nothing: it's a low impact extra mode for execution resources which have to exist for other FP-int conversion instructions. The impact is high, thanks to how important JS is in today's world. So arm64's ISA architects decided it was worth it to burn a single opcode (one of the most precious ISA resources, from their perspective) on it.

arm64 is full of things like that. They clearly did a ton of homework trying to figure out places where they could offer high-leverage, low-cost variants of common operations. It doesn't mean that the resulting ISA isn't RISC, as it's still an extremely regular and simple ISA design.

I'd say Leman is right here. The number of general purpose registers available is not standardised among chips claiming to be either RISC or CISC. AArch64 IIRC has 32 GPRs. x64 has 16. 68K has 8 and 8 address registers. And what about vector registers like AVX, NEON, SVE, etc.? There's no point grouping the ISAs together in CISC or RISC camps when you need to consider each chip individually anyway. Each different ISA also has vastly different instructions available and much to Leman's point the only thing that really seems to determine whether something categorises as CISC or RISC is whether you perform load/store or can do register-memory operations directly. With things like SVE2 it's hardly like AArch64 is that reduced after all; Still quite advanced and numerous instructions in there.
With what I've written above, do you see that the important thing is not how many instructions or even whether they are "advanced"? Keep in mind that some things which seem 'advanced' from the software perspective are dead easy when designing gates, and some things which seem trivial are a giant pain in the butt.

This hasn’t been the case for many years. ARM instructions on current CPUs do not execute any faster than x86 instructions (if anything, it’s a property of the implementation and not the ISA).
There's two factors at work here.

One is that while x86 can and should be classified as a CISC ISA, reality is messier than a pure binary one-or-the other kind of thing. x86 was one of the RISC-iest of the CISC ISAs. You noted that x86 doesn't have tons of addressing modes, and addressing modes are one of the key metrics which can make an ISA more or less "CISCy". Just like everything else, one mustn't get hung up on the number of modes, it's really about implementation complexity. Do any of the modes make life really difficult for hardware designers? Mostly by accident, x86 avoided some of the common addressing mode pitfalls other pre-RISC ISAs fell into, and that was very important to x86 managing to survive the 1980s.

And ARM has its fair share of instructions that execute in multiple steps. For example, shift+add is one instruction in ARM, but pretty much every modern implementation (including Apple) executes it in two steps. And ARM designs until very recently even used micro-ops, just like x86 CPUs. So I really don’t see how this applies to the current situation.
I don't think there's any significant use of microcode in mainstream high performance 64-bit Arm core. Maybe in those which still have support for AArch32, but cores like Apple's (where AArch32 is a distant and unsupported memory), not so much.

More importantly, you have to look at all the outcomes, not just clock speed. For example, consider Zen 4 vs M1, as that's as close to the same process node as we can compare. The Zen 4 core is much larger than Apple's Firestorm core. Zen 4 scales to higher frequencies, but Apple's core delivers profoundly better perf/Hz and perf/W. If ISA doesn't matter at all, one would expect such differences to be far less pronounced.

I suspect the story is that x86 did offer a few “high-level” instructions at some point, to simplify programming in assembler. These instructions were never much used and have been made obsolete long time ago n
No, x86 was never particularly high-level.

The 8086 was the successor of Intel's 8080 and 8085. The biggest new feature was support for a 20-bit (1MB) address space, up from 16-bit (64KB). 8086 wasn't binary compatible with the 8085, but was intentionally mostly assembly language source compatible, as that was an important selling feature in many of the markets 8080 and 8085 had sold into.

Because the 8080 and 8085 were designed in the early 1970s, there just wasn't the transistor budget to do anything fancy. 8086 wasn't much more than that because when that project kicked off, Intel already had a team working on their extremely ambitious all-new 32-bit architecture of the future, iAPX 432. 432 was a "close the semantic gap" design: it had HLL features (capabilities, objects, garbage collection) baked into the ISA and microcode. 8086 was just a side project to keep existing 8085 customers loyal to Intel while the 432 team finished their work.

But the 432 was a dismal failure. Extremely late, incredibly slow, and ironically, its advanced ISA features made it extremely difficult to port existing operating systems and applications. It was a complete disaster, far worse than Itanium.

Concurrent with the 432 beginning to fail, x86 received the windfall of IBM selecting it for the IBM PC, and the PC's success meant x86 got allocated resources for some upgrades. After some false steps in the 286, Intel came up with some decent ideas for cleaning up the ugliest aspects of the 8086 ISA in the 386, and perhaps even more importantly, didn't succumb to the temptation to add too much.

If any of this had gone a little bit differently, we wouldn't have x86 as we know it today. For example, if 8086 had been regarded as the important project, it might have gotten the resources to be more ambitious, and that might have resulted in the inclusion of base ISA features too difficult to paper over in the long term. Designing microprocessor ISA features for ease of pipelined, superscalar, and out-of-order implementation was not something on anyone's mind in the 1970s; it really was a weird historical accident that x86 managed to avoid problems common to its contemporaries.
 

leman

macrumors Core
Oct 14, 2008
19,516
19,664
The problem is that the original acronym was misleading, and as a result, what the general public thinks it means is very different from how it's used by the small community of people who get to design ISAs. […]



Thanks for the great summary, very insightful!

RISC wasn't ever merely about reducing instruction count. Perhaps the most important innovation of the RISC movement was bringing much more analytical rigor to ISA design. For example, at one time lots of people believed that future ISAs should close the "semantic gap" between CPU instructions and high level languages by making the former much more like the latter. This wasn't based on much beyond feeling it was the right thing to do. RISC rejected this in favor of a more scientific, data-driven approach.

Yes, this is a great point. And I think it illustrates well why the commonly used RISC/CISC discourse is misleading. All the “high level” ISAs have pretty much died out. The examples you discuss also apply to x86, with the caveat that Intel is bogged down by decades of legacy.


Here's an example. I've frequently seen people cite arm64's "Javascript instruction" as evidence that it's not really a RISC.
What’s your opinion on memset/memcpy instructions?


I don't think there's any significant use of microcode in mainstream high performance 64-bit Arm core. Maybe in those which still have support for AArch32, but cores like Apple's (where AArch32 is a distant and unsupported memory), not so much.

Microcode - probably not, but micro-ops, yes. I also wonder whether microcode is used much in modern x86 designs, it’s mostly about implementing legacy stuff, right?

The Zen 4 core is much larger than Apple's Firestorm core.

Unless I am misremembering, they should be comparable (especially if you take the cache size into account)? Around 3.5-4mm2?


No, x86 was never particularly high-level.

I meant instructions like ENTER/LEAVE etc.
 

casperes1996

macrumors 604
Jan 26, 2014
7,597
5,769
Horsens, Denmark
With what I've written above, do you see that the important thing is not how many instructions or even whether they are "advanced"? Keep in mind that some things which seem 'advanced' from the software perspective are dead easy when designing gates, and some things which seem trivial are a giant pain in the butt.

I agree with that, always have. And I find it an argument to why the RISC/CISC terms are bad especially in the modern day, which was my point, so no disagreement with your excellent post :)
 

Sydde

macrumors 68030
Aug 17, 2009
2,563
7,061
IOKWARDI
… more addressing modes in ARM (pre/post-increment, index register sign extension control — don't remember if x86 has that too) …

This calls for some clarification. The breadth of addressing modes is instruction-dependent. Some opcodes can use up to 4 variations on the address mode, but others do not have those options. In AL, it can look like a load or store has many different address modes, but in machine code, those modes resolve to different base instruction codes.

Intel, by contrast, makes its variety of address modes uniformly available to every instruction that can access memory. An operand may be in a register or in a memory location that can be specified in one of about a dozen different ways. Most code will not use those elaborate address forms combining a base, an offset and a scaled index, but a few will, and the decoder will have to dig through the code stream to figure out what the spec is and how many bytes the instruction takes up.

As to the pre/post index update forms, those are very useful. They allow any general register to behave exactly like a stack pointer (there are no actual instructions that use the dedicated stack pointer implicitly) and also make it easy to scan through arrays of data structures concisely. The Intel design only facilitates the use of the dedicated stack pointer for stacking behavior and requires multiple instructions for scanning large-component arrays.

Intel made some design choices that seemed to make sense in the 1980s but end up wasting code space. The uniform instruction format of a RISC ISA imposes some real limitations on what one instruction can do, but in the practical world, those limitations converge with the way program code actually works.
 

thebart

macrumors 6502a
Feb 19, 2023
514
517
When I first moved from a 12gb PC to a 16gb M1 mini, I basically replicated all my apps and workflow, except for a couple apps that didn't have a Mac version. I found that the Mac filled up memory and hit the swap faster than the PC, even with 4gb more. (I should note that the PC had a dedicated GPU. I don't know how much of a difference that makes, but if I look in the system monitor, I see a few GPU processes that eat up about 500MB each.)

I don't think macs use any less memory or is more memory efficient. Maybe it's faster at swapping so you don't see as much impact. If Apple didn't charge an arm and a leg for memory upgrade everybody would just get 16gb, and so much of this discourse is just copium.
 

leman

macrumors Core
Oct 14, 2008
19,516
19,664
When I first moved from a 12gb PC to a 16gb M1 mini, I basically replicated all my apps and workflow, except for a couple apps that didn't have a Mac version. I found that the Mac filled up memory and hit the swap faster than the PC, even with 4gb more. (I should note that the PC had a dedicated GPU. I don't know how much of a difference that makes, but if I look in the system monitor, I see a few GPU processes that eat up about 500MB each.)

I don't think macs use any less memory or is more memory efficient. Maybe it's faster at swapping so you don't see as much impact. If Apple didn't charge an arm and a leg for memory upgrade everybody would just get 16gb, and so much of this discourse is just copium.

Did you notice any difference in perceived performance? Hangs, glitches, stutter on any system? Was there a difference in perceived smoothness?

Looking at how fast RAM fills up is not helpful because different systems simply have different behavior. For example, Apple uses system memory for SSD cache. If none of your applications need memory right now the system will happily "steal" multiple GB for this purpose. Hitting the swap also doesn't mean much, only when you repeatedly see heavy swapping activity and memory warnings you can be certain that something is off.
 
  • Like
Reactions: Chuckeee

dmccloud

macrumors 68040
Sep 7, 2009
3,138
1,899
Anchorage, AK
On Intel Macs, the graphics cards have built-in dedicated memory.

This would only be true for those Macs that shipped with either nVidia or AMD graphics solutions. For the majority of Macs in the Intel era, they were using some variant of Intel's integrated graphics, which means that part of the system RAM was permanently allocated to the GPU and therefore not available to the CPU.
 
  • Like
Reactions: MacPowerLvr

dmccloud

macrumors 68040
Sep 7, 2009
3,138
1,899
Anchorage, AK
I'm curious—for systems with separate RAM and VRAM, are there particular classes of items that take up a significant amount of VRAM, yet that would not be mirrored in RAM?

For instance are the textures and shaders for a video game copied from RAM to VRAM, after which they can be deleted from RAM? If so, for those particular items, the dGPU's separate memory pool would act as additional memory.

In most cases, the CPU and GPU are handling different data, so there is minimal swapping between system RAM and VRAM. There are other factors at play which preclude VRAM from being treated as additional system RAM. For starters, most dedicated videocards actually are running higher spec DDR than modern CPUs support. For example, even a previous generation Radeon 6700XT is running GDDR6, while current AM4 CPUs only run DDR5 and Intel's 13th/14th gen parts can run either DDR4 or DDR5 depending on the motherboard being used. Beyond the simple generational differences, there are significant differences between DDR and GDDR, including a wider data bus with GDDR and lower power consumption compared to desktop RAM. DDR is also optimized for latency (which is why system builders often talk about memory timings, aka the CLxx numbers) instead of bandwidth (which is how GDDR is designed).

These charts compare GDDR, DDR, and LPDDR both in relation to overall bandwidth and power efficiency. What's interesting to me is that LPDDR beats standard DDR both in bandwidth and power efficiency
 

Attachments

  • Screenshot 2024-01-05 at 10.04.50 AM.jpg
    Screenshot 2024-01-05 at 10.04.50 AM.jpg
    110.2 KB · Views: 58

thebart

macrumors 6502a
Feb 19, 2023
514
517
Did you notice any difference in perceived performance? Hangs, glitches, stutter on any system? Was there a difference in perceived smoothness?

Looking at how fast RAM fills up is not helpful because different systems simply have different behavior. For example, Apple uses system memory for SSD cache. If none of your applications need memory right now the system will happily "steal" multiple GB for this purpose. Hitting the swap also doesn't mean much, only when you repeatedly see heavy swapping activity and memory warnings you can be certain that something is off.
Well my PC was 11yo, so of course the m1 is way smoother and faster.

I assume Windows also tries to use as much RAM as possible to speed things up. And Mac OS will reduce cache to avoid unnecessary swapping

Anyway, the topic is whether MacOS+AS is more memory efficient, and my admittedly limited experience says it isn't. It uses sheer processing power and tight integration to swap really well. But swapping only gives you some headroom before you run into a brick wall, otherwise truly NOBODY would need more than 8gb
 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.