RAM Requirements With AS

leman · Nov 8, 2020

pshufd said:
Legacy CISC has a lot of cruft.

I recall that the VAX architecture had this may instruction that was truly huge.

Intel would be in a much better place if they had taken the opportunity to redesign their ISA for 64-bit era. Of course, Intel was pushing Itanium, and the rest is history.... a great example how short-sightedness of a small group of people can make or break progress.

pshufd · Nov 8, 2020

leman said:
Intel would be in a much better place if they had taken the opportunity to redesign their ISA for 64-bit era. Of course, Intel was pushing Itanium, and the rest is history.... a great example how short-sightedness of a small group of people can make or break progress.

It wasn't Intel's choice.

AMD came out with x64 and Intel was forced to follow.

Itanium had a lot of RAS features that the vast majority of customers don't want or need.

mr_roboto · Nov 8, 2020

leman said:
As far as I know, unaligned memory access is undefined behavior in C and C++ (forbidden by the standard).

Undefined isn't forbidden. Rather the opposite, in fact - for better or for worse, the C and C++ spec/compiler ecosystem uses undefined behavior to permit a huge number of optimizations.

What Every C Programmer Should Know About Undefined Behavior #1/3

People occasionally ask why LLVM-compiled code sometimes generates SIGTRAP signals when the optimizer is turned on. After digging in, they find that Clang generated a "ud2" instruction (assuming X86 code) - the same as is generated by __builtin_trap().

blog.llvm.org

I believe it is true that dereferencing an unaligned pointer is undefined behavior. But if you do that with almost any x86 compiler, you are likely to get exactly the result you would expect. This is the danger of C family languages; you're invoking undefined behavior all the time, but you think it's OK because the compiler doesn't complain and the code doesn't look like something that's undefined and usually everything works great. Until it doesn't...

On AArch64, it seems there's a mode bit per exception level (EL0 through EL3) which tells the processor whether misaligned memory accesses should be treated as exceptions in ELx. Since Apple is trying to support low-cost migration of x86 code, I expect that they've designed their processors to support fast misaligned accesses in hardware and will not force exceptions with this mode bit.

pshufd said:
If ARM has instructions to write out a single byte, word, etc, then I'm wrong - I've never really looked at ARM. My last look at RISC was PowerPC.

pshufd said:
It sounds like ARM is more CISC than RISC.

I think you're remembering what you worked on at DEC as the definition of RISC. Alpha was the only mainstream RISC I'm aware of which attempted to leave out support for byte and word memory accesses. PowerPC never did that; it supported 8- and 16-bit loads and stores. Here's an example instruction:

lhz (Load Half and Zero) instruction

Alpha didn't stay purist for long, either. The lack of byte and word access proved to be such an immediate disaster in the first-gen 21064 that they added "Byte Word eXtensions" (BWX) to the second generation 21164.

Fomalhaut · Nov 8, 2020

aeronatis said:
When I compare 4K 10-bit HEVC video editing comparison between my iPad Pro (LumaFusion) and MacBook Pro (Final Cut Pro), if the project is basic colour and light corrections, iPad Pro performs the export in half as much time. However, when I add too many effects, apply a LUT, increase the number of layers, MacBook Pro gets ahead.

Which MBP do you have? 13" or 16"?

pshufd · Nov 8, 2020

mr_roboto said:
I think you're remembering what you worked on at DEC as the definition of RISC. Alpha was the only mainstream RISC I'm aware of which attempted to leave out support for byte and word memory accesses. PowerPC never did that; it supported 8- and 16-bit loads and stores. Here's an example instruction:

lhz (Load Half and Zero) instruction

Alpha didn't stay purist for long, either. The lack of byte and word access proved to be such an immediate disaster in the first-gen 21064 that they added "Byte Word eXtensions" (BWX) to the second generation 21164.

That sounds right. The PowerPC stuff that I did was mostly Altivec. I probably thought of it as being like Alpha though.

Nermal · Nov 8, 2020

pshufd said:
It sounds like ARM is more CISC than RISC.

64-bit Arm is certainly a lot more 'CISCy' than 32-bit was.

aeronatis · Nov 9, 2020

Fomalhaut said:
Which MBP do you have? 13" or 16"?

I have 16" model with i9-9880H CPU, Radeon Pro 5500M 8 GB graphics card and 32 gigs of RAM.

theluggage · Nov 9, 2020

mailman199 said:
With the switch to AS is there a change in how the system utilizes RAM? I’m curious if it will be more efficient and therefor need less RAM than an Intel Mac. iPads seem to be very snappy with lower amounts of RAM.

One thing to watch for is whether the new ASi chips for Mac have dedicated VRAM - if they don't, and the GPU uses shared system memory, that's a few gigs extra RAM requirement c.f. (say) an iMac with a discrete GPU. However, that's already an issue on some Macs with integrated graphics.

pshufd said:
It sounds like ARM is more CISC than RISC.

I'm not sure how much of an issue CISC vs. RISC is in 2020 in terms of per-core performance - RISC basically "won" when Intel switched to the Pentium Pro architecture which is (coarsely speaking) a RISC-like core with an x86 instruction translator. A big part of any ASi advantage is going to be how many extra cores, GPUs, codec accelerators, neural engines and pizza ovens Apple can pack on the die. Even the supercomputer projects using ARM are mainly using it as a controller to feed custom vector processors and other exotic hardware.

Anyway, RISC vs. CISC is a bit more than just "lots of simple instructions" - it's about designing the instruction set for the benefit of optimising compilers rather than making it easy for humans writing lovingly hand-crafted assembly.

I haven't touched ARM assembly language since the early days of ARM 2/3 and 24-bit addressing but even then, the instructions could get pretty "complex" in their own way: e.g. every single instruction could be made conditional (avoids having to flush the pipeline on a jump) and every instruction that manipulated data could also perform a bitwise shift/rotate operation (a hard-wired "barrel shifter" was part of the ARM hardware). The reasoning was always efficiency rather than programmer convenience - and trying to ensure that most instructions executed in a single clock cycle rather than being expanded to multiple steps of microcode.

pshufd said:
I recall that the VAX architecture had this may instruction that was truly huge.

Yes - the VAX "Evaluate polynomial" instruction was the poster child for the problem with CISC. It's not hard to see why instructions like that would be great if you were writing number-crunching code in assembly language or calling VAX-specific libraries, but would rarely get used in code compiled from standard programming languages.

Andropov · Nov 9, 2020

theluggage said:
One thing to watch for is whether the new ASi chips for Mac have dedicated VRAM - if they don't, and the GPU uses shared system memory, that's a few gigs extra RAM requirement c.f. (say) an iMac with a discrete GPU.

From the WWDC videos I would expect RAM to be shared by the CPU and GPU. So no discrete VRAM.

Krevnik · Nov 9, 2020

theluggage said:
Anyway, RISC vs. CISC is a bit more than just "lots of simple instructions" - it's about designing the instruction set for the benefit of optimising compilers rather than making it easy for humans writing lovingly hand-crafted assembly.

Agreed.

That said, I still prefer ARM/PPC over x86 when having to debug crash stacks using disassembly. I find it faster and easier to grok iOS disassembly the few times I need to do it. x86 because of how dense an instruction can be, along with having to remember which syntax you are working with, makes x86 slower to work with if you aren't doing it regularly. And honestly, you shouldn't be doing it regularly.

pshufd said:
It wasn't Intel's choice.

AMD came out with x64 and Intel was forced to follow.

Technically true, but it's not like Intel was doing anything with 64-bit other than IA-64. AMD saw an opportunity and took it. Intel was forced because they utterly missed the mark on where they should invest.

As much as engineers would love to get a fresh start from time to time, the thing I keep seeing in the market is "evolution trumps revolution" when it comes to architecture.

pshufd · Nov 9, 2020

Krevnik said:
Technically true, but it's not like Intel was doing anything with 64-bit other than IA-64. AMD saw an opportunity and took it. Intel was forced because they utterly missed the mark on where they should invest.

As much as engineers would love to get a fresh start from time to time, the thing I keep seeing in the market is "evolution trumps revolution" when it comes to architecture.

Intel won with that approach for a very long time. They could have kept winning but took their eye off the ball for a very long time.

x64 outlasted a lot of architectures from the 80s and 90s.

Fomalhaut · Nov 9, 2020

aeronatis said:
I have 16" model with i9-9880H CPU, Radeon Pro 5500M 8 GB graphics card and 32 gigs of RAM.

That's exactly what I have too, so the fact your iPad exports in half the time is truly impressive.

dmccloud · Nov 9, 2020

mikeboss said:
11.0.1 Release Candidate build 20B5022a

fresh boot, exactly the same set of apps started on both machines

intel:

View attachment 1537474

arm:

View attachment 1537476

What's interesting about these two images is that while there really isn't much of a difference in the total amount of RAM in use (4.2GB for Intel, 5.25GB AS), the Intel actually has a slightly higher memory pressure despite the lower amount of RAM in use. Granted, we would need to see a comparison between an Intel based Mac and a production AS Mac to be certain (I assume this was from the DTK?), but it appears that even the ARM-based processor here might have slightly improved memory management compared to Intel. That could make things quite interesting after tomorrow...

Krevnik · Nov 9, 2020

dmccloud said:
What's interesting about these two images is that while there really isn't much of a difference in the total amount of RAM in use (4.2GB for Intel, 5.25GB AS), the Intel actually has a slightly higher memory pressure despite the lower amount of RAM in use. Granted, we would need to see a comparison between an Intel based Mac and a production AS Mac to be certain (I assume this was from the DTK?), but it appears that even the ARM-based processor here might have slightly improved memory management compared to Intel. That could make things quite interesting after tomorrow...

Less wired (immovable) memory on the AS Mac means less memory pressure. Not all RAM pages are created equal. Read-only pages like code pages are the cheapest to evict/reload, for example.

But these screenshots aren't great examples. They really need to be sorted better, and not by PID. As it is, I can't even verify the claim that "the same set of apps started on both machines" is true. Because it doesn't look true based on the screenshots. Sorted by memory usage would be more interesting, IMO.

pshufd said:
Intel won with that approach for a very long time. They could have kept winning but took their eye off the ball for a very long time.

x64 outlasted a lot of architectures from the 80s and 90s.

And it still has quite a bit of life left in it on the desktop, going by what AMD is doing. Intel itself is the problem, IMO. It's more that with Apple having a silicon team, when it was clear that Intel wasn't doing the job, Apple's first thought seems to have been "can we do it in house and cut out these dependencies entirely?"

Apple could have continued for a while longer by going AMD. But I guess they saw a long-term business opportunity and decided to take it rather than try to maintain status quo.

pshufd · Nov 9, 2020

Krevnik said:
And it still has quite a bit of life left in it on the desktop, going by what AMD is doing. Intel itself is the problem, IMO. It's more that with Apple having a silicon team, when it was clear that Intel wasn't doing the job, Apple's first thought seems to have been "can we do it in house and cut out these dependencies entirely?"

Apple could have continued for a while longer by going AMD. But I guess they saw a long-term business opportunity and decided to take it rather than try to maintain status quo.

The rumors of ARM have been around since 2008 so I think that they have been thinking about this even before the Intel transition was complete. Apple has the size and scale where they can simply hire talent to do what they need and the consolidation of major architectures in the 1980s and 1990s means that there is likely a lot of talent out there. They can also just hire from AMD and Intel.

Intel dropped the ball for a very long time. I still think about Intel parts first. My recent build has a 10700 and it's a nice system. I bought the parts before Zen3 came out, though, and I might have gone with Zen3 if it has been out at the time. Apple Silicon really has my interest. I don't think that I could run my full workload on it right now but I could run parts of my workload on it which would be sufficient.

aeronatis · Nov 9, 2020

Fomalhaut said:
That's exactly what I have too, so the fact your iPad exports in half the time is truly impressive.

Exactly! Like I said, as the project grts more complex, MacBook Pro gets the upper hand. Though this could simply be due to the system RAM difference. I guess we are about to see 🙏🏼

jtara · Nov 14, 2020

Nobody has brought up the issue of swapping. With only 16GB, many popular Mac use cases will be doing a LOT of swapping of data.

Could it be that between something about the architecture along with much faster flash memory (I haven't seen benchmarks for the flash?) it's less of an issue?

But the idea of lot of swapping onto flash gives me the willies!

I mean, yes, flash lifetime has improved over the past few years - and dramatically. But still... has the flash reliability/lifetime also been further dramatically improved?

I would assume the standard advice of getting the largest flash storage you can - whether you actually need it for storage or not - applies here.

Kung gu · Nov 14, 2020

jtara said:
Nobody has brought up the issue of swapping. With only 16GB, many popular Mac use cases will be doing a LOT of swapping of data.

Could it be that between something about the architecture along with much faster flash memory (I haven't seen benchmarks for the flash?) it's less of an issue?

But the idea of lot of swapping onto flash gives me the willies!

I mean, yes, flash lifetime has improved over the past few years - and dramatically. But still... has the flash reliability/lifetime also been further dramatically improved?

I would assume the standard advice of getting the largest flash storage you can - whether you actually need it for storage or not - applies here.

remember the 16gn limit is for MBA and 2-port MBP and entry lvl mac mini. the mac mini will likely get more RAM when
M1X or M2X is released. and this is unified memory so who knows, wait till reviews

cmaier · Nov 14, 2020

aeronatis said:
I read on Pixelmator forums that the reason for Pixelmator Photo app no to have released a Photos extension for iPadOS yet is that the required RAM is much higher than what Apple allows them for extensions. Thus, I think the next iPad Pro should have at least 8 gigs of RAM (which is probably the highest we can hope for, tough).

The wording I use (simpler) was just to summarise in one word between two instruction sets. Otherwise, we can write pages of details. I would not want to cause any misunderstanding.

If you summarise two instruction sets to anyone who doesn't want to go too deep, you can basically say one is more like putting add, sum, read instructions while the other is like factorial, sinus sinus square (again these are figurative terms, not the actual instructions), which is why CISC based devices performs complex works easier while RISC based devices perform simpler tasks easier. RISC device would be like adding multiple "multiply" instruction instead of just one "factorial" instruction, so the code will be longer on RISC compared to CISC.

When I compare 4K 10-bit HEVC video editing comparison between my iPad Pro (LumaFusion) and MacBook Pro (Final Cut Pro), if the project is basic colour and light corrections, iPad Pro performs the export in half as much time. However, when I add too many effects, apply a LUT, increase the number of layers, MacBook Pro gets ahead.

No, the difference in instructions between RISC and CISC usually boils down to things like performing memory accesses vs. just register accesses.

So, a typical CISC instruction may be “add register A to the contents of memory address X and put the result in memory address Y.”

But in RISC you’d instead have to have multiple instructions: (a) fetch the contents of memory address X and put them in register B; (b) add register A to register B and put the result in register C; (c) put the contents of register C in memory address Y.

It's nothing so interesting as "multiply" vs. "factorial."

aeronatis · Nov 14, 2020

cmaier said:
No, the difference in instructions between RISC and CISC usually boils down to things like performing memory accesses vs. just register accesses.

So, a typical CISC instruction may be “add register A to the contents of memory address X and put the result in memory address Y.”

But in RISC you’d instead have to have multiple instructions: (a) fetch the contents of memory address X and put them in register B; (b) add register A to register B and put the result in register C; (c) put the contents of register C in memory address Y.

It's nothing so interesting as "multiply" vs. "factorial."

And I especially emphasised that those were not the actual instructions 🤦🏻‍♂️

cmaier · Nov 15, 2020

aeronatis said:
And I especially emphasised that those were not the actual instructions 🤦🏻‍♂️

Understood, but the inference from your choice of fake instructions was that somehow the CISC instructions were more powerful in the sense that they perform more complicated algorithms that are of use the to programmer. Which is almost never the case.

The difference, almost always, is far more mechanical. CISC instructions can access memory and registers in the same instruction, perform math on memory contents, calculate an address in memory and then use that calculated address to fetch something and perform math on it, etc.

RISC machines require that if you want to get something from memory you use a LOAD instruction. If you want to put something in memory you use a STORE instruction. And if you do math, the results go into the register file, and the opera date come from the register file (or are embedded as immediates in the instruction itself).

The complexity of the mathematical or logical instructions, themselves, are generally equivalent. (There are exceptions to every rule, of course, but the few truly complicated math instructions in various CISC machines also tend to not get used very much).

aeronatis · Nov 15, 2020

cmaier said:
Understood, but the inference from your choice of fake instructions was that somehow the CISC instructions were more powerful in the sense that they perform more complicated algorithms that are of use the to programmer. Which is almost never the case.

The difference, almost always, is far more mechanical. CISC instructions can access memory and registers in the same instruction, perform math on memory contents, calculate an address in memory and then use that calculated address to fetch something and perform math on it, etc.

RISC machines require that if you want to get something from memory you use a LOAD instruction. If you want to put something in memory you use a STORE instruction. And if you do math, the results go into the register file, and the opera date come from the register file (or are embedded as immediates in the instruction itself).

The complexity of the mathematical or logical instructions, themselves, are generally equivalent. (There are exceptions to every rule, of course, but the few truly complicated math instructions in various CISC machines also tend to not get used very much).

If you go historically how the conversation went there, you would understand why I gave that example. I originally mentioned that RISC machines would require less RAM than CISC would be a misconception. I just didn't go technical. That's all. The message you quote was my reply to someone else.

As for "not more powerful", that is not what I said either. I just said CISC instruction would be more complex, and yes, that is the case. You can write pages of exceptions, but that doesn't change the basic difference of RISC vs CISC. In fact, RISC machine could require more RAM in most cases.

cmaier · Nov 15, 2020

aeronatis said:
If you go historically how the conversation went there, you would understand why I gave that example. I originally mentioned that RISC machines would require less RAM than CISC would be a misconception. I just didn't go technical. That's all. The message you quote was my reply to someone else.

As for "not more powerful", that is not what I said either. I just said CISC instruction would be more complex, and yes, that is the case. You can write pages of exceptions, but that doesn't change the basic difference of RISC vs CISC. In fact, RISC machine could require more RAM in most cases.

Mostly true. On the other hand, RISC machines almost always have more registers than CISC machines, which means less data memory is required for scratch. Very small effect. But the effect caused by having to store more op codes in data memory (that’s really what the difference ends up being - the operands end up being more or less a wash), is small, too (around 15% difference in instruction memory, which is a small portion of total memory. Though most of those studies were done when instructions were 32-bit. With 64-bit, the operands grew much more than op codes [which may not have grown at all, depending on ISA], so the percentage may be smaller now)

aeronatis · Nov 15, 2020

cmaier said:
Mostly true. On the other hand, RISC machines almost always have more registers than CISC machines, which means less data memory is required for scratch. Very small effect. But the effect caused by having to store more op codes in data memory (that’s really what the difference ends up being - the operands end up being more or less a wash), is small, too (around 15% difference in instruction memory, which is a small portion of total memory. Though most of those studies were done when instructions were 32-bit. With 64-bit, the operands grew much more than op codes [which may not have grown at all, depending on ISA], so the percentage may be smaller now)

Exactly. I cannot wait to get my hands on the MacBook Air and compare it to MacBook Pro 16". Previously I did a comparison as iPad Pro vs MacBook Pro and iPad Pro had better timeline smoothness and much shorter export times for simple edits while MacBook Pro took the lead exporting highly edited videos (4 GB vs 32 GB RAM is a factor too though). So I wonder how M1 chip with 16 GB RAM will compare.

cmaier · Nov 15, 2020

aeronatis said:
Exactly. I cannot wait to get my hands on the MacBook Air and compare it to MacBook Pro 16". Previously I did a comparison as iPad Pro vs MacBook Pro and iPad Pro had better timeline smoothness and much shorter export times for simple edits while MacBook Pro took the lead exporting highly edited videos (4 GB vs 32 GB RAM is a factor too though). So I wonder how M1 chip with 16 GB RAM will compare.

We don’t yet have a lot of information on i/o bandwidth on these new systems. Should be interesting.

RAM Requirements With AS

macrumors Core

macrumors G4

macrumors 6502a

macrumors 68020

macrumors G4

Moderator

macrumors regular

macrumors G3

macrumors 6502a

macrumors 601

macrumors G4

macrumors 68020

macrumors 68040

macrumors 601

macrumors G4

macrumors regular

macrumors 68020

Suspended

Suspended

macrumors regular

Suspended

macrumors regular

Suspended

macrumors regular

Suspended

Our Staff