Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

handheldgames

macrumors 68000
Original poster
Apr 4, 2009
1,943
1,171
Pacific NW, USA
Regarding how the cMP 4,1/5,1 CPU's address RAM:

There have been countless threads saying 3 sticks of RAM in a 4,1/5,1 is FASTER than 4 sticks of RAM. That when you run 4 sticks of RAM, there is a performance penalty.

Well... This should throw a wrench in the theory. In this test of my cMP, 4 sticks is faster than 3.

990x with 48GB r2x4 registered ecc 16x16x16x0 @1333mhz Geekbench 4 - Single Core:3046 MultiCore:14125
https://browser.geekbench.com/v4/cpu/8589551

990x with 56GB r2x4 registered ecc 16X16X16X8 @1333mhz Geekbench 4 - Single Core:3096 MultiCore:14340
https://browser.geekbench.com/v4/cpu/8589636

Go figure
 
most 16gb sticks don't play well with any other size / type of ram present, last i heard. I'm curious how the second configuration worked at all.
 
most 16gb sticks don't play well with any other size / type of ram present, last i heard. I'm curious how the second configuration worked at all.

Unfortunately, there is no documentation on how the cMP addresses RAM with regards to dual channel, triple channel or triple channel interleaved modes. The lack of difference in benchmark infers 3 and 4 sticks share the same memory addressing configuration.

This is independent of the speed the mac pro runs the ram at:
  • Non registered - non ecc: 1-3 sticks=1333Mhz, 4 sticks=1066Mhz
  • Non registered - ecc: 1-3 sticks=1333Mhz, 4 sticks=1333Mhz
  • registered 2x4 - ecc: 3 sticks=1333Mhz, 4 sticks=1333Mhz
 
Your looking at the wrong geekbench result.

When we say 3 sticks is fastest, we are talking about memory bandwidth, not cpu performance.
I have run extensive tests on my mac pro's and in order from slowest to fastest, its:

1 stick
4 sticks
2 sticks
3 sticks

Yes, 4 sticks has lower bandwidth than 2 sticks.
 
Your looking at the wrong geekbench result.

When we say 3 sticks is fastest, we are talking about memory bandwidth, not cpu performance.
I have run extensive tests on my mac pro's and in order from slowest to fastest, its:

1 stick
4 sticks
2 sticks
3 sticks

Yes, 4 sticks has lower bandwidth than 2 sticks.

Cough... Memory bandwidth, etc is in the Geekbench Results.

3 sticks:
bandwidth: 12.4 GB/sec
Memory Copy 7.43 GB/sec

vs

4 sticks:
bandwidth: 12.7 GB/sec
Memory Copy 7.60 GB/sec
 
When I put 3x8GB into my 4/5,1 cMP with a x5680, it had a GB score about 7% higher than when I added the 4th stick. I believe this is consistent with what others are seeing, but perhaps I am mistaken.
 
  • Like
Reactions: handheldgames
most 16gb sticks don't play well with any other size / type of ram present, last i heard. I'm curious how the second configuration worked at all.

As long as the 4th DIMM is also a RDIMM. It should able to work with the 16GB RDIMM without any issue.
 
The CPU spec about memory handing is quite different between i7 and Xeon actually.
EDBEC7FA-2F26-4182-8F13-A0ACA3D0C467.png


May be the i7 hit the memory bandwidth limit quickly. Therefore, 3 or 4 sticks doesn’t really matter (that “faster” is very tiny, within normal error margin).
 
The CPU spec about memory handing is quite different between i7 and Xeon actually.
View attachment 765646

May be the i7 hit the memory bandwidth limit quickly. Therefore, 3 or 4 sticks doesn’t really matter (that “faster” is very tiny, within normal error margin).

from recent tests conducted by another member https://forums.macrumors.com/threads/mac-pro-cpu-compatibility-list.1954766/page-20#post-26066561

it seems the Core i7 990X at least acts very much like a W3690 in regards to memory (with the same 56GB limit, and running at 1333Mhz despite being listed to only work at 1066 in the 990Xs case)

it even works fine with RDIMMs the only thing is ECC is shown as disabled in system profiler.

its very interesting as most PC X58 motherboards wont post with RDIMMs AFAIK...

I Love how the Mac Pros Firmware will drag a CPU kicking and screaming so to speak into supporting not official memory configurations :)

what really needs to be done in regards to this whole how many RAM sticks thing is for someone to actually run some proper memory benchmarks in whatever OS and finally see What configuration is fastest.
 
When I put 3x8GB into my 4/5,1 cMP with a x5680, it had a GB score about 7% higher than when I added the 4th stick. I believe this is consistent with what others are seeing, but perhaps I am mistaken.

I had similar results before moving to 2x4 RDIMMS
from recent tests conducted by another member https://forums.macrumors.com/threads/mac-pro-cpu-compatibility-list.1954766/page-20#post-26066561

it seems the Core i7 990X at least acts very much like a W3690 in regards to memory (with the same 56GB limit, and running at 1333Mhz despite being listed to only work at 1066 in the 990Xs case)

it even works fine with RDIMMs the only thing is ECC is shown as disabled in system profiler.

its very interesting as most PC X58 motherboards wont post with RDIMMs AFAIK...

I Love how the Mac Pros Firmware will drag a CPU kicking and screaming so to speak into supporting not official memory configurations :)

what really needs to be done in regards to this whole how many RAM sticks thing is for someone to actually run some proper memory benchmarks in whatever OS and finally see What configuration is fastest.

Great insight. I can confirm that while testing the RDIMMS, they refused to post in a gigabyte x58/975x system.


Although RDIMMS are a tad slower on the bench, they can be routinely picked up on ebay for $40/16GB.
Is this the single core result? How about the multi cores result? Same pattern?

Good question. Digging deeper, it looks like the slowdown is not across the board. In the last results set, Three sticks was faster in Multi-core operations. Single core was faster with 4 sticks.

Kicking the tires, I just ran another bench with 16x16x16x8 that leaves the 3 stick test in the dust. Go figure, the test run came in at 3150/14863 and is the fastest geekbench 4 64-bit numbers I've hit on this cMP.


Geekbench Multi-Core 16x16x16x8
Memory Copy
4072
11.3 GB/sec

Memory Latency
6238
69.4 ns

Memory Bandwidth
3570
19.1 GB/sec



Geekbench Multi-Core 16x16x16
Memory Copy
3643
10.1 GB/sec

Memory Latency
6134
70.6 ns

Memory Bandwidth
3207
17.1 GB/sec
 
Last edited:
  • Like
Reactions: h9826790
I have 48GB (16x16x16) RAM from OWC, and just ran Geekbench 4.2.3 in tryout mode. This is running a W3680. No reboot or anything, just closed all programs and ran it after an uptime of 7 days, 20 hours and 45 minutes.

Memory scores - -
Single core:
- Memory score = 3488
- Memory copy = 2861 / 7.93 GB/sec
- Memory latency = 6075 / 71.3 ns
- Memory bandwidth = 2442 / 13.0 GB/sec

Multi core:
- Memory score = 4451
- Memory copy = 4105 / 11.4 GB/sec
- Memory latency = 5935 / 72.9 ns
- Memory bandwidth = 3621 / 19.3 GB/sec

Overall Geekbench scores - -
- Single = 3083
- Multi = 14838

The scores are different every time I run it, of course. I've had higher and lower. Not sure if this does anything for the theory.
 
Update from a fresh install if Mojave... Single core memory performance is near identical. In multi-core memory performance 4 sticks is FASTER than 3 sticks.

Mac Pro 5,1. 16x16x16 48GB(link)
Single-Core Score 3205 Multi-Core Score. 15311

Single-Core memory
Memory Copy 2887
8.00 GB/sec

Memory Latency 6378
67.9 ns

Memory Bandwidth 2484
13.3 GB/sec


Multi-Core memory
Memory Copy 3882
10.8 GB/sec

Memory Latency 5836
74.2 ns

Memory Bandwidth 3363
18.0 GB/sec



MacPro 5,1 16x16x16x8 56GB(link)
Single-Core Score 3200 Multi-Core Score 15506

Single-Core memory
Memory Copy 2876
7.97 GB/sec

Memory Latency 6323
68.5 ns

Memory Bandwidth 2472
13.2 GB/sec


Multi-core memory
Memory Copy 4078
11.3 GB/sec

Memory Latency 6223
69.6 ns

Memory Bandwidth 3536
18.9 GB/sec
 
I just tested this. 4 Sticks is significantly worse in benchmarks than 3 in my system.

Also, I feel like this has been pretty thoroughly tested and proven. Maybe the OPs system is unique in that it is a 990x, but I think it is also non standard to have mixed ram capacities. In my case all memory is identical and adding a fourth shows a marked decline.

3 Sticks:

Memory Copy 12.1 GB/s
Latency 117 ns
Bandwidth 20.4 GB/s
https://browser.geekbench.com/v4/cpu/8815505

4 Sticks:

Memory Copy 8.81 GB/s
Latency 116.8 ns
Bandwidth 14.2 GB/s
https://browser.geekbench.com/v4/cpu/8815749
 
  • Like
Reactions: frou
I just tested this. 4 Sticks is significantly worse in benchmarks than 3 in my system.

Also, I feel like this has been pretty thoroughly tested and proven. Maybe the OPs system is unique in that it is a 990x, but I think it is also non standard to have mixed ram capacities. In my case all memory is identical and adding a fourth shows a marked decline.

3 Sticks:

Memory Copy 12.1 GB/s
Latency 117 ns
Bandwidth 20.4 GB/s
https://browser.geekbench.com/v4/cpu/8815505

4 Sticks:

Memory Copy 8.81 GB/s
Latency 116.8 ns
Bandwidth 14.2 GB/s
https://browser.geekbench.com/v4/cpu/8815749

UDIMM will do much worst when more than one stick installed on a single channel. On the other hand, RDIMM is not a problem.

OP use 16GB DIMM, that must be RDIMM on the cMP. Therefore, he won't suffer from the 2T or 2N timing issue.

But your 24GB / 32GB config can be UDIMM, in this case, it can explain why your pure memory performance degraded a lot when you utilise the 4th slot.
 
UDIMM will do much worst when more than one stick installed on a single channel. On the other hand, RDIMM is not a problem.
he was saying this in my thread also. but then also tells us how non-ecc RDIMM is faster, even though it doesn't exist.
 
he was saying this in my thread also. but then also tells us how non-ecc RDIMM is faster, even though it doesn't exist.

It is a fact that RDIMM work better when more than one DIMM installed on a single memory channel.

https://en.wikipedia.org/wiki/Registered_memory

Can you prove that is wrong? You know nothing about how the memory work, how to categories them, how to perform some simple math calculation, how to make reasonable assumption, how to discuss, and even how to read.

This is all I know from your thread.

And I NEVER say non-ecc RDIMM is faster, in fact, I never talk about this particular DIMM's performance at all. I just said ECC has nothing to do with registered or not. You still can't read despite I've said that more than 3 times.
 
Last edited:
  • Like
Reactions: handheldgames
Digging deeper into the results...

a 6-core gulftown class 990x with 4 ecc RDIMMS, efi version 085 shows 4 rdimms is faster than 3 rdimms in multicore, running without ecc enabled.

two 4-core westmere ep x5677 CPU's with DDR3 SDRAM, efi version 07f, shows 4 sdrams (per CPU or total?) is slower than 3 sdrams (per CPU or total?) in multicore.

The 990x is considered an unclocked consumer version of the w3690 - single dmi.
The x5677 series is dual DMI, with increased memory latency.
Could the variance be DDR3 SDRAM VS DDR3 RDIMMS?
x59 chipset VS intel 5520 chipset
Or EFI version 07F vs 085?
 
Can you just read the wiki link???
View attachment 768027
the specification exists, it's not a real product in general use as you claimed it to be. find one for sale. find one that was ever even made. how can you argue about speed differences of things that don't exist. that is why i did not even bother to look if your other claim was true or not, because this was such a ridiculous claim.
 
the specification exists, it's not a real product in general use as you claimed it to be. find one for sale. find one that was ever even made. how can you argue about speed differences of things that don't exist. that is why i did not even bother to look if your other claim was true or not, because this was such a ridiculous claim.

Please show me the exact post that I talked about the non ECC RDIMM's performance.

I only compared "RDIMM ECC" to "UDIMM non ECC" speed. Never ever mentioned the non ECC RDIMM's speed. I clarify this multiple times already.

Non ECC should be faster than ECC, however RDIMM can be faster UDIMM in some case. Therefore, "RDIMM ECC" can be faster than "UDIMM non ECC" in the test if condition met (more than one DIMM on a single memory channel). This is what I talked about. Please show me the post that I wasn't.
[doublepost=1530146797][/doublepost]
Digging deeper into the results...

a 6-core gulftown class 990x with 4 ecc RDIMMS, efi version 085 shows 4 rdimms is faster than 3 rdimms in multicore, running without ecc enabled.

two 4-core westmere ep x5677 CPU's with DDR3 SDRAM, efi version 07f, shows 4 sdrams (per CPU or total?) is slower than 3 sdrams (per CPU or total?) in multicore.

The 990x is considered an unclocked consumer version of the w3690 - single dmi.
The x5677 series is dual DMI, with increased memory latency.
Could the variance be DDR3 SDRAM VS DDR3 RDIMMS?
x59 chipset VS intel 5520 chipset
Or EFI version 07F vs 085?

It's hard to tell.

But I don't think the BootROM has any effect on this matter.

Dual processor may make the difference. However, form the previous threads. It shows that even single processor cMP can still shows a significant memory performance difference when the 4th slots in use. Therefore, we may also safely remove this variable (as well as the chipset difference).

The remaining suspects are Xeon vs i7, and the memory spec itself.

It's very hard to tell if i7 make the difference because so little users using that here. Therefore, unless you have some UDIMM on hands to make the test. It won't be that easy to find out which one is the most significant parameter.
 
Please show me the exact post that I talked about the non ECC RDIMM's performance.
registered memory has nothing to do with ECC. And registered memory do perform better

non ECC RAM can be registered.

RDIMM will be faster

Registered or not has absolutely nothing to do with ECC.

Registered NON ECC RAM exist.

you going to deny you made these remarks?
also, go ahead and find a registered non-ecc memory for sale, try newegg or ebay, where are they?
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.