nMP BTO RAM

Cubemmal · Dec 13, 2013

I'm planning on buying the hex core, D500 model. I think this is the sweet spot in price/performance. For disk I have all my large data files off on a NAS and the 245GB is fine, but for RAM I'd like 32 GB. Traditionally I buy aftermarket RAM and upgrade myself, but here that would entail throwing out or turning in four good sticks.

So I'm considering paying an Apple tax and just BTO to 32GB, it might be cheaper than paying for then getting rid of the 16GB. Looking here

http://www.newegg.com/Product/Produ...ESTMATCH&Description=ecc+1866&N=-1&isNodeId=1

Indicates that an 8GB ECC 1866 stick would cost around $150 - so $600 for after market 32GB.

Any wagers on what a BTO from 16GB to 32GB would cost?

deconstruct60 · Dec 13, 2013

Cubemmal said:
So I'm considering paying an Apple tax and just BTO to 32GB, it might be cheaper than paying for then getting rid of the 16GB. Looking here

http://www.newegg.com/Product/Produ...ESTMATCH&Description=ecc+1866&N=-1&isNodeId=1

Indicates that an 8GB ECC 1866 stick would cost around $150 - so $600 for after market 32GB.

Newegg isn't going to move much for that. 32GB $439.

http://www.crucial.com/store/listparts.aspx?model=Mac Pro (Late 2013)

Not sure why Crucial doesn't have 8GB modules list but hard to figure how they would come up with a far higher priced for 32GB worth of them.

Any wagers on what a BTO from 16GB to 32GB would cost?

Crucial 4GB 1866 ECC --> $69
Curcial 8GB 1866 ECC --> $129 ( A bit higher if chop those 16GB down to 8GB "chunks").

Given Apple's track record of slightly higher volatility protection built into RAM prices then minimally ....

( ( 4 * $129 ) * 1.45 ) - ( 4 * 69 ) ==> ~ $472

That doesn't have an Apple flavor to the "roundness" so probably something like $500. ( so can keep the trailing '..99' if this is the only BTO change. ). However, given the iMac is charging $600 for 4 x 8GB slower and non ECC the base price is probably is closer to $600 than $500. Apple's RAM markup is usually in the 60-90% range.

Probably not going to be cheaper than using a vendor like OWC that will give you money back on your traded in DIMMs. (if choose to let them go and not keep as a back-up configuration. )

VirtualRain · Dec 13, 2013

You can get 8GB sticks of 1833 ECC memory here for under $100 each. So if Apple charges much more than $400-$500 you're probably better off going aftermarket and re-gifting your unneeded 4GB sticks to entry-level buyers with 12GB wishing to fill their DIMM slots.

haravikk · Dec 14, 2013

VirtualRain said:
You can get 8GB sticks of 1833 ECC memory here for under $100 each. So if Apple charges much more than $400-$500 you're probably better off going aftermarket and re-gifting your unneeded 4GB sticks to entry-level buyers with 12GB wishing to fill their DIMM slots.

They almost certainly will overcharge as Apple always do; for example the Mac Mini's upgrade to 16gb costs £240.00, or you could get the same memory for half the price from somewhere like Crucial, and end up with 4gb of spare RAM either for testing memory problems, or to sell in order to save even more money.

Obviously high-speed ECC memory should command a bit of a premium, but it's always been better to buy elsewhere later on, especially if the built in RAM is enough that you can make do. Remember that with Mavericks' memory compression, 16gb of RAM in the 6-core model can (potentially) hold up to 24gb of stuff, plus you have one very fast SSD so the penalty for page-outs isn't necessarily that bad. You also have a total of 6gb of VRAM, so depending on the workload your RAM may spend most of its time as a buffer.

But yeah, if you can manage on the RAM that's included, I'd recommend getting the RAM separately, but waiting a while for prices to go down.

spaz8 · Dec 14, 2013

You could also just buy 1 x 16GB stick for $169 (using links in thread) .. and get to 28GB by adding it to the 12GB in the quad, or replacing the 4th 4gb (16 total) chip in the hex.

Cubemmal · Dec 14, 2013

spaz8 said:
You could also just buy 1 x 16GB stick for $169 (using links in thread) .. and get to 28GB by adding it to the 12GB in the quad, or replacing the 4th 4gb (16 total) chip in the hex.

They usually have to be matched, or you want them to be. I assume that's still true here.

VirtualRain · Dec 14, 2013

Cubemmal said:
They usually have to be matched, or you want them to be. I assume that's still true here.

Yeah, multi-channel RAM like the quad channel in the nMP stripes data across the DIMMs the same way a RAID0 array of striped drives works... So you want all of them to be the same for optimal performance. I'm not sure if the memory controller will even support mismatched DIMMs, but if it does, it will be at some performance penalty... Best case, you lose the striping effect on that added capacity.

spaz8 · Dec 14, 2013

VirtualRain said:
Yeah, multi-channel RAM like the quad channel in the nMP stripes data across the DIMMs the same way a RAID0 array of striped drives works... So you want all of them to be the same for optimal performance. I'm not sure if the memory controller will even support mismatched DIMMs, but if it does, it will be at some performance penalty... Best case, you lose the striping effect on that added capacity.

Doh.. So to get to 32 it has to be 2 x16, or 4 x 8 ?

not even 2x4 on the left (slots 1,2), and a 16 in slot 3 on the right.

I have a combo of 6 x 2gb and 2x 512 mb modules in my MP 1,1

I wish apple would have an option to actually designate the ram module size that comes with the quad or hex.. not just be stuck with 4gb modules. Or an option to get none - not that they would ever go for that.

Also is the "optimal" perfomance .. like a 10% boost or penalty?.. like I think the tri-channel stuff was.. cuz doing it the "optimal" way looks like it costs 2.5x more.

Bear · Dec 14, 2013

spaz8 said:
Doh.. So to get to 32 it has to be 2 x16, or 4 x 8 ? not even 2x4 on the left (slots 1,2), and a 16 in slot 3 on the right.
...

On the new Mac Pro, you need 4 matched sticks to get best performance.

Cubemmal said:
...
So I'm considering paying an Apple tax and just BTO to 32GB, it might be cheaper than paying for then getting rid of the 16GB. Looking here
...
Any wagers on what a BTO from 16GB to 32GB would cost?

Good thing to always consider. Also remember you have to keep the original Apple memory in case you have certain issues. And more importantly if Apple decides to replace your system, you want the original Apple memory in it. So you can't even recoup costs by selling the original memory.

As for the cost, I suspect the Apple "Tax" of going from 16Gb to 32GB will be no worse than buy 32GB of memory from a third party. So yeah may as well do a CTO unit in that case and have everything covered by AppleCare.

VirtualRain · Dec 14, 2013

spaz8 said:
Doh.. So to get to 32 it has to be 2 x16, or 4 x 8 ? not even 2x4 on the left (slots 1,2), and a 16 in slot 3 on the right.

I have a combo of 6 x 2gb and 2x 512 mb modules in my MP 1,1

I wish apple would have an option to actually designate the ram module size that comes with the quad or hex.. not just be stuck with 4gb modules.

Also is the "optimal" perfomance .. like a 10% boost or penalty?.. like I think the tri-channel stuff was.. cuz doing it the "optimal" way looks like it cost 2.5x more.

The nMP has a quad channel memory controller that will work optimally with 4 matched DIMMs. I'm not an expert on this so someone else might correct me, but, if the memory controller even supports an arrangement of 4, 4, 16, -, then best case you're only going to get multi-channel performance across 12GB (3x4GB), although it could default to single channel mode for the whole lot (no striping). I don't know how the memory controller will handle this... It's worth looking into.

However, the performance impact may only be a few percent if your data sets normally fit into Intels enormous L3 cache. Back in the FSB days, Intel started using ridiculous L3 cache sizes on their CPUs to mask underlying issues with memory performance... And even now that they've dramatically improved memory performance with on-die controller, they still use oversized L3 cache, so any issues with the way you configure your memory will have little impact on overall performance.

Another point worth keeping in mind, is that in most cases, more RAM is better than faster RAM. So if you need to use mismatched DIMMs to achieve your applications sweet spot, that's better than starving your apps for the sake of performance.

Umbongo · Dec 15, 2013

Anyone planning on upgrading Mac Pro memory, beyond 16GB, from a 3rd party should look towards 2-4 16GB DIMMs. Forget getting 4x8GB, unless Apple charge a low amount as hinted at in the news story with the price quotes.

Real world performance differences aren't noticeable once you get past single channel. A good portion of current Mac Pro users don't run the most optimal combination of every DIMM slot full on 2006-2008 models or only 3 per CPU on 2009+, and it isn't an issue.

Bear · Dec 15, 2013

VirtualRain said:
The nMP has a quad channel memory controller that will work optimally with 4 matched DIMMs. I'm not an expert on this so someone else might correct me, but, if the memory controller even supports an arrangement of 4, 4, 16, -, then best case you're only going to get multi-channel performance across 12GB (3x4GB), although it could default to single channel mode for the whole lot (no striping). I don't know how the memory controller will handle this... It's worth looking into.
...

If it supports multichannel for less than 4 identical sticks, then this would be multichannel for 2x4GB as the third stick (16GB) in your example is not the same capacity.

deconstruct60 · Dec 15, 2013

Umbongo said:
Anyone planning on upgrading Mac Pro memory, beyond 16GB, from a 3rd party should look towards 2-4 16GB DIMMs. Forget getting 4x8GB, unless Apple charge a low amount as hinted at in the news story with the price quotes.

only 2 16GB isn't all that great of a configuration.

Real world performance differences aren't noticeable once you get past single channel. A good portion of current Mac Pro users don't run the most optimal combination of every DIMM slot full on 2006-2008 models or only 3 per CPU on 2009+, and it isn't an issue.

it is an issue. Going over the memory controller count ( # DIMMs > # memory controllers ) is very substantively different than the other way around ( # DIMMs < # memory controllers).

The 2006-2008 models are hamstring. That are completely blown away with the modern stuff. It doesn't make a difference there because the whole memory architecture is jacked up for 4+ core contexts. So sucks and sucks less ... that is the context for those models. If want to run single program, single threaded stuff that all fits in one DIMM there is no impact but highly questionable why bothering with a Mac Pro in that context.

The 2009-2012 models what the wide spread config is in the >= 3 DIMMs. Anyone running with just one DIMM is back in the handicapped zone the 2006-2008 models are in. The slight hit taking by dealing with ranked DIMMs ( 4 DIMMs in 2009-2012 case) is not like pouring bandwidth down the drain by going to smaller number of DIMMs while keeping the workload high over multiple cores.

haravikk · Dec 15, 2013

Bear said:
If it supports multichannel for less than 4 identical sticks, then this would be multichannel for 2x4GB as the third stick (16GB) in your example is not the same capacity.

I think what Virtual Rain was getting at was that a smart memory controller should be able to stripe the first 4gb of each stick, leaving the 16gb one with 12gb of un-striped memory leftover. Basically you'd have a total of 12gb striped RAM across 3 sticks, and 12gb of un-striped RAM on the over-large module.

So long as the OS has some awareness of that feature (or the memory controller handles it invisibly somehow) then high demand stuff should still get the speed boost, which in Mavericks wouldn't be so bad as the "slow" RAM could be used for compressed memory.

No idea if any systems actually do this though, but I think that's what Virtual Rain meant.

VirtualRain · Dec 15, 2013

haravikk said:
I think what Virtual Rain was getting at was that a smart memory controller should be able to stripe the first 4gb of each stick, leaving the 16gb one with 12gb of un-striped memory leftover. Basically you'd have a total of 12gb striped RAM across 3 sticks, and 12gb of un-striped RAM on the over-large module.

So long as the OS has some awareness of that feature (or the memory controller handles it invisibly somehow) then high demand stuff should still get the speed boost, which in Mavericks wouldn't be so bad as the "slow" RAM could be used for compressed memory.

No idea if any systems actually do this though, but I think that's what Virtual Rain meant.

Yeah, that's what I meant, but I also have no idea if this is actually possible. The only way to ensure maximum memory bandwidth is to use identical sticks... One for each channel.

deconstruct60 · Dec 15, 2013

VirtualRain said:
...I'm not an expert on this so someone else might correct me, but, if the memory controller even supports an arrangement of 4, 4, 16, -

If Apple continues on trend of using unbuffered modules for smaller DIMM capacities , then that isn't going to work. Mixing unbuffered and registered(buffered) DIMM modules isn't supported.

If the 8GB modules are registered(buffer) than may want to later mix and match with 16GB modules. But if know long term going to 16/32GB DIMMs modules eventually buying the 4GB configs is very likely more cost effective. Just have to resign to either repurposing or selling those 4GB modules.

, then best case you're only going to get multi-channel performance across 12GB (3x4GB), although it could default to single channel mode for the whole lot (no striping).

The notion of channels here is horribly muddled. There are four pathways (controllers) to DIMMs in the Xeon E5 packages. If you plug in 4 DIMMs ( one each to each of the four pathways ) you will be using all four. If your only plug in two DIMMs you will be leaving two of those pathways(controllers) idle. [ essentially flushing potential bandwidth down the drain. ]

If 1-4 of those four pathways has multiple DIMM sockets assigned to it the timing to all four controller/pathways is set to a different timing model so can juggle talking to more than one DIMM. There are still four pathways. The notion that somehow things are going single file across all the DIMMs is flawed. It is not. Down a single pathway only one DIMM is accessible at a time. That one path is single file but there are multiple paths to memory.

On the new Mac Pro that is moot. There are four DIMMs and four pathways. One controller never has to juggle more than one DIMM. The is aligned with the future since in DDR4 that is the normal mode.

However, the performance impact may only be a few percent if your data sets normally fit into Intels enormous L3 cache. Back in the FSB days, Intel started using ridiculous L3 cache sizes on their CPUs to mask underlying issues with memory performance...

And the 10-30MB L3's of the current Xeon E5 v2 are puny compared to "back in the day". Not.

All memory accesses through a single "front side bus" is bad because it sets up a choke point. 1-2 cores can share a single access resource. Maybe 2-4. But 6, 8 , 10 , 12 only hammering on the same "door" to go to from memory. A large room with many people inside will have mulitple exits because it is hard to get lots of folks through one door. Especially if there is in/out traffic. One way of decreasing traffic jams is to have multiple doors/lanes/paths .

And even now that they've dramatically improved memory performance with on-die controller, they still use oversized L3 cache, so any issues with the way you configure your memory will have little impact on overall performance.

The gross assumption here is that the prediction will prefill the L3 cache or that the problem is only L3 sized big. 20MB of data isn't alot. Even less if it is being split 6 ways.

The on-die controller is there because the memory pressure that 4+ cores inflict on the memory pathway(s) is quite heavy. You don't want the memory access bandwidth shared. Just as the number of cores are increasing to bring new levels of parallelism, you also have to increase the parallelism in the I/O subsystem to go along with it. Otherwise merely just creating chokepoints. The L3 cache isn't going to make the chokepoint go away because if it is filled from the exact same choked pathway to memory.
That is just a small band-aid on the root cause issue.

Another point worth keeping in mind, is that in most cases, more RAM is better than faster RAM.

There is a lower limiting boundary need to cross with faster RAM. But right... which is why it is better to have parallel paths to DIMMs (relatively cheaper RAM so can buy more ) than just cranking L3 (relatively extremely expensive RAM). But as a mindless rule this would throw cheaper storage capacity in front of RAM capacity also. It almost always pays to have to have some faster RAM.

So if you need to use mismatched DIMMs to achieve your applications sweet spot, that's better than starving your apps for the sake of performance.

The real core issue folks have to let go of is that they "have to" keep some subset of the DIMMs in place. Some upgrades mean chucking all of the old stuff and starting over. Not all mismatches work well. Extremely skewed 2GB and 32GB DIMMs and likely going pragmatically run into problems. There is a range between identically all matched and matching doesn't matter at all.

Umbongo · Dec 15, 2013

deconstruct60 said:
only 2 16GB isn't all that great of a configuration.

Sure it is.

it is an issue. Going over the memory controller count ( # DIMMs > # memory controllers ) is very substantively different than the other way around ( # DIMMs < # memory controllers).

I ran 8x8GB for a while and did many benchmarks before moving to 8x16GB and test 4x16GB, 2 per CPU,before I full upgraded and there were no real world differences and I know I do more intensive stuff than the content creation the majority of Mac Pro users are going to do. If people want to buy all their DIMMs again that's fine, but I wouldn't recommend it when there is no real world difference.

deconstruct60 · Dec 15, 2013

Umbongo said:
....
I ran 8x8GB for a while and did many benchmarks before moving to 8x16GB and test 4x16GB, 2 per CPU,before I full upgraded and there were no real world differences

Measuring what? The impact of OS X's round robin memory and decoupled round robin process allocations on QPI traffic or the memory controller impact.

If actually looking for the memory controller impact might try not measuring two different things at the same time.

The trans-QPI memory transfer impact on the new Mac Pro .... about zero.

Bear · Dec 15, 2013

haravikk said:
I think what Virtual Rain was getting at was that a smart memory controller should be able to stripe the first 4gb of each stick, leaving the 16gb one with 12gb of un-striped memory leftover. Basically you'd have a total of 12gb striped RAM across 3 sticks, and 12gb of un-striped RAM on the over-large module.
...

And I was stating it as I know it worked or might work. I won't do multichannel between mixed memory sizes. For full 4 channel multichannel all 4 dimms have to be identical.

I'm not sure if the memory controller can do 2 channels multichannel if the confifg was 2x4gb and 1 x 16gb.

Umbongo · Dec 15, 2013

deconstruct60 said:
Measuring what? The impact of OS X's round robin memory and decoupled round robin process allocations on QPI traffic or the memory controller impact.

If actually looking for the memory controller impact might try not measuring two different things at the same time.

The trans-QPI memory transfer impact on the new Mac Pro .... about zero.

Nah mate, I just ran simulations of my workflow enough times over two month period when I didn't need the system and looked at the differences between 64GB(8x8GB); 16-cores 64GB (4x16GB); 16-cores 128GB and 16-cores 96GB with 32GB RAM disk on a dual Sandy Bridge EP system to look at what benefited me most. I'm happy to suggest people use any combination of 2-4 DIMMs per CPU and not worry about it in the slightest for workstation usage. Theory and benchmarks are fine, but I wanted to see for real as I had the opportunity and that's my conclusion.

Sure if someone wants the most optimal system then they can go ahead and grab 4 DIMMs of the same density for peace of mind or whatever floats their boat.

VirtualRain · Dec 15, 2013

Umbongo said:
Nah mate, I just ran simulations of my workflow enough times over two month period when I didn't need the system and looked at the differences between 64GB(8x8GB); 16-cores 64GB (4x16GB); 16-cores 128GB and 16-cores 96GB with 32GB RAM disk on a dual Sandy Bridge EP system to look at what benefited me most. I'm happy to suggest people use any combination of 2-4 DIMMs per CPU and not worry about it in the slightest for workstation usage. Theory and benchmarks are fine, but I wanted to see for real as I had the opportunity and that's my conclusion.

Sure if someone wants the most optimal system then they can go ahead and grab 4 DIMMs of the same density for peace of mind or whatever floats their boat.

I agree... both theory and testing supports the fact that the large L3 cache in Intel CPUs goes a long way to masking any performance issues with the downstream RAM.

AnandTech recently did a bunch of benchmarks of memory from 1333 to 2400 and the differences are marginal (in many cases absolutely zero), despite the near doubling of theoretical memory bandwidth.

I think the best advice is to make your top priority to have more than enough RAM, with your second priority being doing so for the least cost (considering both current and future needs), and your third priority doing it in a way that optimizes performance.

nMP BTO RAM

macrumors 6502a

macrumors G5

macrumors 603

macrumors 65832

macrumors 6502

macrumors 6502a

macrumors 603

macrumors 6502

macrumors G3

macrumors 603

macrumors 601

macrumors G3

macrumors G5

macrumors 65832

macrumors 603

macrumors G5

macrumors 601

macrumors G5

macrumors G3

macrumors 601

macrumors 603

Our Staff