Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
As I wrote in another thread: using all four RAM slots does lower down the three sticks at all due to changing from the faster tripple channel mode to a double channel mode, which means 25% lower clock frequenzy of the bus.
Not saying, that apps always go slower with using 4/8 sticks instead of 3/6 - it depends.


Just to be certain.....

Does the fourth "automatically" and always slow the memory down or just when the memory usage tops the first three memory sticks?
 
hi. thanks for that link. but my results seems to be in-line with that link. i just didn't post the 1x cpu, only the multi cpu.

Oh, those are your own benchmarks?

Kewl. Since they were different systems I thought you were culling them from here or something. Also I have a hunch that even tho the 2.8 and the 2.26 are almost the same in my graph that we're looking at the slowest BM possible for the 2.8 vs. the highest possible in the 2.26's case. I used only the highest numbers to assemble that graph but there was only one 2.8 entry as I recall. I also have a hunch that the CineBench biases the 2.26 favorably in some way. I'm pretty sure the 2.8 will be much faster at most (95%?) things.
 
Just to be certain.....

Does the fourth "automatically" and always slow the memory down or just when the memory usage tops the first three memory sticks?

If this helps... one way to think of multi-channel interleaved memory is the same way you think of stripped RAID0 drive arrays.

To use this analogy on the memory configuration in the MP, the first three sticks are interleaved (a stripped array)... any data that is pulled from this region of memory will gain the benefits of the interleaving (striping). Thus pulling a dataset from the interleaved (stripped) range of memory will result in an effective increase of 3x the data bandwidth of non-interleaved (non-stripped) memory.

The last stick is not interleaved with any other stick (in our RAID0 analogy, it's like an extra drive all on its own). Thus any data pulled from this non-interleaved stick will only have an effective data bandwidth equal to single channel performance.

Any test that utilizes all four sticks of RAM will appear slower than any test that only uses the three interleaved sticks because the performance of the non-interleaved stick will drag the overall performance down... it's not that the other's are performing less. To imply otherwise, is to imply that the memory controller can somehow merge the first two channels into one... this makes no sense... the traces on the board from each of the first three DIMMs back to the memory controller are fixed and each feed an independent channel on the memory controller. While you could argue that the memory controller somehow fails to interleave data across all three channels, there's no reason for this. I believe the observed memory performance with 4 sticks populated is simply the result of the single-channel performance of the last stick dragging the overall performance down.
 
If this helps... one way to think of multi-channel interleaved memory is the same way you think of stripped RAID0 drive arrays.

To use this analogy on the memory configuration in the MP, the first three sticks are interleaved (a stripped array)... any data that is pulled from this region of memory will gain the benefits of the interleaving (striping). Thus pulling a dataset from the interleaved (stripped) range of memory will result in an effective increase of 3x the data bandwidth of non-interleaved (non-stripped) memory.

The last stick is not interleaved with any other stick (in our RAID0 analogy, it's like an extra drive all on its own). Thus any data pulled from this non-interleaved stick will only have an effective data bandwidth equal to single channel performance.

Any test that utilizes all four sticks of RAM will appear slower than any test that only uses the three interleaved sticks because the performance of the non-interleaved stick will drag the overall performance down... it's not that the other's are performing less. To imply otherwise, is to imply that the memory controller can somehow merge the first two channels into one... this makes no sense... the traces on the board from each of the first three DIMMs back to the memory controller are fixed and each feed an independent channel on the memory controller. While you could argue that the memory controller somehow fails to interleave data across all three channels, there's no reason for this. I believe the observed memory performance with 4 sticks populated is simply the result of the single-channel performance of the last stick dragging the overall performance down.

A nice description! This is exactly how I assume it works and I can't see it being any other way.
 
OK Everyone, good news. GREAT NEWS. The upgrade from 2.26 to 2.93 works like a charm. my last failure was due to my own carelessness and pin bending. i am going to upgrade the memory to 1333 mhz next and run some benchmarks. just need to get that strip of MOSFETs covered somehow.

anybody got some good benchmarks to try out?

That's great!

A short step by step tutorial would be nice. :D
 
That's great!

A short step by step tutorial would be nice. :D

He basically already did that. We just have to read through it.

  1. Buy 2009 2.26 Mac
  2. Buy the 2.93 replacement proc. (PGA !!!)
  3. open the Mac
  4. Remove the heat sync (be careful of the CPU sticking to the bottom of the sync - and maybe dropping off.
  5. Clean both surfaces and reapply thermal paste
  6. Insert new CPU.
  7. Replace the heat sync and put back the daughter card.
  8. Close up and connect up the Mac
  9. Turn it on.
Hehehe He can modify that is there's anything left out. :D
 
Oh, those are your own benchmarks?

Kewl. Since they were different systems I thought you were culling them from here or something. Also I have a hunch that even tho the 2.8 and the 2.26 are almost the same in my graph that we're looking at the slowest BM possible for the 2.8 vs. the highest possible in the 2.26's case. I used only the highest numbers to assemble that graph but there was only one 2.8 entry as I recall. I also have a hunch that the CineBench biases the 2.26 favorably in some way. I'm pretty sure the 2.8 will be much faster at most (95%?) things.

Tesselator, yes those were my benchmarks. i had a variation of 5% on a few tests in a row. i just rounded off.

I'm sure the new architecture is faster. maybe cinebench doesn't test things which could be faster with the Nehalem yet. Maybe their script can be remade with some new instructions that take advantage of the new architecture. then again, the whole point of a benchmark is for one popular test to be exactly the same on different machines.

The only thing that really bothers me about the new mac pro is the price. i think the 2.26 with 6GB should be around $2500 not $3300. there isn't that much in it to justify the price hike. plus, the video card isn't superb, so it has to be upgraded too, in addition to drives, etc. just my opinion :)
 
He basically already did that. We just have to read through it.

  1. Buy 2009 2.26 Mac
  2. Buy the 2.93 replacement proc. (PGA !!!)
  3. open the Mac
  4. Remove the heat sync (be careful of the CPU sticking to the bottom of the sync - and maybe dropping off.
  5. Clean both surfaces and reapply thermal paste
  6. Insert new CPU.
  7. Replace the heat sync and put back the daughter card.
  8. Close up and connect up the Mac
  9. Turn it on.
Hehehe He can modify that is there's anything left out. :D

I did a CPU upgrade for my old PC last year. It involved also things like upgrade Bios, change Bios settings, aply new and better CPU fan.

The steps you described involved nothing like that. It sound pretty much like a piece of cake. :)

But, what do you mean by PGA? And also do we know if the 3.2Ghz CPU might work as well without extra cooling measures, or would that be out of the question already?
 
Also supporting diminishing returns...

Multi-threaded speedup:

2.66 quad - 413%
2.93 quad - 376%

2.26 octo - 641%
2.66 octo - 640%
2.93 octo - 629%

Faster you go, less of a benefit.

Cinebench10_Numbers.jpg

All my Mac Pro lust is now gone. I actually like Vista ;)

CINEBENCH R10
****************************************************

Tester :

Processor : Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
MHz : 3800
Number of CPUs : 8
Operating System : WINDOWS 64 BIT 6.0.6001

Graphics Card : GeForce GTX 275/PCI/SSE2
Resolution : <fill this out>
Color Depth : <fill this out>

****************************************************

Rendering (Single CPU): 5352 CB-CPU
Rendering (Multiple CPU): 21949 CB-CPU

Multiprocessor Speedup: 4.02

Shading (OpenGL Standard) : 6800 CB-GFX


****************************************************
 
All my Mac Pro lust is now gone. I actually like Vista ;)

CINEBENCH R10
****************************************************

Tester :

Processor : Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
MHz : 3800
Number of CPUs : 8
Operating System : WINDOWS 64 BIT 6.0.6001

Graphics Card : GeForce GTX 275/PCI/SSE2
Resolution : <fill this out>
Color Depth : <fill this out>

****************************************************

Rendering (Single CPU): 5352 CB-CPU
Rendering (Multiple CPU): 21949 CB-CPU

Multiprocessor Speedup: 4.02

Shading (OpenGL Standard) : 6800 CB-GFX


****************************************************

Umm... Thanks?
 
Heat Sink / Heatspreader?

yes it can be upgraded. if anyone here decides to upgrade cpu's, be really careful. the original cpu's are NOT clamped down and when you remove the heatsink, the CPUs hang on to the heatsink for a second, then drop down onto the board and in my case bent a pin or two. so be really careful.

at least we know that all the dual socket versions 2.26 and up are all the same machine.

also, if you go 2.66 or higher, you should switch to 1333MHz memory to take advantage of the speed.

here's a pic with 1 of my 2.93's installed. 2nd socket is empty. notice the QPI 6.4 GT/s (QPI stands for QuickPath Interconnect). the 2.26 GPI is 5.8 GT/s.

What did you do with regard to the heat sink / heatspreader issue? Or has that already been addressed and I completely miss it? Last I understood, the retail processors come with the heatspreaders glued-on, and are too big for the heat sink in the '09 MP to fit over top. How did you resolve this?
 
In addition to the last question posted I would also like to know if you were ever able to confirm 1333Mhz non-ECC memory will both work and get recognised at 1333Mhz speeds by OSX.
 
I did a CPU upgrade for my old PC last year. It involved also things like upgrade Bios, change Bios settings, aply new and better CPU fan.

The steps you described involved nothing like that. It sound pretty much like a piece of cake. :)

But, what do you mean by PGA? And also do we know if the 3.2Ghz CPU might work as well without extra cooling measures, or would that be out of the question already?

PGA = Pin Grid Array. It means these processors have pins again. ;)

And yeah as for the BIOS and fans and stuff it's a Mac so none of that nonsense. :D I supposed you could switch over to active cooling if you wanted to tho. Hmmm.

3.2GHz xeon cooling will be on Intel's site. If it's like the '06, and '08 macs tho it will be the same. Xeons are great that way. All my Xeon machines so far (for the past 15 years or more) have been passively cooled - even in ancient 4-processor quad systems.


@ davewolfs,
Something is very wrong with your test results. (Oh, NM, that's Windows 64-bit. Yes, I see. :) )
 
I'm curious...how long does it take for your new Mac Pro to boot
(from the moment of pushing the power button til the dock appears)?
 
1333MHz memory on 2.66/2.93 Octos ??

According to documentation 1333MHz memory will only work if you only use the first slots on the memory channels (i.e. more than 3 dimms per processor means no 1333MHz speed, probably why Apple didn't bother with it). If 1333MHz works it would be nice if you could just confirm this for the sake of completion if possible. Thanks.

Has anyone got memory working at 1333MHz ? (I've searched the threads and they seem to peter out.) Given that the memory is connected directly to the processor, I would have thought that provided it is three channel (ie 3 sticks per processor) and only one stick per channel it should automatically run at the correct speed.

Similarly it would be nice to know if registered memory will run. Given that everyone other than Apple seem to allow both registered and unregistered and the various speeds (800/1066/1333) and they all use the same chip set(s) I would have thought it would all be ok in the Apple too.

The trouble is it is rather expensive to experiment, and I don't have a Mac Pro anyway (part of my interest in the question is to try and decide whether to get a Mac Pro or a Dell or build something myself).
 
I'm curious...how long does it take for your new Mac Pro to boot
(from the moment of pushing the power button til the dock appears)?

I can tell you mine. It's not a new MP tho; it's a 2006 2.66 GHz.

When my machine was new it took 16 seconds from power button push to desktop display - In OS 10.4.x. With only 1 HDD, 4 total cores (two X5150 xeons), and 2GB RAM.

Now in OS 10.5.6 with 8-cores (x5355 xeons), 9 USB devices, 4 total HDDs (3 are in RAID 0), two 24" LCDs, 12 GB os RAM, and a whole butt-load of start-up apps, it takes 43 seconds. Till I can click on something on the desktop and have it react. Now it takes 16 seconds just for the Mac Bong sound to go off.

I think if I disconnected all the USB stuff, went back to a single HDD, and removed all the start-up programs then it would return to the 16 second (or so) startup time.

Again tho, this is a 2006 model and not one of the new ones.
 
What did you do with regard to the heat sink / heatspreader issue? Or has that already been addressed and I completely miss it? Last I understood, the retail processors come with the heatspreaders glued-on, and are too big for the heat sink in the '09 MP to fit over top. How did you resolve this?

Indeed too bad this question remains unanswered. The pictures on page 25 do show a great difference between the CPU's. A problem is described but the answer didn't come up in the thread.

Tesselator7483088 said:
PGA = Pin Grid Array. It means these processors have pins again. ;)

You mean the board has pins right? The CPU part doesn't seem to have pins according some pictures a few pages back.
 
Indeed too bad this question remains unanswered. The pictures on page 25 do show a great difference between the CPU's. A problem is described but the answer didn't come up in the thread.

It could be why it never booted. Looking at the pics and the heat sink retention mechanism, there's just no way you can secure the heat sink on a CPU that's twice as thick (or more). If it was secured using screws, you could just use longer screws, but this is some sort of fixed length push-pin setup.
 
You mean the board has pins right? The CPU part doesn't seem to have pins according some pictures a few pages back.

Well, there's two kinds. There's PGA and LGA. Here's some pics of the LGA type:

0d9dd0e1b3de45fff168ff91f4b5324e.jpg
landgridnehalem.jpg


In the case of LGA procs the socket has the pins. Like so:
3.jpg
LGA by the way, stands for Land Grid Array and I guess it's "Land" is as in: Landing pads.


And here's some pics of the PGA type that Apple seems to be using in their 2009 models:

dunnington_06_sm.jpg
intel-dunnington-600.jpg


While I can't seem to find better images on-line here you can clearly see the CPU package
itself has the pins on.

And I guess we all know what a ZIF socket for PGA types look like ya? I think they're probably
available in surface mounted form factor as well - probably BGA for Ball Grid Array. :D


.
 
It could be why it never booted. Looking at the pics and the heat sink retention mechanism, there's just no way you can secure the heat sink on a CPU that's twice as thick (or more). If it was secured using screws, you could just use longer screws, but this is some sort of fixed length push-pin setup.

It did boot, it booted just fine after he fixed a pin he bent when originally removing the heat sinks...
 
It did boot, it booted just fine after he fixed a pin he bent when originally removing the heat sinks...

I thought he could only get it to boot after fixing the bent pin AND putting the stock processor back in. But it's possible I missed something. At any rate, I'm shocked if it worked... and I really doubt it would work for long.
 
I thought he could only get it to boot after fixing the bent pin AND putting the stock processor back in. But it's possible I missed something. At any rate, I'm shocked if it worked... and I really doubt it would work for long.

yes it can be upgraded. if anyone here decides to upgrade cpu's, be really careful. the original cpu's are NOT clamped down and when you remove the heatsink, the CPUs hang on to the heatsink for a second, then drop down onto the board and in my case bent a pin or two. so be really careful.

at least we know that all the dual socket versions 2.26 and up are all the same machine.

also, if you go 2.66 or higher, you should switch to 1333MHz memory to take advantage of the speed.

here's a pic with 1 of my 2.93's installed. 2nd socket is empty. notice the QPI 6.4 GT/s (QPI stands for QuickPath Interconnect). the 2.26 GPI is 5.8 GT/s.

He did have success. Would be nice if we could get an update...? ;)
 
Thought I'd post some pictures of my new Mac Pro Quad in this thread since it is for unboxing. I'm a bit disappointed there wasn't an outer cardboard box like I got with my iMac

3515873644_98b5174041_b.jpg


3515874338_1bd3aa9d05_b.jpg


3515875014_57d4f2ec79_b.jpg


3515066287_3b6a3c7610_b.jpg


3515066825_d27fbe2236_b.jpg
 
I thought he could only get it to boot after fixing the bent pin AND putting the stock processor back in. But it's possible I missed something. At any rate, I'm shocked if it worked... and I really doubt it would work for long.

Yeah, you got it quite wrong.

It worked and has been working ever since. The only reason it didn't work was that he had a bent pin. Once that was discovered he unbent it and the system recognized the processor and everything. Read back through the thread. It's all in there.

These systems ARE upgradable to any compatible processor just like all prior Mac Pros.
 
Hope these show up. Nobody yell if they don't.

Couple more pics.

Do the octo's have pins? I don't see them on the quad.

i meant the pins on the board. problem is, the 2.26 is only being held by a small amount of cpu compound, so when i removed it, the cpu popped off and bent one pin on the board, but i fixed it and everything's back to normal.

yup, this is complete different. look at your picture and the CPU lockdown lever. look at my picture, the 2.26 has no lever:


2.26:
attachment.php



2.66:
attachment.php


Well, there's two kinds. There's PGA and LGA. Here's some pics of the LGA type:

0d9dd0e1b3de45fff168ff91f4b5324e.jpg
landgridnehalem.jpg


In the case of LGA procs the socket has the pins. Like so:
3.jpg
LGA by the way, stands for Land Grid Array and I guess it's "Land" is as in: Landing pads.


And here's some pics of the PGA type that Apple seems to be using in their 2009 models:

dunnington_06_sm.jpg
intel-dunnington-600.jpg


While I can't seem to find better images on-line here you can clearly see the CPU package
itself has the pins on.

And I guess we all know what a ZIF socket for PGA types look like ya? I think they're probably
available in surface mounted form factor as well - probably BGA for Ball Grid Array. :D


.

Still, according the posts and pics of raziel777 and unethical (see page 44 for pics raziel777) it seems they agree the pins are on the board, not on the chips. Also the CPU's look like the ones you posted described as LGA ones. It means Mac Pro 2009 uses LGA CPU's, right?
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.