Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Yeah, that's kinda what I thought too. From the benchmarks I've created and looking over the GFX I/O stack it seems like a similar situation to the SATA II vs. SATA III debate. In a few rare cases the slower buss speeds caps performance but in over 90% of cases it doesn't.

How about on the power issue? What say you? Would you go for the additional PSU or just use native power?

The tweaker in me would be stymied were I to use just native power over the long haul without adding another PSU; so that's what I've done.
 
An old kid is welcomed on a new block - Fans of CUDA and self-builders, "Rejoice!"

In post no. 531, above, I indicated that EVGA had discontinued the 2012 SR-X motherboard for E5 processors, but continues to sell the 2009 SR-2 motherboard for Westmere and Nehalem processors. The E5 CPU chips that went into the SR-X, have limited tweakability and many tweakers openly stated that they were sticking with the SR-2 for now. That all may soon change. A favorite motherboard manufacturer has announced new entries for the Sandy/Ivy Bridge dual CPU market [ http://www.legitreviews.com/news/15049/ ] and [ http://forums.legitreviews.com/about44037.html ]. Gigabyte's new entries are welcomed, especially the GA-7PESH3 (supports 4-Way SLI / CrossFireX) and the GA-6PXSV4 (supports 3-Way SLI / CrossFireX). I look forward to getting my hands on the GA-7PESH3 even though I have no intention whatsoever to use either SLI (Nvidia's creation) or CrossFireX (ATI's creation).

I have Titan orphans in mind. CUDA performs best with GPUs that aren't connected by SLI. So 4 double-width 16X PCI-e 3.0 slots (from 7 individual slots total), and the other qualities inherent in Gigabyte's products take me back into the arms of Gigabyte. Other than SuperMicro who sells grade A dual+ CPU motherboards, that motherboard market had been left by EVGA to Intel, Tyan and Asus. Tyan and Intel hadn't done anything in recent years to rekindle my interest in them, and Asus, who I consider to be a credible player in the single CPU motherboard market, has disappointed me and others greatly with the poor quality of their dual Sandy Bridge motherboards. I have long been a fan of Gigabyte single CPU motherboards because of their products' innovations, quality, durability, tweakability and adaptability. While the lack of real tweakability of the E5 chips is Intel's decision, the adaptability of Gigabyte's motherboards is reflected by the fact that most hackintoshers and fans of Linux prize(d) Gigabyte's single GPU motherboards because they were/are the easiest to install non-Windows OSes on and the most compatible to maintain those OSes. Here's to hoping that Gigabyte hadn't lost its juice.
 
Last edited:
I have Titan orphans in mind. CUDA performs best with GPUs that aren't connected by SLI. So 4 double-width 16X PCI-e 3.0 slots (from 7 individual slots total),

The SR-X already had 4 x 16 lane slots right? So is there new ground to be broken or is it just promising that the board may be friendlier to work with?
 
Can a stable, $600, EEB form factor motherboard w/4 Titans break new ground?

The SR-X already had 4 x 16 lane slots right? So is there new ground to be broken or is it just promising that the board may be friendlier to work with?
True - the SR-X motherboard already had 4 x 16 lane slots, but boy did it require a large case. Importantly tho', EVGA has discontinued the SR-X. The people that I know who purchased them were not significantly impressed by their quality/workmanship, especially the ones who had already owned the EVGA SR-2. Even the EVGA SR-2s have their own idiosyncrasies - some of which are great and others of which are not great at all. However, purchasers of the SR-X felt better satisfied about their EVGA purchases than did we who purchased the dual socket Asus boards. In this light, whether there is new ground to be broken depends first on whether Gigabyte maintains the quality and compatibility it was and is known for in its single socket boards. If that is the case then that puts it above Asus - at the bottom, and EVGA and Intel which are above Asus. That would make the dual socket Gigabyte boards on a par with those of SuperMicro. However, SuperMicro doesn't have anything in the $600 range (where the Gigabyte board is) that has slots for four Titans and in a form factor that doesn't require a special/custom case costing a lot more money [Gigabyte's board is EEB rather than HPTX (EVGA SR-2) or those funky, custom form factors used by SuperMicro for which you'll pay more for the case than for the motherboard]. Moreover, you can now purchase the Gigabyte board and from what I've read in the owners manual, the real ground breaker could begin to be present if the actual product's bios conforms to what the manual states about the allowable settings. The Asus manual had a couple of bios setting that never made it into any of the bios releases and those unrealized bios settings were what influenced me to buy the Asus board rather than the SR-X. So I'm hopeful that Gigabyte's bios settings as set forth in the manual are what I'll confront after my board arrives and is installed. Unlike Asus and EVGA, Gigabyte makes no pretense that you'll be able to increase the base clock setting of the CPUs, but if this board is more stable than Asus' and EVGA's, so long as it runs four Titans stably over time, it'd make me happy. Only then can new ground really be broken. Intel took away my ability to really tweak the latest Xeon CPUs for maximizing CPU performance; but what Intel taketh away, I hope that Gigabyte along with Nvidia give back to me in spades, i.e., something to tweak that is multiples more powerful, namely the GPGPU. So let the new plowing with a friendlier motherboard begin!
 
Last edited:
What Nvidia hath given, Nvidia has now taken away, piece by piece. Not!

If you've read my earlier posts about how to maximize Titan's double precision floating point peak performance, you, like me, might be surprised to learn that through the latest Nvidia drivers, etc., Nvidia has likely now seen fit to minimize the actual advantages of owning two or more Titans. How can that be? Because the sources from which I purchased my Titans limited me to purchasing just one of the overclocked versions at a time, I acquired mine one by one over the last month and a half. When I prepared the earlier posts about Titan's astounding performance, I was basing my observations on the Titan in WolfPackPrime0. WolfPackPrime0 is slot challenged by having only four 16x slots and I had (and have) a PCI-e SSD and a GT 640 in two of those four slots. That was fine for me then as I was in the acquisition stage and didn't mind then being limited to one Titan. After I got three of them (a critical mass for me), I decided to do what I had planned to do all along, i.e., put at least three of them together in a different system. So that is what I did and updated my Titans' drivers software with a new install of the latest and, I was hoping, greated Nvidia could dish out. Then, I beheld when running CPU-z and Precision X that after making my planned tweaks that only one of the Titan cards consistently behaved as I had planned. Occasionally a second one would briefly experience a big boost in performance, but never consistently. The third Titan never would exhibit more than factory performance, excluding completely the ability to use Nvidia Control Panel to unchain the third card at all - you can select the change, but it won't take effect. Luckily, I hadn't change anything on WolfPackPrime0 in the interim so I studied the changes in software versions and copied files from WolfPackPrime0 to WolfPack3 (EVGA SR-2 running Windows 7) to replace the updated files from Nvidia (and I deleted and disabled any future download via Nvidia Update). The result: I'm happy to report that now all three Titans perform as a team as I had planned that they would. Too bad that greed would make a company try to hobble a product line, even after purchases have been made, to attempt to drive customers who want spectacular performance from multiple purchases of a seemingly high priced product ($1k), to higher priced product, e.g., the Tesla K20 series. My software retrograde shows that Nvidia knew how to get it right ( at least for me) in the first instance. Here's to hoping that NVidia gets it right once more.

BTW - Before the software retrograde, my three Titan cards (after I had attempted to tweak all three of them) attained a Luxmark score [ http://www.luxrender.net/luxmark/results/user/TheRealTutor ] of about 1.78 times the score of that attained by one Titan fully tweaked [3136 / 1760]. With the software retrograde, now the three of them (all fully tweaked in the same way and to the same extent) score very close to three times the score of that attained by one Titan fully tweaked [5188 / 1760 = 2.95] . Also, on LuxMark tests my three fully tweaked Titans score at least 1.7 times the scores of those attained by my three fully tweaked GTX 580s, except for one test where the delta was only 1.52x. LuxMark is Open GL based (No CUDA here). So the top scores will continue to go to ATI cards because they excel at Open GL chores. Too bad not many apps make good use of Open GL. If they did, that could improve the fate of ATI.

How three of my GTX 580s compare to three of my GTX Titans on CUDA chores will be tested in the not too distant future. I shooting for doing it within the next two to three weeks.

N.B. None of the three card ensembles were connected through SLI.
 
Last edited:
If you've read my earlier posts about how to maximize Titan's double precision floating point peak performance, you, like me, might be surprised to learn that through the latest Nvidia drivers, etc., Nvidia has likely now seen fit to minimize the actual advantages of owning two or more Titans.

Are you running under OS X? If so, couple of things that deserve clarification, I think:

1) There are no Mac drivers that match the Windows release that supported the GeForce GTX Titan card (i.e. release 313). The current drivers in 10.8.3 are still based on release 304, and probably don't contain all the necessary bits to truly make the Titan card work.

2) Enabling the full double-precision performance under Windows requires checking a box in the control panel. I believe this lowers the overall clock speed of the cores, since it enables all the additional double-precision processing power. AFAIK there's no way to do this under OS X, right?

To be honest, I'm waiting to see what NVIDIA will do with their next web driver. 10.8.3 has been out for a few weeks now, so hopefully we'll see a new driver shortly. I haven't been following this discussion closely so I'm not sure what kind of system you're running on, but I think it's a bit premature to claim that NVIDIA is taking away features when:

- 10.8.3 only just came out.
- The Titan cards only just came out.
- It's not an official Mac product.

It took a couple of months before the GTX 680 cards started working under OS X for example, so the fact that you can even run the card this quickly is a step in the right direction, no?

Edit: And if you're not talking about OS X, please just ignore this. To clarify, what software update are you talking about exactly?
 
Are you running under OS X?


Nope. WolfPackPrime0 runs Linux and Windows and so does WolfPack3 (EVGA SR-2 running Windows 7). I used to run OSX on WolfPacks3 and 4, but not any longer. Since I rely more on GPUs now, I use Windows more and more because of, among other things, more software and driver compatibility, more useful PCI-e slots and more powerful PSUs. Of course I still run OSX on my Mac Pros. OSX, especially 10.6.7, once increased the performance on Westmere and Nehalem systems over the Windows releases for those systems. With the newer CPUs the story has changed radically.


1) There are no Mac drivers that match the Windows release that supported the GeForce GTX Titan card (i.e. release 313). The current drivers in 10.8.3 are still based on release 304, and probably don't contain all the necessary bits to truly make the Titan card work.
True.

2) Enabling the full double-precision performance under Windows requires checking a box in the control panel. I believe this lowers the overall clock speed of the cores, since it enables all the additional double-precision processing power. AFAIK there's no way to do this under OS X, right?
True for the last sentence, but only partially complete for the immediately preceding sentence. See post no. 564, above. It describes how you can not only recover from the downclocking, but also exceed the level of original clocking. Add that to the greater CUDA features, unmatched single precision floating point peak performance and the elective increase in double precision floating point peak performance, then we're talking about big deltas.

To be honest, I'm waiting to see what NVIDIA will do with their next web driver. 10.8.3 has been out for a few weeks now, so hopefully we'll see a new driver shortly. I haven't been following this discussion closely so I'm not sure what kind of system you're running on, but I think it's a bit premature to claim that NVIDIA is taking away features when:

- 10.8.3 only just came out.
- The Titan cards only just came out.
- It's not an official Mac product.

It took a couple of months before the GTX 680 cards started working under OS X for example, so the fact that you can even run the card this quickly is a step in the right direction, no?

And I would completely agree with you if your original assumption about my running three of my Titans on a Mac Pro, was true.

Edit: And if you're not talking about OS X, please just ignore this. To clarify, what software update are you talking about exactly?

Point updates can occur if you have Nvidia Update installed (it's installed automatically) and it will and did, in my case, install point (X.xxx) updates without telling me what they were all about. I guess that I should have been suspicious of the little reminders. So, in the end, I replaced the updated files (mainly just .dll files, but boy did they make a big difference) from Nvidia. I deleted Nvidia Update to disable any future, surreptitious, regressive "updates."
 
Last edited:
And I would completely agree with you if your original assumption about my running three of my Titans on a Mac Pro, was true.

Sure, my bad. This is the Mac Pro forum after all, but yes I have not read this entire thread and didn't know what kind of system you were talking about.
 
Eclecticism - Tasting all BEFORE deciding what one likes or dislikes

I'm a child of the 1950's who in 1967 first saw Captain James T. Kirk pull a device from his belt and communicate with his crew in a space craft hovering in outer space just above a planet where the aliens looked like us [or were we, in fact, the aliens there to them?](that could never happen in real life, could it? If not all of it, just a part of it?) and taught to hide in an underground shelter in our back yard in case of nuclear attack and how to forage for food when the attack ended (that could but did not then happen). Seeing sci-fi touch reality in a cold war world broadens my appetite for knowledge. I'm an eclectic, geeking boomer whose tools presently consist fully functional Mac, Power Computing, Atari, Commodore, Radio Shack, Dell, HP and other then brand name systems, as well as self-built systems.

Sure, my bad. This is the Mac Pro forum after all,...didn't know what kind of system you were talking about.
You made "no bad." This is, of course, the Mac Pro forum. Your question prompted me to add a clarification to my earlier post just to make things clearer about the nature of my systems. So I thank you for that reason.

Here's why I provide in the Mac Pro forum information about CPU/GPU and other related performance improving and negating issues covering Macs, and a broad range of other computer systems. Some Mac Pro users work in mixed system environments where they and/or their co-workers may also use Linux and Windows systems. Some who own/use only Mac Pro's may own/use other OSes because the Mac Pro is flexible enough to present other faces, such as via Windows in BootCamp or on the Mac side through software such as Parallels. Moreover, some users run Mac OS on PC's, especially since 2006 when Apple dumped Motorola for Intel. Since that time, the Mac hardware has became more PCish. Additionally, some who currently use only Mac may like to keep current on what Windows and Linux PC's have to offer in terms of performance so that they will be more knowledgeable about things such as what Apple can or can't offer, what might be available or unavailable for Apple to adopt and incorporate into the Mac Pro, what's there in the Windows world for Mac Pro users to stuff into their own toolboxes, or what's out there if Apple ever decides to dump the Mac Pro. So for anyone who falls into one of these or into a related category that I may have missed, I have tried to consolidate into this thread coverage of everything that might be done to help us get stuff processed faster no matter what the hardware present.

...but yes I have not read this entire thread and ...

Thanks for visiting. I hope that you return soon and often. And who knows what you might get out of the other Mac Pro threads if you keep current with future posts here and break up the prior posts in this thread into chunks small enough for you to taste regularly and get feed some non-denominational Geekfood by some who simply see the Mac Pro, the HP, the Dell and other systems as tools that have more in common than some might prefer to not recognize.
 
BTW - Before the software retrograde, my three Titan cards (after I had attempted to tweak all three of them) attained a Luxmark score [ http://www.luxrender.net/luxmark/results/user/TheRealTutor ] of about 1.78 times the score of that attained by one Titan fully tweaked [3136 / 1760]. With the software retrograde, now the three of them (all fully tweaked in the same way and to the same extent) score very close to three times the score of that attained by one Titan fully tweaked [5188 / 1760 = 2.95] . Also, on LuxMark tests my three fully tweaked Titans score at least 1.7 times the scores of those attained by my three fully tweaked GTX 580s, except for one test where the delta was only 1.52x. LuxMark is Open GL based (No CUDA here). So the top scores will continue to go to ATI cards because they excel at Open GL chores. Too bad not many apps make good use of Open GL. If they did, that could improve the fate of ATI.

N.B. None of the three card ensembles were connected through SLI.

How can your 3 Titans get beaten by two 670s or two 680s or one 580 or one 680 or one 670 or one 660 TI or one 460 or two 550s?
Even a guy with 4 Titans was low on the list.
I'm guessing the above were extremely overclocked with liquid cooling and maybe connected with SLI?
Or even more likely, I don't understand the test.
http://www.luxrender.net/luxmark/top/top20/LuxBall HDR/GPU
 
How can your 3 Titans get beaten by two 670s or two 680s or one 580 or one 680 or one 670 or one 660 TI or one 460 or two 550s?
Even a guy with 4 Titans was low on the list.
Hello Topper,

There are mainly 2 technologies that harvest the tremendous computing potential of video cards (commonly referred to as a graphic processing unit "GPU" or a general purpose graphic processing unit "GPGPU"). One of them is CUDA ["CUDA™ is a parallel computing platform and programming model that enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU)."] The word "parallel" is key. Compute tasks (jobs) are devised and sent primarily to many different cores at once, rather than being devised to be sent primarily serially from one core to do one part and then on the the next core do something else to that part. So for simplicities sake, just think of the difference between serial and parallel processing being how a job is devised, broken up and worked on. Only Nvidia GPUs support CUDA - [https://developer.nvidia.com/what-cuda ].

The other technology that harvests the tremendous computing potential of video cards is Open CL [ http://www.opengl.org ]. The "C" in Open CL represents computing potential. Like CUDA, Open CL is a parallel computing platform and programming model that enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU). ATI's GPUs support only Open CL. Thus, ATI's GPUs such as the Radeon 5970 can handle only Open CL chores. Nvidia GPUs such as the GTX 680 card can handle both CUDA and Open CL chores. Although ATI's GPUs don't support CUDA, at Open CL chores the ATI cards more often than not excel over Nvidia's cards. The problem for ATI is that there are fewer applications that take advantage of Open CL (like Lux 3d Render does) than there are applications that take advantage of CUDA (like certain apps in Adobe CS and the vast majority of 3d app). There are many other kinds of apps that take advantage of CUDA. See, e.g., http://www.nvidia.com/docs/IO/123576/nv-applications-catalog-lowres.pdf .
I'm guessing the above were extremely overclocked with liquid cooling and maybe connected with SLI?
Or even more likely, I don't understand the test.http://www.luxrender.net/luxmark/top/top20/LuxBall HDR/GPU
Generally, video cards are more tweakable now than current processors, with the exception being the i7 CPUs. So it's possible that every test score at the LuxMark site is dependent of the overclocking handiwork of it's owner. Open CL performance can be increase by Cross-Fire connection (ATI's GPU connection technology). CUDA performance isn't increased by SLI (Nvidia's GPU connection technology), but the Open CL performance of Nvidia's GPUs can, like the ATI GPUs, be increase by multiple card connection - Nvidia calls there connectivity SLI. Both companies created their card connectivity solutions to enhance mainly Open GL performance - that's display performance (like frames per second {"fps"} for things like gaming or displaying HD video). That is, the "G" in Open GL represents graphics display potential.

Certain series in Nvidia's product lines emphasize/excel at different things. For example, generally the GTX 6xx line was designed to better compete against ATI by having much improved video display ability for gaming and video. So they excel at Open CL and GL chores. The GTX 5xx line is known for its better CUDA double precision floating point peak performance than has the GTX 6xx line. See post #590, below . The Titan represents Nvidia's attempt to put the best of both into a single card, i.e., a card that developers desire for excellent compute ability and a card that gamers desire for excellent fast and smooth graphics. - The card isn't perfect - it's not that the card that beats everything else in either category but it's highly respectable in both.

In sum, that ATI cards excel at LuxMark (a test of Open GL performance) is to be expected and overclocking an earlier card can make it outperform a newer card. Because Open CL and Open GL can be enhance by card connectivity you need to be aware of that fact and be careful when comparing LuxMark score results because card connectivity can mask the fact that what you're really seeing is many cards connected to get maximum performance, but appear as one to the bench testing software as might be shown in the Device Count column. So at first blush, they are not what they may appear to be. One way you may be able to spot this is to look at the no. of units in the Device Name(s) column and compare the no. of units shown there with the no. of units shown for a bench of another single card of that variety - seeing multiples of a single card's no. of units can be a dead give away that card connectivity is involved.
 
Last edited:
In sum, that ATI cards excel at LuxMark (a test of Open GL performance) is to be expected and overclocking an earlier card can make it outperform a newer card. Because Open CL and Open GL can be enhance by card connectivity you need to be aware of that fact and be careful when comparing LuxMark score results because card connectivity can mask the fact that what you're really seeing is many cards connected to get maximum performance, but appear as one to the bench testing software as might be shown in the Device Count column. So at first blush, they are not what they may appear to be. One way you may be able to spot this is to look at the no. of units in the Device Name(s) column and compare the no. of units shown there with the no. of units shown for a bench of another single card of that variety - seeing multiples of a single card's no. of units can be a dead give away that card connectivity is involved.

Interesting, thank you.
You go through an awful lot of work. I hope you get something out of it besides the satisfaction.
 
Isn't satisfaction the ultimate reason why we want money?

Interesting, thank you.
You go through an awful lot of work. I hope you get something out of it besides the satisfaction.

The satisfaction that I get from helping others is worth more to me than cash, gold, jewels, a winter home of the Côte d'Azur aka the French Riviera, etc. Also, the work that I do in the background to help others is what I sometimes refer to as Alzheimer's insurance: If I forget 10%, 20%, 30% or >, but less than all of what you all have helped me to learn by asking me questions, how much shall I still remember? Hopefully - a whole lot. Moreover, I hope that I bring a smile to those who I assist. The hope of making others smile brings me the greatest satisfaction. All that I've earned and acquired over my almost 60 years on this side of the dirt, can be taken from me at any moment. I cannot rely on any of it when my shell begins rapid decay. But the satisfaction that comes from serving others - I shall claim even long after I have left this shell behind.
 
Last edited:
The problem for ATI is that there are fewer applications that take advantage of Open CL (like Lux 3d Render does) than there are applications that take advantage of CUDA (like certain apps in Adobe CS and the vast majority of 3d app).

This is changing within Adobe; they're embracing OpenCL for some of their Suite applications. The first of which was Photoshop, which started using OpenCL acceleration for certain tasks in... CS5.5 I think? Maybe it was CS6. One of the revisions of Premiere Pro CS6 made OpenCL available for certain AMD/ATI processors that were found on iMacs.

When CS[6.5|7|Next|Whatever-the-Hell-They're-calling-it] comes out in May, it's supposed to have far broader support for OpenCL acceleration on all of the AMD processors.

So they excel at Open CL and GL chores. The GTX 5xx line is known for its better CUDA compute ability than does the GTX 6xx line.

I've read that same thing here a few times, but it doesn't appear to be true. Over on the Adobe forums, a couple of guys have devised a fairly impressive benchmark for Premiere Pro. The 6XX line cards all perform significantly better than the 5XX cards do. I haven't looked into the specifics, but it's generally accepted that the 6-series cards are the ones to get for CUDA processing with Adobe apps.

jas
 
This is changing within Adobe; they're embracing OpenCL for some of their Suite applications. The first of which was Photoshop, which started using OpenCL acceleration for certain tasks in... CS5.5 I think? Maybe it was CS6. One of the revisions of Premiere Pro CS6 made OpenCL available for certain AMD/ATI processors that were found on iMacs.

Thanks for putting that information out here.

I neither invest in the stock of Nvidia, nor ATI. I have 4 overclocked GTX 680s and 1 overclocked GTX690. I also have GTX 295s, 480s, 580s, and Titans. Moreover, I have, as well, a slew of ATI cards in the 49xx, 59xx, 69xx and most recently the 79xx family. ATI cards do not take any advantage of CUDA and if ATI tried to do so, Nvidia would drag ATI into court because CUDA is proprietary and Nvidia owns it, whereas Open CL is, well, open to all. That's why Nvidia cards take advantage of both technologies, but ATI can take advantage of only Open CL. But if you asked me which technology would I want to reign supreme, it would, of course, be the open one, not the proprietary one.

GTX 6xx cards have much lower double precision floating point compute capability than a similarly numbered GTX 5xx card (e.g., 680 vs. 580 or 670 vs. 570). GTX 6xx cards have higher single precision floating point compute capability than a similarly numbered GTX 5xx card. The Titan is in a class all of it's own, being the likely best amalgam possible of the GTX580, GTX680, and Tesla K20, all wrapped up in one. So if you wrote an application that takes greater advantage of greater single precision capability, the GTX 6xx cards would be favored, and if you wrote an application that takes greater advantage of greater double precision capability, the GTX 5xx cards would be favored. I don't expect any application to remain static, so there will likely be change.

When CS[6.5|7|Next|Whatever-the-Hell-They're-calling-it] comes out in May, it's supposed to have far broader support for OpenCL acceleration on all of the AMD processors.

Thanks for putting that information out here. I hope that this Adobe release comes out on time.

I've read that same thing here a few times, but it doesn't appear to be true. Over on the Adobe forums, a couple of guys have devised a fairly impressive benchmark for Premiere Pro. The 6XX line cards all perform significantly better than the 5XX cards do. I haven't looked into the specifics, but it's generally accepted that the 6-series cards are the ones to get for CUDA processing with Adobe apps.

First, do the 6xx cards advantage over the 5xx cards in Premiere spring from what you pointed out above, namely: "One of the revisions of Premiere Pro CS6 made OpenCL available for certain AMD/ATI processors that were found on iMacs." Remember that Nvidia cards do both CUDA and Open CL. In other words, you could be seeing that the 6xx's have an advantage because of their higher single precision floating point peak expressing itself by better Open CL performance.

Secondly, I always try to pick the right tool for the job at hand. I don't use a hammer where a screw needs to be set. If a card is optimized for single precision applications and has a higher single precision floating point peak performance value than another card, then that card with the higher single precision floating point peak performance value will most likely get its computations for a single precision application done faster. The Tesla K10 is such a card. The Tesla K20 is designed to be the leader in double precision applications. Here's what Nvidia recommends the Tesla K10 (the high end single precision Wonder) for: among others, signal, image and video processing and video analytics. And that's what I have deployed my GTX 680s and 690 for because that is what they do best. My statement which you quoted earlier: "The problem for ATI is that there are fewer applications that take advantage of Open CL (like Lux 3d Render does) than there are applications that take advantage of CUDA (like certain apps in Adobe CS and the vast majority of 3d app).", was intended to show the breadth of CUDA's applicability, not as an assessment that the GTX 6xx series perform worse at CS 6 chores than the GTX 5xx series, especially since CS 6 also takes advantage of Open GL. Along that same line, I have dedicated my 3 ATI 5970s to video production tasks because they excel at that. I doubt that many Mac users have the patience to get more than one high end video card installed in their systems or care to go through what is takes to the install an additional high end card - so the issue when upgrading the GPU in a Mac Pro usually is which one high end card should I get. If one of them asked me, "Should I get a GTX 6xx or a GTX 5xx because I can't afford that Titan?," I 'd ask them, "What do you intend to use it for?" If they responded, "Only for video production." Then, I'd say, "Go with the 6xx series." However, if they responded with, "Animation/3d and video production" or "Animation/3d" or "Computational physics" or "Biochemistry simulations" or "Computational Physics" or "Computational finance" or other double precision applications, then I'd now respond, "Go with the 5xx series because its known for its better CUDA double precision floating point peak performance than has the GTX 6xx line."

Again, thanks jas. I know that at least one person is reading what I write. You have shown me that I could have been more precise by using "greater double precision floating point peak performance" instead of just using the word "compute ability." and you made me flesh out some important distinctions. So I'll make this correction: "The GTX 5xx line is known for its better CUDA double precision floating point peak performance than has the GTX 6xx line. See post #590, below ."

P.S. I'll update this with some pics to show you the numbers supporting what I'm talking about.
 
Last edited:
But if you asked me which technology would I want to reign supreme, it would, of course, be the open one, not the proprietary one.

I'd agree that it should reign supreme. Right now, at least on Macs with nVidia cards and Adobe software (highly restrictive, I know) it's not the case. CUDA performs better on the nVidia cards, it appears, than OpenCL on the nVidias or the ATIs. These are very specific use cases though, so as usual, your mileage may vary, don't try this at home, etc, etc.

First, do the 6xx cards advantage over the 5xx cards in Premiere spring from what you pointed out above, namely: "One of the revisions of Premiere Pro CS6 made OpenCL available for certain AMD/ATI processors that were found on iMacs."

Nope. As of this writing, with CS6.0.x, Adobe only supports using OpenCL on very specific ATI/AMD processors as found in the iMacs. They have NO official support for the 500-series or 600-series nVidia cards (CUDA or OpenCL) on the Mac. At all. The benchmarks I was referring to were run on Windows rigs specifically built for video editing. On Windows, OpenCL and ATI/AMD isn't even an option as far as Adobe is concerned. Only CUDA with nVidias.

So if you followed that ridiculously-written paragraph, you'll understand: it was CUDA processing that was tested and it appears as though the 600-series cards are better at it. At least as far as Premiere Pro is concerned.

jas
 
jas, Thanks again for this update. Please visit often and shed your light.

I'd agree that it should reign supreme. Right now, at least on Macs with nVidia cards and Adobe software (highly restrictive, I know) it's not the case. CUDA performs better on the nVidia cards, it appears, than OpenCL on the nVidias or the ATIs. These are very specific use cases though, so as usual, your mileage may vary, don't try this at home, etc, etc.



Nope. As of this writing, with CS6.0.x, Adobe only supports using OpenCL on very specific ATI/AMD processors as found in the iMacs. They have NO official support for the 500-series or 600-series nVidia cards (CUDA or OpenCL) on the Mac. At all. The benchmarks I was referring to were run on Windows rigs specifically built for video editing. On Windows, OpenCL and ATI/AMD isn't even an option as far as Adobe is concerned. Only CUDA with nVidias.

So if you followed that ridiculously-written paragraph, you'll understand: it was CUDA processing that was tested and it appears as though the 600-series cards are better at it. At least as far as Premiere Pro is concerned.

jas

Then, this just confirms what I had read on Nvidia's website about how to chose the right Tesla card and it shows that it also applies, just as I had conjectured before last reconfiguring my systems to maximize GPU performance, down to the GTX line. So for applications that involve mainly signal, image and/or video processing or video analytic chores, GPUs with the highest single precision floating point peak performance (currently the Titan and then the 6xx series) tend to perform better.
 
Suggestions needed.

... CUDA performs better on the nVidia cards, it appears, than OpenCL on the nVidias or the ATIs. These are very specific use cases though, so as usual, your mileage may vary, don't try this at home, etc, etc. ...
jas

I've been recently following an exceptional, specific use: Lux renderer. If you visit the LuxMark site [ http://www.luxrender.net/luxmark/ ], there you'll see (a) a clear tendency for Open CL cards, especially, the ATI/AMD variety, to blow the socks off of Nvidia's cards*/ and (b) that the GTX line crushes the Tesla line when it comes to Open CL rendering in that app. Lux renderer is solely Open CL. I plan to explore how CUDA vs. Open CL perform on Nvidia cards. What application(s) should I use as a base?

*/ - What we are seeing may be just like a manifestation of human nature, namely, that the only child appears to get special treatment because the parent(s) have no other child upon which to bestow affection (likening ATI/AMD as the parent and Open CL as implemented by ATI/AMD as the child); but when the parent(s) have two children to care for - one by blood and the other adopted, one of the two may be favored though both children may be jealous of the only child (likening Nvidia as the parent, CUDA as the child by blood and Open CL as implemented by Nvidia as the adopted child). In sum, ATI/AMD has but one child on which to concentrate and AMD knows that it better make sure that one child shines.
 
Last edited:
I plan to explore how CUDA vs. Open CL perform on Nvidia cards. What application(s) should I use as a base?

The only app I specifically use that can do both is Premiere Pro. With a very simple text file edit, you can get PPro to use OpenCL with any of the available ATI/AMD cards, as well as any of the available GTX cards (assuming OS X). Another simple text file edit will open up any available GTX card to CUDA processing on PPro.

As I'd mentioned, CS6 is Adobe's first foray into OpenCL for PPro (and again, only on OS X). It's not quite as optimal as their CUDA code, so I'd expect the nVidias to outperform the ATI/AMD cards pretty handily.

I doubt that answered your question; I'm unaware of a benchmark app that can switch back and forth between CUDA and OpenCL for GPUs. Were you to use PPro, I'd suggest waiting until May when they launch/hatch their next version.

jas
 
Working on a 4,1 Upgrade and have a few questions...

Apologies if either of these has been answered somewhere, but I searched for the last day or so and either my google fu is lacking, or I'm not searching for the right thing, and since this is the only thread I found with X5679's floating around... here goes.

Has anyone successfully put 2x X5679's into a 4,1 Mac Pro Firmware Upgraded to 5,1?

In theory I see no reason why it shouldn't work. They're essentially X5680's with 1x lower multipliers, I'm just not sure of the following...

  • Is the X5670 microcode in the firmware or not?
  • If not, or if it's a partial match (i.e. Just says generic X5600) Will it affect fan speeds, ect?

I'm also really confused on a stupid nuanced RAM thing. Assuming the ram is DDR3 ECC Reg 10600, 9-9-9 timings, Has a thermal Sensor, will it work?

Basically I see a lot of ram that satisfies those conditions that is significantly cheaper than the "Apple certified" stuff.

I just want to know if it will all work so long as it meets those conditions, or is it a trial and error hit and miss type thing?

If so, does anyone maintain a list of confirmed positives anywhere?

Thanks for the help guys.
 
Any ECC ram at the proper speed should work. "Apple Certified" ram generally is going to have a temperature sensor that can be read by OSX, though it should work just fine. So long as it meets voltage and speed requirements.
 
Working on a 4,1 Upgrade and have a few questions...

Apologies if either of these has been answered somewhere, but I searched for the last day or so and either my google fu is lacking, or I'm not searching for the right thing, and since this is the only thread I found with X5679's floating around... here goes.

Has anyone successfully put 2x X5679's into a 4,1 Mac Pro Firmware Upgraded to 5,1?
Not that I recall; but they have used 5675s, 5680s and 5690s (and I'd bet that those are not individually covered in Apple's firmware). Some who build Hackintoshes use 5679s [ http://browser.primatelabs.com/geekbench2/search?page=1&q=X5679 ].

In theory I see no reason why it shouldn't work. They're essentially X5680's with 1x lower multipliers, I'm just not sure of the following...


[*]Is the X5670 microcode in the firmware or not?

Probably not.

[*]If not, or if it's a partial match (i.e. Just says generic X5600) Will it affect fan speeds, etc?

This ["... it's a partial match (i.e. Just says generic X5600)"] appears to be more likely the case, since released steppings of 5675s, 5680s and 5690s work fine.

I'm aware of cases where a few individuals have had to reset SMC by doing as instructed here: [ http://support.apple.com/kb/ht3964?viewlocale=de_de ] (but I recommend doing it in all cases) and/or use SMC fan control [ http://www.macupdate.com/app/mac/23049/smcfancontrol ] (but I recommend using it in all cases).

I'm also really confused on a stupid nuanced RAM thing. Assuming the ram is DDR3 ECC Reg 10600, 9-9-9 timings, Has a thermal Sensor, will it work?

It should.

Basically I see a lot of ram that satisfies those conditions that is significantly cheaper than the "Apple certified" stuff.

I just want to know if it will all work so long as it meets those conditions, or is it a trial and error hit and miss type thing?

Most often it will work, but for a few it's been a little "trial and error hit and miss type thing."

If so, does anyone maintain a list of confirmed positives anywhere?... .

Not that I'm aware of.
.
 
Last edited:
News of more Titan developments from other forum members

Thanks to the resourcefulness of Macrumor regular Garamond [ https://forums.macrumors.com/threads/1572951/ ], and to the resourcefulness of Macrumors bot rampagedev who has done some testing, we now know how the Titan can score in Unigine Heaven Benchmark 4.0, that Titan supports up to 4 monitors, and that after installing CUDA, full CUDA support in CUDA supported applications exists [ http://rampagedev.wordpress.com/2013/04/07/nvidia-titian-for-mac/ ].

Thanks to the resourcefulness of Macrumor regular xcodeSyn [ https://forums.macrumors.com/threads/1572951/ ], we see that there may soon be a slightly scaled down version of the Titan (Titan LE) with 5 GB GDDR5, 320-bit Memory, and slightly fewer cores, for a price of around $659 - $759 [ http://wccftech.com/nvidia-geforce-gtx-titan-le-spotted-5-gb-gddr5-320-bit-memory/ ].
 
Last edited:
If you're looking for a home for many Titans, but you're not tied to the Mac, then -

Give a look to this house if you need CUDA big time for heavy 3d work: [ http://www.supermicro.com/products/system/4U/7047/SYS-7047GR-TRF.cfm ]. It costs under $1700 and allows you to add up to 4 Titans, up to 512GB of DDR3 - 1600MHz ECC Registered DIMMs in 16x DIMM sockets, two Sandy Bridge 2600s, and, as a fifth video card, at a minimum, something like the half-size EVGA 04G-P4-2647-KR GeForce GT 640 4GB (around $110 at Newegg) [ http://www.newegg.com/Product/Product.aspx?Item=N82E16814130818 ] for interactivity while scene building or animation creation or scene/animation modification while the Titans are rendering.
 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.