Repurposing old ATI video cards - I want apps that also take advantage of OCL 1.0
At one time (in the not too distant past) I favored ATI video cards solely for their OpenGL [
http://en.wikipedia.org/wiki/OpenGL ] display rendering ability. Although then I also owned Nvidia video cards (such as GTX 295s and GTX 480s, they were no match for the OpenGL ability of my overclocked 4890s and 5970s, especially in my multi OS systems. Not until PunkNugget and others helped to get me really interest in CUDA [
http://en.wikipedia.org/wiki/CUDA ] compute capability did I pull my Nvidia cards out of storage and learn about CUDA for my parallel compute needs.
Regardless what anyone thinks about the recently announced Mac Pro or what one might think that I think of it, for selfish monetary reasons, I truly hope that it, as well as all things Apple, is a booming success. Moreover, it's inclusion of ATI Fire Pro GPUs could give OpenCL [
http://en.wikipedia.org/wiki/OpenCL ] compute development a much needed shot in the arm, beneficial to both OSX and Windows systems. Although both Nvidia video cards and ATI video cards support OpenCL, ATI cards do not support CUDA. However, in addition to ATI cards' excellence at OpenGL, they are also excellent at OpenCL.
After thinking about how I could leverage what I already own in light of Apple's rMP announcement, I reviewed some of my "old" video cards [Radeon 4890s and Radeon HD 5970s] using information contained here:
http://en.wikipedia.org/wiki/Compar...essing_units#Radeon_R700_.28HD_4xxx.29_Series . This is how I evaluated those cards for potential future repurposing:
Model - Radeon HD 4890 vs Radeon HD 5970
Launch - Apr 2, 2009 vs Nov 18, 2009
Code name - RV790 XT vs Hemlock XT
Fab (nm)- 55 vs 40
Transistors (Million)- 959 vs 2154x2
Die Size (mm2) - 282 vs 334x2
Bus interface - PCIe 2.0 x16 vs PCIe 2.1 x16
Memory (MiB)1024 / 2048 vs 1024x2; 2048x2
Clock rate: Core (MHz)- 850 / Memory(MHz) - 975 vs 725 - 725 / 1000 - 1000
Config core1 - 800:40:16 vs 1600:80:32 ×2
Fillrate: Pixel (GP/s) - 13.6 / Texture (GT/s) - 34 vs 46.4 / 116.0
Memory Bandwidth (GB/s) - 124.8 / Bus type - GDDR5 / Bus width (bit) - 256 vs 128x2 / GDDR5 / 256x2
GFLOPS (Single-precision) - 1360 vs 4640
TDP3 (W) - 190 vs Idle - 51 Max. 294
GFLOPS/W (Single-precision) 7.16 vs 15.78
GFLOPS (Double-precision) - 272 [x3 = 816; +10% OC = 897.6] vs 928 [x3 = 2,784; +10% OC = 3,062.4]
5970 vs. 4890 double precision: 928 / 272 = 3.41
5970 vs. 4890 single precision: 4640 / 1360 = 3.41
So 3 Radeon 5970s have the compute capability of 10.23 Radeon 4890s.
In three systems, I can repurpose six of my ATI 4980 cards (I'll keep one of my other three in each of my three 2007 Mac Pros) and my three ATI 5970 cards. Each of those three systems use Gigabyte UD5 1366 motherboards (and i7 980X CPUs), each with 3 PCIe dual wide slots [with two being x16 and one being x8]. Compute capability of the system with 3 Radeon 5970s is [4,640 x 3 = 13,920; +10% OC = 15,312 or] 15.31 TFLOPS. {I wonder how Mari will run on this system since it over 2x 7 TFLOPS; better yet, how will Mari run on AlphaCanisLupus0 that has over 16 TFLOPS of double precision peak floating point performance and 50.1 TFLOPS of single precision peak floating point performance.} Base compute capability of each of two systems with three Radeon 4980s is [1,360 x 3 = 4,080; +10% OC = 4,488 or] 4.49 TFLOPS. So you Mac application software developers, let the OpenCL floods flow, but just make sure that those apps also support OpenCL 1.0 cards (my 4980s), even tho' those often mentioned Fire Pros include OpenCL 1.2 support. My 5970s should work just fine because they include OpenCL 1.2 support.