I've been following what you've all been saying with some interest -- you seem like a most knowledgable bunch!
I've just bought a GTX 660 (specifically
this) for my Mac Pro 3,1 (with a stock 8800GT in it at the moment). It's due to arrive in a few days, and I'm wondering whether or not to upgrade to 10.8.3 in the mean time. As there are conflicting reports about the 660 and sleep crashes -- and sleep is something that's important to me (!) -- I'm going to delay.
Do any of you have any comments on the 660 and 10.8.3? I note that on the front page of this topic it's not on the "known good" page, but it
is recommended, and on previous versions of the front page it has been suggested.
I've got the extra 6 pin power lead, the NVidia drivers installed, and shall therefore let you know for certain how it gets on when it arrives!
I have an EVGA 660 on a 3,1 running 10.8.3. I just updated today. No issue so far. I cannot comment on the issue others have brought up about the PCI speed. I ran cuda-z and got the same results as the other guy above but I don't know what they were before the update. I can say that my unigine benchmark score has not changed. I have no issue with sleep that I know of. I have noticed that it takes a while to restart and sometimes gets hung up but I have not troubleshooted this and I'm not sure it is due to the graphics card. When the updated nvidia drivers are available maybe it will run a little better but I don't really see a difference between 10.8.3 and 10.8.2. Hope this helps.
This is my CUDA-Z info if you are interested.
CUDA-Z Report
=============
Version: 0.6.163
http://cuda-z.sf.net/
OS Version: Mac OS X 10.8.3 12D78
Driver Version: 8.10.44 304.10.65f03
Driver Dll Version: 5.0
Runtime Dll Version: 4.20
Core Information
----------------
Name: GeForce GTX 660
Compute Capability: 3.0
Clock Rate: 1124 MHz
PCI Location: 0:2:0
Multiprocessors: 5 (960 Cores)
Therds Per Multiproc.: 2048
Warp Size: 32
Regs Per Block: 65536
Threads Per Block: 1024
Threads Dimensions: 1024 x 1024 x 64
Grid Dimensions: 2147483647 x 65535 x 65535
Watchdog Enabled: Yes
Integrated GPU: No
Concurrent Kernels: Yes
Compute Mode: Default
Memory Information
------------------
Total Global: 2047.81 MiB
Bus Width: 192 bits
Clock Rate: 3004 MHz
Error Correction: No
L2 Cache Size: 48 KiB
Shared Per Block: 48 KiB
Pitch: 2048 MiB
Total Constant: 64 KiB
Texture Alignment: 512 B
Texture 1D Size: 65536
Texture 2D Size: 65536 x 65536
Texture 3D Size: 4096 x 4096 x 4096
GPU Overlap: Yes
Map Host Memory: Yes
Unified Addressing: No
Async Engine: Yes, Unidirectional
Performance Information
-----------------------
Memory Copy
Host Pinned to Device: 2926.18 MiB/s
Host Pageable to Device: 2398.9 MiB/s
Device to Host Pinned: 3179.18 MiB/s
Device to Host Pageable: 2377.81 MiB/s
Device to Device: 46.9479 GiB/s
GPU Core Performance
Single-precision Float: 1161.87 Gflop/s
Double-precision Float: 85.029 Gflop/s
32-bit Integer: 368.964 Giop/s
24-bit Integer: 367.91 Giop/s