Maximizing GPU CUDA 3d Compute Performance By GPU Selection
GPU Performance Review
I. My CUDA GPUs Titan Equivalency
*/ (TE) from highest to lowest (fastest OctaneRender
**/ Benchmark V1.20 score to lowest):
1) EVGA GTX 780 Ti Superclock (SC) ACX / 3gig (G) = TE of 1.319
2) EVGA GTX 690 / 4G = TE of 1.202
3) EVGA GTX Titan SC / 6G = TE of 1.185
4) EVGA GTX 590C = TE of 1.13
Titan that Bare Feats tested = TE of 1.0
5) EVGA GTX 480 SC / 1.5G = TE of .613
6) EVGA GTX 580 Classified (C) / 3G = TE of .594
7) Galaxy 680 / 4G = TE of .593
8) BFG Tech GTX 295 = As indicated below, this card's compute capability is deprecated with latest version of OctaneRender, so I use it for ACS and Blender.
9) EVGA GT 640 / 4G = TE of .097
I use the EVGA GT 640 / 4G for video output/as a 4G frame buffer and interactivity only when 3d scene building and scene tweaking.
II. Processing Power GFLOPS
***/
1) EVGA GTX 780 Ti Superclock (SC) ACX / 3gig (G) = 5,046/210
2) EVGA GTX 690 / 4G = 2× 2810.88 = 5,621.76
3) EVGA GTX Titan SC / 6G = 4,500/1,300-1,500
4) EVGA GTX 590C /3G = 2,488.3
5) EVGA GTX 480 SC / 1.5G = 1,344.96
6) EVGA GTX 580 Classified (C) / 3G = 1,581.1
7) Galaxy 680 / 4G = 3,090.43
8) BFG Tech GTX 295 = 1,788.480
9) EVGA GT 640 / 4G = 729.6
III. CUDA Compute Capability
CUDA compute capability (CCC) (and differences in GPU memory amounts) can affect what the renderer can actually perform:
Compute capabilities required for octane features:
Compute Capability Limitations
1) CCC of 1.0: Octane Render Version 1.20+ not supported.
2) CCC of 1.1 or lower: no PMC kernel or matte material (shadow capture)
3) CCC of 1.1 -> 1.3: no PMC or matte material (shadow capture)
4) CCC of 2.0 and 2.1: no limitations
5) CCC of 3.0 and 3.5: no limitations
Compute Capability Of My Cards
1) EVGA GTX 780 Ti Superclock (SC) ACX / 3gig (G) = CCC of 3.5
2) EVGA GTX 690 / 4G = CCC of 3.0
3) EVGA GTX Titan SC / 6G = CCC of 3.5
4) EVGA GTX 590 Classified (C) / 3G = CCC of 2.0
5) EVGA GTX 480 SC / 1.5G = CCC of 2.0
6) EVGA GTX 580 C / 3G = CCC of 2.0
7) Galaxy 680 / 4G = CCC of 3.0
8) BFG Tech GTX 295 = CCC of 1.3
9) EVGA GT 640 / 4G = CCC of 3.5
*/ I use TE to compare how GPUs perform relative to the Titan that Bare Feats tested here:
http://barefeats.com/gputitan.html . see post # 865, above [
https://forums.macrumors.com/showthread.php?p=18267072&coined#post18267072 ]. For example, my EVGA GTX 780 Ti Superclock (SC) ACX / 3gig (G) with a TE of 1.319 is 1.319 times faster than the Titan that Bare Feats tested, but my EVGA GTX 480 SC / 1.5G with a TE of .613 is only about 60% as fast as the Titan that Bare Feats tested. Because of OctaneRender's perfect linearity two EVGA GTX 480 SC / 1.5G with a TE of .613 would be 1.226 (or 2x .613) times faster than the Titan that Bare Feats tested.
**/ OctaneRender [
http://render.otoy.com/features.php ] has plugin support presently for:
ArchiCAD
Blender
Daz Studio
Lightwave
Poser
Rhino
3ds Max
AutoCAD
Cinema4D
Inventor
Maya
Revit and
Softimage;
and will soon have plugin support for:
SketchUp (in development)
Modo (in development) and
Carrara (in development). Otherwise, it imports .obj files and there are free community developed exporter scripts for:
Autodesk 3D Studio Max
Autodesk Maya
Autodesk Softimage XSI
Blender
Maxon Cinema 4D
Sketchup and
Modo.
***/ These figures come from Wikipedia [
http://en.wikipedia.org/wiki/Comparison_of_Nvidia_graphics_processing_units ] and its my understanding that they reflect the performance of reference design GPUs. Almost all of my GPUs are non-reference design; so the performance of my non-reference design GPUs exceed these figures. Moreover, for only recent GPUs does Wikipedia set forth both single precision floating point peak performance and double precision floating point peak performance. The larger, first figure (when two are given) reflects single precision and the smaller, second figure reflects double precision.
Furthermore, since Mac users may find it difficult to tweak their video cards, I recommend that, if you seek peak performance, you should purchase a card with a well binned GPU and superclocked at purchase. Its my experience that EVGA does this better than its competitors. Its also my observation that the performance differential is almost always a lot greater than the difference in cost. Finally, its my observation that higher memory speeds have a significant impact on OctaneRender performance.
In Mavericks, currently GTX Titans (newer production) and GTX 780 Tis with the GK110B processor cause applications that call OCL to crash.