Many very important GPGPU apps, especially in the "deep learning" field, deal with recognizing repeating patterns in mind-numbing quantities of data. (That's why how many thousands or millions of CUDA cores that you have is important.) The CUDA cores in your Tesla are doing this as they're optimizing your battery life and helping you to park.Aiden, can you explain what Nvidia mean by: "mixed precision performance"
You don't need full or double precision floating point when looking coarsely at huge datasets. Nvidia's CUDA supports "half precision" floating point (16 bit floats - see https://en.wikipedia.org/wiki/Half-precision_floating-point_format ) so that the memory and or bandwidth requirements are 50% of standard precision floating point, and if the processing is faster that's an additional benefit.
(And MathPunk is right, but short floats make it a three-tiered game. Use short floats for the coarse work, elevate to 32-bit floats for the next level, and go to 64-bit for the critical stuff.)
Probably not important for wedding videos (you don't want the bride's gown to be "approximately white" unless she's "approximately a virgin"), but very important for apps like Siri that are using GPGPU programming to respond in nearly real time to fuzzy input.
And I could post a link to benchmarks that show a MacBook Air destroying a twelve core MP6,1 in H.264 encoding, and claim that the Air is faster, much faster than the MP6,1.Also look at the second link from golem.de. In Luxmark 3 R9 290X is faster, much faster than Titan X.
But I won't, because we both know that a single benchmark is irrelevant unless that is exactly what you do every day to bring home the bread.
And the koala cub is cute....
Last edited: