Delta Memory compression can help, in Memory constrained environment. But GTX 980 Ti has the same amount of it as GTX 1060 and has much higher memory bandwidth. So it will not help here.That's fine, keep ignoring my point.
If the game is limited by nothing but raw shader math horsepower, then TFLOPs will give an accurate representation of how well a given GPU will run the game. The game runs faster on GTX 1060 than raw TFLOPs suggests. My conclusion is that the game is not limited by raw TFLOPs. Your conclusion is that I'm an idiot and NVIDIA and AMD are artificially limiting the performance in this game in a grand conspiracy to make the GTX 1060 look better than it actually is. I wonder which is more likely to be correct?
Is stuff like the delta memory compression taken into consideration in your raw TFLOPs measurement?
http://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review/8
I don't think it is, and it's one of the major new things in the Pascal architecture.
Oh look, there's a better implementation of asynchronous compute as well:
http://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review/9
Let's just ignore all of the non-shader related improvements in Pascal, and the real benchmark data, and conclude that I don't know what I'm talking about.
Again, my fundamental point relates to the comparison between the 1060 and the RX 480. On paper, the RX 480 should destroy the 1060, but in nearly all game benchmarks the 1060 wins (and often wins by a significant margin). How do you explain this?
Asynchronous Compute would help. But what we have seen to this day is that GTX 1060 is loosing performance with Asynchronous Compute turned on in games using DX12. Secondly, for Pascal GPUs the difference is within 1-2% of performance when this setting is on.
And again. The difference in performance of both GPUs is 4.4 vs 5.9 TFLOps.
I do not know why you are putting here RX 480.