Nvidia Volta.

koyoot · Aug 19, 2017

For those who are interested in hardware, not fanboyism.

Nvidia finally released Nvidia Volta whitepaper.
https://images.nvidia.com/content/volta-architecture/pdf/Volta-Architecture-Whitepaper-v1.0.pdf

Two things are important.

Simultaneous Execution of FP32 and INT32 Operations Unlike Pascal GPUs, which could not execute FP32 and INT32 instructions simultaneously, the Volta GV100 SM includes separate FP32 and INT32 cores, allowing simultaneous execution of FP32 and INT32 operations at full throughput, while also increasing instruction issue throughput. Dependent instruction issue latency is also reduced for core FMA math operations, requiring only four clock cycles on Volta, compared to six cycles on Pascal.
Many applications have inner loops that perform pointer arithmetic (integer memory address calculations) combined with floating-point computations that will benefit from simultaneous execution of FP32 and INT32 instructions. Each iteration of a pipelined loop can update addresses (INT32 pointer arithmetic) and load data for the next iteration while simultaneously processing the current iteration in FP32.

This basically means, that Nvidia finally has 1:1 parity in compute throughput as GCN, and will not have to rely on software.

We have to look at the layout this way:
64 core/256 KB Register File Size, that has warp size of 32 KB Warp, that has 4 cycle cadence latency. This is first hardware layout from Nvidia that I am content of, and first time that I know will not require software to gain performance, because alone is enough capable. Also, separate FP32 and INT32 are meaningful for throughput, and latency. And last thing. Increased L1 cache size will reduce latency even more, and increase bandwidth, and resources available to the cores.

Volta has finally proper compute capabilities, just like GCN in sheer throughput. AMD will have huge problem to compete with Nvidia because for the first time, they have in compute, maybe not advantage in hardware, but have on par hardware, and their software is simply better.

tuxon86 · Aug 19, 2017

AMD as already a huge problem competing with NVidia...

koyoot · Aug 19, 2017

tuxon86 said:
AMD as already a huge problem competing with NVidia...

In software department? Yes. In hardware? Nope. Volta brings finally parity on this front(well AMD still has slight advantage because it still ca do more work each cycle, because of difference between 32 KB Warp, and 64 KB Wavefront). But overall hardware with Volta will be on par.

cube · Aug 19, 2017

koyoot said:
In software department? Yes. In hardware? Nope. Volta brings finally parity on this front(well AMD still has slight advantage because it still ca do more work each cycle, because of difference between 32 KB Warp, and 64 KB Wavefront). But overall hardware with Volta will be on par.

Why is Vega behind in MSAA?

koyoot · Aug 19, 2017

cube said:
Why is Vega behind in MSAA?

I suppose it is interplay of new pipeline and driver state of Vega. Vega in some cases is running basic GCN drivers, just to sort of make it running. I don't know how long we would have to wait for release of Vega, if AMD would want to launch it with proper drivers.

tuxon86 · Aug 19, 2017

koyoot said:
In software department? Yes. In hardware? Nope. Volta brings finally parity on this front(well AMD still has slight advantage because it still ca do more work each cycle, because of difference between 32 KB Warp, and 64 KB Wavefront). But overall hardware with Volta will be on par.

In an alternate reality maybe. In actual use, NVidia is way ahead.

koyoot · Aug 19, 2017

tuxon86 said:
In an alternate reality maybe. In actual use, NVidia is way ahead.

The only hardware advantage Nvidia has with Volta is Tensor cores. Everything else, layout, execution pipeline mimics GCN compute pipeline. The only real and most meaningful advantage Nvidia has is CUDA and drivers. But this is software, not hardware.

beaker7 · Aug 19, 2017

koyoot said:
In software department? Yes. In hardware? Nope.

What difference does it make?

koyoot · Aug 19, 2017

beaker7 said:
What difference does it make?

You don't understand? Well it makes that with not properly optimized software, hardware that is faster will be slower than hardware that has optimized software. If software is optimized for both hardwares: slower and faster, which one will in the end be faster?

I have posted here many, many times already what Split Kernel addition to Blender made for AMD GPUs. GTX 1060 using CUDA, was not able to catch up to RX 480 using OpenCL. Small difference that made OpenCL mimic CUDA in this department made huge change in performance for faster hardware.

Back to topic. I think I will early next year if Nvidia will release GV107 chip build SFF computer(again...) with Core i7-8700T, 16 GB's of RAM, and GTX 2050 Ti, if it will bring satisfying performance levels(higher than GTX 1060).

tuxon86 · Aug 19, 2017

koyoot said:
The only hardware advantage Nvidia has with Volta is Tensor cores. Everything else, layout, execution pipeline mimics GCN compute pipeline. The only real and most meaningful advantage Nvidia has is CUDA and drivers. But this is software, not hardware.

And yet they barely match the performance of a year old card from their competitor and can't match their competitor flagship.
[doublepost=1503173288][/doublepost]

koyoot said:
You don't understand? Well it makes that with not properly optimized software, hardware that is faster will be slower than hardware that has optimized software. If software is optimized for both hardwares: slower and faster, which one will in the end be faster?

I have posted here many, many times already what Split Kernel addition to Blender made for AMD GPUs. GTX 1060 using CUDA, was not able to catch up to RX 480 using OpenCL. Small difference that made OpenCL mimic CUDA in this department made huge change in performance for faster hardware.

Back to topic. I think I will early next year if Nvidia will release GV107 chip build SFF computer(again...) with Core i7-8700T, 16 GB's of RAM, and GTX 2050 Ti, if it will bring satisfying performance levels(higher than GTX 1060).

The finewine theory has been debunked by just about everybody...

koyoot · Aug 19, 2017

tuxon86 said:
And yet they barely match the performance of a year old card from their competitor and can't match their competitor flagship.

In games? Sure. Most likely because drivers are not ready. What will happen when the drivers will be optimized?

Tuxon, please, what are you trying to prove? Why does every technical discussion end this way?

tuxon86 said:
The finewine theory has been debunked by just about everybody...

Explain then why does RX 580, which is the same Polaris 10, as RX 480 matching currently GTX 1070 in games?

FineWine has nothing to do with this. It is just robustness of GCN, and the way Games are programmed.

beaker7 · Aug 19, 2017

koyoot said:
You don't understand?

Of course I understand. I just personally don't care whether it's hardware or software. I choose the best overall tool for the job.

Also found it humorous...a poster on a mac site claiming that superior hardware is more important than superior software.
[doublepost=1503173436][/doublepost]

koyoot said:
In games? Sure. Most likely because drivers are not ready. What will happen when the drivers will be optimized?

AMD has been promising to make their drivers suck less since prehistoric times.

koyoot · Aug 19, 2017

beaker7 said:
Of course I understand. I just personally don't care whether it's hardware or software. I choose the best overall tool for the job.

Also found it humorous...a poster on a mac site claiming that superior hardware is more important than superior software.
[doublepost=1503173436][/doublepost]

AMD has been promising to make their drivers suck less since prehistoric times.

Haha, that is good one

.

Well there are technical reasons why software from Apple will be better optimized for AMD hardware than for Nvidia, and for this we should care about. I actually also do not care about anything else than performance/price/power, and this all makes best product for me. This is why I use currently Core i7-7700T, and GTX 1050 Ti. And as I have posted this will change with Coffee Lake, and Volta, because somehow I do not see AMD coming up with optimizations "soon enough".

In the context of Volta it is interesting, or would be interesting to know whether consumer GPUs will be made on 12 nm FFN, or 16 nm FF+, just like Pascal GPUs. 12 nm process would make possible for example 1024 Core/192 bit GV107 chip that would have 160-180 mm2 die size, because of 20% increased density, of the process.

tuxon86 · Aug 19, 2017

koyoot said:
In games? Sure. Most likely because drivers are not ready. What will happen when the drivers will be optimized?

Tuxon, please, what are you trying to prove? Why does every technical discussion end this way?
Explain then why does RX 580, which is the same Polaris 10, as RX 480 matching currently GTX 1070 in games?

FineWine has nothing to do with this. It is just robustness of GCN, and the way Games are programmed.

Discussion tends to ends this way because you think you're superior to anyone and that differing opinions or conclusion are somehow a personal affront aginst your superior intellect or something...

As i previously said, we have acces to the same source material than you do and are free to reach whatever conclusions after studying them. You don't have to agree, but you have to respect it.

koyoot · Aug 19, 2017

tuxon86 said:
Discussion tends to ends this way because you think you're superior to anyone and that differing opinions or conclusion are somehow a personal affront aginst your superior intellect or something...

As i previously said, we have acces to the same source material than you do and are free to reach whatever conclusions after studying them. You don't have to agree, but you have to respect it.

Aaaaaaand, aren't you doing the same thing you accuse me for?

I am the problem here?

Yes, you have access to materials. However, all you want to do is to prove a guy, over the internet how wrong he is, because he looks at things from broader point of view, than you do. When I try to discuss hardware, and show you from different views, not only one: black or white, like you do, you see this as a problem. Maybe its not me who is the problem here?

Funniest part is that you believe that I try to talk out of poor performance of Vega compared to its power draw. Using games, as the most important metric.

Its always has been this way on this forum. But its not my fault. You have said: Vega is failure because it cannot reach its counterpart flagship? Are you sure about it? In all cases? Is your comment showing all the truth is there about this hardware?

Who is misleading here, as I have been accused by you before, about hardware?

If you are not interested in hardware, please stop responding.

h9826790 · Aug 19, 2017

koyoot said:
In software department? Yes. In hardware? Nope. Volta brings finally parity on this front(well AMD still has slight advantage because it still ca do more work each cycle, because of difference between 32 KB Warp, and 64 KB Wavefront). But overall hardware with Volta will be on par.

The problem is, the whole product performance is the combination of BOTH hardware and software. No matter how good on the AMD hardware, if they fail to provide an up to standard software. The hardware is useless, and the overall value will be much much lowered (the value is pretty much base on it's final performance, not the hardware architecture).

In fact, I always wonder why AMD card is so bad on efficiency. I suspect that's because AMD have powerful hardware, but also only on hardware. A powerful hardware demand more power (make sense) but not really do more work (because of poor software).

If that's a car, it's a bit like a powerful engine keep burning fuel, but without a proper gearbox, the car actually run slow. End up, the powerful hardware just lower the overall efficiency, but not increase performance. It become a disadvantage, but not advantage.

On the other hand, Nvidia know how to create a good product. Good balance on both hardware and software, make the hardware that just enough to deliver the performance, but nothing more to draw unnecessary power. Also, make sure the software can release the hardware's potential.

If a car only need one engine to achieve the required speed, give it 4 engines doesn't mean that the architecture is advance, but just burn more fuel, increase weight, increase complexity, and increase maintenance cost, etc. I don't know what's AMD's plan (or do they really have a plan), but as you said, even in terms of hardware, AMD has no advantage now. And we can 99% sure AMD won't able to deliver any drive that even close to the Nvidia driver's performance. So, no matter compute or graphics, or anything. AMD has no way to win. And most likely the situation will get worse and worse.

koyoot · Aug 19, 2017

h9826790 said:
The problem is, the whole product performance is the combination of BOTH hardware and software. No matter how good on the AMD hardware, if they fail to provide an up to standard software. The hardware is useless, and the overall value will be much much lowered (the value is pretty much base on it's final performance, not the hardware architecture).

In fact, I always wonder why AMD card is so bad on efficiency. I suspect that's because AMD have powerful hardware, but also only on hardware. A powerful hardware demand more power (make sense) but not really do more work (because of poor software).

If that's a car, it's a bit like a powerful engine keep burning fuel, but without a proper gearbox, the car actually run slow. End up, the powerful hardware just lower the overall efficiency, but not increase performance. It become a disadvantage, but not advantage.

On the other hand, Nvidia know how to create a good product. Good balance on both hardware and software, make the hardware that just enough to deliver the performance, but nothing more to draw unnecessary power. Also, make sure the software can release the hardware's potential.

If a car only need one engine to achieve the required speed, give it 4 engines doesn't mean that the architecture is advance, but just burn more fuel, increase weight, increase complexity, and increase maintenance cost, etc. I don't know what's AMD's plan (or do they really have a plan), but as you said, even in terms of hardware, AMD has no advantage now. And we can 99% sure AMD won't able to deliver any drive that even close to the Nvidia driver's performance. So, no matter compute or graphics, or anything. AMD has no way to win. And most likely the situation will get worse and worse.

Exactly what I have adressed in my first post about Volta, with whitepaper of it. Volta has currently hardware on par with GCN, and has software advantage(CUDA, and Drivers). Of course, AMD can do miracles till Volta will arrive, but how likely is it?

AMD is bad at efficiency because they push their cards out of the efficiency curve. The Maximum Clock is absolute maximum clock possible for the design at designed voltage. This hinders both Overclockability, and efficiency. Nvidia on the other hand effectively throttles the GPUs, hence their efficiency and overclockability. Nvidia GPUs stay way below their maximum GPU clock for certain level of efficiency. This is advantage of longer pipeline. Vega if would stay below this, would be much more efficient. But as I have said in Vega thread: AMD had to push it out of its comfort zone to get to decent performance levels without developing drivers, that can get proper utilization of the hardware. This is the problem of low budget. You just cannot afford all of those software engineers.

The situation will not get worse and worse, for very simple reason. GCN5(Vega) is base for every next generation GCN GPU from AMD. AMD has to optimize for this in graphics pipelines for every upcoming architecture. Biggest change in GCN5 was graphics pipeline. Compute pipeline changes will be meaningful in sheer throughput of the designs. But will not require redesigning of the drivers to the degree that GCN5 required, for gaming, at least.

tuxon86 · Aug 19, 2017

koyoot said:
Aaaaaaand, aren't you doing the same thing you accuse me for? I am the problem here?

Yes, you have access to materials. However, all you want to do is to prove a guy, over the internet how wrong he is, because he looks at things from broader point of view, than you do. When I try to discuss hardware, and show you from different views, not only one: black or white, like you do, you see this as a problem. Maybe its not me who is the problem here?

Funniest part is that you believe that I try to talk out of poor performance of Vega compared to its power draw. Using games, as the most important metric.

Its always has been this way on this forum. But its not my fault. You have said: Vega is failure because it cannot reach its counterpart flagship? Are you sure about it? In all cases? Is your comment showing all the truth is there about this hardware?

Who is misleading here, as I have been accused by you before, about hardware?

If you are not interested in hardware, please stop responding.

It' not my job nor yours to prove anyones wrong... And this isn't your own personal forum either so don't tell me if I can respond or not. And, again, having a different opinion than yours doesn't mean I'm not interested in hardware. Please, stop it with the flamebaiting.

koyoot · Aug 19, 2017

tuxon86 said:
It' not my job nor yours to prove anyones wrong... And this isn't your own personal forum either so don't tell me if I can respond or not. And, again, having a different opinion than yours doesn't mean I'm not interested in hardware. Please, stop it with the flamebaiting.

So why you are trying to prove me wrong every time, I write anything about the GPUs?

Of course you can be interested in hardware if you have different opinion. But your opinion is only opinion. The problem is, when I discuss architecture layouts, I discuss about facts. I do not care about opinions. You can have your opinion and be wrong about it. Its not my business.

My opinion can be simple about Vega. Its not worth a pennie, which I have written in first post today. In current form of drivers. This is fact. Not an opinion.

tuxon86 · Aug 19, 2017

koyoot said:
So why you are trying to prove me wrong every time, I write anything about the GPUs?

Of course you can be interested in hardware if you have different opinion. But your opinion is only opinion. The problem is, when I discuss architecture layouts, I discuss about facts. I do not care about opinions. You can have your opinion and be wrong about it. Its not my business.

My opinion can be simple about Vega. Its not worth a pennie, which I have written in first post today. In current form of drivers. This is fact. Not an opinion.

Sorry to tell you this but your opinions aren't facts either.
You're doing the exact same thing that we do and get your info from the same place we do.
You're not an insider and you're not even on the same level as the reviewer who unlike you HAVE used and tested the card and pretty much said it's subpar and not a good deal.

AidenShaw · Aug 19, 2017

koyoot said:
The only hardware advantage Nvidia has with Volta is Tensor cores. Everything else, layout, execution pipeline mimics GCN compute pipeline. The only real and most meaningful advantage Nvidia has is CUDA and drivers. But this is software, not hardware.

Benchmark performance isn't a meaningful advantage for the green team?

koyoot said:
Im sorry but talking with you is just a waste of time.

Be careful what you ask....

koyoot · Aug 20, 2017

AidenShaw said:
Benchmark performance isn't a meaningful advantage for the green team?

Benchmark performance relies on software optimization.

koyoot · Aug 20, 2017

I have contacted a local GPU supplier in Beijing, the listed price of a Tesla V100 is abit cheaper than I thought, and it will become availble sooner as well.

Tesla V100 PCIE will be charged about the same amount of money as Tesla P100 PCIE at launch, and about 10%-20% more expensive comparing to it is now, and it will become aviable in China next month.

As for FP16 rate, according to their tech manager, it seems that besides Tensor core's mixed precision computation, V100 dont have 2x FP16 rate like P100 does, this is also confirmed in CUDA 9.0RC's programming guide: their FP16 rate is the same as their FP32 rate, so in V100 Nvidia move all its low precision DL stuff into tensor cores (with better precision), I think thats a good idea.

Source: https://forum.beyond3d.com/posts/1995839/

AidenShaw · Aug 20, 2017

koyoot said:
I have contacted a local GPU supplier in Beijing, the listed price of a Tesla V100 is abit cheaper than I thought, and it will become availble sooner as well.

Tesla V100 PCIE will be charged about the same amount of money as Tesla P100 PCIE at launch, and about 10%-20% more expensive comparing to it is now, and it will become aviable in China next month.

As for FP16 rate, according to their tech manager, it seems that besides Tensor core's mixed precision computation, V100 dont have 2x FP16 rate like P100 does, this is also confirmed in CUDA 9.0RC's programming guide: their FP16 rate is the same as their FP32 rate, so in V100 Nvidia move all its low precision DL stuff into tensor cores (with better precision), I think thats a good idea.

Source: https://forum.beyond3d.com/posts/1995839/

These sources all say that FP16 is 2x FP32.

http://www.anandtech.com/show/11559...ces-pcie-tesla-v100-available-later-this-year
http://www.tweaktown.com/news/58112/nvidia-volta-v100-pcie-5120-cuda-cores-16gb-hbm2-300w/index.html
https://forum.beyond3d.com/threads/nvidia-volta-speculation-thread.53930/page-23
http://www.guru3d.com/news-story/nvidia-announces-pci-express-version-of-tesla-v100-accelerator.html

koyoot · Aug 20, 2017

AidenShaw said:
These sources all say that FP16 is 2x FP32.

http://www.anandtech.com/show/11559...ces-pcie-tesla-v100-available-later-this-year
http://www.tweaktown.com/news/58112/nvidia-volta-v100-pcie-5120-cuda-cores-16gb-hbm2-300w/index.html
https://forum.beyond3d.com/threads/nvidia-volta-speculation-thread.53930/page-23
http://www.guru3d.com/news-story/nvidia-announces-pci-express-version-of-tesla-v100-accelerator.html

Then why the whitepaper of Volta architecture, and CUDA programming guides show otherwise?

Let me quote something from the Whitepaper:

Tensor Cores operate on FP16 input data with FP32 accumulation. The FP16 multiply results in a full precision product that is then accumulated using FP32 addition with the other intermediate products for a 4x4x4 matrix multiply (see Figure 9). In practice, Tensor Cores are used to perform much larger 2D or higher dimensional matrix operations, built up from these smaller elements.

So it is FP16 processed on FP32, and Tensor cores. There is no 2xFP16 in Volta. Whitepaper is in the previous thread page, at the very top f it.

This is also a clue what Nvidia will use for consumer GPU architecture. GP100 reused, not GV100, because of emerging focus for FP16x2 in gaming market.

Nvidia Volta.

macrumors 603

macrumors 65816

macrumors 603

Suspended

macrumors 603

macrumors 65816

macrumors 603

Cancelled

macrumors 603

macrumors 65816

macrumors 603

Cancelled

macrumors 603

macrumors 65816

macrumors 603

macrumors P6

macrumors 603

macrumors 65816

macrumors 603

macrumors 65816

macrumors P6

macrumors 603

macrumors 603

macrumors P6

macrumors 603

Our Staff