Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Which assumes Nvidia will have fully optimized drivers for a brand new architecture on day 1...
Engineering samples are already in Nvidia headquarters. Nvidia will work for 15 months(!) on Volta GV102 drivers, because in consumer space you can expect one huge surprise, especially when it goes to Multi-GPU tech, implemented. Nvidia has huge benefit here, that they can afford to stall the release of the GPUs and work on software. AMD did not had this comfort.
 
Which assumes Nvidia will have fully optimized drivers for a brand new architecture on day 1...
That's been their pattern....
[doublepost=1504304482][/doublepost]
Engineering samples are already in Nvidia headquarters. Nvidia will work for 15 months(!) on Volta GV102 drivers, because in consumer space you can expect one huge surprise, especially when it goes to Multi-GPU tech, implemented. Nvidia has huge benefit here, that they can afford to stall the release of the GPUs and work on software. AMD did not had this comfort.
And GV100 drivers and Volta-ready CUDA 9 are already out. GV102/4/7... will be tweaks of GV100 - not a "new architecture".
 
Edit. Sticking to the thread. I have found one graph on another forum.
acVk5SF.png

What you see here is that Draw Stream Binning Rasterizer being turned off.

So essentially, all of new graphics Vega Features: Draw Stream Binning Rasterizer, Primitive Shaders, High Bandwidth Cache Controller, and Intelligent Workload Distributor, which is tied with DSBR, and Primitive Shaders - all are turned off.

You basically pay 499$ for overclocked Fiji.

When AMD will enable all of those features - it will be interesting to come back to reviews, and read impressions of everybody who thought this architecture is a failure.

There will be improvements in future drivers but don't expect miracles. I don't expect DSBR and Primitive Shader will bring massive performance improvements, maybe 10-20% at best. HBCC will do nothing in terms of performance.
 
There will be improvements in future drivers but don't expect miracles. I don't expect DSBR and Primitive Shader will bring massive performance improvements, maybe 10-20% at best. HBCC will do nothing in terms of performance.
DSBR according to Whitepaper of Vega brings 10% more performance, in current state of software, and huge improvement in VRAM limited situations. What DSBR affects are minimum and maximum framerates. HBCC improves minimum framerates. Dramatically.

Geometry and Compute affects Maximum framerates. Geometry is weak link in Vega and Fiji GPUs, however Primitive Shaders are here to deal with it. How big improvement we will see with Primitive Shaders? It depends on games, and how geometry-heavy they are. The biggest differences you will see in higher resolutions, because as we go higher in resolution the higher amount of geometry you have to deal with.
And GV100 drivers and Volta-ready CUDA 9 are already out. GV102/4/7... will be tweaks of GV100 - not a "new architecture".
It has nothing to do with this situation. Compute Kernels with which GV100 has to deal with are general purpose and much easier to write drivers for than creating drivers for Volta, which has massive changes in its architecture layout, and has to deal with gaming. Nvidia does not have to come up with game ready drivers at day one, because it is not a given.
 
Last edited:
It has nothing to do with this situation. Compute Kernels with which GV100 has to deal with are general purpose and much easier to write drivers for than creating drivers for Volta, which has massive changes in its architecture layout, and has to deal with gaming. Nvidia does not have to come up with game ready drivers at day one, because it is not a given.

Unlike AMD, NVIDIA actually spends time before a product release perfecting their software as much as possible. Sure, there are often minor improvements after release, especially for new games that come out, but in general their software quality has been very good on day one. So, I would be shocked if consumer Volta cards ship with sub-par drivers or with important hardware features disabled. I'm not sure why you'd think otherwise? NVIDIA has an extremely long track record of delivering good drivers and good performance on day one, unlike AMD.
 
Unlike AMD, NVIDIA actually spends time before a product release perfecting their software as much as possible. Sure, there are often minor improvements after release, especially for new games that come out, but in general their software quality has been very good on day one. So, I would be shocked if consumer Volta cards ship with sub-par drivers or with important hardware features disabled. I'm not sure why you'd think otherwise? NVIDIA has an extremely long track record of delivering good drivers and good performance on day one, unlike AMD.
Its not that I think otherwise. I just take BOTH possibilities as being equal.

Everything depends on the architecture layout used. If they will use Paxwell architecture - there will be no problem. They can just use current drivers. If they will use GP100 architecture layout - there will be a little modification required for gaming drivers. Full Volta architecture will require quite a lot of work.

Also as I have written, there will be one huge surprise linked with Multi-GPU technology, that will pave way for future of Nvidia. And this will require quite a lot of work.
 
Its not that I think otherwise. I just take BOTH possibilities as being equal.

Everything depends on the architecture layout used. If they will use Paxwell architecture - there will be no problem. They can just use current drivers. If they will use GP100 architecture layout - there will be a little modification required for gaming drivers. Full Volta architecture will require quite a lot of work.

Also as I have written, there will be one huge surprise linked with Multi-GPU technology, that will pave way for future of Nvidia. And this will require quite a lot of work.

NVIDIA has been redesigning their shader cores every generation since Kepler. At the very least that requires a new shader compiler. We know that Volta has very different shader cores to Pascal, so I'm not sure why you think they can just use the Pascal driver for Volta?

I'll ignore the last sentence as yet another cryptic "I know more than you all" comment. If you have something to say, just say it please.
 
NVIDIA has been redesigning their shader cores every generation since Kepler. At the very least that requires a new shader compiler. We know that Volta has very different shader cores to Pascal, so I'm not sure why you think they can just use the Pascal driver for Volta?

I'll ignore the last sentence as yet another cryptic "I know more than you all" comment. If you have something to say, just say it please.
I can give you hints so far, about it.

As for the first part. Layout, and core throughput of CUDA architecture hasn't changed since Maxwell. Consumer Pascal GPUs are using the same layout, hence why there is no IPC difference between Pascal and Maxwell GPUs. That is why they could use the same drivers, with little work, regarding tuning for specific core counts. If Nvidia will reuse Paxwell architecture - they will not have to work on it.

Thankfully, I have got a technical hints that show otherwise. Volta is new layout with 64 Core/256 KB Register File Size. If it would be Maxwell/Consumer Pascal Layout, Nvidia would not be able to implement specific multi-GPU tech, that will pave way for future of Nvidia. This tech also hints that we will actually see GV100 layout, rather than GP100 layout, because of explicit scheduling technology which is apparent in Volta, and is not apparent in HPC Pascal.
 
DSBR according to Whitepaper of Vega brings 10% more performance, in current state of software, and huge improvement in VRAM limited situations. What DSBR affects are minimum and maximum framerates. HBCC improves minimum framerates. Dramatically.

Geometry and Compute affects Maximum framerates. Geometry is weak link in Vega and Fiji GPUs, however Primitive Shaders are here to deal with it. How big improvement we will see with Primitive Shaders? It depends on games, and how geometry-heavy they are. The biggest differences you will see in higher resolutions, because as we go higher in resolution the higher amount of geometry you have to deal with.

HBCC only improves minimum framerate as in when game chokes from hitting VRAM wall, HBCC is largely irrelevant unless in VRAM limited situation. HBCC has absolutely nothing to do performance itself.
 
HBCC only improves minimum framerate as in when game chokes from hitting VRAM wall, HBCC is largely irrelevant unless in VRAM limited situation. HBCC has absolutely nothing to do performance itself.
It has for scheduling. Its part of this pipeline of features. It is not separate part of this pipeline, in Vega. If you are looking at Vega from perspective of separate features, you will see just bits.

Its when all of those features are working in perfect sync - thats when you see biggest improvements of this architecture.
 
It has for scheduling. Its part of this pipeline of features. It is not separate part of this pipeline, in Vega. If you are looking at Vega from perspective of separate features, you will see just bits.

Its when all of those features are working in perfect sync - thats when you see biggest improvements of this architecture.

Please explain how HBCC improves performance for games where all their resources for a level fits into video memory? Once all the data has been transferred over the PCIe bus, the GPU is reading everything directly from its local video memory. Please explain how HBCC has any effect after that point?
 
Please explain how HBCC improves performance for games where all their resources for a level fits into video memory? Once all the data has been transferred over the PCIe bus, the GPU is reading everything directly from its local video memory. Please explain how HBCC has any effect after that point?
High Bandwdth Cache Controller has huge impact on indexing the data, and delivering them when they are needed. Its like GPU is able to create "sort of" theoretical pipelines in advance, and deliver them when they are needed. Indexing data, has huge importance for lowering the stalls in the pipeline, and all of this is part of Intelligent Workload Distributor feature.

All of features of Vega "sort of" blend together. They have to be perfectly synced together, that is why AMD is having so much problem with delivering working drivers.

In the matter of Vega thread:
https://lists.freedesktop.org/archives/mesa-dev/2017-September/168270.html
https://lists.freedesktop.org/archives/mesa-dev/2017-August/164897.html
 
Last edited:
It has for scheduling. Its part of this pipeline of features. It is not separate part of this pipeline, in Vega. If you are looking at Vega from perspective of separate features, you will see just bits.

Its when all of those features are working in perfect sync - thats when you see biggest improvements of this architecture.

No, HBCC will not improve performance in anyway when it is not VRAM bottlenecked. When Vega sees the future improvements by either optimization or driver improvement, HBCC is not going to be part of the reason, period.
 
No, HBCC will not improve performance in anyway when it is not VRAM bottlenecked. When Vega sees the future improvements by either optimization or driver improvement, HBCC is not going to be part of the reason, period.
Because you say so it will not improve framerates...?

Yes, you are correct that it increases the framerates in VRAM limited situations. Dramatically. It also increases framerates a little bit when not VRAM limited, and in "normal" circumstances, because it is part of pipeline, and has to be synced with Primitive Shaders, DSBR, and - especially - Intelligent Workload Distributor.
 
HBCC probably means a Vega will last you longer than a 1070 or 1080 bought now.
 
Because you say so it will not improve framerates...?

Yes, you are correct that it increases the framerates in VRAM limited situations. Dramatically. It also increases framerates a little bit when not VRAM limited, and in "normal" circumstances, because it is part of pipeline, and has to be synced with Primitive Shaders, DSBR, and - especially - Intelligent Workload Distributor.

No, it doesn't have to be "Synced' with Primitive shaders and DSBR. HBCC is separate feature that should work without developer optimization, it just reallocates pages to save the amount of framebuffer it uses. and no, it doesn't increase the framerates in any way when not VRAM limited.

JuSMoLx.png
 
  • Like
Reactions: AidenShaw
High Bandwdth Cache Controller has huge impact on indexing the data, and delivering them when they are needed. Its like GPU is able to create "sort of" theoretical pipelines in advance, and deliver them when they are needed. Indexing data, has huge importance for lowering the stalls in the pipeline, and all of this is part of Intelligent Workload Distributor feature.

All of features of Vega "sort of" blend together. They have to be perfectly synced together, that is why AMD is having so much problem with delivering working drivers.

In the matter of Vega thread:
https://lists.freedesktop.org/archives/mesa-dev/2017-September/168270.html
https://lists.freedesktop.org/archives/mesa-dev/2017-August/164897.html

What the hell are you talking about? Here's Raja explaining what the HBCC is:


TL;DR - HBCC is about automagically transferring data from huge data sets into video memory. Nothing more, nothing less. It just treats the video memory as another level of cache, so if you have a data set that is 1TB in size, HBCC will just load the N GB (where N is the size of video memory) of data that is actively being used without the developer/application having to micro-manage residency.

So, again, if all your data fits (i.e. every game that is shipping right now) then HBCC does nothing to improve performance.
 
Last edited:
  • Like
Reactions: TheStork
No, it doesn't have to be "Synced' with Primitive shaders and DSBR. HBCC is separate feature that should work without developer optimization, it just reallocates pages to save the amount of framebuffer it uses. and no, it doesn't increase the framerates in any way when not VRAM limited.

JuSMoLx.png
In all of what you are writing, you have forgotten about one important thing. Its not about developer optimization, but DRIVERS optimizations. Primitive Shaders, DSBR and IWD are all disabled in drivers.

That is why there is no difference in performance currently. All of those features are tied together.
What the hell are you talking about? Here's Raja explaining what the HBCC is:


TL;DR - HBCC is about automagically transferring data from huge data sets into video memory. Nothing more, nothing less. It just treats the video memory as another level of cache, so if you have a data set that is 1TB in size, HBCC will just load the N GB (where N is the size of video memory) of data that is actively being used without the developer/application having to micro-manage residency.

So, again, if all your data fits (i.e. every game that is shipping right now) then HBCC does nothing to improve performance.
It is not only "just storing the data". Its indexing it so features of Vega pipeline can dynamically access the data, and fill the cores, when it is needed. That is why its important for Vega to have all of features to be enabled. Otherwise we will not see benefits of HBCC for games, at all.
 
In all of what you are writing, you have forgotten about one important thing. Its not about developer optimization, but DRIVERS optimizations. Primitive Shaders, DSBR and IWD are all disabled in drivers.

That is why there is no difference in performance currently. All of those features are tied together.

It is not only "just storing the data". Its indexing it so features of Vega pipeline can dynamically access the data, and fill the cores, when it is needed. That is why its important for Vega to have all of features to be enabled. Otherwise we will not see benefits of HBCC for games, at all.
Please explain the difference to the end user between these two statements:
  • Vega is crap
  • Vega is wonderful hardware, but has crap drivers, so it runs like crap
 
Last edited:
In all of what you are writing, you have forgotten about one important thing. Its not about developer optimization, but DRIVERS optimizations. Primitive Shaders, DSBR and IWD are all disabled in drivers.

That is why there is no difference in performance currently. All of those features are tied together.

It is not only "just storing the data". Its indexing it so features of Vega pipeline can dynamically access the data, and fill the cores, when it is needed. That is why its important for Vega to have all of features to be enabled. Otherwise we will not see benefits of HBCC for games, at all.

HBCC DOES NOT, i repeat, DOES NOT make GPU faster in any way, PERIOD. It does not make GPU pull the data faster from re-indexing. And you can stop bringing up Intelligent workload distributor as some kind of big thing, it is just new version of hardware scheduler.
 
It is not only "just storing the data". Its indexing it so features of Vega pipeline can dynamically access the data, and fill the cores, when it is needed. That is why its important for Vega to have all of features to be enabled. Otherwise we will not see benefits of HBCC for games, at all.

Source? All the information about HBCC that I've seen directly contradicts what you're saying here. A cache controller is exactly that -- it controls what data is resident in the cache. Vega's HBCC can treat all of video memory as a giant L3 cache, pulling data from other sources (system RAM, hard drives, whatever). That means an application doesn't have to micro manage what resources are resident in video memory, it can just say "here's all my data" and let the HBCC decide what textures or buffers should be moved into video memory, based on the GPU's access patterns. This is exactly how the GPU's L2 cache controller decides what should be in L2 as opposed to just sitting in video memory. Similarly, this is exactly how an L1 cache controller decides what should be in L1 instead of L2 or video memory. Here's some background reading that you might find useful:

https://en.wikipedia.org/wiki/Cache_hierarchy

And a more in-depth article about HBCC itself:

https://techgage.com/article/a-look-at-amd-radeon-vega-hbcc/
 
Source? All the information about HBCC that I've seen directly contradicts what you're saying here. A cache controller is exactly that -- it controls what data is resident in the cache. Vega's HBCC can treat all of video memory as a giant L3 cache, pulling data from other sources (system RAM, hard drives, whatever). That means an application doesn't have to micro manage what resources are resident in video memory, it can just say "here's all my data" and let the HBCC decide what textures or buffers should be moved into video memory, based on the GPU's access patterns. This is exactly how the GPU's L2 cache controller decides what should be in L2 as opposed to just sitting in video memory. Similarly, this is exactly how an L1 cache controller decides what should be in L1 instead of L2 or video memory. Here's some background reading that you might find useful:

https://en.wikipedia.org/wiki/Cache_hierarchy

And a more in-depth article about HBCC itself:

https://techgage.com/article/a-look-at-amd-radeon-vega-hbcc/
Exactly. But Intelligent Workload Distributor, which is new, next generation scheduler, is aware of the data that has been cached, and moves it accordingly to the needs, feeding the cores with work. That is how HBCC can increase performance of the GPU.

HBCC DOES NOT, i repeat, DOES NOT make GPU faster in any way, PERIOD. It does not make GPU pull the data faster from re-indexing. And you can stop bringing up Intelligent workload distributor as some kind of big thing, it is just new version of hardware scheduler.
How hard for you two is to grasp this concept, that HBCC has to work in tandem with ALL other Vega features, to bring performance uplifts?

Ergo. Read this post fully.
 
Please explain the difference to the end user between these two statements:
  • Vega is crap
  • Vega is wonderful hardware, but has crap drivers, so it runs like crap
This forum loves car analogies, so let me explain this to you using one.

There is presented a new car, that has an engine designed to run on fuel that will be released in future. You buy it in anticipation of it, however you fuel it using "traditional" fuel.

And then you start complaining how crap hardware you have bought, because its running sub par, to what you thought it will be.
 
Exactly. But Intelligent Workload Distributor, which is new, next generation scheduler, is aware of the data that has been cached, and moves it accordingly to the needs, feeding the cores with work. That is how HBCC can increase performance of the GPU.

Okay, the data is in video memory. It might've gotten there because the app put it there (traditional memory management) or it might've gotten there because HBCC moved it there (new-style HBCC level 3 cache). Please explain, from this point, how HBCC is improving performance over the traditional memory management scheme? The data is already in video memory, it can't magically be accessed faster once it's in video memory. Once that data is in video memory, then the rest of the cache hierarchy operates the same no matter what, so the new L2 and standard L1 caches will read the data and feed it to the units that require it (i.e. the IWD in your example).

So, given that I'm assuming you still think otherwise, please explain how HBCC improves the performance of the GPU once the data is already in video memory, and why that wouldn't apply to a traditional memory management scheme where the data is already in video memory.

Or, even better, point us at a whitepaper that proves your point, since you seem to love to do that. All the documentation on the internet I've seen (including from Raja himself) suggests that we are correct and that you're not understanding what HBCC is.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.