Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
https://forum.beyond3d.com/threads/amd-speculation-rumors-and-discussion.56719/page-250#post-1940620

Well it looks like after all, Polaris is very efficient as an architecture, however it requires low voltage and low clocks for this. 71W under load for GPU die is... 2.5 times higher efficiency compared to for example power consumption of R9 390 die.

Whole board consumes under load 264W, we have to take out Memory power consumption from the equation and we get around 190W for the die of R9 390.

Im starting to wonder what we will get with Vega. Vega will be HBM2 GPU arch. Bare in mind all of this is my loud thinking/estimation/analysis, based on what we know. Lets take a look at, hypothetical, 3072 GCN core GPU. 190W for Hawaii/Grenada die. Vega will use the same TSMC 16 nm FF+ process as Nvidia Pascal. So the process will be better compared to 14nm LPP from Samsung.

What we have seen with Fiji compared to Grenada/Hawaii GPUs? Reduction in power consumption under load. R9 390 consumed on average 264W of power, Fury X 246W, and Asus Strix Fury: 200W(!). Bare in mind the chip was bigger in Fiji, and used HBM memory.

Similar situation we can see with Polaris/Vega architecture. Vega can use less power under load than Polaris, just because of HBM2 chips. 2 chips will give Vega 512 GB/s and 8 GB of VRAM which will be plenty enough for 3072 GCN core/64 ROP chip. And at that config it will use 10W of power. Compare that to 37W that memory in RX 480 consumes...

Overall IMO, Polaris was Maxwell like leap forward for AMD. Vega on the other hand can be Pascal like leap forward in terms of performance and efficiency.

What do you guys think about this possibility for small Vega? This is only my loud thinking: 3072 GCN cores, 64 ROPs, 8 GB of HBM2(2 stacks) with 512 GB/s. 1.4 GHz, 8.6 TFLOPs of compute power(exactly the same as Fury X). 150W TDP.

And we still have to see what next generation of graphics IP will bring to the table in graphics capabilities...
 
https://forum.beyond3d.com/threads/amd-speculation-rumors-and-discussion.56719/page-250#post-1940620

Well it looks like after all, Polaris is very efficient as an architecture, however it requires low voltage and low clocks for this. 71W under load for GPU die is... 2.5 times higher efficiency compared to for example power consumption of R9 390 die.

Whole board consumes under load 264W, we have to take out Memory power consumption from the equation and we get around 190W for the die of R9 390.

Im starting to wonder what we will get with Vega. Vega will be HBM2 GPU arch. Bare in mind all of this is my loud thinking/estimation/analysis, based on what we know. Lets take a look at, hypothetical, 3072 GCN core GPU. 190W for Hawaii/Grenada die. Vega will use the same TSMC 16 nm FF+ process as Nvidia Pascal. So the process will be better compared to 14nm LPP from Samsung.

What we have seen with Fiji compared to Grenada/Hawaii GPUs? Reduction in power consumption under load. R9 390 consumed on average 264W of power, Fury X 246W, and Asus Strix Fury: 200W(!). Bare in mind the chip was bigger in Fiji, and used HBM memory.

Similar situation we can see with Polaris/Vega architecture. Vega can use less power under load than Polaris, just because of HBM2 chips. 2 chips will give Vega 512 GB/s and 8 GB of VRAM which will be plenty enough for 3072 GCN core/64 ROP chip. And at that config it will use 10W of power. Compare that to 37W that memory in RX 480 consumes...

Overall IMO, Polaris was Maxwell like leap forward for AMD. Vega on the other hand can be Pascal like leap forward in terms of performance and efficiency.

What do you guys think about this possibility for small Vega? This is only my loud thinking: 3072 GCN cores, 64 ROPs, 8 GB of HBM2(2 stacks) with 512 GB/s. 1.4 GHz, 8.6 TFLOPs of compute power(exactly the same as Fury X). 150W TDP.

And we still have to see what next generation of graphics IP will bring to the table in graphics capabilities...

You are buying too much into one result. Just like you can get isolated samples that overclock well, you can get isolated samples that underclock well. Stock Polaris 10 is no more efficient than maxwell.

Its hard for me to be optimistic about Vega and other future AMD GPUs when this is where things currently stand.

perfwatt_2560_1440.png
 
You are free to be entitled to your opinions, however, you forget that you already can reduce power consumption of Polaris at max clocks, by 30W, just by downvolting the GPU.

Secondly, as has been pointed out on other forums:
Hdgkv0F.png

This comes from BIOSes of the GPUs/Silicon. So all of Polaris GPUs will work on 0.9V and 1060 MHz, like in this particular case(this guy was able to get it lower on voltage: 0.88V and 1065 MHz). Differences between silicon will be marginal, in just 0.05V range.

Thirdly. Exactly you have proven my point. Polaris was Maxwell-like stepping in efficiency. Vega very likely is to be Pascal-stepping for AMD in this particular area.

The reasons behind this I already have posted: TSMC 16 nm FF+, and HBM2 memory.
 
All these performance per watt figures are based on mature APIs and regularly updated Windows drivers though. As long as nobody expects the same happening on their Macs. It didn't even happen when the system was more open.
 
This comes from BIOSes of the GPUs/Silicon. So all of Polaris GPUs will work on 0.9V and 1060 MHz, like in this particular case(this guy was able to get it lower on voltage: 0.88V and 1065 MHz). Differences between silicon will be marginal, in just 0.05V range.

But its also possible to tweak all graphics cards in this way. Its not just limited to AMD or the RX 480. You could underclock and undervolt a Nvidia GPU and see similar efficiency gains.

Thirdly. Exactly you have proven my point. Polaris was Maxwell-like stepping in efficiency. Vega very likely is to be Pascal-stepping for AMD in this particular area.

The difference though is that Maxwell's increase in efficiency was achieved on the same process as the generation that came before it. Achieving gains in efficiency when moving to a smaller process is expected, much like Nvidia's transition from Maxwell to Pascal. When AMD is only able to gain parity with Nvidia chips on a previous generation process, its not encouraging.
 
  • Like
Reactions: tuxon86
But its also possible to tweak all graphics cards in this way. Its not just limited to AMD or the RX 480. You could underclock and undervolt a Nvidia GPU and see similar efficiency gains.
Yes, that is why we see GTX 1060 from desktop in laptops, and all of the mobile space. There is no difference between desktop and laptop parts these days, as is even with AMD latest announcement about RX 470 for laptops.
The difference though is that Maxwell's increase in efficiency was achieved on the same process as the generation that came before it. Achieving gains in efficiency when moving to a smaller process is expected, much like Nvidia's transition from Maxwell to Pascal. When AMD is only able to gain parity with Nvidia chips on a previous generation process, its not encouraging.
You forget that AMD, with Vega, can achieve Pascal efficiency on the same, 16 nm node. And it appears that we are stuck with 14/16 nm node for quite some time, even up to 2020, so every upcoming GPU lineup will be on this node.

And one last bit. Maxwell was architecture designed for 20 nm process, that failed. Nvidia took out from the architecture all of FP64 capabilities, and unified memory, posted it on 28 nm, then added Pascal between Maxwel and Volta(which is respun on 16 nm Maxwell, true new GPU on new architecture is GP100 chip), and here we are in 2016. It can be seen in the lineups Nvidia brought before and after their plans have changed.
 
You forget that AMD, with Vega, can achieve Pascal efficiency on the same, 16 nm node. And it appears that we are stuck with 14/16 nm node for quite some time, even up to 2020, so every upcoming GPU lineup will be on this node.

But we don't actually know this. What we know is that AMD manufactured a GPU on Global Foundries 14 nm process that achieved parity with Maxwell and fell short of Pascal, which was manufactured on TSMC's 16 nm process. AMD gained some efficiency, but we don't know how much came from architecture enhancements and how much is from the new process. Its likely that TSMC has a better process than Global Foundries, but the only data point we have for comparison is Apple's chips that were sourced from both companies which indicated a small advantage for TSMC. We don't have any GPUs to compare.

We do know that AMD lagged behind Nvidia on 28 nm, even when AMD was using the more efficient HBM when Nvidia was using GDDR5. I wouldn't assume that Vega is going to achieve parity to Pascal. Its possible, but just like with AMD's Zen, I am not going to buy into any hype from a company that has been lagging behind their competitors for years.

And one last bit. Maxwell was architecture designed for 20 nm process, that failed. Nvidia took out from the architecture all of FP64 capabilities, and unified memory, posted it on 28 nm, then added Pascal between Maxwel and Volta(which is respun on 16 nm Maxwell, true new GPU on new architecture is GP100 chip), and here we are in 2016. It can be seen in the lineups Nvidia brought before and after their plans have changed.

I am not really sure what you are trying to say here. Nvidia designs separate GPUs for gaming and compute. So the gaming GPUs are focused on single precision performance while the compute GPUs (GP100) have much more double precision compute. The architectures are the same, except with GP100 having extra features like NVLink and no display output.

AMD doesn't have the same design bandwidth and releases GPUs that tend to be jack of all trades (i.e., some gaming and some compute).

I am not sure how the failing of the 20nm process indicates a weaker product from Nvidia. They adapted to the processes that are available to them. Just like Intel added Kaby Lake after Skylake when it looked like the 10 nm process was going to be delayed.
 
  • Like
Reactions: tuxon86
But we don't actually know this. What we know is that AMD manufactured a GPU on Global Foundries 14 nm process that achieved parity with Maxwell and fell short of Pascal, which was manufactured on TSMC's 16 nm process. AMD gained some efficiency, but we don't know how much came from architecture enhancements and how much is from the new process. Its likely that TSMC has a better process than Global Foundries, but the only data point we have for comparison is Apple's chips that were sourced from both companies which indicated a small advantage for TSMC. We don't have any GPUs to compare.

We do know that AMD lagged behind Nvidia on 28 nm, even when AMD was using the more efficient HBM when Nvidia was using GDDR5. I wouldn't assume that Vega is going to achieve parity to Pascal. Its possible, but just like with AMD's Zen, I am not going to buy into any hype from a company that has been lagging behind their competitors for years.
I do not think you know exactly what you are writing here. Only place where AMD lagged behind Nvidia was in situation where they were bottlenecked by software. Have you compared latest scores from APIs that do not bottleneck either of architectures? have you actually calculated GFLOPs/watt from AMD and Nvidia Architectures to say this with rock solid face? Because It appears that you didnt. Fiji had the highest efficiency on 28 nm of all of the GPUs. Both in gaming in DX12/Vulkan, and in GFLOPs/watt. Nvidia with latest node was able to beat them at this factor.

I am not really sure what you are trying to say here. Nvidia designs separate GPUs for gaming and compute. So the gaming GPUs are focused on single precision performance while the compute GPUs (GP100) have much more double precision compute. The architectures are the same, except with GP100 having extra features like NVLink and no display output.

AMD doesn't have the same design bandwidth and releases GPUs that tend to be jack of all trades (i.e., some gaming and some compute).

I am not sure how the failing of the 20nm process indicates a weaker product from Nvidia. They adapted to the processes that are available to them. Just like Intel added Kaby Lake after Skylake when it looked like the 10 nm process was going to be delayed.
What I have said is absolutely simple. Maxwell was designed for 20 nm process. When the process failed, Nvidia ported it to 28 nm and sold it as GTX 9XX series. Also, they ported it to 16 nm FF+, and sells them as GTX 10XX series. Only True, new Pascal GPU, with completely different architecture is GP100 chip.

The only GPU that was jack of all trades was HD7970 in early 2013. It had 1/4th FP64 ratio. Every next generation of consumer AMD GPUs has very marginal FP64 ratio.

And one last bit. Where did I wrote that failed 20 nm process resulted in weaker Nvidia product? Or was it you that understood it that way...?
 
I do not think you know exactly what you are writing here. Only place where AMD lagged behind Nvidia was in situation where they were bottlenecked by software. Have you compared latest scores from APIs that do not bottleneck either of architectures? have you actually calculated GFLOPs/watt from AMD and Nvidia Architectures to say this with rock solid face? Because It appears that you didnt. Fiji had the highest efficiency on 28 nm of all of the GPUs. Both in gaming in DX12/Vulkan, and in GFLOPs/watt. Nvidia with latest node was able to beat them at this factor.

Real world performance is much more important than theoretical performance. When Fiji was released there were zero DX12 games out. That meant it lost just about every DX11 test to the 980 Ti. Even now there are only a handful of DX12 games and not all of them have increased performance when moving from DX11. When it comes to compute, GM200 also tends to beat Fiji in single precision tasks. If I'm buying a graphics card I would much rather buy something that has great performance now then buy something with a vague promise that it will be better someday.

Only True, new Pascal GPU, with completely different architecture is GP100 chip.

You keep saying this but it doesn't make any sense. GP104 is just as much pascal as GP100. Just like GM107, GM204, GM200, etc are all Maxwell.

And one last bit. Where did I wrote that failed 20 nm process resulted in weaker Nvidia product? Or was it you that understood it that way...?

You stated that because nvidia couldn't make GPUs on the 20nm process that they stopped designing improved architectures. I disagree with that.
 
Real world performance is much more important than theoretical performance. When Fiji was released there were zero DX12 games out. That meant it lost just about every DX11 test to the 980 Ti. Even now there are only a handful of DX12 games and not all of them have increased performance when moving from DX11. When it comes to compute, GM200 also tends to beat Fiji in single precision tasks. If I'm buying a graphics card I would much rather buy something that has great performance now then buy something with a vague promise that it will be better someday.
I would like to see OpenCL performance. You need 4 GTX 980 Ti's to equal 3 Fury X GPUs in OpenCL applications. About DX12/Vulkan/Metal/Mantle. It was actually on the rise. So here again we will not agree.


You keep saying this but it doesn't make any sense. GP104 is just as much pascal as GP100. Just like GM107, GM204, GM200, etc are all Maxwell.
Architecture layout for GM204 and Architecture layout for GP104 is exactly the same. They differ by core counts, and register file size, which is smaller for Consumer Pascal GPUs. The only one new Nvidia GPU with Pascal architecture, was/is GP100, with new architecture layout.


You stated that because nvidia couldn't make GPUs on the 20nm process that they stopped designing improved architectures. I disagree with that.
I have stated, that because of the failure of 20 nm process, Nvidia had to port back Maxwell to 28 nm, and then, reused this architecture in mainstream, consumer parts on 16 nm process. Which is absolutely correct if you will look at architecture design layout(128 cores in each SM, exatcly same type of cache sizes, etc). There are two differences. GDDR5X memory controller(which in all fairnes is exactly the same as GDDR5, because GP104 can use both types of memory. And they are not separate chips, it is the same chip in GTX 1070 and 1080). And Asynchornous Compute finally properly working(properly which means that Nvidia GPUs do not tank in performance with this option on in the rendering pipeline).
 
It seems Polaris is as efficient as the previous NVIDIA generation, and that Vega would match Pascal.

I got an RX460 to assemble an easy eGPU for my MBP, but it seems it is no faster than my GTX960 Skylake, at least under TB2.

If the GTX1050 is kept at 75W and can be made to work with the Mac, maybe I'll get one use the RX460 for other pending upgrades.
 
http://www.anandtech.com/show/10663/analyzing-sonys-playstation-4-pro-announcement/2

Some time ago I have written that next generation AMD GPU architecture will not be compatible with previous generations of GCN, and they will have to be emulated through drivers. To achieve compatibility with previous generation of GPUs in consoles the graphics has to be binary 1:1 compatible.

The thing is, I thought it was related to GCN4. It appears that real next generation Graphics IP is the one that we will see ending in Vega, and Zen APUs. Which can be confirmed by looking at how AMD considers Vega/Greenland - Graphics IPv9. Polaris is Graphics IPv8.
 
http://www.anandtech.com/show/10663/analyzing-sonys-playstation-4-pro-announcement/2

Some time ago I have written that next generation AMD GPU architecture will not be compatible with previous generations of GCN, and they will have to be emulated through drivers. To achieve compatibility with previous generation of GPUs in consoles the graphics has to be binary 1:1 compatible.

The thing is, I thought it was related to GCN4. It appears that real next generation Graphics IP is the one that we will see ending in Vega, and Zen APUs. Which can be confirmed by looking at how AMD considers Vega/Greenland - Graphics IPv9. Polaris is Graphics IPv8.
Well I got some more information on this. It turns out, everything, with what AMD Linux driver head specialist have said, about upcoming lineup of GPUs, was corerct. People however thought it was all about Polaris. Vega is the true next generation AMD Graphics IP. Polaris was developed to go... to Playstation 4 Pro. It is the same GPU. It is Vega that will require driver overhead layer to create backwards compatibility with previous generations of GPUs on the software side. Polaris design was in fact payed by... Sony. Thats why it has 36 CU's, exactly the same amount that PS4 Pro has. AMD just had tuned it to go to desktop, and mobile markets.

At the same time it tells why there was the high demand for full Polaris dies. Consoles, upcoming Macs, desktop market, mining market...

Interesting times still are to come...
 
If this is so, will we see Polaris in the nMP or just Vega?

it would make sense for Apple to just target Vega in this context, but the cost issue is paramount. Even a cut down Vega should cost more than Polaris I guess. Unless it has both GDDR5 and HBM2 mem controllers and the GPU is very modular so as to be easy to get additional SKUs at the lower end.
Interesting indeed.
[doublepost=1473717876][/doublepost]It must have been a good deal with Sony, no exclusivity.
 
If this is so, will we see Polaris in the nMP or just Vega?

Hopefully Vega to be honest, OpenCL for Polaris is lacklustre in MacOS roughly a third compared to windows.


Some benchmarks of the GTX 1080 against the RX480. Didn't do to bad for a third of the price, But VEGA is needed for true compute performance

 
p.l, that would be great but cost issues can get in the way. Unless Vega is indeed modular and they make a full lineup with it, which I doubt. koyoot has already said how it goes and seems correct.
Polaris was just out the door and I don't see AMD replacing it this soon.
 
p.l, that would be great but cost issues can get in the way. Unless Vega is indeed modular and they make a full lineup with it, which I doubt. koyoot has already said how it goes and seems correct.
Polaris was just out the door and I don't see AMD replacing it this soon.


Even if thats true, Lets say it is. All they are offering is the same performance as the D700's with more RAM.
They only way forward would be to recode OpenCL 2.0 into MacOS for Polaris to take full advantage. There is a 30% performance difference between OpenCL 1.2 and OpenCL 2.0 in my testing.
 
Even if thats true, Lets say it is. All they are offering is the same performance as the D700's with more RAM.
They only way forward would be to recode OpenCL 2.0 into MacOS for Polaris to take full advantage. There is a 30% performance difference between OpenCL 1.2 and OpenCL 2.0 in my testing.

As some say, all these performance and efficiency numbers are coming from Windows. Nobody knows how many gigaflops per wattever these AMD and Nvidia cards can do on a Mac. It's certainly not as good as Windows in any application, benchmark or game, because the drivers and APIs are ****.
 
  • Like
Reactions: Crosscreek
As some say, all these performance and efficiency numbers are coming from Windows. Nobody knows how many gigaflops per wattever these AMD and Nvidia cards can do on a Mac. It's certainly not as good as Windows in any application, benchmark or game, because the drivers and APIs are ****.
Jobs and crew pushed for elegant APIs, not fast and efficient ones.
 
  • Like
Reactions: tuxon86
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.