Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Your slide shows Mantle being shipped 6 months before Metal. Do you seriously believe that:

1) AMD shared details with Apple in the years leading up to their release?
2) Apple built Metal from the ground up in less than 6 months?

Apple was working on Metal for years before they announced it in WWDC in 2014. Are there similarities between all the new APIs? Sure, they all have command queues and command buffers, but OpenCL had command queues and Apple was the major force behind that API, so I think it's more likely that the folks at Apple said "yeah we like this command queue idea, let's use that for our new combined API" than some guy from AMD calling them up and telling them all about their new top-secret project. Mantle is basically a thin wrapper over GCN, which a cynical person could view as AMD admitting that their D3D driver team was terrible and couldn't compete with NVIDIA so they basically gave the GCN hardware specs to app developers and said "good luck".

In my experience, the "low overhead" nature of Metal makes it by far the easiest of the next-gen APIs to work with. Yes, it shares the common aspects of all next-gen APIs such as command buffers with encoding on multiple CPU threads or precompiled state objects, but that's about the extent of the similarities. So, if you want to claim that because Metal has command queues it's based on Mantle, I'd argue that it draws more from OpenCL and D3D11 than the low-level APIs like Mantle, Vulkan and D3D12.

Asgorath, I appreciate your insight here. Any idea if Metal will bring OS X macOS closer to performance parity with windows and directx in things like games? With most major 3d engines making sure to have iOS support I am hoping to see broader support of cross platform gaming on the mac.
 
Your slide shows Mantle being shipped 6 months before Metal. Do you seriously believe that:

1) AMD shared details with Apple in the years leading up to their release?
2) Apple built Metal from the ground up in less than 6 months?

Apple was working on Metal for years before they announced it in WWDC in 2014. Are there similarities between all the new APIs? Sure, they all have command queues and command buffers, but OpenCL had command queues and Apple was the major force behind that API, so I think it's more likely that the folks at Apple said "yeah we like this command queue idea, let's use that for our new combined API" than some guy from AMD calling them up and telling them all about their new top-secret project. Mantle is basically a thin wrapper over GCN, which a cynical person could view as AMD admitting that their D3D driver team was terrible and couldn't compete with NVIDIA so they basically gave the GCN hardware specs to app developers and said "good luck".

In my experience, the "low overhead" nature of Metal makes it by far the easiest of the next-gen APIs to work with. Yes, it shares the common aspects of all next-gen APIs such as command buffers with encoding on multiple CPU threads or precompiled state objects, but that's about the extent of the similarities. So, if you want to claim that because Metal has command queues it's based on Mantle, I'd argue that it draws more from OpenCL and D3D11 than the low-level APIs like Mantle, Vulkan and D3D12.
DirectX 12 was also developed for years before Mantle. And you know what is funniest? Microsoft included Mantle, as feature set base in DX12 in the process of developing it.

Vulkan and LiquidVR are to Mantle like Michael Jackson Black vs White - same guy, with same personality, just different surface.
DX12 and Metal to Mantle are like Disco Volante vs Alfa Romeo 8C. Same skeleton, same chassis, same engine but ride is softer, different tune of exhaust, different surface. And different feeling. Makes sense?

And I agree with you. Definitely Metal is the easiest to work with, I think easier than even DX11, despite the years of development, education and experience with that API for many developers.
It must be lonely when you're the only one who supposedly know the truth... Even after others have posted links that discredit your belief.
Nobody has disproven me anything.
 
Last edited:
DirectX 12 was also developed for years before Mantle. And you know what is funniest? Microsoft included Mantle, as feature set base in DX12 in the process of developing it.

Vulkan and LiquidVR are to Mantle like Michael Jackson Black vs White - same guy, with same personality, just different surface.
DX12 and Metal to Mantle are like Disco Volante vs Alfa Romeo 8C. Same skeleton, same chassis, same engine but ride is softer, different tune of exhaust, different surface. And different feeling. Makes sense?

And I agree with you. Definitely Metal is the easiest to work with, I think easier than even DX11, despite the years of development, education and experience with that API for many developers.

Nobody has disproven me anything.

You keep saying no one as disproved you because you just ignore what many others have been telling you.
DirectX 12 isn't based on Mantle and neither is Metal. I know that you wish it with all your heart, but you are wrong and have been proved wrong before in similar discussion.
 
Whether Metal is easy to develop for or not, at the end of the day the big name developers already struggle to make deadlines for the three main platforms.

And if the Mac has mobile GPUs they don't have incentive to develop for that. Customers will just cry like grown up babies that the Mac games don't perform like they do in Windows. And we know for sure Windows will always have better drivers with full feature sets too.

So Metal games will be mostly iOS ports and some stuff from companies like Feral. Nothing major will change when it comes to AAA titles.

I'm interested not in Mac gaming but what will Metal do for pro apps. Nothing materialised there after big hype more than a year ago. I think you still won't see much for about one more year.
 
Asgorath, I appreciate your insight here. Any idea if Metal will bring OS X macOS closer to performance parity with windows and directx in things like games? With most major 3d engines making sure to have iOS support I am hoping to see broader support of cross platform gaming on the mac.

Unfortunately, Metal is still missing some key features when compared with DX11, which is now ~7 years old. So, it makes it really hard for folks who want to port DX11 games from Windows to macOS, because there's stuff that those games are doing that are either impossible to do or really difficult to emulate with Metal. This is confirmed by the fact there are basically no actual games that use Metal, aside from perhaps the new World of Warcraft expansion (Legion). UE4 has a Metal back end, so perhaps when games that use UE4 get released we'll see better Mac versions.

I'm happy to just agree to disagree on this whole Metal vs Mantle thing. Are they similar in some regards? Sure. Does that mean Metal was based on Mantle? I haven't seen any evidence to suggest so. I think Apple likes being in control over their platform, and I think AMD is struggling and might overstate the relevance or importance of Mantle in the grand scheme of things (even Vulkan ended up being pretty different to Mantle, once Khronos started working on it). Did AMD succeed in pushing the other next-gen APIs to be lower level? I would say so. Is that actually a good thing? Based on the number of DX12 titles that exist, especially those that perform better than their DX11 versions, I kind of doubt it. We kind of learned all of this back in the 90s with 3dfx Glide and other low-level APIs (i.e. they're really hard to work with, not future proof, etc).
 
hay found a some benchmarks of the rx 480 v titan x in DaVinci resolve 12.5 looks like the rx 480 8GB is a good value card.
(+ more cards later on looks like it works as well as or better than the GTX 1070 but it's early days so things may change with driver updates or something)
https://forum.blackmagicdesign.com/viewtopic.php?f=21&t=49713
(also seen reports of resolve 12.5 using much more vram giving people with 2GB cards some problems.)

They seem happy to have titan x performance at such a price
 
Last edited:
  • Like
Reactions: linuxcooldude
Compute performance of both cards is similar. 6.1 TFLOPs vs 5.8 TFLOPs. If your software is properly coded that is what you will see. Similar performance on GPUs that have similar compute performance.
 
Compute performance of both cards is similar. 6.1 TFLOPs vs 5.8 TFLOPs. If your software is properly coded that is what you will see. Similar performance on GPUs that have similar compute performance.

You keep listing all these raw theoretical maximum performance numbers. If you compare the 480 vs the 1060, the 480 wins in every metric of theoretical performance, often by a considerable margin. However, in real games and applications, the 1060 is usually 10% or more ahead.

So, as usual, take these theoretical maximum numbers with a huge grain of salt. If ever application ran at the theoretical maximum the 480 would wipe the floor with the 1060, but in real life it's the reverse.
 
You keep listing all these raw theoretical maximum performance numbers. If you compare the 480 vs the 1060, the 480 wins in every metric of theoretical performance, often by a considerable margin. However, in real games and applications, the 1060 is usually 10% or more ahead.

So, as usual, take these theoretical maximum numbers with a huge grain of salt. If ever application ran at the theoretical maximum the 480 would wipe the floor with the 1060, but in real life it's the reverse.
It appears that your reading comprehension from context is not working ;).
---------

Here are the Standard Candle results for the new $199 AMD Radeon RX480 in RED, Nvidia Titan X in GREEN.
Rsolve 12.5 Studio, X99 Motherboard, 12 core Xeon.

Blur:
09 Nodes: RX480:24fpsTitanX:24fps
18 Nodes: RX480:20fps TitanX:16fps
30 Nodes: RX480:13fps TitanX:11fps
66 Nodes: RX480:6fps TitanX:5-6fps

TNR:
1: RX480:24fps TitanX:24fps
2: RX480:18fpsTitanX:17-20fps
4: RX480:9fpsTitanX:11fps
6: RX480:7fpsTitanX:8fps


Big thanks to Peter Richards of fixafilm.com for purchasing the 480 card, as there are no compute based reviews of the card, he thought it important that the card be tested for OPENCL performance. Let me know if you want any other benchmarks run.
--------

This is the post that has been linked in the post that I have responded.


Secondly, what if in DX11 games AMD GPUs are bottlenecked by nature of API, and by CPU overhead? Have you considered this? What happens when you lift CPU overhead and GPUs are not bottlenecked by nature of API like is in DX12 and Vulkan?

And again, What I have posted: Properly coded Application that is not bottlenecking in any way any hardware will show exactly that: 6 TFLOPs GPU will be slightly faster than 5.8 TFLOPs GPU. 8.6 TFLOPs GPU will be MUCH faster than 5.8 TFLOPS GPU.

Whole point, even if you wanted to contradict it, has been proven correct.
 
It appears that your reading comprehension from context is not working ;).
---------

Here are the Standard Candle results for the new $199 AMD Radeon RX480 in RED, Nvidia Titan X in GREEN.
Rsolve 12.5 Studio, X99 Motherboard, 12 core Xeon.

Blur:
09 Nodes: RX480:24fpsTitanX:24fps
18 Nodes: RX480:20fps TitanX:16fps
30 Nodes: RX480:13fps TitanX:11fps
66 Nodes: RX480:6fps TitanX:5-6fps

TNR:
1: RX480:24fps TitanX:24fps
2: RX480:18fpsTitanX:17-20fps
4: RX480:9fpsTitanX:11fps
6: RX480:7fpsTitanX:8fps


Big thanks to Peter Richards of fixafilm.com for purchasing the 480 card, as there are no compute based reviews of the card, he thought it important that the card be tested for OPENCL performance. Let me know if you want any other benchmarks run.

Its easy to pick out compute tests that favor one vendor over another. Another OpenCL test shows the 980 ti with much better performance over the rx 480.

Secondly, what if in DX11 games AMD GPUs are bottlenecked by nature of API, and by CPU overhead? Have you considered this? What happens when you lift CPU overhead and GPUs are not bottlenecked by nature of API like is in DX12 and Vulkan?

Could be but don't expect crazy performance breakthroughs. Comparing Directx11 vs 12 on the RX 480 At 1440p, Arstechnica shows that AMD sees a 15% increases in Ashes of the singularity, 3% increase in Hitman and 11% decrease in Rise of the tomb raider.

Additionally with the release of Pascal, Nvidia also has implemented their own limited form of asynch compute meaning that AMD is no longer the sole beneficiary of async compute performance increases from these new APIs.
 
  • Like
Reactions: tuxon86
mented their own limited form of asynch compute meaning that AMD is no longer the sole beneficiary of async compute performance increases from these new APIs.
Nvidia is using Preemption not true hardware level Asynchronous Compute. Games are using Asynchronous Shaders.

Hotel in Luxmark benchmark aside compute performance of GPU, is relying heavily on Rasterising - something which AMD lacks in terms of performance. Unfortunately.
 
A bit off topic, but when the RX480 power issue is sorted, would it be possible for me to run one of these next to an ati 5770 in a 5.1 MacPro (as AMD/ATI are pretty much the same make)?
 
Nvidia is using Preemption not true hardware level Asynchronous Compute. Games are using Asynchronous Shaders.

Hotel in Luxmark benchmark aside compute performance of GPU, is relying heavily on Rasterising - something which AMD lacks in terms of performance. Unfortunately.

Regardless of whether its "real" async compute, benchmarks that test this show that Pascal sees increases in performance where maxwell sees no increase in performance.
[doublepost=1469213924][/doublepost]
A bit off topic, but when the RX480 power issue is sorted, would it be possible for me to run one of these next to an ati 5770 in a 5.1 MacPro (as AMD/ATI are pretty much the same make)?

Currently it is not. There is some limited driver support in MacOS Sierra, but as far as I know it either does not fully work or is unstable.
 
Again, completely false.

http://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review/9

AnandTech explains in great detail how Maxwell and Pascal implement asynchronous compute, and the improvements that are in the Pascal version. Preemption is something completely different and has nothing to do with this.
You are absolutely right. Pascal and Maxwell both has Async Compute, both can execute compute workload parallel to graphics workload, without preemption. Maxwell was just very hard to tune to get perf increase
 
  • Like
Reactions: TheStork
A bit off topic, but when the RX480 power issue is sorted, would it be possible for me to run one of these next to an ati 5770 in a 5.1 MacPro (as AMD/ATI are pretty much the same make)?

The power is no issue, yes the card draws more than spec on the PCIe slot (in most cases) however this lane seems to be shared across all 4 slots at 300W with no slot specific limit/measurement on my 4.1.

My RX480 runs fine on a single 6 pin + PCIe in Windows, with both old and new driver (did *not* test the power saving setting that clocks it down even more).

Considering i stole power from the bay and HDD drives to supply 2 RX270 before just fine a dual RX480 setup *should* work if your PCIe draw otherwise is low (like the usual PCIe SSD and a USB 3.0 card) without any mods.
 
The power is no issue, yes the card draws more than spec on the PCIe slot (in most cases) however this lane seems to be shared across all 4 slots at 300W with no slot specific limit/measurement on my 4.1.

It is still an unknown at this moment. The manual says total cannot exceed 300W, however, 75W per slot is the normal standard. It's had to tell if Apple use poor wording (because they never says you can install any non-Apple approved PCIe card in there), or really a single slot can deliver up to 300W (in no other card installed).

TBH, it hard to believe that any of those slot can deliver 4x normal standard.
 
TBH, it hard to believe that any of those slot can deliver 4x normal standard.

Yes, 300W on a single slot is alone electrically dangerous - however, the RX480 max measured slot consumption is 90-100W, which is not a problem on the pins or the PCB tracks yet. A single one seems to work just fine, without having it set with the new driver to hotfix either.

Best guess for limit would be the Apple approved/official GPUs (The 5770 + 3x Nvidia config is IIRC officially supported) and the RAID adapter, but i have doubts either draw 75W+.
 
Interesting informations: It appears that Zen and Vega can be made on... 16 nm FF+ TSMC process, not GloFo/Samsung 14 nm LPP. Other possibility is IBM 14 nm FDSOI.
 
@koyoot if you look at post further down on the black magic forum the GTX1070 has about the same performance as the RX480 (in one app so only matters if your thinking of video grading)

"Using the Blur settings on the original candle test on the reference GTX 1070

9 Nodes - 24 (we managed to go up to 16 with it at 24 and on the 17th node it dropped to 21)
18 Nodes - 19-20
30 Nodes - 12-13
66 Nodes - 6"

but i think this is in openCL mode, dont think the CUDA side has been fully updated but not shore so i may be wrong.
the cards are all new so i assume there will be some speedup as things get updated.

& a lot of this relay comes down to what do you (or 'I') want/need, not if one card is better.
if i need CUDA then ATI is a no go but if i want a cheep openCL card then ATI is worth looking at but it's still early days, + the RX 480 is a cheaper card than most which needs to be mentioned.

(and at the mo there both as much use to me as a toaster for video editing as there's still no osx drivers)
ps thats a retro joke https://en.wikipedia.org/wiki/Video_Toaster :p
 
Well it should be similar because it also has similar compute performance to both RX 480 and Titan X.

What I am posting is all about optimising the software to fully utilize the compute performance of GPUs, not how they are better than others. And properly optimised software always will show that 6 TFLOPs GPU is on par with 6 TFLOPs GPU. They are mathematical algorithms. And then you should pick hardware by your pocket, not brand favourism. But that is "ideal" world.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.