Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Ethosik

Contributor
Original poster
Oct 21, 2009
8,183
7,170
I accidentally posted this in the Macbook Pro forums. I guess mods are not around to move messages?

So, I am getting a GTX 1080 FE and replacing my GTX 980 SC card (currently in a custom build Windows computer). Final Cut Pro X runs better on AMD, but will it run ANY better upgrading from the AMD Radeon 5870 (stock that came with the Mac Pro) to the EVGA GTX 980 SC?

That AMD card is so old, I am hoping the 980 will be better at rendering.
 
Final Cut Pro X runs better on AMD, but will it run ANY better upgrading from the AMD Radeon 5870 (stock that came with the Mac Pro) to the EVGA GTX 980 SC?

That AMD card is so old, I am hoping the 980 will be better at rendering.

980 is 2-10x faster on everything?

3DMark GPU Score AMD Radeon 5870:
4280

3DMark GPU Score EVGA GTX 980:
21330
rgds
 
I think it really depends on what you do in FCPX, which version of FCPX, and the OS X version as well (the driver version)

980 can finish BruceX in ~28s. (The driver version affect the result A LOT! make sure you use the correct / most up to date version. Otherwise, the 980 may run 100% slower in FCPX!)

http://barefeats.com/cmp12c6c.html
https://forums.macrumors.com/threads/fcpx-amd-vs-nvidia.1956128/page-2#post-22657893

http://barefeats.com/gtx980ti.html

And my 7950 can finish this test may be 10% faster than the 980 in FCPX. (my 2x 7950 can finish BruceX in 15s, so single card may able to finish in around 25s due cannot achieve 100% scaling)

A 5870 show about 24-42% slower than the 7950 in few FCPX tests.

http://barefeats.com/gpu7950b.html
http://barefeats.com/tube05.html

To put everything together, A 5870 can just be 10% slower than a 980 in some particular job in some situation. Or may be much much slower in other situations.

If you have a Mac Pro with 5870 on hand now. You may try the BruceX, and compare it to other results. Of course BruceX may not be accurate at all for your workflow, but you can still have a rough idea. Or if can develop a small test which fit your workflow and upload to here. We may able to tell yo our result, and let you decide if it's worth to upgrade. Or which card has better value for you.
 
BruceX runs in 14 seconds on my single TITAN X GPU with 12GB of video memory. People love to complain about NVIDIA's performance in FCPX, but most of the time they're using a card with 2GB of memory which is simply not enough for 4K video.
 
I think it really depends on what you do in FCPX, which version of FCPX, and the OS X version as well (the driver version)

980 can finish BruceX in ~28s. (The driver version affect the result A LOT! make sure you use the correct / most up to date version. Otherwise, the 980 may run 100% slower in FCPX!)

http://barefeats.com/cmp12c6c.html
https://forums.macrumors.com/threads/fcpx-amd-vs-nvidia.1956128/page-2#post-22657893

http://barefeats.com/gtx980ti.html

And my 7950 can finish this test may be 10% faster than the 980 in FCPX. (my 2x 7950 can finish BruceX in 15s, so single card may able to finish in around 25s due cannot achieve 100% scaling)

A 5870 show about 24-42% slower than the 7950 in few FCPX tests.

http://barefeats.com/gpu7950b.html
http://barefeats.com/tube05.html

To put everything together, A 5870 can just be 10% slower than a 980 in some particular job in some situation. Or may be much much slower in other situations.

If you have a Mac Pro with 5870 on hand now. You may try the BruceX, and compare it to other results. Of course BruceX may not be accurate at all for your workflow, but you can still have a rough idea. Or if can develop a small test which fit your workflow and upload to here. We may able to tell yo our result, and let you decide if it's worth to upgrade. Or which card has better value for you.

So I am better off selling my 980? Its only a 10% improvement over the 5870 as you say.
 
So I am better off selling my 980? Its only a 10% improvement over the 5870 as you say.

I really don't know. If you only deal with some small short videos most of the time, may be 5870 actually works better (no need to worry about any bugs, OS update, etc).

Anyway, it seems that you have both cards on hand, why not run some real world test that fit your workflow and then decide?
 
I really don't know. If you only deal with some small short videos most of the time, may be 5870 actually works better (no need to worry about any bugs, OS update, etc).

Anyway, it seems that you have both cards on hand, why not run some real world test that fit your workflow and then decide?

Right, BruceX is not the most representative FCP test ever (how many people are actually working with 5K video streams?). The 980 will be so much faster in a wide variety of apps, that you'd be better off just trying it yourself and seeing what the impact is.
 
I really don't know. If you only deal with some small short videos most of the time, may be 5870 actually works better (no need to worry about any bugs, OS update, etc).

Anyway, it seems that you have both cards on hand, why not run some real world test that fit your workflow and then decide?

It is really stupid that NVIDIA cards are too horrible at OpenCL it cannot even beat the 5870 that came out in 2010 (if even that).
 
Right, BruceX is not the most representative FCP test ever (how many people are actually working with 5K video streams?). The 980 will be so much faster in a wide variety of apps, that you'd be better off just trying it yourself and seeing what the impact is.

I really want to. If OSX has native support of Maxwell card, I will do that straight away. I am not worrying about the web driver stuff. However, I share use the Mac Pro with my wife. If she boot to the black screen when I am out of town, that will be a disaster :p (e.g. she do the OSX update, or for some reason she perform PRAM rest, etc...)
 
It is really stupid that NVIDIA cards are too horrible at OpenCL it cannot even beat the 5870 that came out in 2010 (if even that).

Okay, let's try and break this down one more time. Performance is usually limited by one or more bottlenecks. FCP is very rarely limited by raw GPU compute performance, i.e. OpenCL execution. It is generally limited by things like:

- Disk speed (SSD is faster than mechanical, etc).
- PCIe bus speed (Gen 3 in nMP is faster than Gen 2 in cMP, etc).
- Amount of memory on your graphics card (6GB is better than 2GB, etc).

You cannot just blame bad FCP performance on "horrible OpenCL". Fire up the OpenGL Driver Monitor and take a look at the stats when you run your FCP test or normal workflow. I'd be very surprised if the "GPU Core Utilization" is maxed out at 100% for the entire test, which suggests that raw GPU compute perf is absolutely not the problem. As we've been discussing to death in other threads, there are plenty of OpenCL benchmarks where Maxwell-based GPUs absolutely demolish everything else.
 
Okay, let's try and break this down one more time. Performance is usually limited by one or more bottlenecks. FCP is very rarely limited by raw GPU compute performance, i.e. OpenCL execution. It is generally limited by things like:

- Disk speed (SSD is faster than mechanical, etc).
- PCIe bus speed (Gen 3 in nMP is faster than Gen 2 in cMP, etc).
- Amount of memory on your graphics card (6GB is better than 2GB, etc).

You cannot just blame bad FCP performance on "horrible OpenCL". Fire up the OpenGL Driver Monitor and take a look at the stats when you run your FCP test or normal workflow. I'd be very surprised if the "GPU Core Utilization" is maxed out at 100% for the entire test, which suggests that raw GPU compute perf is absolutely not the problem. As we've been discussing to death in other threads, there are plenty of OpenCL benchmarks where Maxwell-based GPUs absolutely demolish everything else.

Everything that I have read says that FCP is very heavily reliant on OpenCL. The fact that h9826790 said that it is only a 10% increase (I have seen this elsewhere too) means a lot going from a 2009/2010 card, to the top of the line card 1.5 generations ago (if you consider the TI editions a .5 generation).

Either FCP barely uses the GPU, which is false because I see massive benchmark improvements going from the 5870 to the 7950, or NVIDIA is just not that good with OpenCL that FCP uses. That is a fact. There are some OpenCL tasks that AMD cards are better for. I was just hoping for more than a 10% improvement going from the 2009/2010 card to a card 1.5 generations ago.

Even .5 generations ago, people have reported only seeing about a 10% improvement (going from the 5870 to the GTX 980 Ti SC). So I am probably looking at less than 10%.

The AMD 7950 is looking to be the better option as it is around a 25% increase in performance. And that card is getting old too!
 
Last edited:
Everything that I have read says that FCP is very heavily reliant on OpenCL. The fact that h9826790 said that it is only a 10% increase means a lot going from a 2009/2010 card, to the top of the line card 1.5 generations ago (if you consider the TI editions a .5 generation).

Either FCP barely uses the GPU, which is false because I see massive benchmark improvements over the 5870 and the 7950, or NVIDIA is just not that good with OpenCL that FCP uses. That is a fact. There are some OpenCL tasks that AMD cards are better for. I was just hoping for more than a 10% improvement going from the 2009/2010 card to a card 1.5 generations ago.

My 12GB TITAN X runs BruceX in 14 seconds, per the other FCP performance thread. That's better than any single GPU result I've seen posted (edit: the 2 D700s in the nMP get a similar score). It also disproves your claim that NVIDIA is terrible at FCP or OpenCL. Here's another hint: if you compare 2 GPUs where one has massively more theoretical horsepower and get similar results, then the bottleneck is not related to raw GPU horsepower. This is performance analysis 101.

From what I've seen, FCP is heavily reliant on sending raw video frames across the PCIe bus to the GPU, doing some work on those frames, and then sending raw video frames back across the PCIe bus. So, if you don't have a fast connection to your GPU, or your GPU doesn't have enough framebuffer memory to hold all those video frames, then your performance is going to be terrible.
 
My 12GB TITAN X runs BruceX in 14 seconds, per the other FCP performance thread. That's better than any single GPU result I've seen posted (edit: the 2 D700s in the nMP get a similar score). It also disproves your claim that NVIDIA is terrible at FCP or OpenCL. Here's another hint: if you compare 2 GPUs where one has massively more theoretical horsepower and get similar results, then the bottleneck is not related to raw GPU horsepower. This is performance analysis 101.

From what I've seen, FCP is heavily reliant on sending raw video frames across the PCIe bus to the GPU, doing some work on those frames, and then sending raw video frames back across the PCIe bus. So, if you don't have a fast connection to your GPU, or your GPU doesn't have enough framebuffer memory to hold all those video frames, then your performance is going to be terrible.

What do you mean by that? The performance gains going from an AMD 5870 to a 7950 are big. That card performs less and has less memory than my 980 SC. But it certainly beats my 980 in terms of Final Cut Pro X. I fail to see what is going on here. Sometimes, AMD cards are better at some OpenCL stuff. This just might be the case.

Granted, the 7950 is getting too old and still very expensive for the Mac edition, so I do not want to get that one :(.
 
What's your disk read/write speed? Internal RAID or external?
What codec is the footage you normally work with?
How much memory do you have?
What version OS X and FCP are you using?

The answer to these questions are just as important to FCP performance as what GPU to use. If you are editing with an external drive hooked up via FW800 or cheap Lacie Rugged via USB3 , it literally does not matter what graphic card you had, you're maxing out at 80 - 100 MB/s and performance will be horrible.
 
Um, BTW. The Titan X blows away even my GTX 980. So I do not think that is a fair comparison to make. You have 3x the amount of video memory that I do! Titan X still blows away the GTX 1080 is some areas. Not a fair comparison at all.

You guys are saying NVIDIA is just as good in EVERY aspect on OpenCL than AMD? I have seen benchmarks the show differently. NVIDIA does things better on some things the AMD. Likewise with AMD leading in some benchmarks.

Disk read/write is 230MB/s (SSD over SATA2).
Codec: Both Prores422 and h.264
Memory: 4GB on my GTX 980, 1GB on my 5870, system memory is 32GB
OS X: El Capitan (latest) and the latest FCP.
When tested with my 980, I used the latest NVIDIA drivers.
 
Um, BTW. The Titan X blows away even my GTX 980. So I do not think that is a fair comparison to make. You have 3x the amount of video memory that I do! Titan X still blows away the GTX 1080 is some areas. Not a fair comparison at all.

You guys are saying NVIDIA is just as good in EVERY aspect on OpenCL than AMD? I have seen benchmarks the show differently. NVIDIA does things better on some things the AMD. Likewise with AMD leading in some benchmarks.

Disk read/write is 230MB/s (SSD over SATA2).
Codec: Both Prores422 and h.264
Memory: 4GB on my GTX 980, 1GB on my 5870, system memory is 32GB
OS X: El Capitan (latest) and the latest FCP.
When tested with my 980, I used the latest NVIDIA drivers.

No, I'm just objecting to your claim that NVIDIA has "horrible OpenCL" and that this is the root cause of bad FCP performance. As we've discussed at length, certain OpenCL apps run better on AMD, while others run better on NVIDIA. I'm using my TITAN X results to indicate that certain NVIDIA cards run FCP very, very well. It may be that you need more video memory on NVIDIA GPUs than you do on AMD GPUs to make them work well in FCP, I don't know.
 
No, I'm just objecting to your claim that NVIDIA has "horrible OpenCL" and that this is the root cause of bad FCP performance. As we've discussed at length, certain OpenCL apps run better on AMD, while others run better on NVIDIA. I'm using my TITAN X results to indicate that certain NVIDIA cards run FCP very, very well. It may be that you need more video memory on NVIDIA GPUs than you do on AMD GPUs to make them work well in FCP, I don't know.

I didn't mean horrible OpenCL in general, I meant horrible OpenCL that FCP uses. There is a difference. And the Titan X uses more memory and better speeds to overcome that maybe. I am sure NVIDIA beats AMD in some OpenCL tasks. But it looks like FCP is not one of them :(.
 
Um, BTW. The Titan X blows away even my GTX 980. So I do not think that is a fair comparison to make. You have 3x the amount of video memory that I do! Titan X still blows away the GTX 1080 is some areas. Not a fair comparison at all.

You guys are saying NVIDIA is just as good in EVERY aspect on OpenCL than AMD? I have seen benchmarks the show differently. NVIDIA does things better on some things the AMD. Likewise with AMD leading in some benchmarks.

Disk read/write is 230MB/s (SSD over SATA2).
Codec: Both Prores422 and h.264
Memory: 4GB on my GTX 980, 1GB on my 5870, system memory is 32GB
OS X: El Capitan (latest) and the latest FCP.
When tested with my 980, I used the latest NVIDIA drivers.


I would improve on read write speeds. Also, there is practically nothing that will help h264. It's CPU and single thread dependent it seems. Transcoding it will be slow as molasses. I transcode footage onset, and anytime I get GoPro footage, I cringe. A maxed out trash can and a maxed out oMP will yield roughly the same speeds...about realtime.

ProRes is also CPU dependent and FCP is not that great at multithreading. Resolve is much better at multithreading, but even Transcoding Prores, I still only get about 60-70% of my CPU power (I have dual 3.46 GHz 6 cores, capable of 24 threads).

My point is you currently have bottlenecks that could be addressed.
 
I would improve on read write speeds. Also, there is practically nothing that will help h264. It's CPU and single thread dependent it seems. Transcoding it will be slow as molasses. I transcode footage onset, and anytime I get GoPro footage, I cringe. A maxed out trash can and a maxed out oMP will yield roughly the same speeds...about realtime.

ProRes is also CPU dependent and FCP is not that great at multithreading. Resolve is much better at multithreading, but even Transcoding Prores, I still only get about 60-70% of my CPU power (I have dual 3.46 GHz 6 cores, capable of 24 threads).

My point is you currently have bottlenecks that could be addressed.

Last I checked x264 uses OpenCL.

How should I be exporting my videos that uses hardware acceleration?
 
Last edited:
Just be careful that FCPX isn't be crashed by Nvidia's terrible web drivers

Reports of Premiere and AE crashing like crap, cards that did work no longer work.

"Mature Maxwell drivers" some people said around here.

http://www.tonymacx86.com/threads/n...vers-for-os-x-10-11-6-346-03-15.198033/page-3

Adobe really botched their latest CC update. I had the audio repeating glitch at the end of ALL my exports. The last few words I said were repeated at the end :(
 
Adobe really botched their latest CC update. I had the audio repeating glitch at the end of ALL my exports. The last few words I said were repeated at the end :(

Nothing wrong with Adobe from my end on both Mac and Windows on multiple machines. As long as we avoid Nvidia's Mac web drivers. They have been stuck at version 346 for over a year and each new driver for each new macOS update brings new problems.
 
Nothing wrong with Adobe from my end on both Mac and Windows on multiple machines. As long as we avoid Nvidia's Mac web drivers. They have been stuck at version 346 for over a year and each new driver for each new macOS update brings new problems.

Did you update to CC 2015.3? If not, stay on the version you currently have.

https://forums.adobe.com/thread/2048147

This causes my workflow to double. I am mostly on Final Cut Pro now due to this.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.