Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
...In software they're only enabling one GPU for compute, and one for graphics, and it would be trivial to change this if so.

This assumption is incorrect. The statement made was that by default Mavericks uses one GPU to physically control displays and the other to process memory intensive computations taking some of the burden off the CPU.

Adobe Photoshop CC and Adobe Premiere CC have taken advantage of OpenCL to assign specific computational tasks to the GPU, but have not yet implemented both GPUs in the process.

http://community.amd.com/community/amd-blogs/amd/blog/2014/01/30/new-adobe-photoshop-cc-blockbuster-features-scream-on-amd-graphics

The software option that some manufacturers have already taken advantage of is to engage the power of both GPUs in aiding the CPU by processing specific computations through OpenCL. Apple's Final Cut Pro and Mari 3D paint software have enabled the power of both GPUs with open CL.

http://arstechnica.com/apple/2014/01/two-steps-forward-a-review-of-the-2013-mac-pro/4/

"OpenCL is currently used in the realtime rendering. We need to render the screen a few times each frame to find out what texture tiles are visible, we then need to process that image to work out the set of unique tiles. The images we pre-render can be very large (up to 4k x 4k) so that is a huge amount of pixels to process in real time.

We use OpenCL on OS X to perform this processing. This processing has the option of working on the second GPU. This is set in the OpenCL preferences tab. We are working on offloading more of the processing onto OpenCL and the second GPU." - Mari product manager, Jack Greasly
 
nMP GPU routing

This assumption is incorrect. The statement made was that by default Mavericks uses one GPU to physically control displays and the other to process memory intensive computations taking some of the burden off the CPU.


I think you misunderstand what I'm saying. I now think that both GPU's are plumbed to the TB controllers and the displays, and obviously they've not enabled this in software. This is something Apple does already in every MBP, with HD and Discrete graphics. In fact they can seamlessly switch between the GPU's, either balancing for performance or for power.

Here they could do the same, except balancing between compute/graphics, or graphics/graphics. They've decided to hardcode it to compute/graphics performance. I think in the future they could add a feature where it can go to graphics/graphics performance too, even automatically as it does for the HD/Discrete feature.

Imagine this, a new checkbox in the Display tab of System Preferences, which says "Allow compute GPU to be used for graphics" or some such. Cocoa can detect when an app is making heavy use of OpenCL. When it is, a GPU is dedicated for compute. When it's not, it is driving a display. I have a suspicion now that this is a feature we'll be seeing in a year or so. Given the block diagram, it would have been insane for the hardware engineers not to plumb the lines for the DP connection from the "compute" GPU.

The final "graphics" step would be a form of Crossfire, where graphics work is shared between GPU's. I feel confident Apple will never enable this feature. It's really difficult to do, probably Vendor specific, and after all that it's highly problematic and it just enables higher metrics (FPS) and worse graphics (screen tears and other artifacts) for games, which is not a market Apple especially targets.
 
Last edited:
Indeed, many reviews state that Crossfire occasionally tends to be a hit-and-miss experience. However it's still better to have it than not, IMO (I find it kind of ironic to be available on windows but not on OS X, for a Mac Pro).

I was wondering if nvidia's SLI is also suffering from tearing etc.
 
I don't know if this is helpful to the discussion, but here's a hardware block diagram of the nMP that was posted elsewhere here some time ago, and it clearly shows all display ports and TB controllers connected to GPU B. It also shows a direct connection between the two GPUs (the label is hard to read though). You might wonder if it's legit, but I have no reason to believe it's not. There's simply to much detail on here if you can read the fine print...

http://imgur.com/ItIqxDY
 
I don't know if this is helpful to the discussion, but here's a hardware block diagram of the nMP that was posted elsewhere here some time ago, and it clearly shows all display ports and TB controllers connected to GPU B. It also shows a direct connection between the two GPUs (the label is hard to read though). You might wonder if it's legit, but I have no reason to believe it's not. There's simply to much detail on here if you can read the fine print...

Interesting ... This would take a lot of work even to create. Whoever did this has some motivation, it does seem unlikely to be a fake. I wish it was a higher resolution.

ItIqxDY.png


It's hard to read, but let's see if we can verify it in any way.


  • (Verified) It correctly shows the "B" (or 2) GPU as the Render
  • (Unknown) It shows six DP outputs going from the Render GPU. Assume for the moment that was the only GPU, why would you do that? You can daisy chain two monitors over one TB port, so it doesn't seem like you would need it if you had two displays to two TB ports not the same controller.
  • (Probably wrong) It labels the HDMI connection as being tied to Thunderbolt Connector B. However Apple has said it's tied to the "Zero" TB bus IIRC (I can't seem to find the link now).
  • (Unknown)It shows a "DVO" connection between the GPU's (Digital Video Out?) This is weird, no idea what that means. That would seem to indicate that the compute GPU sends video data to the Render GPU.

Not sure what to make of it, but it certainly looks like a valid block diagram. And it opens more questions than it answers, for all we know the Compute GPU could still send video via the DVO output to the Render. Perhaps this indicates Apple is planning some kind of Crossfire like solution. All speculation ...
 
Perhaps this indicates Apple is planning some kind of Crossfire like solution. All speculation ...

Well Crossfire is already available under Windows, so drivers and software implementation is what's needed.
 
Well Crossfire is already available under Windows, so drivers and software implementation is what's needed.

As I talked about above I think the problem is philosophical rather than technological. Apple can get the Crossfire drivers for Windows from AMD, and the engineers too, but the question is do they want to? As I said even at best Crossfire is problematic, but it would consume engineering resources for a feature that isn't what they're designing the computer for, namely gaming.

And again, Apple takes a purist approach. When there's a engineering solution that can't be made perfect, they'll drop it and push an alternate approach. In this case I think it's highly likely they'll never implement Crossfire, instead opting to push software developers to use OpenCL in addition to OpenGL. That's the purist approach, and frankly the one that will always give the best results. At any rate its certainly the approach they're taking already, and its working, as usual for Apple.
 
As I talked about above I think the problem is philosophical rather than technological. Apple can get the Crossfire drivers for Windows from AMD, and the engineers too, but the question is do they want to? As I said even at best Crossfire is problematic, but it would consume engineering resources for a feature that isn't what they're designing the computer for, namely gaming.

And again, Apple takes a purist approach. When there's a engineering solution that can't be made perfect, they'll drop it and push an alternate approach. In this case I think it's highly likely they'll never implement Crossfire, instead opting to push software developers to use OpenCL in addition to OpenGL. That's the purist approach, and frankly the one that will always give the best results. At any rate its certainly the approach they're taking already, and its working, as usual for Apple.


I've spoken to Feral Interactive before, and they mentioned they'd like to do just that. One GPU for OpenGL, and the other for CL in games.

Asypr recently updated Civilisation 5 t take advantage of OpenCL for bette compute, and it certainly helped performance at 4K.

It's time, and effort though. In the end Drivers still play a massive part for performance in both cases.
 
I've spoken to Feral Interactive before, and they mentioned they'd like to do just that. One GPU for OpenGL, and the other for CL in games.

Seems like at least some of the physics could go on the second GPU. Of course the CPU is more than adequate for much of that.

Asypr recently updated Civilisation 5 t take advantage of OpenCL for bette compute, and it certainly helped performance at 4K.

I didn't notice huge jumps in performance, but I didn't do close testing and don't have a 4k monitor (do you have link to that?) Hard to test it at any rate. But I did run while watching the performance tool on the Compute GPU, it was hard to see any activity. Pretty low level, all Civ does is texture unpacking. Presumably just taking the textures from the disk, decompressing and what not. Probably happens infrequently.

It's time, and effort though. In the end Drivers still play a massive part for performance in both cases.

Agreed.
 
Seems like at least some of the physics could go on the second GPU. Of course the CPU is more than adequate for much of that.
Agreed.

Here ya go, I see the word was changed. Before it stated that Both GPUs are being used now. Interesting.

http://www.aspyr.com/news_articles/civilization-v-gets-open-cl-update-for-mac-pro


Also Battlefield uses OpenCL for physics, and it's the same for Lara Croft's TressFX hair. So if developers can take the time to get it offloaded to a different GPU that specialises in OpenCL there could be some nice improvements.

Although time, and money though.
We'd be better off at the moment with better OpenGL driver support. Especially now that it's been revealed that driver overhead can be dramatically reduced in OpenGL

https://developer.nvidia.com/content/how-modern-opengl-can-radically-reduce-driver-overhead-0

http://www.geeks3d.com/20140321/opengl-approaching-zero-driver-overhead/

AMD need to shape up with their driver support. Disregarding games, all of this and improved OpenCL potential on another GPU can dramatically help the professional industry.
 
Another possibility to consider is that the GPU's are optimized for their respective tasks. GPU Compute has very different requirements than GPU Render. Sometimes vendors optimize for one or the other, here the Tahiti architecture was designed to do well at both Compute and Render. In particular different implementations are necessary for floating point Compute, double precision, etc.

These are labeled as "Dual D700's" of course we don't know what that means as there's no public information about these chips (it's a pure marketing name). We already know these are customized FirePro chips, but not how much they are customized.

At any rate Crossfire in Windows works, so we at least know that its possible. However, I'm now suspecting that Apple will simply have the attitude that if you want Crossfire gaming then BootCamp into Windows, but on the Mac side it's firmly in the Compute/Render camp.
 
Here ya go, I see the word was changed. Before it stated that Both GPUs are being used now. Interesting.

I had read that before but not in detail. Let's see ...

Mac gamers will now be able to fully take advantage of OpenCL framework, which means better visuals and improved performance for the critically acclaimed strategy game.

Better visuals, why? Maybe now with OpenCL they can do more physics or something? Or maybe they just added things across the boad. OK I have several Mac laptops, an old Mac pro and Windows machines. I'll do direct comparisons to see.

Additionally, the game should run faster (meaning less waiting in between turns in late-game situations)

This is the texture unpacking probably

and fewer crashes.

This means they just fixed some bugs.
 
I had read that before but not in detail. Let's see ...

Better visuals, why? Maybe now with OpenCL they can do more physics or something? Or maybe they just added things across the boad. OK I have several Mac laptops, an old Mac pro and Windows machines. I'll do direct comparisons to see.

This is the texture unpacking probably

This means they just fixed some bugs.

The I see the less waiting for turns is the AI's moves are processed faster.
Visual wise, offloading some AI ( if possible for OpenCL ), means that the rendering GPU can get more graphical work done.

Do have a comparison please, always interesting to see what the actual improvements are.
 
The I see the less waiting for turns is the AI's moves are processed faster.

Visual wise, offloading some AI ( if possible for OpenCL ), means that the rendering GPU can get more graphical work done.


I work in AI, but am not sure what the gaming guys use. For GPU compute the main techniques you can use for AI is with neural nets and image processing (visual recognition). Perhaps HMM (Hidden Markov Models) can be implemented in OpenCL also.

My impression is that game AI is mostly custom algorithms. With Civ I'd imagine chained decision making algorithms. In this case I don't see a case for running it on the GPU.

That does't disagree with your observation that it runs faster, that could be due to other optimizations. Don't know for sure.
 
I work in AI, but am not sure what the gaming guys use. For GPU compute the main techniques you can use for AI is with neural nets and image processing (visual recognition). Perhaps HMM (Hidden Markov Models) can be implemented in OpenCL also.

My impression is that game AI is mostly custom algorithms. With Civ I'd imagine chained decision making algorithms. In this case I don't see a case for running it on the GPU.

That does't disagree with your observation that it runs faster, that could be due to other optimizations. Don't know for sure.

Quite true. They specifically changed their statement that the second GPU was being used. So they probably got called out on this very topic, and reworded it all.

Once I get a 4K monitor I'll certainly run Civ 5 to see if there is any real differences. Although I'm usually to optimistic about these things.
 
I suspect this is what Apple is thinking. They take a purist approach, so here they might be planning on never releasing the capability of splitting Displays across GPU's. If they did I'd expect to see a "Use GPU for compute" option or something like that in the Displays pane of System Preferences.

I hinted at this above, but there are already software hooks since 10.6 for pinning output from one card to a card hooked to a different display. You could use those APIs to either do something like CrossFire, or do rendering from the second "compute only" GPU.

OpenGL and OpenCL.
 
I hinted at this above, but there are already software hooks since 10.6 for pinning output from one card to a card hooked to a different display. You could use those APIs to either do something like CrossFire, or do rendering from the second "compute only" GPU.

OpenGL and OpenCL.


What do you mean by "pinning output from one card to a card hooked to a different display"? I'd guess you're talking about the IOSurface API? If not which one do you mean?

If you're talking about IOSurface, that's rendering on the "offline" GPU (the one not connected to the display you are connected to). Naively one might think that a developer could do their own Crossfire implementation using this, but the answer is no. The problem is Data IO. Transferring frames from the offline GPU to the online would swamp any improvements you're going for. Apple has many cautions to this effect.

Behind the scenes Cocoa manages this for the developer, for example when the user moves your app from one screen to another on a different GPU. IOSurface moves you over to the other GPU seamlessly. But you can't use that (without faster GPU-GPU DMA transfer) yourself.
 
Interesting ... This would take a lot of work even to create. Whoever did this has some motivation, it does seem unlikely to be a fake. I wish it was a higher resolution.

Image

It's hard to read, but let's see if we can verify it in any way.


  • (Verified) It correctly shows the "B" (or 2) GPU as the Render
  • (Unknown) It shows six DP outputs going from the Render GPU. Assume for the moment that was the only GPU, why would you do that? You can daisy chain two monitors over one TB port, so it doesn't seem like you would need it if you had two displays to two TB ports not the same controller.
  • (Probably wrong) It labels the HDMI connection as being tied to Thunderbolt Connector B. However Apple has said it's tied to the "Zero" TB bus IIRC (I can't seem to find the link now).
  • (Unknown)It shows a "DVO" connection between the GPU's (Digital Video Out?) This is weird, no idea what that means. That would seem to indicate that the compute GPU sends video data to the Render GPU.

Not sure what to make of it, but it certainly looks like a valid block diagram. And it opens more questions than it answers, for all we know the Compute GPU could still send video via the DVO output to the Render. Perhaps this indicates Apple is planning some kind of Crossfire like solution. All speculation ...

I think the diagram fully supports what we already knew or suspected, including:

- One GPU for render drives all the displays, and one GPU for compute
- We know current AMD GPUs support up to 6 Displays, or 3 4K displays using MST over DP 1.2 (which is consistent with this diagram and explains why the nMP supports 6 displays plus HDMI or three 4K displays)
- Each Falcon Ridge dual port TB controller has a pair of DP and a single x4PCIe input per unit (one display per port or one 4K display using MST per controller)
- The diagram shows a crossfire like connection between GPUs which has been confirmed to be working in Windows (DVO could stand for anything)
- The HDMI shares display bandwidth with one of the TB buses (Controller B) which could have been arbitrarily called bus 0 with its TB ports labelled 5 and 6 (if I'm not mistaken)... So nothing about it is orderly.

There's really no outstanding questions about the hardware architecture that I can see. If there are any questions it has to do with what OS X and developers can do with these resources in software, which seems to be the path you're on.
 
Last edited:
I think the diagram fully supports what we already knew or suspected, including: ...

Well you more or less restated what I said and just said the questions aren't questions. I still maintain that one, we don't know if there is a custom bus between the two GPU's, because it is known that with PCIe3.0 AMD uses PCI bus mastering to transfer the data. So again, we don't know if that is true or not.

Second - and this is the key question - it's not answered why only one GPU has TWO DP ports routed to the TB controllers. I can't think of any reason for that, and pins and traces aren't put down for no reason. I briefly wondered if it might be due to 4k, but as you say DP 1.2 supports 3k, and the TB controller they use supports DP1.2.

The other points are minor, but this last point is the key as to whether we'll see a OS X Crossfire like solution, or at least the choice of whether you will ever be able to do rendering on the Compute GPU.

So while this is still an open question and hasn't been definitely proved, for myself I've given it up and am continuing to research ways to do pure compute on that second GPU. If I was a more serious gamer I'd buy a PC and do it right. Chasing down that path on the Mac is a mistake in my opinion, but that doesn't keep it from being an OK gaming platform, and one good enough for me.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.