Interesting idea, an internal PCIe slot just for compute. It gets around the DisplayPort requirement for thunderbolt. A problem would be which cards are supported and if they need non-apple provided drivers.
It is primary design for compute with an "display out" option.

The Tesla M40 card still has an external facing edge but that whole edge is devoted to blowing air out ( no output sockets). Not sure how entangled macOS OpenCL stack is ( I imagine the Metal one is ) with initiating the graphics stack on a card, but conceptually enabling a "compute only" card should be simpler than a compute + graphics card. If it is not there is something 'off' with the functional decomposition/decoupling inside of macOS.
But yes the graphics out option opens the supported cards issues and whether that is going to be a healthy ecosystem or not.
I think a better solution (from Apples perspective) would be to support GPUs over thunderbolt 3. Then you don't have potentially unwanted space inside the case and don't have to worry about power and cooling of arbitrary cards. The viability of this depends on how much compute tasks like FCP rendering are PCIe bandwidth constrained. It also doesn't solve the driver problem.
That would help with keeping the system on the desktop in a less obtrusive form. But as you point out the graphics driver problem isn't solved. In fact, it gets more complex because now have to also support hot plugging. So it is even more software and opens up the random hacked windows card #42 that will be thrown at macOS.
There is a even deeper problem. One reason why the dual GPU solution found a limited audience is that the implementation that Apple did was a "copy and then work and then copy back" GPGPU solution. For real "general purpose" ( the GP in GPGPU) you need a flatter, more uniform memory access solution. Apple pragmatically stopped on OpenCL 1.2 which stops way short of that. If want to broaden the use cases then need more of an architecture where the local GPU RAM is used more as a cache ( or general access store) than the model where "copy , work , and perhaps copy again " dominates.
So the Thunderbolt model where bandwidth is low doesn't really broaden the scope of what they were covering. So if less software segments pick it up ...... are they really in a better off position? Gaming would trend up ( and a few other application segments ), but is that really what going after with Mac Pro????
The reason why some folks are trying to drag compute+graphics onto a super large single card is because of the latency/bandwidth constraints to either another card ad/or main memory. Thunderbolt just cranks that higher.
Thunderbolt opens more potential system that will buy a card. But I don't think really helps with the leading root cause of that single, monster card driver issue than ran into.
[doublepost=1491938388][/doublepost]
From PikerAlpha's Xeon iMac blog post reply:
Yes, the servers for datacenters and the next Mac Pro may share some parts, but it remains to be seen if Apple is really willing to sell the same hardware to us.
Together with the news of apple cutting ties with Supermicro, this move would be intriguing.
Plus, without knowing the validity of his source, the mMP will be "much more like a PC".
Amazon , Google, Facebook , and Microsoft all have custom boards for their data center. How many of them sell them? Or create products with them to share parts? Zero. It is
not necessary if you have a large scale data center operation to be a major cloud player. Not ordering from Supermicro can simply mean just going to order from same customer shop(s) that one of those other major players is ordering from with perhaps a slightly different modification/customization.
This is wishful thinking hooey. Apple isn't going to be trying to put data center server nodes boards onto people's desktops. It just isn't going to happen unless someone in Cupertino is smoking major drugs.
I suspect someone has tech porn lust for a solution with a
LGA 3647 (Socket P) solution for a Mac Pro as opposed to the
extremely more likely Socket R LGA 2066 (
https://en.wikipedia.org/wiki/LGA_2066 ) that the Skylake-W variant of Xeon E5 1xxxx v5 will use. Apple is likely going after modestly high core counts at faster base clock and higher Turbo deltas than after maximum core count target ( with a substantive trade-off for lower base and turbo. ). Servicing 100-1000's of independent users doing independents tasks over a long distance internet link is a substantively different workload than a single user , mostly single fccus tasks , from local data sources.
For the vast majority of "pro" users sitting at a single user workstation, Socket P doesn't make sense. For a small, small subset, sometimes, but in general no. The Mac Pro isn't going to be aiming at some sub 1% of the market.