Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
Status
Not open for further replies.
I think I remember when I posted the slides from Intel where they where saying eGPUs are a official thing with TB3 there was something with them being hot pluggable.

There is a difference between being optionally present and being mandatory for TB v3 certification. There is little to indicate that this a mandatory TB v3 requirement for all systems.

" ... the biggest hold-up has always been handling what to do about GPU hot-plugging and the so-called “surprise removal” scenario. Intel tells us that they have since solved that problem, and are now able to move forward with external graphics. The company is initially partnering with AMD on this endeavor – though nothing excludes NVIDIA in the long-run – ... "

Only initially working with one vendor is suggestive this is an optional feature. This subset of the certification is likely per specific OS and graphics driver limited. It is also a bit contradictory in that the 'they' in "they have since solved" consists solely of Intel. If need partners to solve the problem then "they" isn't just Intel.


I dont think this is a problem anymore when Intel says themselves put a GPU in a external pcie enclosure

It never was Intel's primary call to solely make. If Windows or OS X doesn't fully support it then there was no single provider solution for Intel to provide. Still isn't. Intel is talking about it because the other major players in the solution are on board and the group ( Intel and the others) are relatively confident in making it work across a holistic span of operations... ( not just plug-and-pray coupled with don't touch it after it appears to be working.)

There has always been a bit of overselling just how little OS and driver support Thunderbolt additionally requires. It isn't so much 'new' capabilities but more so existing ones that most OS and driver implementors could (and did ) optionally ignore with little blowback. (e.g., hot plug PCIe protocols. )
 
Last edited:
Im sorry but that shows me that you don't even understand what AMD has shown, and it invalidates your point of view.
....
But I am fan of GPU engineering. And the more I dig into the GCN architecture the more I am amazed with the capabilities it has.

Not sure if you know what you are looking at. Direct connected external GPUs instantiated into a single OS instance is fundamentally different that paravirtualized GPUs off on some remote grid node. Those are two different problem-solution pairs. One is about local connectivity in a single OS instance. The other is more remote multiple OS instances sharing the same hardware.
 
Now you're being silly playing typo cop... I bet you understand it was virtual...

Virtual isn't the point. What you virtualizing and at what level of abstraction this virtualization occurs at is.

This "could" mean redirecting the work to the other GPU presently in the machine. When the eGPU is connected than the work would be send to it.

APIs like Metal/Vulkan/etc aren't opaque to the hardware. What you have is down-to-hardware instructions. The instructions sent are going to be specific to that hardware. If it was different hardware then you'd have incrementally different set of instructions.


What you are proposing is putting another abstraction layer back in. Hey "down to the metal api" talk to this non-hardware driver. Lots of folks are going to tons of effort to take that out at this point in time. So if the external GPU has 8 GB of VRAM and you map this onto an internal GPU with 2GB of VRAM how is t hat performance going to work out? Similarly internal iGPU has relatively uniform shared memory access, what happens when external GPU has VRAM? The timing is going to get thrown off and for multithread apps with locking and synchronization there are all kinds of hazards that may pop up because the code has implicit expectations that he known hardware can do what it is suppose to do.

The hand waving work around is to only present the lowest common denominator to the apps above. There may not be a common denominator once factor in GPU architecture geometry , feature sets , and RAM storage locality.

In short, if look at the problem from high up on the graphcs stack ( e.g., from OpenGL layer) is is just a simple virtualization problem. If look at the problem from down inside the kernel it is a substantially different story.
 
Now you're being silly playing typo cop... I bet you understand it was virtual...

The role of this virtual driver is to manage the pluging and unplugging of the eGPU. It is the one that when it detect the loss of connection will do what is necessary to keep the system going. This "could" mean redirecting the work to the other GPU presently in the machine. When the eGPU is connected than the work would be send to it.

You don't really understand the problem. The problem is that GPUs store data. It's like having hot swappable RAM. If you pull a stick of RAM while your machine is running, in theory any app that stored data on that chip is going to suddenly have data missing and will crash. Same is true of a hard drive. If an app has very important data sitting on that hard drive when it suddenly gets disconnected the app will crash.

What you're saying is just like saying "I can't disconnect a hard drive while an app is reading/writing to it, but if I talk to the hard drive like it's a monitor then I can!" It's nonsense.

" ... the biggest hold-up has always been handling what to do about GPU hot-plugging and the so-called “surprise removal” scenario. Intel tells us that they have since solved that problem, and are now able to move forward with external graphics. The company is initially partnering with AMD on this endeavor – though nothing excludes NVIDIA in the long-run – ... "

This is the big reason why Apple hasn't supported eGPUs under Thunderbolt 2.
 
You don't really understand the problem. The problem is that GPUs store data. It's like having hot swappable RAM. If you pull a stick of RAM while your machine is running, in theory any app that stored data on that chip is going to suddenly have data missing and will crash. Same is true of a hard drive. If an app has very important data sitting on that hard drive when it suddenly gets disconnected the app will crash.

What you're saying is just like saying "I can't disconnect a hard drive while an app is reading/writing to it, but if I talk to the hard drive like it's a monitor then I can!" It's nonsense.



This is the big reason why Apple hasn't supported eGPUs under Thunderbolt 2.

That's just a matter of how and what you have to do prior to disconnecting it. In your exemple, the usb drive or HDD, you have to do an eject before removing it if you don't want any problem. Same thing could apply to the eGPU. No one is advocating an arbitrary pull of the plug!

And eGPU don't have to be managed 100% like internal gpu, just like external drive aren't 100% managed like internal one. I can't on windows for example, mix and match external usb and internal drive in a raid. But both can be access transparently by the same filesystem.

What I meant by my comment "treat it as an external monitor" is that the eGPU become its own appliance/independant device.
 
That's just a matter of how and what you have to do prior to disconnecting it. In your exemple, the usb drive or HDD, you have to do an eject before removing it if you don't want any problem. Same thing could apply to the eGPU. No one is advocating an arbitrary pull of the plug!

Every app that draws to the screen stores data on the GPU, whether or not they are doing it on purpose. You're talking about shutting down the machine first, which doesn't work so great on a laptop.

And if the plug down come lose or get disconnected, that means the whole machine crashes with any unsaved data lost.
 
Every app that draws to the screen stores data on the GPU, whether or not they are doing it on purpose. You're talking about shutting down the machine first, which doesn't work so great on a laptop.

And if the plug down come lose or get disconnected, that means the whole machine crashes with any unsaved data lost.

No I'm saying the user use an eject function which tells the applications and OS that the eGPU is about to be disconnected. I never said that you could yank the cable... Same as with an external drive.
 
No I'm saying the user use an eject function which tells the applications and OS that the eGPU is about to be disconnected. I never said that you could yank the cable... Same as with an external drive.

That doesn't work without re-writing every single application to understand that, and even then you're copying data that may not be compatible with the GPU you're switching to. If an application is using DirectX 12, and it has to move to a DirectX 11 GPU, what does it do, just explode?

Every single application that uses the GPU would be corrupted. Your eject function is shut down the computer, unplug the GPU, and then turn back on the computer. That's the best it gets because you're never going to be in a state where the GPU is not busy and ready to be ejected. What you're saying is basically "I want to come up with a way to safely eject my boot disk." Not going to happen without a shut down. The Finder, the Window Server, and the login screen all use the GPU. That means to be in a state where you can eject a GPU, those things have to be shut down, and shutting down the window server basically means killing every application on the machine.

Also, nowhere in OpenGL or DirectX is there this concept of ejectable GPUs. You're talking about rewriting how the graphics APIs work and then rewriting every application that uses graphics. When you use OpenGL or DirectX, it's all based around the GPU never disconnecting.

What Intel and AMD did was way easier: Work together and make sure the drivers can copy data back and forth. That's way easier than trying to redefine every application that uses the GPU. But it requires Intel and other vendors work together instead of hacking together something on an existing implementation.

The other more basic reason this is crazy is because Apple will never actually condone a setup where one accidental cable disconnect brings down the whole system. Even if there is a "eject process" the accidental disconnect problem would mean it would never happen. Imagine if you wiggle your machine just the wrong way and the whole thing just crashes.
 
The other more basic reason this is crazy is because Apple will never actually condone a setup where one accidental cable disconnect brings down the whole system. Even if there is a "eject process" the accidental disconnect problem would mean it would never happen. Imagine if you wiggle your machine just the wrong way and the whole thing just crashes.

I'm glad the interested companies are working together on a hopefully elegant solution.

On the other hand, I bet a LOT of people would have been perfectly happy for the last couple of years if we simply had optional screws on both ends of the connectors. A wiggle wouldn't result in an accidental unplug, and anything that broke the connection at that point is not a valid use case anyway. As a side benefit, those pros who want to prevent their critical external storage from accidentally unplugging would benefit too.

Sonnet even has a Thunderbolt screw adapter, but of course it only works on custom ports with the threaded hole:

prodhdr_thunderlok.jpg
 
Last edited:
That doesn't work without re-writing every single application to understand that, and even then you're copying data that may not be compatible with the GPU you're switching to. If an application is using DirectX 12, and it has to move to a DirectX 11 GPU, what does it do, just explode?

Every single application that uses the GPU would be corrupted. Your eject function is shut down the computer, unplug the GPU, and then turn back on the computer. That's the best it gets because you're never going to be in a state where the GPU is not busy and ready to be ejected. What you're saying is basically "I want to come up with a way to safely eject my boot disk." Not going to happen without a shut down. The Finder, the Window Server, and the login screen all use the GPU. That means to be in a state where you can eject a GPU, those things have to be shut down, and shutting down the window server basically means killing every application on the machine.

Also, nowhere in OpenGL or DirectX is there this concept of ejectable GPUs. You're talking about rewriting how the graphics APIs work and then rewriting every application that uses graphics. When you use OpenGL or DirectX, it's all based around the GPU never disconnecting.

What Intel and AMD did was way easier: Work together and make sure the drivers can copy data back and forth. That's way easier than trying to redefine every application that uses the GPU. But it requires Intel and other vendors work together instead of hacking together something on an existing implementation.

The other more basic reason this is crazy is because Apple will never actually condone a setup where one accidental cable disconnect brings down the whole system. Even if there is a "eject process" the accidental disconnect problem would mean it would never happen. Imagine if you wiggle your machine just the wrong way and the whole thing just crashes.

You're over thinking it. You're saying that an eGPU has to work and be used the same way an internal one is. I'm saying it doesn't. I don't need to shutdown my computer to remove my external disk array. I just have to use the eject feature. I don't need to eject my boot device since it's inside the machine. And I can't raid both of them because the OS doesn't permit it and it would be the same with the eGPU. Your application would be able to use it when it is present but independently of your internal GPU.

As for not having application or DX and OGL support... Well, eGPU aren't yet available now do they? So why would there be support for something that isn't available yet? Build it and they'll come as the movie says.

As for the wiggle thing... Use a better connector. I never once disconnected a wire by moving my laptop or pc... But then again I'm prudent when handling $$$ electronics.
 
You're over thinking it. You're saying that an eGPU has to work and be used the same way an internal one is. I'm saying it doesn't. I don't need to shutdown my computer to remove my external disk array. I just have to use the eject feature. I don't need to eject my boot device since it's inside the machine. And I can't raid both of them because the OS doesn't permit it and it would be the same with the eGPU. Your application would be able to use it when it is present but independently of your internal GPU.

As for not having application or DX and OGL support... Well, eGPU aren't yet available now do they? So why would there be support for something that isn't available yet? Build it and they'll come as the movie says.

As for the wiggle thing... Use a better connector. I never once disconnected a wire by moving my laptop or pc... But then again I'm prudent when handling $$$ electronics.

I'm going to repeat it again so you understand: Disconnecting a GPU using the current Thunderbolt 2 mechanisms requires shutting down the machine. There is no magical technical work around even if you could code one. You can say all you want "They just need an eject sequence!" but you have to shut down the window server to make that work, and shutting down the window server shuts down every single other app. And that's just the smallest problem with this whole scheme.

And I'll say it again: This will never fly with Apple with an eject sequence. Grab your closed laptop in a hurry, unplug all the cables, and oooops your whole entire machine crashed. That's way worse than a disk disconnect. Never going to happen. Apple's never going to sanction something that if you accidentally disconnect your whole machine crashes. Never ever ever.

If you want to say "But eject sequence!" please just again refer to the above paragraph.

Same thing happened with SSDs. "Why doesn't Apple just enable Trim for all SSDs?" Because some SSDs commit seppuku when TRIM is enabled. "But that's the user's fault for picking a bad SSD!" Yes, but Apple isn't in the business of putting users in situations where they can easily destroy a bunch of data. They don't want to have to deal with that. They don't want the support burden. They don't want the data loss. They don't want a user calling them up who suddenly upgraded to 10.8 and suddenly their computer is losing bits, and it looks like their fault. That's why we ended up getting what we got with TRIM, and why it's an opt in in the terminal.

I don't know if we'll get eGPUs right away with Thunderbolt 3, but that's why we don't have them now.
 
Last edited:
I'm going to repeat it again so you understand: Disconnecting a GPU using the current Thunderbolt 2 mechanisms requires shutting down the machine.

Apple were happy shipping a laptop that you had to log out of / reboot to switch between the integrated and discreet graphics built into the actual machine, so having to log out / shut down to disconnect an eGPU isn't really THAT much of a stretch. It's certainly a "you'll never encounter this" problem for a desktop system.

I think there's a great deal of "why wouldn't you just buy a new machine?" - not moustache-twirly villain "haha, now we'll make more money because they can't upgrade" but the sort of blind spot wealthy people (and if there's one place where Apple lacks diversity, it's in the wealth of the people who make decisions) have when it comes to products and services which are below THEIR impulse purchase floor, but above other people's ceilings.

Bets on a TB3 Display, with graphics card in it, which is non-upgradable - "Just buy a new display" as the justification for what to do when the card is out of date.
 
Rendering clusters aren't anything new either.
true.
an improvement over the old/current style clusters is that we may not need to use rendering nodes.. or, i currently can't simply plug another computer into this one and have it's resources available to use.. instead, the other computer needs application specific node software installed and an ip connection initiated.. (further, under this implementation, the node resources don't kick in immediately.. if i start a render, it might be 30 seconds until all systems go)

if we could just plug something into the computer and it functions as if those nodes are internal.. that'd be sweet. and hopefully something the egpu designers are striving for.
 
Last edited:
I'm going to repeat it again so you understand: Disconnecting a GPU using the current Thunderbolt 2 mechanisms requires shutting down the machine. There is no magical technical work around even if you could code one. You can say all you want "They just need an eject sequence!" but you have to shut down the window server to make that work, and shutting down the window server shuts down every single other app. And that's just the smallest problem with this whole scheme.

And I'll say it again: This will never fly with Apple with an eject sequence. Grab your closed laptop in a hurry, unplug all the cables, and oooops your whole entire machine crashed. That's way worse than a disk disconnect. Never going to happen. Apple's never going to sanction something that if you accidentally disconnect your whole machine crashes. Never ever ever.

If you want to say "But eject sequence!" please just again refer to the above paragraph.

Same thing happened with SSDs. "Why doesn't Apple just enable Trim for all SSDs?" Because some SSDs commit seppuku when TRIM is enabled. "But that's the user's fault for picking a bad SSD!" Yes, but Apple isn't in the business of putting users in situations where they can easily destroy a bunch of data. They don't want to have to deal with that. They don't want the support burden. They don't want the data loss. They don't want a user calling them up who suddenly upgraded to 10.8 and suddenly their computer is losing bits, and it looks like their fault. That's why we ended up getting what we got with TRIM, and why it's an opt in in the terminal.

I don't know if we'll get eGPUs right away with Thunderbolt 3, but that's why we don't have them now.

Keyword "CURRENT"... What I'm discussing is a new system. Try to keep up.
Try to read what I posted again, this time with this fact in mind.
Take care of the point where I said that an eGPU doesn't have to be managed as an internal GPU is, just like an external drive isn't managed as an internal is.

This isn't about you, so calm down.
 
true.
an improvement over the old/current style clusters is that we may not need to use rendering nodes.. or, i currently can't simply plug another computer into this one and have it's resources available to use.. instead, the other computer needs application specific node software installed and an ip connection initiated.. (further, under this implementation, the node resources don't kick in immediately.. if i start a render, it might be 30 seconds until all systems go)

if we could just plug something into the computer and it functions as if those nodes are internal.. that'd be sweet. and hopefully something the egpu designers are striving for.

You already can with ASIC device. Look at bitcoin minning for an example of this.
 
Apple were happy shipping a laptop that you had to log out of / reboot to switch between the integrated and discreet graphics built into the actual machine, so having to log out / shut down to disconnect an eGPU isn't really THAT much of a stretch. It's certainly a "you'll never encounter this" problem for a desktop system.

I think there's a great deal of "why wouldn't you just buy a new machine?" - not moustache-twirly villain "haha, now we'll make more money because they can't upgrade" but the sort of blind spot wealthy people (and if there's one place where Apple lacks diversity, it's in the wealth of the people who make decisions) have when it comes to products and services which are below THEIR impulse purchase floor, but above other people's ceilings.

Bets on a TB3 Display, with graphics card in it, which is non-upgradable - "Just buy a new display" as the justification for what to do when the card is out of date.

But that gpu would never be out of date since it is perfectly paired to drive that monitor at opimal capability already (resolution & framerate). in short, on a marketing point of view that GPU would be out of date when the display would be.
 
Show of hands, how many people posting opinions and "facts" about eGPUs here have actually set up and used one?

Last time I hot plugged one into nMP it didn't KP.

It showed up as an Nvidia Chip Model in System Profiler.

Had to restart to use it but it didn't crash the system like it did a few OS revisions ago.

In Windows even easier, can be hot plugged, you get the "new device" sound and if drivers are already installed,, it starts working. Plug a display into it and start using it.

Can be unplugged too, just better have another display connected via iGPU or you won't see anything. So, with OS X being a superior OS...

Time to get out of the armchairs and try some stuff guys.
 
Show of hands, how many people posting opinions and "facts" about eGPUs here have actually set up and used one?

Last time I hot plugged one into nMP it didn't KP.

It showed up as an Nvidia Chip Model in System Profiler.

Had to restart to use it but it didn't crash the system like it did a few OS revisions ago.

In Windows even easier, can be hot plugged, you get the "new device" sound and if drivers are already installed,, it starts working. Plug a display into it and start using it.

Can be unplugged too, just better have another display connected via iGPU or you won't see anything. So, with OS X being a superior OS...

Time to get out of the armchairs and try some stuff guys.

Ah! Thank you!
 
Apple were happy shipping a laptop that you had to log out of / reboot to switch between the integrated and discreet graphics built into the actual machine, so having to log out / shut down to disconnect an eGPU isn't really THAT much of a stretch. It's certainly a "you'll never encounter this" problem for a desktop system.

I think there's a great deal of "why wouldn't you just buy a new machine?" - not moustache-twirly villain "haha, now we'll make more money because they can't upgrade" but the sort of blind spot wealthy people (and if there's one place where Apple lacks diversity, it's in the wealth of the people who make decisions) have when it comes to products and services which are below THEIR impulse purchase floor, but above other people's ceilings.

Bets on a TB3 Display, with graphics card in it, which is non-upgradable - "Just buy a new display" as the justification for what to do when the card is out of date.

It's a good idea to have a GPU in a an external display for notebooks to connect to. Annoyingly the industry decided to suppress the MXM format which promised to bring upgradable GPUs to notebooks and slim format computers.
 
Apple were happy shipping a laptop that you had to log out of / reboot to switch between the integrated and discreet graphics built into the actual machine, so having to log out / shut down to disconnect an eGPU isn't really THAT much of a stretch. It's certainly a "you'll never encounter this" problem for a desktop system.

Right, but you couldn't accidentally disconnect one of the GPUs. That's the real issue here. You could switch GPUs, but mis-using the system wouldn't cause the machine to crash.

The whole reason that worked is because none of the GPUs could unexpectedly disconnect.
 
In Windows even easier, can be hot plugged, you get the "new device" sound and if drivers are already installed,, it starts working. Plug a display into it and start using it.

Yeah, try that in OS X...

DirectX had some enhancements in Vista for GPU crashes where the whole system won't just die. Apps aren't necessarily going to survive that, they can die too. It's not meant to work reliably, it's just meant to make sure the system doesn't totally die and at least some apps can survive.

OS X has some plumbing to deal with driver crashes, but not necessarily a GPU simply going missing. And OS X doesn't guarantee stability after a driver crash either.

(Sorry for the double post, the forum usually appends subsequent posts to the first.)
 
You already can with ASIC device. Look at bitcoin minning for an example of this.
what's asic? can you only add desired hardware ?
like- just want additional cpu cores so only plug those in.. instead of ram, gpu, and everything else included in a complete computer.
 
Yeah, try that in OS X...

DirectX had some enhancements in Vista for GPU crashes where the whole system won't just die. Apps aren't necessarily going to survive that, they can die too. It's not meant to work reliably, it's just meant to make sure the system doesn't totally die and at least some apps can survive.

OS X has some plumbing to deal with driver crashes, but not necessarily a GPU simply going missing. And OS X doesn't guarantee stability after a driver crash either.

(Sorry for the double post, the forum usually appends subsequent posts to the first.)

He did try it on OSX and it didn't crash. You should take the time to read before replying.
 
Status
Not open for further replies.
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.