Technically you don't have to tell to the kernel it has a GPU connected to use it as compute accelerator, nVidia may define it's GPU as another high bandwidth peripheral and nip/tuck the libraries to handle it, as drawback being out the kernel has some performance penality and the GPU isn't usable to display anything neither is available for STD gpgpu as opencl or metal only Cuda .
That seems like one possible path. The display driver stack is only going to be available for kernel drivers, but they might be able to ship CUDA only drivers.
The issue is going to be that in since they don't ship in kernel space, they can't patch the CUDA functions in for every application. CUDA applications would have to come up with a new way to talk to the driver in user space across process boundaries. It's still kind of a hacky mess.
DriverKit doesn't really have a way for applications to just provide custom services cleanly to other applications. And not in a way that would work with existing CUDA apps.