You mean this?
Metal for Pro Apps - WWDC19 - Videos - Apple Developer
Metal is the platform-optimized graphics and compute framework at the heart of GPU acceleration on Apple platforms. Learn key aspects of...developer.apple.com
Actually you can either use each GPU independently even transfer data among GPU by infinity fabric with specific commands, or link multiple GPUs on s single workload and let both infinity fabric and GPU work as a single device , it's named "metal peer group API".
Peer Groups don't virtualize or make opaque much at all.
"... In Metal, resources are created by device objects, and are always associated with the device object that created them. Peer groups don't change that association. If a resource is associated with a device object, and you want to access it on another device object, you need to copy the data to a resource associated with the second device object. ..."
Transferring Data Between Connected GPUs | Apple Developer Documentation
Use high-speed connections between GPUs to transfer data quickly.
developer.apple.com
Note the "don't change that association" above. They remove from some of the grunt of doing the copy (or remote access), but they make that step explicit. What you more pragmatically have is a group of GPUs that can more easily share data, but the data is explicitly assigned to individual GPUs. All lines up with Metal's general mindset of more direct access to the low level GPU mechanisms and explicit directing things. The app developer has to create view , initial the copy , and synchronize the data. ( none of that is being done for them without code invocations. ).
The following means some 'work' needs to be explcitly laid out by applications.
"... To copy data between members of a peer group, make a remote view on the second GPU that’s connected to the resource you want to copy. .."
Once the views are created the apps can handily access them, but Metal is doing about nothing to automagically provide access to a normal , flat address space that apps gets from virtual host memory or virtual SMT/Hyperthread for CPU.
I suspect the "chop up the work/data for me and distribute it" may land more so in a future AVFramework or "ComputationXYZFramework" that Apple may deploy. I'd be surprised if a Metal 3 feature started to grow a bigger opaque layer. ( the other major stakeholders in the API like game engine developers will probably want their own layer to that and want Metal to provide them better building blocks .. not pre-fab houses. )
The problem with apple and GPGPU is they are just following CUDA, don't lead neither introduce nothing New, just another redundant proprietary API.
The is a bit of a stretch. Message Passing Interface (MPI) has existed since the earily 1990's and had concepts of "smoothing out" accessing memory on multiple nodes/host that was being used to solve a distributed computational problem. How to route that intercommunication down onto a "faster" fabric is a general problem that was well underway before GPGPU concept took off.
The AMD vs Nvidia fanboy wars tend drift into CUDA being some point of original of good HPC / "Highly parallel" compute concepts , but there was lots of work before any of those became trendy.