Exactly!
The Mac App Store version is crippled and the result of an arrangement between BMD and Apple. Apps on Apple's store can't access some deeper functions and hardware stuff because they're highly sandboxed. So Apple allowed them to make a version that could make a proper use of the trashcan Mac Pro (especially with two GPU) yet wouldn't be able to use CUDA and others. It might have made sense to buy it until v12 if you were running it with a trash-Mac Pro because at $299 it was cheaper than the real Studio version - and also because you can't buy from BMD online, you have to get a physical copy. But recently, BMD dropped the price of the real Studio Version to $299 and switched from a usb dongle to a serial number license, making it easier to buy it online.
And, icing on the cake, if you have a dongle version, they will happily provide you with a serial nr, effectively giving you a free second version.
If they continue at this rate (innovations/perf/prices), it won't be long before Premiere is gone for good, and only Youtubers/mid end users stick to FCPX.
Just some sharing, I managed to do some test on V14.
I can see the options of choosing CUDA, OpenCL, and METAL (or leave it Auto).
Did a quick test.
1) Import a 10s 720P@60 video with zero editing
2) Deliver 3840x2160@30 MP4 H264 to a SSD
Result
CUDA 1:24 with poor output quality (the video become seriously interlaced)
OpenCL 1:11 with good video quality
METAL 1:07 with good video quality
Auto 1:08 with good video quality
And the power draw from the 1080Ti, METAL up to 120W, all other options are up to 60W.
Clearly the GPU was doing something, but not that busy.
What I suspect is that the GPU can only do the rendering part, but not the encoding part (the video engine still disabled, only GPGPU is in use). Therefore, no matter how fast the GPU can render, it still has to wait for the CPU to encode. My understanding is as follow.
1) We edit a video and want to export it
2) the GPU render each frame, and store it in somewhere, may be cache, may be memory
3) the CPU now compare the difference between frames and encode it accordingly
Therefore, once you hit the CPU encoding limit, upgrade the GPU won't help. I think it is at least true on the MacOS side. The CUDA hardware encoding library simply not exist in MacOS (I can use the nvenc option in Windows FFMpeg, but not MacOS).
This also explain why you can get some improvement by upgrading the GPU (until you hit the 60FPS limit). Because when the rendering part is limiting, a faster GPU do help.
Anyway, from the test result. I will leave the setting at Auto, leave the software to decide which API to use. And definitely avoid manually select CUDA.