I think your speeds represent the maximum available on the cMPs without a bifurcation card. I get your speeds or somewhat less with a 960EVO on my 6-core.
On my 12-core, I have the popular Syba /IO Crest PCI card which, in slot 2 (16x) allows for 2 blades (1 970EVO, and 1 970 EVO+ in my case) and PCI3 speeds, or almost. They both get 2.4GB/s writes and 2.8GB/s reads. The cards have a fan and are are heat sync, and cost around $200. My only complaint is copying between them is way too slow.
I have Catalina on one blade and both Mojave and Big Sur in separate APFS volumes on the other. I have not been successful using multiple boot volumes with the Opencore EFI Boot, and I've tried more than once, but not conclusively.
On the 6-core, I have Catalina (OC-EFI) on the blade, and both Mojave and Big Sur on volumes on a eSATA3 SSD. I am not sure, but believe I can just remove the OC-EFI from Catalina and put it on either of the other boot volumes. That way I can have Big Sur on the NVMe as the EFI drive. The only problem is that Big Sur doesn't clone with SD or CCC yet, but not without workarounds.