I've managed to disable SIP in Sierra, though I had to do it differently than you would on a real Mac. In ESXi the CMD-R (or whatever) keystroke doesn't work at boot, so I had to use the VMWare boot menu to boot the recovery partition, then disable SIP. VoodooHDA actually works, but it pops and distorts when the VM has a lot of I/O going on. So watching YouTube is painful when, for example, downloading emails.I used a VM Mac on ESXi for about a year as my primary work machine, never could resolve audio and video issues. Granted, the host was a server class machine with limited A\V resources. I took the same steps you mention without positive results. So, I got an old Mini and loaded it with the backup from the VM. Worked fine until SIP came on the scene and the VoodooHDA caused issues with booting. So, I removed all the non-Apple kext files and it worked fine after that.
So, beware if you use migration or restore from the TM backup of a VM.
I got around the video issues by installing a GTX 1050 Ti, enabling GPU pass through in ESXi, then installing the Nvidia web driver remotely through the VMWare console. It works perfectly, though I've had one minor issue with fullscreen video playback. Even got HDMI audio working thru VoodooHDA, but it has more issues with I/O than the onboard audio.
I don't actually have a server running ESXi. It's a Dell T5500 workstation with two video slots, so the logistics of GPU passthrough are easier to accomplish than with a true server. Fortunately it was 100% compatible with ESXi, though I don't have hardware monitoring.