Here is a list of VMWARE shared pass-through graphics guide:
https://www.vmware.com/resources/compatibility/pdf/vi_sptg_guide.pdf
Deploying Hardware-Accelerated Graphics with VMware Horizon 7:
https://techzone.vmware.com/resource/deploying-hardware-accelerated-graphics-vmware-horizon-7
Installation, Configuration, and Setup
For graphics acceleration, you need to install and configure the following components:
- ESXi 6.x host
- Virtual machine
- Guest operating system
- Horizon 7 version 7.x desktop pool settings
- License server
ESXi 6.x Host
Installing the graphics card and configuring the ESXi host vary based on the type of graphics acceleration.
Installing and Configuring the ESXi Host for vSGA or vGPU
- Install the graphics card on the ESXi host.
- Put the host in maintenance mode.
- If you are using an NVIDIA Tesla P card, disable ECC.
- If you are using an NVIDIA Tesla M card, set the card to graphics mode (the default is compute) using GpuModeSwitch, which comes as a bootable ISO or a VIB.
a. Install GpuModeSwitch without an NVIDIA driver installed:
esxcli software vib install --no-sig-check -v /<path_to_vib>/NVIDIA-GpuModeSwitch-
1OEM.xxx.0.0.xxxxxxx.x86_64.vib
b. Reboot the host.
c. Change all GPUs to graphics mode:
gpumodeswitch --gpumode graphics
d. Remove GpuModeSwitch:
esxcli software vib remove -n NVIDIA-VMware_ESXi_xxx_GpuModeSwitch_Driver
- Install the GPU VIB:
esxcli software vib install -v /<path_to_vib>\NVIDIA-VMware_ESXi_xxx_Host_Driver_xxx.xx-1OEM.xxx.0.0.xxxxxxx.vib
If you are using ESXi 6.0, vSGA and vGPU have different VIB files.
If you are using ESXi 6.5, both vSGA and vGPU use the same VIB file.
- Reboot, and take the host out of maintenance mode.
- If you are using an NVIDIA card and vSphere 6.5 or later, in the vSphere Web Client, navigate to Host > Configure > Hardware > Graphics > Host Graphics > Edit to open the Edit Host Graphics Settings window.
a. For vGPU, select Shared Direct. For vSGA, select Shared.
b. If you are using vGPU with different profiles per GPU, select Group VMs on GPU until full (GPU consolidation). In this case, different profiles are placed on different GPUs, and same profiles are placed on the same GPU until it is full. This method prevents you from running out of free GPUs for different profiles.
Example:
The host has a single M60 card, which has two GPUs. Each GPU has 8 GB of memory. Two VMs with 4 GB of frame buffer and four VMs with 2 GB are trying to run. If the first two machines started have the same profile, they are placed on different GPUs. As a result, no GPU is available for the other profile. With Group VMs on GPU until full (GPU consolidation), virtual machines with the same profile start on the same GPU.
Installing and Configuring the ESXi Host for MxGPU
- Install the graphics card on the ESXi host.
- Put the host in maintenance mode.
- In the BIOS of the ESXi host, verify that single-root IO virtualization (SR-IOV) is enabled and that one of the following is also enabled.
- Intel Virtualization Technology support for Direct I/O (Intel VT-d)
- AMD IO memory management unit (IOMMU)
- Browse to the location of the AMD FirePro VIB driver and AMD VIB install utility:
cd /<path_to_vib>
- Make the VIB install utility executable, and execute it:
chmod +x mxgpu-install.sh && sh mxgpu-install.sh –i
- In the script, select the option that suits your environment:
Enter the configuration mode([A]uto/[H]ybrid/[M]anual,default:A)A
- For the number of virtual functions, enter the number of users you want to run on a GPU:
Please enter number of VFs: (default:4): 8
- Choose whether you want keep performance fixed and independent of the number of active VMs:
Do you want to enable Predictable Performance? ([Y]es/[N]o,default:N)N
…
Done
The configuration needs a reboot to take effect
- Reboot and take the host out of maintenance mode.
Installing and Configuring the ESXi Host for vDGA
- Install the graphics card on the ESXi host.
- In the BIOS of the ESXi host, verify that Intel VT-d or AMD IOMMU is enabled.
- To enable pass-through for the GPU in the vSphere Web Client, navigate to Host > Configure > Hardware > PCI Devices > Edit.
- In the All PCI Devices window, select the GPU, and reboot.
Virtual Machine
Configure the general settings for the virtual machine, and then configure it according to the type of graphics acceleration you are using.
General Settings for Virtual Machines
Hardware level – The recommended hardware level is the highest that all hosts support. The minimum is hardware level version 11.
CPU – The number of CPUs required depends on usage and is determined by actual workload. As a starting point, consider these numbers:
Knowledge workers: 2
Power users: 4
Designers: 6
Memory – The amount of memory required depends on usage and is determined by actual workload. As a starting point, consider these amounts:
Knowledge workers: 2 GB
Power users: 4 GB
Designers: 8 GB
Virtual network adapter – The recommended virtual network adapter is VMXNET3.
Virtual storage controller – The recommended virtual disk is LSI Logic SAS, but demanding workloads using local flash-based storage might benefit from using VMware Paravirtual.
Other devices – We recommend removing devices that are not used, such as a COM port, a printer port, DVD, or floppy.
Now that you have configured the general settings for the virtual machine, configure the settings for the type of graphics acceleration.
Virtual Machine Settings for vSGA
Configure the virtual machine as follows if you are using vSGA.
- Enable 3D graphics by selecting Enable 3D Support.
- Set the 3D Renderer to Automatic or Hardware.
Automatic uses hardware acceleration if the host that the virtual machine is starting in has a capable and available hardware GPU. If a hardware GPU is not available, the virtual machine uses software 3D rendering for 3D tasks. The Automatic option allows the virtual machine to be started on or migrated to (via vSphere vMotion) any host (vSphere version 5.0 or later) and to use the best solution available on that host.
Hardware uses only hardware-accelerated GPUs. If a hardware GPU is not present in a host, the virtual machine does not start, or you cannot perform a live vSphere vMotion migration to that host. Migration is possible as long as the host that the virtual machine is being moved to has a capable and available hardware GPU. The Hardware option guarantees that a virtual machine always uses hardware 3D rendering when a GPU is available, but it limits the virtual machine to using hosts that have hardware GPUs.
- Select the amount of video memory (3D Memory).
3D Memory has a default of 96 MB, a minimum of 64 MB, and a maximum of 512 MB.
Virtual Machine Settings for vGPU
Configure the virtual machine as follows if you are using vGPU.
- On the vSphere console, select your virtual machine, and navigate to Edit Settings.
- Add a shared PCI device to the virtual machine, and select the appropriate PCI device to enable GPU pass-through on the virtual machine. In this case, select NVIDIA GRID vGPU.
- From the GPU Profile drop-down menu, select the correct profile.
The last part of the GPU Profile string (4q in this example) indicates the size of the frame buffer (VRAM) in gigabytes and the required GRID license. For the VRAM, 0 means 512 MB, 1 means 1024 MB, and so on. So for this profile, the size is 4 GB. The possible GRID license types are:
b – GRID Virtual PC virtual GPUs for business desktop computing
a – GRID Virtual Application virtual GPUs for Remote Desktop Session Hosts
q – Quadro Virtual Datacenter Workstation (vDWS) for workstation-specific graphics features and accelerations, such as up to four 4K monitors and certified drivers for professional applications
- Click Reserve all memory, which reserves all memory when creating the virtual machine.
Virtual Machine Settings for MxGPU and vDGA
Configure the virtual machine as follows if you are using MxGPU or vDGA.
- For devices with a large BAR size (for example, Tesla P40), you must use vSphere 6.5 and set the following advanced configuration parameters on the VM:
- firmware="efi"
- pciPassthru.use64bitMMIO="TRUE"
- pciPassthru.64bitMMIOSizeGB="64"
- Add a PCI device (virtual functions are presented as PCI devices) to the virtual machine, and select the appropriate PCI device to enable GPU pass-through.
With MxGPU, you can also do this by installing the Radeon Pro Settings for the VMware vSphere Client Plug-in.
To add a PCI device for multiple machines at once, from ssh:
a. Browse to the AMD FirePro VIB driver and AMD VIB install utility:
cd /<path_to_vib>
b. Edit vms.cfg:
vi vms.cfg
Press I, and change the instances of .* to match the names of your VMs that require a GPU. For example, to match *MxGPU* to VM names that include MxGPU, such as WIN10-MxGPU-001 or WIN8.1-MxGPU-002:
.*MxGPU.*
To save and quit, press Esc, type :wq, and press Enter.
c. Assign the virtual functions to the VMs:
sh mxgpu-install.sh –a assign
Eligible VMs:
WIN10-MxGPU-001
WIN10-MxGPU-002
WIN8.1-MxGPU-001
WIN8.1-MxGPU-002
These VMs will be assigned a VF, is it OK?[Y/N]y
d. Press Enter.
- Select Reserve all guest memory (All locked).
Guest Operating System
Install and configure the guest operating system.
Windows Guest Operating System
For a Windows guest operating system, install and configure as follows.
- Install Windows 7, 10, or 2012 R2, and install all updates.
- The following installations are also recommended.
- Install common Microsoft runtimes and features.
Before updating Windows in the VM, install the required versions of Microsoft runtimes that are patched by Windows Update and that can run side by side in the image. For example, install:
- .NET Framework (3.5, 4.5, and so on)
- Visual C++ Redistributables x86 / x64 (2005 SP1, 2008, 2012, and so on)
- Install Microsoft updates.
Install the updates to Microsoft Windows and other Microsoft products with Windows Update or Windows Server Update Service. You might need to first manually install Windows Update Client for Windows 8.1 and Windows Server 2012 R2: March 2016.
- Tune Windows with the VMware OS Optimization Tool using the default options
- If you are not using vSGA:
- Obtain the GPU drivers from the GPU vendor (with vGPU, this is a matched pair with the VIB file).
- Install the GPU device drivers in the guest operating system of the virtual machine. For MxGPU, make sure that the GPU Server option is selected.
- Install VMware Tools™ and Horizon Agent (select 3D RDSH feature for Windows 2012 R2 Remote Desktop Session Hosts) in the guest operating system.
- Reboot the system.
Red Hat Enterprise Linux Operating System for vGPU and vDGA
For a Red Hat Enterprise Linus guest operating system, install and configure as follows.
- Install Red Hat Enterprise Linux 6.9 or 7.4 x64, install all updates, and reboot.
- Install gcc, kernel makefiles, and headers:
sudo yum install gcc-c++ kernel-devel-$(uname -r) kernel-headers-$(uname -r) -y
- Disable libvirt:
sudo systemctl disable libvirtd.service
- Disable the open-source nouveau driver.
a. Open the following configuration file using vi:
sudo vi /etc/default/grub
If you are using RHEL 6.x:
sudo vi /boot/grub/grub.conf
b. Find the line for GRUB_CMDLINE_LINUX, and add blacklist=nouveau to the line.
c. Add the line blacklist=nouveau anywhere in the following configuration file:
sudo vi /etc/modprobe.d/blacklist.conf
- Generate new grub.cfg and initramfs files:
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
sudo dracut /boot/initramfs-$(uname -r).img $(uname -r) -f
- Reboot.
- Install the NVIDIA driver, and acknowledge all questions:
init 3
chmod +x NVIDIA-Linux-x86_64-xxx.xx-grid.run
sudo ./NVIDIA-Linux-x86_64-xxx.xx-grid.run
- (Optional) Install the CUDA Toolkit (run file method recommended), but do not install the included driver.
- Add license server information:
sudo cp /etc/nvidia/gridd.conf.template /etc/nvidia/gridd.conf
sudo vi /etc/nvidia/gridd.conf
Set ServerAddress and BackupServerAddress to the DNS names or IPs of your license servers, and FeatureType to 1 for vGPU and 2 for vDGA.
- Install the Horizon Agent:
tar -zxvf VMware-horizonagent-linux-x86_64-7.3.0-6604962.tar.gz
cd VMware-horizonagent-linux-x86_64-7.3.0-6604962
sudo ./install_viewagent.sh
Following is a screenshot of the NVIDIA X Server Settings window showing the results of installation and configuration for a Red Hat Enterprise Linux guest operating system.
Horizon 7 version 7.x Pool and Farm Settings
During the creation of a new farm in Horizon 7, configuring a 3D farm is the same as a normal farm. During the creation of a new View desktop pool in Horizon 7, configure the pool as normal until you reach the Desktop Pool Settings section.
- In the Add Desktop Pool window scroll to the Remote Display Protocol section.
- For the 3D Renderer option, do one of the following.
- For vSGA, select either Hardware or Automatic.
- For vDGA or MxGPU, select Hardware.
- For vGPU, select NVIDIA GRID VGPU.
Automatic uses hardware acceleration if the host that the virtual machine is starting in has a capable and available hardware GPU. If a hardware GPU is not available, the virtual machine uses software 3D rendering for any 3D tasks. The Automatic option allows the virtual machine to be started on, or migrated (via vSphere vMotion) to any host (VMware vSphere version 5.0 or later), and to use the best solution available on that host.
Hardware uses only hardware-accelerated GPUs. If a hardware GPU is not present in a host, the virtual machine will not start, or you cannot perform a live vSphere vMotion migration to that host. Migration is possible as long as the host the virtual machine is being moved to has a capable and available hardware GPU. The Hardware option guarantees that a virtual machine always uses hardware 3D rendering when a GPU is available, but it limits the virtual machine to using hosts that have hardware GPUs.
For Horizon 7 version 7.0 or 7.1, configure the amount of VRAM you want each virtual desktop to have. If you are using vGPU, also select the profile to use. With Horizon 7 version 7.1, you can use vGPU with instant clones, but the profile must match the profile set on the parent VM with the vSphere Web Client.
3D Memory has a default of 96 MB, a minimum of 64 MB, and a maximum of 512 MB.
With Horizon 7 version 7.2 and later, the video memory and vGPU profile are inherited from the VM or VM snapshot.
License Server
For vGPU with GRID 2.0, you must install a license server. See the GRID Virtual GPU User Guide included with your NVIDIA driver download.
Resource Monitoring
Various tools are available for monitoring resources when using graphics acceleration.
gpuvm
To better manage the GPU resources available on an ESXi host, examine the current GPU resource allocation. The ESXi command-line query utility gpuvm lists the GPUs installed on an ESXi host and displays the amount of GPU memory that is allocated to each virtual machine on that host.
gpuvm
Xserver unix:0, GPU maximum memory 2076672KB
pid 118561, VM "Test-VM-001", reserved 131072KB of GPU memory pid 664081, VM "Test-VM-002", reserved 261120KB of GPU memory GPU memory left 1684480KB
nvidia-smi
To get a summary of the vGPUs currently running on each physical GPU in the system, run nvidia-smi without arguments.
Thu Oct 5 09:28:05 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.73 Driver Version: 384.73 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P40 On | 00000000:84:00.0 Off | Off |
| N/A 38C P0 60W / 250W | 12305MiB / 24575MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 135930 M+C+G manual 4084MiB |
| 0 223606 M+C+G centos3D004 4084MiB |
| 0 223804 M+C+G centos3D003 4084MiB |
+-----------------------------------------------------------------------------+
To monitor vGPU engine usage across multiple vGPUs, run nvidia-smi vgpu with the –u or --utilization option:
nvidia-smi vgpu -u
The following usage statistics are reported once every second for each vGPU.
#gpu
vgpu
sm
mem
enc
dec
#Idx
Id
%
%
%
%
0
11924
6
3
0
0
1
11903
8
3
0
0
2
11908
10
4
0
0
Key:
gpu – GPU ID
vgpu – vGPU ID
sm – Compute
mem – Memory controller bandwidth
enc – Video encoder
dec – Video decoder
Troubleshooting
Try these troubleshooting techniques to address general problems or a specific symptom.
General Troubleshooting for Graphics Acceleration
If an issue arises with vSGA, vGPU, or vDGA, or if Xorg fails to start, try one or more of the following solutions in any order.
Verify That the GPU Driver Loads
To verify that the GPU VIB is installed, run one of the following commands.
- For AMD-based GPUs:
#esxcli software vib list | grep fglrx
- For NVIDIA-based GPUs:
#esxcli software vib list | grep NVIDIA
If the VIB is installed correctly, the output resembles the following:
NVIDIA-VMware 304.59-1-OEM.510.0.0.799733 NVIDIA
VMwareAccepted 2012-11-14
To verify that the GPU driver loads, run the following command.
- For AMD-based GPUs:
#esxcli system module load –m fglrx
- For NVIDIA-based GPUs:
#esxcli system module load –m nvidia
If the driver loads correctly, the output resembles the following:
Unable to load module /usr/lib/vmware/vmkmod/nvidia: Busy
If the GPU driver does not load, check the vmkernel.log:
# vi /var/log/vmkernel.log
On AMD hardware, search for FGLRX. On NVIDIA hardware, search for NVRM. Often, an issue with the GPU is identified in the vmkernel.log.
Verify That Display Devices Are Present in the Host
To make sure that the graphics adapter is installed correctly, run the following command on the ESXi host:
# esxcli hardware pci list –c 0x0300 –m 0xff
The output should resemble the following example, even if some of the particulars differ:
000:001:00.0
Address: 000:001:00.0
Segment: 0x0000
Bus: 0x01
Slot: 0x00
Function: 0x00
VMkernel Name:
Vendor Name: NVIDIA Corporation
Device Name: NVIDIA Quadro 6000
Configured Owner: Unknown
Current Owner: VMkernel
Vendor ID: 0x10de
Device ID: 0x0df8
SubVendor ID: 0x103c
SubDevice ID: 0x0835
Device Class: 0x0300
Device Class Name: VGA compatible controller
Programming Interface: 0x00
Revision ID: 0xa1
Interrupt Line: 0x0b
IRQ: 11
Interrupt Vector: 0x78
PCI Pin: 0x69
Check the PCI Bus Slot Order
If you installed a second lower-end GPU in the server, the ESXi console session chooses the higher-end card. If this occurs, swap the two GPUs between PCIe slots, or change the primary GPU settings in the server BIOS. Then the card for the console (low-end) will come first.
Check Xorg Logs
If the correct devices are present in the previous troubleshooting methods, view the Xorg log file to see if there is an obvious issue:
# vi /var/log/Xorg.log
Troubleshooting Specific Issues in Graphics Acceleration
This section describes solutions to specific issues that could arise in graphics acceleration deployments.
Problem:
sched.mem.min error when starting the virtual machine.
Solution:
Check sched.mem.min.
If you get a vSphere error about sched.mem.min, add the following parameter to the VMX file of the virtual machine:
sched.mem.min = "4096"
Note: The number in quotes, 4096 in the previous example, must match the amount of configured virtual machine memory. The example is for a virtual machine with 4 GB of RAM.
Problem:
Only able to use one display in Windows 10 with vGPU -0B or -0Q profiles.
Solution:
Use a profile that supports more than one virtual display head and has at least 1 GB of frame buffer.
To reduce the possibility of memory exhaustion, vGPU profiles with 512 MB or less of frame buffer support only one virtual display head on a Windows 10 guest OS.
Problem:
Unable to use NVENC with vGPU -0B or -0Q profiles.
Solution:
If you require NVENC to be enabled, use a profile that has at least 1 GB of frame buffer.
Using the frame buffer for the NVIDIA hardware-based H.264 / HEVC video encoder (NVENC) might cause memory exhaustion with vGPU profiles that have 512 MB or less of frame buffer. To reduce the possibility of memory exhaustion, NVENC is disabled on profiles that have 512 MB or less of frame buffer.
Problem:
Unable to load vGPU driver in the guest operating system.
Depending on the versions of drivers in use, the vSphere VM’s log file reports one of the following errors.
- A version mismatch between guest and host drivers:
vthread-10| E105: vmiop_log: Guest VGX version(2.0) and Host VGX version(2.1) do not match
vthread-10| E105: vmiop_log: VGPU message signature mismatch
Solution:
Install the latest NVIDIA vGPU release driver matching the installed VIB on ESXi in the VM.
Problem:
Tesla-based vGPU fails to start.
Solution:
Disable error-correcting code (ECC) on all GPUs.
Tesla GPUs support ECC, but the NVIDIA GRID vGPU does not support ECC memory. If ECC memory is enabled, the NVIDIA GRID vGPU fails to start. The following error is logged in the VMware vSphere VM’s log file:
vthread10|E105: Initialization: VGX not supported with ECC Enabled.
- Use nvidia-smi to list the status of all GPUs.
- Check whether ECC is enabled on the GPUs.
- Change the ECC status to Off on each GPU for which ECC is enabled by executing the following command:
nvidia-smi -i id -e 0 (id is the index of the GPU as reported by nvidia-smi)
- Reboot the host.
Problem:
Single vGPU benchmark scores are lower than the pass-through GPU.
Solution:
Disable the Frame Rate Limiter (FRL) by adding the configuration parameter pciPassthru0.cfg.frame_rate_limiter with a value of 0 in the VM’s advanced configuration options.
FRL is enabled on all vGPUs to ensure balanced performance across multiple vGPUs that are resident on the same physical GPU. FRL is designed to provide a good interactive remote graphics experience, but it can reduce scores in benchmarks that depend on measuring frame-rendering rates as compared to the same benchmarks running on a pass-through GPU.
Problem:
VMs configured with large memory fail to initialize the vGPU when booted.
When starting multiple VMs configured with large amounts of RAM (typically more than 32 GB per VM), a VM might fail to initialize the vGPU. The NVIDIA GRID GPU is present in Windows Device Manager but displays a warning sign and the following device status:
Windows has stopped this device because it has reported problems. (Code 43)
The vSphere VM’s log file contains these error messages:
vthread10|E105: NVOS status 0x29
vthread10|E105: Assertion Failed at 0x7620fd4b:179
vthread10|E105: 8 frames returned by backtrace
...
vthread10|E105: VGPU message 12 failed, result code: 0x29
...
vthread10|E105: NVOS status 0x8
vthread10|E105: Assertion Failed at 0x7620c8df:280
vthread10|E105: 8 frames returned by backtrace
...
vthread10|E105: VGPU message 26 failed, result code: 0x8
Solution:
A vGPU reserves a portion of the VM’s frame buffer for use in GPU mapping of VM system memory. The default reservation is sufficient to support up to 32 GB of system memory. You can accommodate up to 64 GB by adding this configuration parameter:
pciPassthru0.cfg.enable_large_sys_mem
with a value of 1 in the VM’s advanced configuration options.
Summary
VMware Horizon 7 offers three technologies for hardware-accelerated graphics, each with its own advantages.
- Virtual Shared Pass-Through Graphics Acceleration (MxGPU or vGPU) – Best match for nearly all use cases.
- Virtual Shared Graphics Acceleration (vSGA) – For light graphical workloads that use only DirectX9 or OpenGL 2.1 and require the maximum level of consolidation.
- Virtual Dedicated Graphics Acceleration (vDGA) – For heavy graphical workloads that require the maximum level of performance.
[doublepost=1556428162][/doublepost]If you notice in the list of supported cards there is GRID K1 and GRID K2 for NVIDIA. You can covert consumer NVIDIA to K1 and K2:
" It normally requires hard wire mod then after flash the GPU BIOS it will act like Grid/Quadro but the function is limited. For example, even you could mode as K1/K2 but you can only use pass-through to a single VM. It's good for testing propose, but I won't use that for production.
Current gen K1/K2 price has been going down since the announcement of the new gen. Quadro on the other hand can be found in reasonable price. K4000 can be found around $450. You will be better off use pass-through for that. If more user session is required I would do RDS/XenApp with K4000 pass-through. You can easily get 10-20 users using CAD that way with reasonable performance."
For example conversion of GT 640 to Grid K1:
http://www.eevblog.com/forum/chat/h...HPSESSID=r3ift8acta09i2bg0iildomg85#msg213332
[doublepost=1556429476][/doublepost]More on vDGA:
https://www.vmware.com/content/dam/...phics-acceleration-deployment-white-paper.pdf
http://images.nvidia.com/content/grid/vmware/horizon-with-grid-vgpu-faq.pdf
https://images.nvidia.com/content/pdf/vgpu/guides/vgpu-deployment-guide-horizon-on-vsphere-final.pdf