Using Dell host with two GPUs installed, I get unexpected high GPU usage in VMs
If using one or two GPUs in the Dell R730, make sure that you have the correct Dell 2nd Riser, especially if you add a GPU after buying the server. If you order an R730 from Dell without a GPU it most likely will not have the correct riser installed.
There are 2 versions of the second riser "Riser 3". The "Default" riser has dual 8x PCI-E slots but the GPU-enabled "Alternate" riser has a single 16x PCI-E Slot. The correct Dell part number for the 16x Riser is "0800JH"
Any GPU Cards installed on Riser 3 with the incorrect 8x Riser will prematurely show 100% GPU utilization on all GPU cores with Nvidia-SMI. With the 8x Riser the GPU card's GPU usage will be abnormal and may prematurely exhaust available GPU resources. Another indicator of this problem will be the significant impact to end-user experience, with significant impact to graphics performance on VM's located on those GPU cores.
Do not install a GPU on the "Default" 8x Riser 3.
Per Dell documentation here http://i.dell.com/sites/doccontent/shared-content/data-sheets/en/Documents/Dell-PowerEdge-R730-and-R730xd-Technical-Guide-v1-7.pdf , Dell says:
"The GPUs are installed on the PCIe x16 3.0 interfaces available on riser 2 and GPU-optional riser 3. A system must have the optional riser 3 with a single x16 slot to support two double-wide GPUs."