Nvidia Esxi Driver

This driver enables VMware's vSGA shared GPU capabilities. Hardware based graphics acceleration for 3D workloads for VMware Horizon View virtual desktops or vSphere virtual machines. Support for VMware ESXi 5.5. Either the vSphere Client, the Web Client, or an SCP program to upload the NVIDIA software to the host. Downloading the Software Before configuring an ESXi 6.0 host to utilize vGPU, the NVIDIA software needs to be retrieved from the NVIDIA site and put in a location where all hosts can access it. Jul 10, 2018  Nvidia vGPU esxi 6.7 driver Prob. SidiusB Jul 6, 2018 2:39 AM WE have 2 Fujitsu RX2540 M4. We have 2 Nvidia TEsla M10 on it I updated the vsphere Software from 6.5 to 6.7 but i didn remove the nvidia 6.5 vib software before.

  1. Nvidia Driver Esxi 6.5
  2. Nvidia Esxi Driver 7

This is part 3 of a series of blog articles on the subject of using GPUs with VMware vSphere.

Part 1 of this series presents an overview of the various options for using GPUs on vSphere

Part 2 describes the DirectPath I/O (Passthrough) mechanism for GPUs

Part 3 gives details on setting up the NVIDIA Virtual GPU (vGPU) technology for GPUs on vSphere

Part 4 explores the setup for the Bitfusion Flexdirect method of using GPUs

In this article, we describe the NVIDIA vGPU (formerly “Grid”) method for using GPU devices on vSphere. The focus in this blog is on the use of GPUs for compute workloads (such as for machine learning, deep learning and high performance computing applications) and we are not looking at GPU usage for virtual desktop infrastructure (VDI) here.

The method of GPU usage on vSphere described here makes use of the products within the NVIDIA vGPU family. This vGPU family includes the “NVIDIA Virtual ComputeServer”, (VCS) and the “NVIDIA Quadro Virtual Datacenter Workstation” (vDWS) products for GPU access and management on vSphere, as well as other products. Here, we use the term “NVIDIA vGPU” as a synonym for the software product you choose from the vGPU family of products. NVIDIA recommends the vCS software product for machine learning and AI workloads, whereas vDWS was used for that purpose before vCS appeared on the market. These are licensed software products from NVIDIA.

Figure 1: Parts of the NVIDIA vGPU product shown in the ESXi Hypervisor and in virtual machines

Figure 1 shows the relationship of the parts of the NVIDIA vGPU product to each other in the overall vSphere and virtual machine architecture.

The NVIDIA vGPU software includes two separate components:

  1. The NVIDIA Virtual GPU Manager, that is loaded as a VMware Installation Bundle (VIB) into the vSphere ESXi hypervisor itself and
  2. A separate guest OS NVIDIA vGPU driver that is installed within the guest operating system of your virtual machine (the “guest VM driver”).

Using the NVIDIA vGPU technology with vSphere allows you to choose between dedicating a full GPU device to one virtual machine or to allow partial sharing of a GPU device by more than one virtual machine.

The reasons for choosing this NVIDIAvGPU option are

  • we know that the applications in your VMs do not need the power of full GPU;
  • there is a limited number of GPU devices and we want them to be available to more than one team of users simultaneously;
  • we sometimes want to dedicate a full GPU device to one VM, but at other times allow partial use of a GPU to a VM.

The released versions of the NVIDIA vGPU Manager and guest VM drivers that you install must be compatible. For all the versions of the software, versions of vSphere and the hardware versions, consult the current NVIDIA Release Notes. At the time of this writing, the NVIDIA vGPU release notes are located here.

1.NVIDIA vGPU Setup on the vSphere Host Server

The vSphere ESXi host server-specific part of the setup process is described first here. In order to set up the NVIDIA vGPU environment you will need:

  • The licensed NVIDIA vGPU product (including the VIB for vSphere and the guest OS driver)
  • Administrator login access to the console of your vSphere/ESXi machine
Nvidia

VMware recommends that you choose vSphere version 6.7 for this work. Choosing vSphere 6.7 update 1 will allow you to use the vMotion feature along with your GPU-enabled VM’s. If you choose to use vSphere 6.5 then ensure you are on update 1 before proceeding.

Nvidia Driver Esxi 6.5

Carefully review the pre-requisites and other details in the NVIDIA vGPU Software User Guide document

The host server part of the NVIDIA vGPU installation process makes use of a “vib install” technique that is used in vSphere for installing drivers into the ESXi hypervisor itself. For more information on using vSphere VIBs, you should check this material

Nvidia Esxi Driver 7

The NVIDIA vGPU Manager is contained in the VIB package that is downloaded from NVIDIA’s website. The package can be found by searching for the NVIDIA Quadro Virtual DataCenter Workstation (or vDWS) products on the NVIDIA site.

To install the NVIDIA vGPU Manager software into the vSphere ESXi hypervisor follow the procedure below.

1.1 Set the GPU Device to vGPU Mode Using the vSphere Host Graphics Setting

A GPU card can be configured in one of two modes: vSGA (shared virtual graphics) and vGPU. The NVIDIA card should be configured with vGPU mode. This is specifically for use of the GPU in compute workloads, such as in machine learning or high performance computing applications.

Access the ESXi host server either using the ESXi shell or through SSH. You will need to enable SSH access using the ESXi management console as SSH is disabled by default.To enable vGPU mode on the ESXi host, use the command line to execute this command:

# esxcli graphics host set –-default-type SharedPassthru

You may also get to this setting through the vSphere Client by choosing your host server and using the navigation

“Configure -> Hardware -> Graphics -> Host Graphics tab -> Edit”

A server reboot is required once the setting has been changed. The settings should appear as shown in Figure 2 below.

Figure 2: Edit Host Graphics screen for a Host Server in the vSphere Client

1.2 Check the Host Graphics Settings

To check that the settings have taken using the command line, type

# esxcli graphics host get

This command should produce output as follows:

Before installing the VIB, place the ESXi host server into maintenance mode (i.e. all virtual machines are moved away or quiesced)

# esxcli system maintenanceMode set –enable true

Note: Ensure that you install the VIB AFTER enabling the vGPU mode on your ESXi host. Otherwise, if you try to enable “Shared Direct” in the vSphere Client UI for this device, once the VIB is installed, it will not take effect.

To install the VIB, use a command similar to the following (where the path to your VIB may differ)

# esxcli software vib install -v /vmfs/volumes/ARL-ESX14-DS1/NVIDIA/NVIDIA-VMware_ESXi_6.7_Host_Driver_390.42-1OEM.670.0.0.7535516.vib

This command produces the following output:

2
the license was acquired successfully from the correct server url
the system islicensed forgrid vgpu<strong></strong>

4.install and test the cuda libraries

this section has two approaches to completing the task, the first using containers to simplify the versioning of the various components – and the second more manually-driven approach.

4.1 installation using containers

the cuda and ml frameworks can be installed using docker containers. this is provided so that the person who is installing can avoid the complexity of installing each component one by one. the approach involved the use of the “nvidia-docker” tool along with the cuda/machine learning or high performance computing containers supplied by nvidia. this approach to installing these components is detailed in the vmware enabling machine learning as a service with gpu acceleration document.

both methods of installation work with virtual machines on vsphere.

4.2 manual installation

this section describes the installation of the cuda libraries without the use of containers.

4.2.1 download the libraries

ensure that you are using compatible versions of the cuda libraries with the version of your guest operating system driver. as an example of version compatibility, version 9.1 of the cuda libraries is compatible with the “390” version of the drivers.

download the appropriate packages (using a “wget” command, for example). the file name examples we use here for cuda version 9.1 may not apply to your installation as subsequent versions become available.

4.2.2 cuda library installation

to install the cuda libraries, use the command:

# sudo sh cuda_9.1.85_387.26_linux

accept the eula conditions.

when prompted to install the driver, answer “no”

respond “yes” to the questions about libraries, symbolic links and samples.

4.3 cuda application code testing

compile and run the example “devicequery” program. this should work correctly.

compile and run the “vectoradd” program. ensure that the gpu is working.

you may now proceed to installing the cudnn and your chosen platform for machine learning application development (such as tensorflow)- or you may use the container method for installation as an alternative.

references

wrap=' soft'="" readonly='>Installation Result Message: Operation finished successfully. Reboot Required: false VIBs Installed: NVIDIA_bootbank_NVIDIA-VMware_ESXi_6.7_Host_Driver_390.42-1OEM.670.0.0.7535516 VIBs Removed: VIBs Skipped:
2
4
6
8
10
VIBs Installed:NVIDIA_bootbank_NVIDIA-VMware_ESXi_6.7_Host_Driver_390.42-1OEM.670.0.0.7535516
VIBs Removed:
VIBs Skipped:

Take the ESXi host server out of Maintenance Mode, using this command

# esxcli system maintenanceMode set –enable false

1.4 List the VIB that was installed in the ESXi Hypervisor

To list the VIBs installed on the ESXi host and ensure that the NVIDIA VIB was done correctly, use the command:

# esxcli software vib list grep –i NVIDIA