SCHMAUSTECH: Enabling vGPU in OpenShift Containerized Virtualization

There is a lot of discussion about using GPUs for AI/ML workloads and while some of those workloads run in containers there are still some use cases where those workloads run in virtual machines. In OpenShift when using Containerized virtualization one can run virtual machines and use a PCI passhthrough configuration to pass up one of the GPUs into the virtual machine. This is clearly defined in the documentation here. However there are some cases where the entire GPU is not needed by the virtual machine and so rather then have wasted cycles we can pass a slice of the GPU into the virtual machine as a vGPU. In this blog I will demonstrate how to configure and pass up a virtual GPU into a virtual Linux machine.

Before we begin lets make a few assumptions about what has already been configured. We assume that we have a working OpenShift 4.9 cluster, could be a full cluster, a compact cluster or in my case just a single node cluster (SNO). We also can assume that Containerized virtualization and Node Feature Discovery operator has been installed via OperatorHub.

Now that we have the basic assumptions out of the way lets begin the process of enabling virtual GPUs. The very first step is to label the nodes that have a GPU installed:

$ oc get nodes
NAME    STATUS   ROLES           AGE   VERSION
sno2    Ready    master,worker   1d   v1.22.3+e790d7f

$ oc label nodes sno2 hasGpu=true
node/sno2 labeled

With the labeling of the node which will be used later when we deploy the driver we can now create the MachineConfig to enable the IOMMU:

$ cat << EOF > ~/100-master-kernel-arg-iommu.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
    machineconfiguration.openshift.io/role: master 
  name: 100-master-iommu 
spec:
  config:
    ignition:
      version: 3.2.0
  kernelArguments:
      - intel_iommu=on
EOF

With the MachineConfig created lets go ahead and apply it to the cluster:

$ oc create -f ~/100-master-kernel-arg-iommu.yaml
machineconfig.machineconfiguration.openshift.io/100-master-iommu created

Wait for the nodes where the machine config is applied to reboot. Once the nodes have rebooted we can continue onto the next step.

Once the nodes have rebooted we can verify the MachineConfig was applied by running the following:

$ oc get MachineConfig 100-master-iommu
NAME               GENERATEDBYCONTROLLER   IGNITIONVERSION   AGE
100-master-iommu                           3.2.0             6m25s

Now lets go ahead and build the driver container that will apply the NVIDIA driver to the worker nodes that have GPUs in them. I should note that in order to proceed the NVIDIA GRID drivers need to be obtained from NVIDIA here. I will be using the following driver in this example to build my container: NVIDIA-Linux-x86_64-470.63-vgpu-kvm.run. The first step we need to do is determine the driver-toolkit release image our current cluster is using. We can find that by running the following command:

$ oc adm release info --image-for=driver-toolkit
quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ce897bc72101dacc82aa593974fa0d8a421a43227b540fbcf1e303ffb1d3f1ea

Next we will take that release image and place it into a Dockerfile in a directory called vgpu:

$ cat << EOF > ~/vgpu/Dockerfile
FROM quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ce897bc72101dacc82aa593974fa0d8a421a43227b540fbcf1e303ffb1d3f1ea
ARG NVIDIA_INSTALLER_BINARY
ENV NVIDIA_INSTALLER_BINARY=${NVIDIA_INSTALLER_BINARY:-NVIDIA-Linux-x86_64-470.63-vgpu-kvm.run}

RUN dnf -y install git make sudo gcc \
&& dnf clean all \
&& rm -rf /var/cache/dnf

RUN mkdir -p /root/nvidia
WORKDIR /root/nvidia
ADD ${NVIDIA_INSTALLER_BINARY} .
RUN chmod +x /root/nvidia/${NVIDIA_INSTALLER_BINARY}
ADD entrypoint.sh .
RUN chmod +x /root/nvidia/entrypoint.sh

RUN mkdir -p /root/tmp
EOF

Next create the following entrypoint.sh script and place that in the vgpu directory as well:

$ cat << EOF > ~/vgpu/entrypoint.sh
#!/bin/sh
/usr/sbin/rmmod nvidia
/root/nvidia/${NVIDIA_INSTALLER_BINARY} --kernel-source-path=/usr/src/kernels/$(uname -r) --kernel-install-path=/lib/modules/$(uname -r)/kernel/drivers/video/ --silent --tmpdir /root/tmp/ --no-systemd

/usr/bin/nvidia-vgpud &
/usr/bin/nvidia-vgpu-mgr &

while true; do sleep 15 ; /usr/bin/pgrep nvidia-vgpu-mgr ; if [ $? -ne 0 ] ; then echo "nvidia-vgpu-mgr is not running" && exit 1; fi; done
EOF

Also place the NVIDIA-Linux-x86_64-470.63-vgpu-kvm.run in to the vgpu directory. Then you should have the following:

$ ls
Dockerfile  entrypoint.sh  NVIDIA-Linux-x86_64-470.63-vgpu-kvm.run

At this point change directory into vgpu and then use the podman build command to build the driver container local:

$ cd ~/vgpu
$ podman build --build-arg NVIDIA_INSTALLER_BINARY=NVIDIA-Linux-x86_64-470.63-vgpu-kvm.run -t ocp-nvidia-vgpu-installer .
STEP 1/11: FROM quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ce897bc72101dacc82aa593974fa0d8a421a43227b540fbcf1e303ffb1d3f1ea
STEP 2/11: ARG NVIDIA_INSTALLER_BINARY
--> Using cache 13aa17a1fd44bb7afea0a1b884b7005aaa51091e47dfe14987b572db9efab1f2
--> 13aa17a1fd4
STEP 3/11: ENV NVIDIA_INSTALLER_BINARY=${NVIDIA_INSTALLER_BINARY:-NVIDIA-Linux-x86_64-470.63-vgpu-kvm.run}
--> Using cache e818b281ad40c0e78ef4c01a71d73b45b509392100262fddbf542c457d697255
--> e818b281ad4
STEP 4/11: RUN dnf -y install git make sudo gcc && dnf clean all && rm -rf /var/cache/dnf
--> Using cache d6f3687a545589cf096353ad792fb464a6961ff204c49234ced26d996da9f1c8
--> d6f3687a545
STEP 5/11: RUN mkdir -p /root/nvidia
--> Using cache 708b464de69de2443edb5609623478945af6f9498d73bf4d47c577e29811a414
--> 708b464de69
STEP 6/11: WORKDIR /root/nvidia
--> Using cache 6cb724eeb99d21a30f50a3c25954426d4719af84ef43bda7ab0aeab6e7da81a8
--> 6cb724eeb99
STEP 7/11: ADD ${NVIDIA_INSTALLER_BINARY} .
--> Using cache 71dd0491be7e3c20a742cd50efe26f54a5e2f61d4aa8846cd5d7ccd82f27ab45
--> 71dd0491be7
STEP 8/11: RUN chmod +x /root/nvidia/${NVIDIA_INSTALLER_BINARY}
--> Using cache 85d64dc8b702936412fa121aaab3733a60f880aa211e0197f1c8853ddbb617b5
--> 85d64dc8b70
STEP 9/11: ADD entrypoint.sh .
--> Using cache 9d49c87387f926ec39162c5e1c2a7866c1494c1ab8f3912c53ea6eaefe0be254
--> 9d49c87387f
STEP 10/11: RUN chmod +x /root/nvidia/entrypoint.sh
--> Using cache 79d682f8471fc97a60b6507d2cff164b3b9283a1e078d4ddb9f8138741c033b5
--> 79d682f8471
STEP 11/11: RUN mkdir -p /root/tmp
--> Using cache bcbb311e35999cb6c55987049033c5d278ee93d76a97fe9203ce68257a9f8ebd
COMMIT ocp-nvidia-vgpu-installer
--> bcbb311e359
Successfully tagged localhost/ocp-nvidia-vgpu-installer:latest
Successfully tagged localhost/ocp-nvidia-vgpu-nstaller:latest
Successfully tagged quay.io/bschmaus/ocp-nvidia-vgpu-nstaller:latest
bcbb311e35999cb6c55987049033c5d278ee93d76a97fe9203ce68257a9f8ebd

Once the container is build push it to a private repository that is only accessible by the organization that purchased the NVIDIA GRID license. It is not legal to freely distribute the driver image.

$ podman push quay.io/bschmaus/ocp-nvidia-vgpu-nstaller:latest
Getting image source signatures
Copying blob b0b1274fc88c done  
Copying blob 525ed45dbdb1 done  
Copying blob 8aa226ded434 done  
Copying blob d9ad9932e964 done  
Copying blob 5bc03dec6239 done  
Copying blob ab10e1e28fa3 done  
Copying blob 4eff86c961b3 done  
Copying blob e1790381e6f7 done  
Copying blob a8701ba769cc done  
Copying blob 38a3912b1d62 done  
Copying blob 257db9f06185 done  
Copying blob 09fd8acd3579 done  
Copying config bcbb311e35 done  
Writing manifest to image destination
Copying config bcbb311e35 [--------------------------------------] 0.0b / 5.6KiB
Writing manifest to image destination
Storing signatures

Now that we have a driver image lets create a custom resource that will use and apply that driver image to the cluster nodes that have the label hasGPU via a deamonset. The file will look similar below but will need the container image path to be updated to fit ones environment.

$ cat << EOF > ~/1000-drivercontainer.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: simple-kmod-driver-container
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: simple-kmod-driver-container
rules:
- apiGroups:
  - security.openshift.io
  resources:
  - securitycontextconstraints
  verbs:
  - use
  resourceNames:
  - privileged
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: simple-kmod-driver-container
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: simple-kmod-driver-container
subjects:
- kind: ServiceAccount
  name: simple-kmod-driver-container
userNames:
- system:serviceaccount:simple-kmod-demo:simple-kmod-driver-container
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: simple-kmod-driver-container
spec:
  selector:
    matchLabels:
      app: simple-kmod-driver-container
  template:
    metadata:
      labels:
        app: simple-kmod-driver-container
    spec:
      serviceAccount: simple-kmod-driver-container
      serviceAccountName: simple-kmod-driver-container
      hostPID: true
      hostIPC: true
      containers:
      - image: quay.io/bschmaus/ocp-nvidia-vgpu-nstaller:latest
        name: simple-kmod-driver-container
        imagePullPolicy: Always
        command: ["/root/nvidia/entrypoint.sh"]
        lifecycle:
          preStop:
            exec:
              command: ["/bin/sh", "-c", "systemctl stop kmods-via-containers@simple-kmod"]
        securityContext:
          privileged: true

          allowedCapabilities:
          - '*'
          capabilities:
            add: ["SYS_ADMIN"]
        volumeMounts:
        - mountPath: /dev/vfio/
          name: vfio
        - mountPath: /sys/fs/cgroup
          name: cgroup
      volumes:
      - hostPath:
          path: /sys/fs/cgroup
          type: Directory
        name: cgroup
      - hostPath:
          path: /dev/vfio/
          type: Directory
        name: vfio
      nodeSelector:
        hasGpu: "true"
EOF

Now that we have our custom resource driver yaml lets apply it to the cluster:

$ oc create -f 1000-drivercontainer.yaml
serviceaccount/simple-kmod-driver-container created
role.rbac.authorization.k8s.io/simple-kmod-driver-container created
rolebinding.rbac.authorization.k8s.io/simple-kmod-driver-container created
daemonset.apps/simple-kmod-driver-container created

We can validate the daemonset is running by looking at the daemonsets under openshift-nfd:

$ oc get daemonset simple-kmod-driver-container 
NAME                           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
simple-kmod-driver-container   1         1         1       1            1           hasGpu=true     9m23s

Now lets further validate by logging into the worker node as the core user, sudo up to root and then list out the loaded kernel modules. We should see the NVIDIA drivers loaded:

# sudo bash
# lsmod| grep nvi
nvidia_vgpu_vfio       65536  0
nvidia              35274752  10 nvidia_vgpu_vfio
mdev                   20480  2 vfio_mdev,nvidia_vgpu_vfio
vfio                   36864  3 vfio_mdev,nvidia_vgpu_vfio,vfio_iommu_type1
drm                   569344  4 drm_kms_helper,nvidia,mgag200

Once we have confirmed the NVIDIA drivers are loaded lets enumerate through the possible mdev_type devices for our GPU card. Using the commands below we can show what the different options are for carving up the GPU card from a vGPU perspective. In our example below we have a variety of ways we could use this card. However it should be noted that only one nvidia-(n) device can be used. That is if we choose nvidia-22 and carve up each GPU into a single vGPU then we end up with one vGPU per physical GPU on the card. As another example if we chose nvidia-15 we would then end up with 8 vGPUs per physical GPU on the card.

# for device in /sys/class/mdev_bus/*; do for mdev_type in "$device"/mdev_supported_types/*; do     MDEV_TYPE=$(basename $mdev_type);     DESCRIPTION=$(cat $mdev_type/description);     NAME=$(cat $mdev_type/name); echo "mdev_type: $MDEV_TYPE --- description: $DESCRIPTION --- name: $NAME";   done; done | sort | uniq
mdev_type: nvidia-11 --- description: num_heads=2, frl_config=45, framebuffer=512M, max_resolution=2560x1600, max_instance=16 --- name: GRID M60-0B
mdev_type: nvidia-12 --- description: num_heads=2, frl_config=60, framebuffer=512M, max_resolution=2560x1600, max_instance=16 --- name: GRID M60-0Q
mdev_type: nvidia-13 --- description: num_heads=1, frl_config=60, framebuffer=1024M, max_resolution=1280x1024, max_instance=8 --- name: GRID M60-1A
mdev_type: nvidia-14 --- description: num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120x2880, max_instance=8 --- name: GRID M60-1B
mdev_type: nvidia-15 --- description: num_heads=4, frl_config=60, framebuffer=1024M, max_resolution=5120x2880, max_instance=8 --- name: GRID M60-1Q
mdev_type: nvidia-16 --- description: num_heads=1, frl_config=60, framebuffer=2048M, max_resolution=1280x1024, max_instance=4 --- name: GRID M60-2A
mdev_type: nvidia-17 --- description: num_heads=4, frl_config=45, framebuffer=2048M, max_resolution=5120x2880, max_instance=4 --- name: GRID M60-2B
mdev_type: nvidia-18 --- description: num_heads=4, frl_config=60, framebuffer=2048M, max_resolution=5120x2880, max_instance=4 --- name: GRID M60-2Q
mdev_type: nvidia-19 --- description: num_heads=1, frl_config=60, framebuffer=4096M, max_resolution=1280x1024, max_instance=2 --- name: GRID M60-4A
mdev_type: nvidia-20 --- description: num_heads=4, frl_config=60, framebuffer=4096M, max_resolution=5120x2880, max_instance=2 --- name: GRID M60-4Q
mdev_type: nvidia-210 --- description: num_heads=4, frl_config=45, framebuffer=2048M, max_resolution=5120x2880, max_instance=4 --- name: GRID M60-2B4
mdev_type: nvidia-21 --- description: num_heads=1, frl_config=60, framebuffer=8192M, max_resolution=1280x1024, max_instance=1 --- name: GRID M60-8A
mdev_type: nvidia-22 --- description: num_heads=4, frl_config=60, framebuffer=8192M, max_resolution=5120x2880, max_instance=1 --- name: GRID M60-8Q
mdev_type: nvidia-238 --- description: num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120x2880, max_instance=8 --- name: GRID M60-1B4

In my example I am going to go ahead and use nvidia-22 and only pass one vGPU per physical GPU. To do this we need to echo a unique uuid number into the following device path create file. I will do this twice once for each physical GPU device. Note that this can only be done once. If attempted more then once an IO error will result.

# echo `uuidgen` > /sys/class/mdev_bus/0000:3e:00.0/mdev_supported_types/nvidia-22/create
# echo `uuidgen` > /sys/class/mdev_bus/0000:3d:00.0/mdev_supported_types/nvidia-22/create

Now that we have created our vGPU devices we next need to expose those devices to Containerized virtualization so they can then be consumed by a virtual machine. To do this we need to patch the kubevirt-hyperconverged configuration. So first lets create the patch file:

$ cat << EOF > ~/kubevirt-hyperconverged-patch.yaml
spec:
    permittedHostDevices:
      mediatedDevices:
      - mdevNameSelector: "GRID M60-8Q"
        resourceName: "nvidia.com/GRID_M60_8Q"
EOF

With the patch file created we next need to merge it with the existing kubevirt-hyperconverged configuration using the oc patch command:

$ oc patch hyperconverged kubevirt-hyperconverged -n openshift-cnv --patch "$(cat ~/kubevirt-hyperconverged-patch.yaml)" --type=merge
hyperconverged.hco.kubevirt.io/kubevirt-hyperconverged patched

Once applied wait a few minutes for the configuration to be reloaded. Then to validate it run the oc describe node command against the node and look for the GPU devices under Capacity and Allocatable. In our example we see two devices because we had 2 physical GPUs and we created a vGPU using nvidia-22 which allows for one vGPU per physical GPU.

$ oc describe node| sed '/Capacity/,/System/!d;/System/d'
Capacity:
  cpu:                            24
  devices.kubevirt.io/kvm:        1k
  devices.kubevirt.io/tun:        1k
  devices.kubevirt.io/vhost-net:  1k
  ephemeral-storage:              936104940Ki
  hugepages-1Gi:                  0
  hugepages-2Mi:                  0
  memory:                         131561680Ki
  nvidia.com/GRID_M60_8Q:         2
  pods:                           250
Allocatable:
  cpu:                            23500m
  devices.kubevirt.io/kvm:        1k
  devices.kubevirt.io/tun:        1k
  devices.kubevirt.io/vhost-net:  1k
  ephemeral-storage:              862714311276
  hugepages-1Gi:                  0
  hugepages-2Mi:                  0
  memory:                         130410704Ki
  nvidia.com/GRID_M60_8Q:         2
  pods:                           250

At this point VMs can now be deployed and consume the available vGPUs on the node. To do this we need to create a VM resource configuration file like the example below. Notice that we define in this file host devices and we pass in the NVIDIA host device of the vGPU name:

$ cat << EOF > ~/fedora-vm.yaml
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  annotations:
    kubemacpool.io/transaction-timestamp: '2022-02-09T17:23:53.76596817Z'
    kubevirt.io/latest-observed-api-version: v1
    kubevirt.io/storage-observed-api-version: v1alpha3
    name.os.template.kubevirt.io/fedora34: Fedora 33 or higher
    vm.kubevirt.io/validations: |
      [
        {
          "name": "minimal-required-memory",
          "path": "jsonpath::.spec.domain.resources.requests.memory",
          "rule": "integer",
          "message": "This VM requires more memory.",
          "min": 1073741824
        }
      ]
  resourceVersion: '19096098'
  name: fedora
  uid: 48bf787d-9240-444c-92fd-f0e5ce0ced23
  creationTimestamp: '2022-02-09T16:48:54Z'
  generation: 3
  managedFields:
    - apiVersion: kubevirt.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:annotations':
            .: {}
            'f:name.os.template.kubevirt.io/fedora34': {}
            'f:vm.kubevirt.io/validations': {}
          'f:labels':
            .: {}
            'f:app': {}
            'f:os.template.kubevirt.io/fedora34': {}
            'f:vm.kubevirt.io/template': {}
            'f:vm.kubevirt.io/template.namespace': {}
            'f:vm.kubevirt.io/template.revision': {}
            'f:vm.kubevirt.io/template.version': {}
            'f:workload.template.kubevirt.io/server': {}
        'f:spec':
          .: {}
          'f:dataVolumeTemplates': {}
          'f:template':
            .: {}
            'f:metadata':
              .: {}
              'f:annotations': {}
              'f:labels': {}
            'f:spec':
              .: {}
              'f:domain':
                .: {}
                'f:cpu':
                  .: {}
                  'f:cores': {}
                  'f:sockets': {}
                  'f:threads': {}
                'f:devices':
                  .: {}
                  'f:disks': {}
                  'f:interfaces': {}
                  'f:networkInterfaceMultiqueue': {}
                  'f:rng': {}
                'f:machine':
                  .: {}
                  'f:type': {}
                'f:resources':
                  .: {}
                  'f:requests':
                    .: {}
                    'f:memory': {}
              'f:evictionStrategy': {}
              'f:hostname': {}
              'f:networks': {}
              'f:terminationGracePeriodSeconds': {}
              'f:volumes': {}
      manager: Mozilla
      operation: Update
      time: '2022-02-09T16:48:54Z'
    - apiVersion: kubevirt.io/v1alpha3
      fieldsType: FieldsV1
      fieldsV1:
        'f:status':
          'f:conditions': {}
          'f:printableStatus': {}
      manager: Go-http-client
      operation: Update
      subresource: status
      time: '2022-02-09T17:23:53Z'
  namespace: openshift-nfd
  labels:
    app: fedora
    os.template.kubevirt.io/fedora34: 'true'
    vm.kubevirt.io/template: fedora-server-large
    vm.kubevirt.io/template.namespace: openshift
    vm.kubevirt.io/template.revision: '1'
    vm.kubevirt.io/template.version: v0.16.4
    workload.template.kubevirt.io/server: 'true'
spec:
  dataVolumeTemplates:
    - metadata:
        creationTimestamp: null
        name: fedora-rootdisk-uqf5j
      spec:
        pvc:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 40Gi
          storageClassName: hostpath-provisioner
          volumeMode: Filesystem
        source:
          http:
            url: >-
              https://download-ib01.fedoraproject.org/pub/fedora/linux/releases/34/Cloud/x86_64/images/Fedora-Cloud-Base-34-1.2.x86_64.raw.xz
  running: true
  template:
    metadata:
      annotations:
        vm.kubevirt.io/flavor: large
        vm.kubevirt.io/os: fedora
        vm.kubevirt.io/workload: server
      creationTimestamp: null
      labels:
        kubevirt.io/domain: fedora
        kubevirt.io/size: large
        os.template.kubevirt.io/fedora34: 'true'
        vm.kubevirt.io/name: fedora
        workload.template.kubevirt.io/server: 'true'
    spec:
      domain:
        cpu:
          cores: 12
          sockets: 1
          threads: 1
        devices:
          disks:
            - disk:
                bus: virtio
              name: cloudinitdisk
            - bootOrder: 1
              disk:
                bus: virtio
              name: rootdisk
          hostDevices:
            - deviceName: nvidia.com/GRID_M60_8Q
              name: GRID_M60_8Q
          interfaces:
            - macAddress: '02:01:53:00:00:00'
              masquerade: {}
              model: virtio
              name: default
          networkInterfaceMultiqueue: true
          rng: {}
        machine:
          type: pc-q35-rhel8.4.0
        resources:
          requests:
            memory: 32Gi
      evictionStrategy: LiveMigrate
      hostname: fedora
      networks:
        - name: default
          pod: {}
      terminationGracePeriodSeconds: 180
      volumes:
        - cloudInitNoCloud:
            userData: |
              #cloud-config
              user: fedora
              password: password
              chpasswd:
                expire: false
              ssh_authorized_keys:
                - >-
                  ssh-rsa
                  SSH-KEY-HERE
          name: cloudinitdisk
        - dataVolume:
            name: fedora-rootdisk-uqf5j
          name: rootdisk
status:
  conditions:
    - lastProbeTime: '2022-02-09T17:24:09Z'
      lastTransitionTime: '2022-02-09T17:24:09Z'
      message: VMI does not exist
      reason: VMINotExists
      status: 'False'
      type: Ready
  printableStatus: Stopped
  volumeSnapshotStatuses:
    - enabled: false
      name: cloudinitdisk
      reason: 'Snapshot is not supported for this volumeSource type [cloudinitdisk]'
    - enabled: false
      name: rootdisk
      reason: >-
        No VolumeSnapshotClass: Volume snapshots are not configured for this
        StorageClass [hostpath-provisioner] [rootdisk]
EOF

Lets go ahead and create the virtual machine:

$ oc create -f ~/fedora-vm.yaml
virtualmachine.kubevirt.io/fedora created

Wait a few moments for the virtual machine to get to a running state. We can confirm its running by doing oc get vms:

$ oc get vms
NAME              AGE     STATUS    READY
fedora            8m17s   Running   True

Now lets expose the running virtual machines ssh port so we can ssh into it by using the virtctl command:

$ virtctl expose vmi fedora --port=22 --name=fedora-ssh --type=NodePort
Service fedora-ssh successfully exposed for vmi fedora

We can confirm the ssh port is exposed and get the port number it uses by running the oc get svc command:

$ oc get svc
NAME                                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)           AGE
fedora-ssh                               NodePort    172.30.220.248   none          22:30106/TCP      7s

Now lets ssh into the fedora virtual machine and become root:

$ ssh fedora@10.11.176.230 -p 30106
The authenticity of host '[10.11.176.230]:30106 ([10.11.176.230]:30106)' can't be established.
ECDSA key fingerprint is SHA256:Zmpcpm8vgQc3Oa72RFL0iKU/OPjHshAbHyGO7Smk8oE.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '[10.11.176.230]:30106' (ECDSA) to the list of known hosts.
Last login: Wed Feb  9 18:41:14 2022
[fedora@fedora ~]$ sudo bash
[root@fedora fedora]#

Once at a root prompt we can execute lspci and see the NVIDIA vGPU we passed to the virtual machine is listed as a device:

[root@fedora fedora]# lspci|grep NVIDIA
06:00.0 VGA compatible controller: NVIDIA Corporation GM204GL [Tesla M60] (rev a1)

At this point all that one would need to do is install the NVIDIA drivers on the virtual machine and then fire up their favorite application that would take advantage of the vGPU in the virtual machine!

Sunday, February 06, 2022

Enabling vGPU in OpenShift Containerized Virtualization