Assumptions
We assume we already have a OpenShift cluster running with some kind of backend storage and OpenShift Virtualization installed. In our example environment we have 3 control planes and 3 worker nodes. The worker nodes have Mellanox BF3's and NVIDIA A40 GPUs. We are using OpenShift Data Foundation as the backing storage where needed and it is useful when live migration is a requirement. With the assumptions covered we can begin configuring the system for GPU Direct RDMA.
Enable Device PassThrough
Before we can consume the devices in a virtual machine we need to enable device passthrough on the workers nodes for the Mellanox cards and the GPU devices so they can be used directly by the virtual machines. First we need to enable intel_iommu
and can do so by creating the following MachineConfig.
$ cat <<EOF > 100-worker-iommu.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 100-worker-iommu
spec:
config:
ignition:
version: 3.2.0
kernelArguments:
- intel_iommu=on
EOF
Next we will create a butane file that contains the vendor/pci ids of the devices we wish to bind to vfio which enables them for passthrough.
$ cat <<EOF > 100-worker-vfiopci.bu
variant: openshift
version: 4.16.0
metadata:
name: 100-worker-vfiopci
labels:
machineconfiguration.openshift.io/role: worker
storage:
files:
- path: /etc/modprobe.d/vfio.conf
mode: 0644
overwrite: true
contents:
inline: |
options vfio-pci ids=10de:2235,10de:145a,15b3:a2dc,15b3:c2d5,15b3:1021,15b3:0237,15b3:0016
- path: /etc/modules-load.d/vfio-pci.conf
mode: 0644
overwrite: true
contents:
inline: vfio-pci
EOF
After building the butane file above we can pass it through the butane
command to generate the corresponding custom resource file.
$ butane 100-worker-vfiopci.bu -o 100-worker-vfiopci.yaml
Next we need to generate a mlx_core blacklist so the driver does not load on the worker node. We do not need to do this for the GPU drivers because by default the nouveau
driver is blacklisted in OpenShift.
$ cat <<EOF > 99-machine-config-blacklist-mlx5_core.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: 99-worker-blacklist-mlx5-core
spec:
kernelArguments:
- "module_blacklist=mlx5_core"
EOF
With the MachineConfigs generated we can go ahead and create them on the cluster.
$ oc create -f 100-worker-iommu.yaml
$ oc create -f 100-worker-vfiopci.yaml
$ oc create -f 99-machine-config-blacklist-mlx5_core.yaml
One by one the nodes will reboot as the MachineConfigs are applied to them. Wait for all the worker nodes in the cluster to reboot before proceeding and confirm with the output of the oc get mcp
command.
$ oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-08f7504a24cb5e9734f3cfe995db08c6 True False False 3 3 3 0 122d
worker rendered-worker-8c3ff0c3b0d16b30f7eb76992fd7d3b1 True False False 3 3 3 0 122d
Once confirmed we can proceed to the next section about exposing the devices to OpenShift.
Expose Devices to OpenShift
Now that devices have been configured for passthrough we need to expose them to the kubevirt-hyperconverged configuration. We can do this by editing that configuration.
$ oc edit hyperconverged kubevirt-hyperconverged -n openshift-cnv
Once we are in edit mode we can add our devices. Our environment example looks like the following below where we have A40 GPUs, Mellanox CX7s and Mellanox BF3s. The resourceName is arbitrary after the nvidia.com
portion. Since both BF3 and CX7 cards show up as CX7 when looking at them via lspci
I decided to put the BF3 prefix on the ones from a BF3 card so I could tell the difference. Another thing to note is that this setup should really have consistently configured cards in the workers. What I mean is that the cards should either be set for ethernet ports or infiniband ports as there is no way I could tell the difference.
permittedHostDevices:
pciHostDevices:
- pciDeviceSelector: 10de:2235
resourceName: nvidia.com/GA102GL_A40
- pciDeviceSelector: 15b3:a2dc
resourceName: nvidia.com/BF3_CX7
- pciDeviceSelector: 15b3:1021
resourceName: nvidia.com/CX7
- pciDeviceSelector: 15b3:c2d5
resourceName: nvidia.com/BF3_DMA
resourceRequirements:
Once the lines are added we can save and exit the edit command. We can use the following command to check that they are properly showing. Note some of my nodes has BF3 cards and some just had vanilla CX7 cards.
$ oc describe node | grep -E 'Capacity:|Allocatable:' -A14
(...)
--
Allocatable:
cpu: 127500m
devices.kubevirt.io/kvm: 1k
devices.kubevirt.io/tun: 1k
devices.kubevirt.io/vhost-net: 1k
ephemeral-storage: 1438028263499
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 262445728Ki
nvidia.com/BF3_CX7: 2
nvidia.com/BF3_DMA: 2
nvidia.com/GA102GL_A40: 2
--
Capacity:
cpu: 128
devices.kubevirt.io/kvm: 1k
devices.kubevirt.io/tun: 1k
devices.kubevirt.io/vhost-net: 1k
ephemeral-storage: 1561525616Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 263596676Ki
nvidia.com/BF3_CX7: 2
nvidia.com/BF3_DMA: 2
nvidia.com/GA102GL_A40: 2
(...)
Capacity:
cpu: 128
devices.kubevirt.io/kvm: 1k
devices.kubevirt.io/tun: 1k
devices.kubevirt.io/vhost-net: 1k
ephemeral-storage: 1561525616Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 263596668Ki
nvidia.com/CX7: 2
nvidia.com/GA102GL_A40: 2
pods: 250
--
Allocatable:
cpu: 127500m
devices.kubevirt.io/kvm: 1k
devices.kubevirt.io/tun: 1k
devices.kubevirt.io/vhost-net: 1k
ephemeral-storage: 1438028263499
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 262445692Ki
nvidia.com/CX7: 2
nvidia.com/GA102GL_A40: 2
pods: 250
--
If everything looks good we can proceed to launching our virtual machines.
Launch Virtual Machines
We need to launch a few virtual machines in order to test GPUDirect RDMA with our passthrough devices. The virtual machine custom resource files will look something like the examples below though they could be different depending on what one plans to test and what workloads will run inside the vm. We will need two virtual machines running on different compute nodes so we ensure each yaml has a defined nodeSelector. Note in these examples we are referencing one of the Mellnox devices and one of the NVIDIA GPU devices. The first machine is defined below.
$ cat <<EOF > rhel9-rdma1.yaml
apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
metadata:
annotations:
kubevirt.io/latest-observed-api-version: v1
kubevirt.io/storage-observed-api-version: v1
kubevirt.io/vm-generation: "5"
vm.kubevirt.io/flavor: small
vm.kubevirt.io/os: rhel9
vm.kubevirt.io/workload: server
labels:
kubevirt.io/domain: rhel9-lavender-ocelot-28
kubevirt.io/nodeName: nvd-srv-33.nvidia.eng.rdu2.dc.redhat.com
kubevirt.io/size: small
network.kubevirt.io/headlessService: headless
name: rhel9-rdma1
namespace: default
spec:
architecture: amd64
domain:
cpu:
cores: 1
maxSockets: 16
model: host-model
sockets: 4
threads: 1
devices:
disks:
- disk:
bus: virtio
name: rootdisk
- disk:
bus: virtio
name: cloudinitdisk
gpus:
- deviceName: nvidia.com/GA102GL_A40
name: gpus-orange-porpoise-63
hostDevices:
- deviceName: nvidia.com/BF3_CX7
name: hostDevices-turquoise-hornet-42
interfaces:
- macAddress: 02:23:fc:00:00:11
masquerade: {}
model: virtio
name: default
rng: {}
features:
acpi:
enabled: true
smm:
enabled: true
firmware:
bootloader:
efi:
secureBoot: false
uuid: e2ff2b46-096e-521f-8680-c99c6bbae5d8
machine:
type: pc-q35-rhel9.4.0
memory:
guest: 16Gi
maxGuest: 64Gi
resources:
requests:
memory: 16Gi
evictionStrategy: LiveMigrate
networks:
- name: default
pod: {}
nodeSelector:
kubernetes.io/hostname: nvd-srv-33.nvidia.eng.rdu2.dc.redhat.com
terminationGracePeriodSeconds: 180
volumes:
- dataVolume:
name: rhel9-lavender-ocelot-28
name: rootdisk
- cloudInitNoCloud:
userData: |
#cloud-config
user: cloud-user
password: password
chpasswd:
expire: false
name: cloudinitdisk
EOF
And then we have our second virtual machine defined.
apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
metadata:
annotations:
kubevirt.io/latest-observed-api-version: v1
kubevirt.io/storage-observed-api-version: v1
kubevirt.io/vm-generation: "3"
vm.kubevirt.io/flavor: small
vm.kubevirt.io/os: rhel9
vm.kubevirt.io/workload: server
labels:
kubevirt.io/domain: rhel9-rdma2
kubevirt.io/nodeName: nvd-srv-32.nvidia.eng.rdu2.dc.redhat.com
kubevirt.io/size: small
network.kubevirt.io/headlessService: headless
name: rhel9-rdma2
namespace: default
spec:
architecture: amd64
domain:
cpu:
cores: 1
maxSockets: 16
model: host-model
sockets: 4
threads: 1
devices:
disks:
- disk:
bus: virtio
name: rootdisk
- disk:
bus: virtio
name: cloudinitdisk
gpus:
- deviceName: nvidia.com/GA102GL_A40
name: gpus-amaranth-dormouse-37
hostDevices:
- deviceName: nvidia.com/BF3_CX7
name: hostDevices-turquoise-reptile-50
interfaces:
- macAddress: 02:23:fc:00:00:12
masquerade: {}
model: virtio
name: default
rng: {}
features:
acpi:
enabled: true
smm:
enabled: true
firmware:
bootloader:
efi:
secureBoot: false
uuid: 875d71cc-b337-5209-ad37-b1611fa77ec2
machine:
type: pc-q35-rhel9.4.0
memory:
guest: 16Gi
maxGuest: 64Gi
resources:
requests:
memory: 16Gi
evictionStrategy: LiveMigrate
networks:
- name: default
pod: {}
nodeSelector:
kubernetes.io/hostname: nvd-srv-32.nvidia.eng.rdu2.dc.redhat.com
terminationGracePeriodSeconds: 180
volumes:
- dataVolume:
name: rhel9-rdma2
name: rootdisk
- cloudInitNoCloud:
userData: |
#cloud-config
user: cloud-user
password: password
chpasswd:
expire: false
name: cloudinitdisk
Once we have generated the virtual machine custom resource files we can create them on the cluster.
$ oc create -f rhel9-rdma1.yaml
$ oc create -f rhel9-rdma2.yaml
We can validate they are running by using oc get vmi
.
$ oc get vmi
NAME AGE PHASE IP NODENAME READY
rhel9-rdma1 115m Running 10.128.2.66 nvd-srv-33.nvidia.eng.rdu2.dc.redhat.com True
rhel9-rdma2 114m Running 10.131.0.50 nvd-srv-32.nvidia.eng.rdu2.dc.redhat.com True
If everything looks good we can proceed to configuring the NVIDIA drivers on the virtual machines.
Prepare for NVIDIA DOCA and GPU Drivers
Now that our virtual machines are up and running we will need to configure the NVIDIA DOCA and GPU drivers to take advantage of the devices we have passed up to them.
$ oc get vmi
NAME AGE PHASE IP NODENAME READY
rhel9-rdma1 115m Running 10.128.2.66 nvd-srv-33.nvidia.eng.rdu2.dc.redhat.com True
rhel9-rdma2 114m Running 10.131.0.50 nvd-srv-32.nvidia.eng.rdu2.dc.redhat.com True
We can use virtctl
to access the console of the virtual machines and login.
$ virtctl console rhel9-rdma1
Successfully connected to rhel9-rdma1 console. The escape sequence is ^]
rhel9-rdma1 login:
rhel9-rdma1 login: cloud-user
Password:
Last login: Fri Apr 11 18:55:51 on ttyS0
[cloud-user@rhel9-rdma1 ~]$
Once we are logged in we need to register the host to Red Hat.
$ sudo subscription-manager register
Registering to: subscription.rhsm.redhat.com:443/subscription
Username: schmaustech
Password:
The system has been registered with ID: 8d91ad2e-8d3a-4919-9030-4bd32292cc5b
The registered system name is: rhel9-rdma3
$
Next we need to enable the CodeReady repository.
$ sudo subscription-manager repos --enable=codeready-builder-for-rhel-9-x86_64-rpms
Repository 'codeready-builder-for-rhel-9-x86_64-rpms' is enabled for this system.
For NVIDIA's drivers there are dependencies to EPEL so we will need to enable that repository as well.
$ sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm -y
Red Hat CodeReady Linux Builder for RHEL 9 x86_ 19 MB/s | 12 MB 00:00
Last metadata expiration check: 0:00:01 ago on Fri Apr 11 19:04:49 2025.
epel-release-latest-9.noarch.rpm 346 kB/s | 19 kB 00:00
Dependencies resolved.
================================================================================
Package Architecture Version Repository Size
================================================================================
Installing:
epel-release noarch 9-9.el9 @commandline 19 k
Transaction Summary
================================================================================
Install 1 Package
Total size: 19 k
Installed size: 26 k
Downloading Packages:
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Installing : epel-release-9-9.el9.noarch 1/1
Running scriptlet: epel-release-9-9.el9.noarch 1/1
Many EPEL packages require the CodeReady Builder (CRB) repository.
It is recommended that you run /usr/bin/crb enable to enable the CRB repository.
[ 7451.044946] systemd-rc-local-generator[5104]: /etc/rc.d/rc.local is not marked executable, skipping.
Verifying : epel-release-9-9.el9.noarch 1/1
Installed products updated.
Installed:
epel-release-9-9.el9.noarch
Complete!
$
Next we need to enable the NVIDIA CUDA repository.
$ sudo cat <<EOF > /etc/yum.repos.d/cuda-rhel9.repo
[cuda-rhel9-x86_64]
name=cuda-rhel9-x86_64
baseurl=https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64
enabled=1
gpgcheck=1
gpgkey=https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/D42D0685.pub
EOF
We will also need to enable the NVIDIA Doca repository.
$ sudo cat <<EOF > /etc/yum.repos.d/doca-rhel9.repo
[doca-rhel9-x86_64]
name=doca-rhel9-x86_64
baseurl=https://linux.mellanox.com/public/repo/doca/2.10.0/rhel9.4/x86_64
enabled=1
gpgcheck=0
EOF
After adding the required repositories we also need to make sure the nouveau driver is properly blacklisted. To do this we will need to edit the default grub file.
$ sudo vi /etc/default/grub
We just want to append the modprobe.blacklist=nouveau
on the GRUB_CMDLINE_LINUX line like the example below.
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="crashkernel=auto resume=/dev/mapper/rhel-swap rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap rhgb quiet modprobe.blacklist=nouveau"
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true
We also need to create two denylist.conf files under /etc/modprobe.d
.
$ sudo echo "blacklist nouveau" > /etc/modprobe.d/denylist.conf
$ sudo echo "options nouveau modeset=0" >> /etc/modprobe.d/denylist.conf
Finally we can rebuild the dracut image and generate the new grub file.
$ sudo dracut --force
$ sudo grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file ...
Adding boot menu entry for UEFI Firmware Settings ...
done
With our repos and blacklists in place we can validate the repolist looks correct.
$ sudo dnf repolist
Updating Subscription Management repositories.
repo id repo name
codeready-builder-for-rhel-9-x86_64-rpms Red Hat CodeReady Linux Builder for RHEL 9 x86_64 (RPMs)
cuda-rhel9-x86_64 cuda-rhel9-x86_64
doca-rhel9-x86_64 doca-rhel9-x86_64
epel Extra Packages for Enterprise Linux 9 - x86_64
epel-cisco-openh264 Extra Packages for Enterprise Linux 9 openh264 (From Cisco) - x86_64
rhel-9-for-x86_64-appstream-rpms Red Hat Enterprise Linux 9 for x86_64 - AppStream (RPMs)
rhel-9-for-x86_64-baseos-rpms Red Hat Enterprise Linux 9 for x86_64 - BaseOS (RPMs)
Before we proceed to installing the NVIDIA DOCA drivers we should reboot the VMs so our blacklist of the drivers takes effect. One rebooted we can proceed.
Install NVIDIA DOCA Drivers
Now that we have our repositories setup we can begin to install the DOCA drivers. This is done by the following command.
$ sudo dnf install doca-all -y
Updating Subscription Management repositories.
cuda-rhel9-x86_64 7.4 MB/s | 2.6 MB 00:00
doca-rhel9-x86_64 208 kB/s | 214 kB 00:01
Extra Packages for Enterprise Linux 9 - x86_64 33 MB/s | 23 MB 00:00
Extra Packages for Enterprise Linux 9 openh264 8.5 kB/s | 2.5 kB 00:00
Dependencies resolved.
=========================================================================================================================================
Package Arch Version Repository Size
=========================================================================================================================================
Installing:
doca-all x86_64 2.10.0-0.5.2 doca-rhel9-x86_64 6.6 k
doca-sosreport noarch 4.8.1-1.el9 doca-rhel9-x86_64 862 k
replacing sos.noarch 4.7.2-3.el9
kernel-core x86_64 5.14.0-427.42.1.el9_4 rhel-9-for-x86_64-baseos-rpms 19 M
kernel-core x86_64 5.14.0-503.35.1.el9_5 rhel-9-for-x86_64-baseos-rpms 18 M
kernel-modules x86_64 5.14.0-503.35.1.el9_5 rhel-9-for-x86_64-baseos-rpms 37 M
(...)
unbound-libs x86_64 1.16.2-8.el9_5.1 rhel-9-for-x86_64-appstream-rpms 552 k
vim-filesystem noarch 2:8.2.2637-21.el9 rhel-9-for-x86_64-baseos-rpms 17 k
xpmem x86_64 2.7.4-1.2501056.rhel9u4 doca-rhel9-x86_64 20 k
xz-devel x86_64 5.2.5-8.el9_0 rhel-9-for-x86_64-appstream-rpms 59 k
zlib-devel x86_64 1.2.11-40.el9 rhel-9-for-x86_64-appstream-rpms 47 k
Installing weak dependencies:
perl-NDBM_File x86_64 1.15-481.el9 rhel-9-for-x86_64-appstream-rpms 23 k
python3-boto3 noarch 1.28.62-1.el9 epel 164 k
Transaction Summary
=========================================================================================================================================
Install 223 Packages
Upgrade 1 Package
Total download size: 452 M
Is this ok [y/N]: y
Downloading Packages:
(1/224): collectx_1.20.2-23151356-rhel9.1-x86_6 323 kB/s | 222 kB 00:00
(2/224): clusterkit-1.15.469-1.2501056.x86_64.r 200 kB/s | 138 kB 00:00
(3/224): doca-all-2.10.0-0.5.2.x86_64.rpm 19 kB/s | 6.6 kB 00:00
(...)
(221/224): meson-0.63.3-1.el9.noarch.rpm 11 MB/s | 1.5 MB 00:00
(222/224): ninja-build-1.10.2-6.el9.x86_64.rpm 1.0 MB/s | 150 kB 00:00
(223/224): libzip-devel-1.7.3-8.el9.x86_64.rpm 3.1 MB/s | 212 kB 00:00
(224/224): libibverbs-2501mlnx56-1.2501056.x86_ 474 kB/s | 358 kB 00:00
--------------------------------------------------------------------------------
Total 14 MB/s | 452 MB 00:33
Extra Packages for Enterprise Linux 9 - x86_64 1.6 MB/s | 1.6 kB 00:00
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Upgrading : libibverbs-2501mlnx56-1.2501056.x86_64 1/226
Running scriptlet: libibverbs-2501mlnx56-1.2501056.x86_64 1/226
Installing : doca-sdk-common-2.10.0087-1.el9.x86_64 2/226
Running scriptlet: doca-sdk-common-2.10.0087-1.el9.x86_64 2/226
Installing : ucx-1.18.0-1.2501056.x86_64 3/226
Running scriptlet: ucx-1.18.0-1.2501056.x86_64 3/226
Installing : libibumad-2501mlnx56-1.2501056.x86_64 4/226
(...)
Installing : doca-all-2.10.0-0.5.2.x86_64 224/226
Obsoleting : sos-4.7.2-3.el9.noarch 225/226
Cleanup : libibverbs-51.0-1.el9.x86_64 226/226
Running scriptlet: kernel-modules-core-5.14.0-427.42.1.el9_4.x86_64 226/226
Running scriptlet: kernel-core-5.14.0-427.42.1.el9_4.x86_64 226/226
Running scriptlet: kernel-modules-core-5.14.0-503.35.1.el9_5.x86_64 226/226
Running scriptlet: kernel-core-5.14.0-503.35.1.el9_5.x86_64 226/226
Running scriptlet: kernel-modules-5.14.0-503.35.1.el9_5.x86_64 226/226
Running scriptlet: mlnx-ofa_kernel-devel-25.01-OFED.25.01.0.5.6.1.r 226/226
Running scriptlet: libibverbs-51.0-1.el9.x86_64 226/226
[ 1456.075588] systemd-rc-local-generator[76691]: /etc/rc.d/rc.local is not marked executable, skipping.
Verifying : clusterkit-1.15.469-1.2501056.x86_64 1/226
Verifying : collectx-clxapi-1.20.2-1.x86_64 2/226
Verifying : collectx-clxapidev-1.20.2-1.x86_64 3/226
Verifying : doca-all-2.10.0-0.5.2.x86_64 4/226
(...)
Verifying : meson-0.63.3-1.el9.noarch 223/226
Verifying : libzip-devel-1.7.3-8.el9.x86_64 224/226
Verifying : libibverbs-2501mlnx56-1.2501056.x86_64 225/226
Verifying : libibverbs-51.0-1.el9.x86_64 226/226
Installed products updated.
Upgraded:
libibverbs-2501mlnx56-1.2501056.x86_64
Installed:
bzip2-devel-1.0.8-8.el9.x86_64
clusterkit-1.15.469-1.2501056.x86_64
(..)
unbound-1.16.2-8.el9_5.1.x86_64
unbound-libs-1.16.2-8.el9_5.1.x86_64
vim-filesystem-2:8.2.2637-21.el9.noarch
xpmem-2.7.4-1.2501056.rhel9u4.x86_64
xz-devel-5.2.5-8.el9_0.x86_64
zlib-devel-1.2.11-40.el9.x86_64
Complete!
One the drivers are installed we can confirm that mst status
is reporting properly
$ sudo mst status -v
MST modules:
------------
MST PCI module is not loaded
MST PCI configuration module is not loaded
PCI devices:
------------
DEVICE_TYPE MST PCI RDMA NET NUMA
BlueField3(rev:1) NA 09:00.0 mlx5_0 net-eth1 -1
Install RHEL Dependency Packages
There are some dependencies from RHEL that will need to be installed so we can do that now.
$ dnf install wget procps-ng pciutils jq iputils ethtool net-tools git autoconf automake libtool pciutils-devel -y
Updating Subscription Management repositories
Last metadata expiration check: 0:59:19 ago on Sat Apr 12 18:06:50 2025.
Package procps-ng-3.3.17-14.el9.x86_64 is already installed.
Package pciutils-3.7.0-5.el9.x86_64 is already installed.
Package jq-1.6-17.el9.x86_64 is already installed.
Package iputils-20210202-9.el9.x86_64 is already installed.
Package ethtool-2:6.2-1.el9.x86_64 is already installed.
Dependencies resolved.
===============================================================================================
Package Arch Version Repository Size
===============================================================================================
Installing:
autoconf noarch 2.69-39.el9 rhel-9-for-x86_64-appstream-rpms 685 k
automake noarch 1.16.2-8.el9 rhel-9-for-x86_64-appstream-rpms 693 k
git x86_64 2.43.5-2.el9_5 rhel-9-for-x86_64-appstream-rpms 55 k
libtool x86_64 2.4.6-46.el9 rhel-9-for-x86_64-appstream-rpms 585 k
net-tools x86_64 2.0-0.64.20160912git.el9 rhel-9-for-x86_64-baseos-rpms 312 k
wget x86_64 1.21.1-8.el9_4 rhel-9-for-x86_64-appstream-rpms 789 k
Upgrading:
iputils x86_64 20210202-10.el9_5 rhel-9-for-x86_64-baseos-rpms 179 k
Installing dependencies:
cpp x86_64 11.5.0-2.el9 rhel-9-for-x86_64-appstream-rpms 11 M
gcc x86_64 11.5.0-2.el9 rhel-9-for-x86_64-appstream-rpms 32 M
git-core x86_64 2.43.5-2.el9_5 rhel-9-for-x86_64-appstream-rpms 4.4 M
git-core-doc noarch 2.43.5-2.el9_5 rhel-9-for-x86_64-appstream-rpms 2.9 M
glibc-devel x86_64 2.34-125.el9_5.1 rhel-9-for-x86_64-appstream-rpms 37 k
glibc-headers x86_64 2.34-125.el9_5.1 rhel-9-for-x86_64-appstream-rpms 543 k
libmpc x86_64 1.2.1-4.el9 rhel-9-for-x86_64-appstream-rpms 65 k
libxcrypt-devel x86_64 4.4.18-3.el9 rhel-9-for-x86_64-appstream-rpms 32 k
m4 x86_64 1.4.19-1.el9 rhel-9-for-x86_64-appstream-rpms 304 k
make x86_64 1:4.3-8.el9 rhel-9-for-x86_64-baseos-rpms 541 k
perl-DynaLoader x86_64 1.47-481.el9 rhel-9-for-x86_64-appstream-rpms 26 k
perl-Error noarch 1:0.17029-7.el9 rhel-9-for-x86_64-appstream-rpms 46 k
perl-File-Compare noarch 1.100.600-481.el9 rhel-9-for-x86_64-appstream-rpms 14 k
perl-File-Copy noarch 2.34-481.el9 rhel-9-for-x86_64-appstream-rpms 20 k
perl-File-Find noarch 1.37-481.el9 rhel-9-for-x86_64-appstream-rpms 26 k
perl-Git noarch 2.43.5-2.el9_5 rhel-9-for-x86_64-appstream-rpms 39 k
perl-TermReadKey x86_64 2.38-11.el9 rhel-9-for-x86_64-appstream-rpms 40 k
perl-Thread-Queue noarch 3.14-460.el9 rhel-9-for-x86_64-appstream-rpms 24 k
perl-lib x86_64 0.65-481.el9 rhel-9-for-x86_64-appstream-rpms 15 k
perl-threads x86_64 1:2.25-460.el9 rhel-9-for-x86_64-appstream-rpms 61 k
perl-threads-shared x86_64 1.61-460.el9 rhel-9-for-x86_64-appstream-rpms 48 k
Transaction Summary
===============================================================================================
Install 27 Packages
Upgrade 1 Package
Total download size: 56 M
Downloading Packages:
(1/28): net-tools-2.0-0.64.20160912git.el9.x86_ 2.0 MB/s | 312 kB 00:00
(2/28): perl-Error-0.17029-7.el9.noarch.rpm 296 kB/s | 46 kB 00:00
(3/28): make-4.3-8.el9.x86_64.rpm 2.9 MB/s | 541 kB 00:00
(4/28): libmpc-1.2.1-4.el9.x86_64.rpm 950 kB/s | 65 kB 00:00
(5/28): perl-TermReadKey-2.38-11.el9.x86_64.rpm 403 kB/s | 40 kB 00:00
(6/28): perl-threads-2.25-460.el9.x86_64.rpm 743 kB/s | 61 kB 00:00
(7/28): m4-1.4.19-1.el9.x86_64.rpm 2.4 MB/s | 304 kB 00:00
(8/28): libxcrypt-devel-4.4.18-3.el9.x86_64.rpm 162 kB/s | 32 kB 00:00
(9/28): perl-threads-shared-1.61-460.el9.x86_64 630 kB/s | 48 kB 00:00
(10/28): perl-Thread-Queue-3.14-460.el9.noarch. 159 kB/s | 24 kB 00:00
(11/28): perl-File-Compare-1.100.600-481.el9.no 191 kB/s | 14 kB 00:00
(12/28): automake-1.16.2-8.el9.noarch.rpm 3.2 MB/s | 693 kB 00:00
(13/28): perl-File-Copy-2.34-481.el9.noarch.rpm 144 kB/s | 20 kB 00:00
(14/28): perl-File-Find-1.37-481.el9.noarch.rpm 120 kB/s | 26 kB 00:00
(15/28): perl-lib-0.65-481.el9.x86_64.rpm 85 kB/s | 15 kB 00:00
(16/28): perl-DynaLoader-1.47-481.el9.x86_64.rp 134 kB/s | 26 kB 00:00
(17/28): wget-1.21.1-8.el9_4.x86_64.rpm 5.1 MB/s | 789 kB 00:00
(18/28): autoconf-2.69-39.el9.noarch.rpm 2.8 MB/s | 685 kB 00:00
(19/28): glibc-devel-2.34-125.el9_5.1.x86_64.rp 319 kB/s | 37 kB 00:00
(20/28): libtool-2.4.6-46.el9.x86_64.rpm 7.5 MB/s | 585 kB 00:00
(21/28): glibc-headers-2.34-125.el9_5.1.x86_64. 9.0 MB/s | 543 kB 00:00
(22/28): cpp-11.5.0-2.el9.x86_64.rpm 59 MB/s | 11 MB 00:00
(23/28): git-2.43.5-2.el9_5.x86_64.rpm 508 kB/s | 55 kB 00:00
(24/28): gcc-11.5.0-2.el9.x86_64.rpm 55 MB/s | 32 MB 00:00
(25/28): git-core-doc-2.43.5-2.el9_5.noarch.rpm 19 MB/s | 2.9 MB 00:00
(26/28): git-core-2.43.5-2.el9_5.x86_64.rpm 17 MB/s | 4.4 MB 00:00
(27/28): perl-Git-2.43.5-2.el9_5.noarch.rpm 451 kB/s | 39 kB 00:00
(28/28): iputils-20210202-10.el9_5.x86_64.rpm 2.5 MB/s | 179 kB 00:00
--------------------------------------------------------------------------------
Total 38 MB/s | 56 MB 00:01
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Installing : perl-DynaLoader-1.47-481.el9.x86_64 1/29
Installing : git-core-2.43.5-2.el9_5.x86_64 2/29
Installing : perl-File-Find-1.37-481.el9.noarch 3/29
Installing : perl-File-Copy-2.34-481.el9.noarch 4/29
Installing : perl-File-Compare-1.100.600-481.el9.noarch 5/29
Installing : perl-threads-1:2.25-460.el9.x86_64 6/29
Installing : libmpc-1.2.1-4.el9.x86_64 7/29
Installing : cpp-11.5.0-2.el9.x86_64 8/29
Installing : perl-threads-shared-1.61-460.el9.x86_64 9/29
Installing : perl-Thread-Queue-3.14-460.el9.noarch 10/29
Installing : git-core-doc-2.43.5-2.el9_5.noarch 11/29
Installing : perl-TermReadKey-2.38-11.el9.x86_64 12/29
Installing : glibc-headers-2.34-125.el9_5.1.x86_64 13/29
Installing : glibc-devel-2.34-125.el9_5.1.x86_64 14/29
Installing : libxcrypt-devel-4.4.18-3.el9.x86_64 15/29
Installing : perl-lib-0.65-481.el9.x86_64 16/29
Installing : m4-1.4.19-1.el9.x86_64 17/29
Installing : autoconf-2.69-39.el9.noarch 18/29
Installing : automake-1.16.2-8.el9.noarch 19/29
Installing : perl-Error-1:0.17029-7.el9.noarch 20/29
Installing : git-2.43.5-2.el9_5.x86_64 21/29
Installing : perl-Git-2.43.5-2.el9_5.noarch 22/29
Installing : make-1:4.3-8.el9.x86_64 23/29
Installing : gcc-11.5.0-2.el9.x86_64 24/29
Installing : libtool-2.4.6-46.el9.x86_64 25/29
Upgrading : iputils-20210202-10.el9_5.x86_64 26/29
Running scriptlet: iputils-20210202-10.el9_5.x86_64 26/29
Installing : wget-1.21.1-8.el9_4.x86_64 27/29
Installing : net-tools-2.0-0.64.20160912git.el9.x86_64 28/29
Running scriptlet: net-tools-2.0-0.64.20160912git.el9.x86_64 28/29
Running scriptlet: iputils-20210202-9.el9.x86_64 29/29
Cleanup : iputils-20210202-9.el9.x86_64 29/29
Running scriptlet: iputils-20210202-9.el9.x86_64 29/29
[ 4396.017542] systemd-rc-local-generator[23097]: /etc/rc.d/rc.local is not marked executable, skipping.
Verifying : make-1:4.3-8.el9.x86_64 1/29
Verifying : net-tools-2.0-0.64.20160912git.el9.x86_64 2/29
Verifying : perl-Error-1:0.17029-7.el9.noarch 3/29
Verifying : perl-TermReadKey-2.38-11.el9.x86_64 4/29
Verifying : libmpc-1.2.1-4.el9.x86_64 5/29
Verifying : libxcrypt-devel-4.4.18-3.el9.x86_64 6/29
Verifying : perl-threads-1:2.25-460.el9.x86_64 7/29
Verifying : m4-1.4.19-1.el9.x86_64 8/29
Verifying : perl-Thread-Queue-3.14-460.el9.noarch 9/29
Verifying : perl-threads-shared-1.61-460.el9.x86_64 10/29
Verifying : automake-1.16.2-8.el9.noarch 11/29
Verifying : perl-File-Compare-1.100.600-481.el9.noarch 12/29
Verifying : perl-File-Copy-2.34-481.el9.noarch 13/29
Verifying : perl-File-Find-1.37-481.el9.noarch 14/29
Verifying : perl-lib-0.65-481.el9.x86_64 15/29
Verifying : perl-DynaLoader-1.47-481.el9.x86_64 16/29
Verifying : wget-1.21.1-8.el9_4.x86_64 17/29
Verifying : autoconf-2.69-39.el9.noarch 18/29
Verifying : gcc-11.5.0-2.el9.x86_64 19/29
Verifying : glibc-devel-2.34-125.el9_5.1.x86_64 20/29
Verifying : libtool-2.4.6-46.el9.x86_64 21/29
Verifying : cpp-11.5.0-2.el9.x86_64 22/29
Verifying : glibc-headers-2.34-125.el9_5.1.x86_64 23/29
Verifying : git-2.43.5-2.el9_5.x86_64 24/29
Verifying : git-core-2.43.5-2.el9_5.x86_64 25/29
Verifying : git-core-doc-2.43.5-2.el9_5.noarch 26/29
Verifying : perl-Git-2.43.5-2.el9_5.noarch 27/29
Verifying : iputils-20210202-10.el9_5.x86_64 28/29
Verifying : iputils-20210202-9.el9.x86_64 29/29
Installed products updated.
Upgraded:
iputils-20210202-10.el9_5.x86_64
Installed:
autoconf-2.69-39.el9.noarch
automake-1.16.2-8.el9.noarch
cpp-11.5.0-2.el9.x86_64
gcc-11.5.0-2.el9.x86_64
git-2.43.5-2.el9_5.x86_64
git-core-2.43.5-2.el9_5.x86_64
git-core-doc-2.43.5-2.el9_5.noarch
glibc-devel-2.34-125.el9_5.1.x86_64
glibc-headers-2.34-125.el9_5.1.x86_64
libmpc-1.2.1-4.el9.x86_64
libtool-2.4.6-46.el9.x86_64
libxcrypt-devel-4.4.18-3.el9.x86_64
m4-1.4.19-1.el9.x86_64
make-1:4.3-8.el9.x86_64
net-tools-2.0-0.64.20160912git.el9.x86_64
perl-DynaLoader-1.47-481.el9.x86_64
perl-Error-1:0.17029-7.el9.noarch
perl-File-Compare-1.100.600-481.el9.noarch
perl-File-Copy-2.34-481.el9.noarch
perl-File-Find-1.37-481.el9.noarch
perl-Git-2.43.5-2.el9_5.noarch
perl-TermReadKey-2.38-11.el9.x86_64
perl-Thread-Queue-3.14-460.el9.noarch
perl-lib-0.65-481.el9.x86_64
perl-threads-1:2.25-460.el9.x86_64
perl-threads-shared-1.61-460.el9.x86_64
wget-1.21.1-8.el9_4.x86_64
Complete!
Install NVIDIA GPU Drivers
Next we need to install the NVIDIA GPU drivers.
$ sudo dnf -y module install nvidia-driver:570-open
Updating Subscription Management repositories.
Last metadata expiration check: 2:17:53 ago on Sat Apr 12 17:25:39 2025.
Dependencies resolved.
==========================================================================================================
Package Arch Version Repository Size
==========================================================================================================
Installing group/module packages:
kmod-nvidia-open-dkms noarch 3:570.124.06-1.el9 cuda-rhel9-x86_64 12 M
libnvidia-cfg x86_64 3:570.124.06-1.el9 cuda-rhel9-x86_64 151 k
libnvidia-fbc x86_64 3:570.124.06-1.el9 cuda-rhel9-x86_64 102 k
(...)
xorg-x11-drv-libinput x86_64 1.0.1-3.el9 rhel-9-for-x86_64-appstream-rpms 49 k
xorg-x11-nvidia x86_64 3:570.124.06-1.el9 cuda-rhel9-x86_64 2.4 M
xorg-x11-proto-devel noarch 2024.1-1.el9 rhel-9-for-x86_64-appstream-rpms 314 k
xorg-x11-server-Xorg x86_64 1.20.11-26.el9 rhel-9-for-x86_64-appstream-rpms 1.5 M
xorg-x11-server-common x86_64 1.20.11-26.el9 rhel-9-for-x86_64-appstream-rpms 37 k
Installing module profiles:
nvidia-driver/default
Transaction Summary
==========================================================================================================
Install 51 Packages
Total download size: 335 M
Installed size: 1.1 G
Downloading Packages:
(1/51): egl-gbm-1.1.2.1-1.el9.x86_64.rpm 90 kB/s | 22 kB 00:00
(2/51): egl-wayland-1.1.19~20250313gitf1fd514-1 162 kB/s | 44 kB 00:00
(3/51): egl-x11-1.0.1~20250324git0558d54-5.el9. 206 kB/s | 56 kB 00:00
(...)
(50/51): info-6.7-15.el9.x86_64.rpm 1.5 MB/s | 228 kB 00:00
(51/51): kernel-devel-matched-5.14.0-503.35.1.e 5.8 MB/s | 2.0 MB 00:00
--------------------------------------------------------------------------------
Total 68 MB/s | 335 MB 00:04
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Installing : libnvidia-ml-3:570.124.06-1.el9.x86_64 1/51
(...)
Installing : nvidia-modprobe-3:570.124.06-1.el9.x86_64 38/51
Installing : nvidia-kmod-common-3:570.124.06-1.el9.noarch 39/51
Running scriptlet: nvidia-kmod-common-3:570.124.06-1.el9.noarch 39/51
Installing : kmod-nvidia-open-dkms-3:570.124.06-1.el9.noarch 40/51
Running scriptlet: kmod-nvidia-open-dkms-3:570.124.06-1.el9.noarch 40/51
[ 474.894705] nvidia: loading out-of-tree module taints kernel.
[ 474.894734] nvidia: module verification failed: signature and/or required key missing - tainting kernel
[ 474.934517] nvidia-nvlink: Nvlink Core is being initialized, major device number 235
[ 474.934584] NVRM: loading NVIDIA UNIX Open Kernel Module for x86_64 570.124.06 Release Build (root@rhel9-rdma3) Sat Apr 12 19:44:54 EDT 2025
[ 475.033439] nvidia-modeset: Loading NVIDIA UNIX Open Kernel Mode Setting Driver for x86_64 570.124.06 Release Build (root@rhel9-rdma3) Sat Apr 12 19:44:24 EDT 2025
[ 475.041093] [drm] [nvidia-drm] [GPU ID 0x00000a00] Loading driver
[ 475.119117] ACPI Warning: \_SB.PCI0.S19.S00._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20230331/nsarguments-61)
[ 476.482834] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:0a:00.0 on minor 1
[ 476.482930] nvidia 0000:0a:00.0: [drm] No compatible format found
[ 476.482938] nvidia 0000:0a:00.0: [drm] Cannot find any crtc or sizes
[ 476.679230] nvidia-uvm: Loaded the UVM driver, major device number 511.
Installing : egl-x11-1.0.1~20250324git0558d54-5.el9.x86_64 41/51
Installing : egl-wayland-1.1.19~20250313gitf1fd514-1.el9.x86_64 42/51
Installing : egl-gbm-2:1.1.2.1-1.el9.x86_64 43/51
Installing : nvidia-driver-libs-3:570.124.06-1.el9.x86_64 44/51
Installing : nvidia-driver-3:570.124.06-1.el9.x86_64 45/51
Running scriptlet: nvidia-driver-3:570.124.06-1.el9.x86_64 45/51
Created symlink /etc/systemd/system/systemd-hibernate.service.wants/nvidia-hibernate.service → /usr/lib/systemd/system/nvidia-hibernate.service.
Created symlink /etc/systemd/system/multi-user.target.wants/nvidia-powerd.service → /usr/lib/systemd/system/nvidia-powerd.service.
Created symlink /etc/systemd/system/systemd-suspend.service.wants/nvidia-resume.service → /usr/lib/systemd/system/nvidia-resume.service.
Created symlink /etc/systemd/system/systemd-hibernate.service.wants/nvidia-resume.service → /usr/lib/systemd/system/nvidia-resume.service.
Created symlink /etc/systemd/system/systemd-suspend-then-hibernate.service.wants/nvidia-resume.service → /usr/lib/systemd/system/nvidia-resume.service.
Created symlink /etc/systemd/system/systemd-suspend.service.wants/nvidia-suspend.service → /usr/lib/systemd/system/nvidia-suspend.service.
Created symlink /etc/systemd/system/systemd-suspend-then-hibernate.service.wants/nvidia-suspend-then-hibernate.service → /usr/lib/systemd/system/nvidia-suspend-then-hibernate.service.
Installing : xorg-x11-nvidia-3:570.124.06-1.el9.x86_64 46/51
Installing : nvidia-xconfig-3:570.124.06-1.el9.x86_64 47/51
Installing : nvidia-settings-3:570.124.06-1.el9.x86_64 48/51
Installing : nvidia-driver-cuda-3:570.124.06-1.el9.x86_64 49/51
Installing : nvidia-libXNVCtrl-devel-3:570.124.06-1.el9.x86_64 50/51
Installing : libnvidia-fbc-3:570.124.06-1.el9.x86_64 51/51
Running scriptlet: libnvidia-fbc-3:570.124.06-1.el9.x86_64 51/51
[ 478.773624] systemd-rc-local-generator[42506]: /etc/rc.d/rc.local is not marked executable, skipping.
Verifying : egl-gbm-2:1.1.2.1-1.el9.x86_64 1/51
Verifying : egl-wayland-1.1.19~20250313gitf1fd514-1.el9.x86_64 2/51
Verifying : egl-x11-1.0.1~20250324git0558d54-5.el9.x86_64 3/51
Verifying : kmod-nvidia-open-dkms-3:570.124.06-1.el9.noarch 4/51
(...)
Verifying : kernel-devel-5.14.0-503.35.1.el9_5.x86_64 48/51
Verifying : kernel-devel-matched-5.14.0-503.35.1.el9_5.x86_64 49/51
Verifying : ed-1.14.2-12.el9.x86_64 50/51
Verifying : info-6.7-15.el9.x86_64 51/51
Installed products updated.
Installed:
bison-3.7.4-5.el9.x86_64
dkms-3.1.6-1.el9.noarch
ed-1.14.2-12.el9.x86_64
egl-gbm-2:1.1.2.1-1.el9.x86_64
egl-wayland-1.1.19~20250313gitf1fd514-1.el9.x86_64
(...)
xorg-x11-nvidia-3:570.124.06-1.el9.x86_64
xorg-x11-proto-devel-2024.1-1.el9.noarch
xorg-x11-server-Xorg-1.20.11-26.el9.x86_64
xorg-x11-server-common-1.20.11-26.el9.x86_64
Complete!
Validate GPU Drivers
We can validate the NVIDIA GPU Drivers installed by running the nvidia-smi
command and listing out the modules.
$ sudo nvidia-smi
Sat Apr 12 19:46:06 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.124.06 Driver Version: 570.124.06 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A40 Off | 00000000:0A:00.0 Off | 0 |
| 0% 27C P0 70W / 300W | 1MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
$ sudo lsmod|grep nvidia
nvidia_uvm 4100096 0
nvidia_drm 143360 0
nvidia_modeset 1720320 1 nvidia_drm
nvidia 11669504 2 nvidia_uvm,nvidia_modeset
video 73728 1 nvidia_modeset
drm_kms_helper 274432 4 bochs,drm_vram_helper,nvidia_drm
drm 782336 8 drm_kms_helper,bochs,drm_vram_helper,nvidia,drm_ttm_helper,nvidia_drm,ttm
Install CUDA Libraries
Next we need to install the NVIDIA CUDA libraries.
$ sud dnf -y install cuda-toolkit-12-8
Updating Subscription Management repositories.
Last metadata expiration check: 1:01:06 ago on Sat Apr 12 18:06:50 2025.
Dependencies resolved.
===========================================================================================================================
Package Arch Version Repository Size
===========================================================================================================================
Installing:
cuda-toolkit-12-8 x86_64 12.8.1-1 cuda-rhel9-x86_64 8.8 k
Installing dependencies:
ModemManager-glib x86_64 1.20.2-1.el9 rhel-9-for-x86_64-baseos-rpms 337 k
adwaita-cursor-theme noarch 40.1.1-3.el9 rhel-9-for-x86_64-appstream-rpms 655 k
adwaita-icon-theme noarch 40.1.1-3.el9 rhel-9-for-x86_64-appstream-rpms 12 M
alsa-lib x86_64 1.2.12-1.el9 rhel-9-for-x86_64-appstream-rpms 527 k
at-spi2-atk x86_64 2.38.0-4.el9 rhel-9-for-x86_64-appstream-rpms 90 k
(...)
pipewire-alsa x86_64 1.0.1-1.el9 rhel-9-for-x86_64-appstream-rpms 59 k
pipewire-jack-audio-connection-kit x86_64 1.0.1-1.el9 rhel-9-for-x86_64-appstream-rpms 9.4 k
pipewire-pulseaudio x86_64 1.0.1-1.el9 rhel-9-for-x86_64-appstream-rpms 196 k
tracker-miners x86_64 3.1.2-4.el9_3 rhel-9-for-x86_64-appstream-rpms 942 k
xdg-desktop-portal-gtk x86_64 1.12.0-3.el9 rhel-9-for-x86_64-appstream-rpms 139 k
Transaction Summary
===========================================================================================================================
Install 232 Packages
Total download size: 5.1 G
Installed size: 9.7 G
Downloading Packages:
(1/232): cuda-compiler-12-8-12.8.1-1.x86_64.rpm 34 kB/s | 7.4 kB 00:00
(2/232): cuda-command-line-tools-12-8-12.8.1-1. 29 kB/s | 7.5 kB 00:00
(3/232): cuda-cccl-12-8-12.8.90-1.x86_64.rpm 4.2 MB/s | 1.6 MB 00:00
(4/232): cuda-crt-12-8-12.8.93-1.x86_64.rpm 705 kB/s | 118 kB 00:00
(5/232): cuda-cudart-12-8-12.8.90-1.x86_64.rpm 1.0 MB/s | 233 kB 00:00
(6/232): cuda-cuobjdump-12-8-12.8.90-1.x86_64.r 1.5 MB/s | 265 kB 00:00
(...)
(228/232): ostree-libs-2024.9-1.el9_5.x86_64.rp 6.0 MB/s | 470 kB 00:00
(229/232): nss-util-3.101.0-10.el9_2.x86_64.rpm 719 kB/s | 92 kB 00:00
(230/232): tzdata-java-2025b-1.el9.noarch.rpm 3.6 MB/s | 228 kB 00:00
(231/232): libxslt-1.1.34-9.el9_5.1.x86_64.rpm 3.5 MB/s | 245 kB 00:00
(232/232): java-17-openjdk-headless-17.0.14.0.7 58 MB/s | 45 MB 00:00
--------------------------------------------------------------------------------
Total 112 MB/s | 5.1 GB 00:46
cuda-rhel9-x86_64 8.4 kB/s | 1.6 kB 00:00
Importing GPG key 0xD42D0685:
Userid : "cudatools <cudatools@nvidia.com>"
Fingerprint: 610C 7B14 E068 A878 070D A4E9 9CD0 A493 D42D 0685
From : https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/D42D0685.pub
Key imported successfully
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Running scriptlet: copy-jdk-configs-4.0-3.el9.noarch 1/1
Running scriptlet: java-17-openjdk-headless-1:17.0.14.0.7-2.el9.x86_64 1/1
Preparing : 1/1
Installing : cuda-toolkit-config-common-12.8.90-1.noarch 1/232
Installing : cuda-toolkit-12-config-common-12.8.90-1.noarch 2/232
Installing : cuda-toolkit-12-8-config-common-12.8.90-1.noarch 3/232
Installing : nspr-4.35.0-17.el9_2.x86_64 4/232
Installing : alsa-lib-1.2.12-1.el9.x86_64 5/232
Installing : libogg-2:1.3.4-6.el9.x86_64 6/232
Installing : avahi-libs-0.8-21.el9.x86_64 7/232
Installing : libvorbis-1:1.3.7-5.el9.x86_64 8/232
(...)
Running scriptlet: copy-jdk-configs-4.0-3.el9.noarch 232/232
Running scriptlet: wireplumber-0.4.14-1.el9.x86_64 232/232
Created symlink /etc/systemd/user/pipewire-session-manager.service → /usr/lib/systemd/user/wireplumber.service.
Created symlink /etc/systemd/user/pipewire.service.wants/wireplumber.service → /usr/lib/systemd/user/wireplumber.service.
Running scriptlet: java-17-openjdk-headless-1:17.0.14.0.7-2.el9.x86 232/232
Running scriptlet: fontconfig-2.14.0-2.el9_1.x86_64 232/232
Running scriptlet: java-17-openjdk-1:17.0.14.0.7-2.el9.x86_64 232/232
Running scriptlet: cuda-nvvp-12-8-12.8.93-1.x86_64 232/232
Running scriptlet: nsight-compute-2025.1.1-2025.1.1.2-1.x86_64 232/232
Running scriptlet: pipewire-pulseaudio-1.0.1-1.el9.x86_64 232/232
[ 4710.758477] systemd-rc-local-generator[27383]: /etc/rc.d/rc.local is not marked executable, skipping.
Verifying : cuda-cccl-12-8-12.8.90-1.x86_64 1/232
Verifying : cuda-command-line-tools-12-8-12.8.1-1.x86_64 2/232
Verifying : cuda-compiler-12-8-12.8.1-1.x86_64 3/232
Verifying : cuda-crt-12-8-12.8.93-1.x86_64 4/232
Verifying : cuda-cudart-12-8-12.8.90-1.x86_64 5/232
Verifying : cuda-cudart-devel-12-8-12.8.90-1.x86_64 6/232
Verifying : cuda-cuobjdump-12-8-12.8.90-1.x86_64 7/232
(...)
Verifying : nss-util-3.101.0-10.el9_2.x86_64 229/232
Verifying : ostree-libs-2024.9-1.el9_5.x86_64 230/232
Verifying : libxslt-1.1.34-9.el9_5.1.x86_64 231/232
Verifying : tzdata-java-2025b-1.el9.noarch 232/232
Installed products updated.
Installed:
ModemManager-glib-1.20.2-1.el9.x86_64
adwaita-cursor-theme-40.1.1-3.el9.noarch
adwaita-icon-theme-40.1.1-3.el9.noarch
alsa-lib-1.2.12-1.el9.x86_64
at-spi2-atk-2.38.0-4.el9.x86_64
at-spi2-core-2.40.3-1.el9.x86_64
(...)
xcb-util-keysyms-0.4.0-17.el9.x86_64
xcb-util-renderutil-0.3.9-20.el9.x86_64
xcb-util-wm-0.4.1-22.el9.x86_64
xdg-dbus-proxy-0.1.3-1.el9.x86_64
xdg-desktop-portal-1.12.6-1.el9.x86_64
xdg-desktop-portal-gtk-1.12.0-3.el9.x86_64
xkeyboard-config-2.33-2.el9.noarch
xml-common-0.6.3-58.el9.noarch
xorg-x11-fonts-Type1-7.5-33.el9.noarch
Complete!
Build Perftest with CUDA Libraries
Now we can move onto building perftest binaries by first cloning the repository.
$ git clone https://github.com/linux-rdma/perftest.git
Cloning into 'perftest'...er]# git clone https://github.com/linux-rdma/perftest.git
remote: Enumerating objects: 6237, done.
remote: Counting objects: 100% (2711/2711), done.
remote: Compressing objects: 100% (163/163), done.
remote: Total 6237 (delta 2600), reused 2548 (delta 2548), pack-reused 3526 (from 2)
Next we need to export our paths for the CUDA libraries.
$ export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
$ export LIBRARY_PATH=/usr/local/cuda/lib64:$LIBRARY_PATH
~
Now we can change directories into the perftest project and run autogen.sh
.
$ cd ./perftest
$ ./autogen.sh
libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, 'config'.
libtoolize: copying file 'config/ltmain.sh'
libtoolize: putting macros in AC_CONFIG_MACRO_DIRS, 'm4'.
libtoolize: copying file 'm4/libtool.m4'
libtoolize: copying file 'm4/ltoptions.m4'
libtoolize: copying file 'm4/ltsugar.m4'
libtoolize: copying file 'm4/ltversion.m4'
libtoolize: copying file 'm4/lt~obsolete.m4'
libtoolize: 'AC_PROG_RANLIB' is rendered obsolete by 'LT_INIT'
configure.ac:55: installing 'config/compile'
configure.ac:59: installing 'config/config.guess'
configure.ac:59: installing 'config/config.sub'
configure.ac:36: installing 'config/install-sh'
configure.ac:36: installing 'config/missing'
Makefile.am: installing 'config/depcomp'
Next we need to run the configure
but also pass it the CUDA header paths.
$ ./configure CUDA_H_PATH=/usr/local/cuda/include/cuda.h
configure: loading site script /usr/share/config.site
checking for a BSD-compatible install... /bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether make supports nested variables... (cached) yes
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking whether gcc understands -c and -o together... yes
checking whether make supports the include directive... yes (GNU style)
checking dependency style of gcc... gcc3
checking for g++... g++
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking dependency style of g++... gcc3
checking dependency style of gcc... gcc3
checking build system type... x86_64-pc-linux-gnu
checking host system type... x86_64-pc-linux-gnu
checking how to print strings... printf
checking for a sed that does not truncate output... /bin/sed
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for fgrep... /bin/grep -F
checking for ld used by gcc... /bin/ld
checking if the linker (/bin/ld) is GNU ld... yes
checking for BSD- or MS-compatible name lister (nm)... /bin/nm -B
checking the name lister (/bin/nm -B) interface... BSD nm
checking whether ln -s works... yes
checking the maximum length of command line arguments... 1572864
checking how to convert x86_64-pc-linux-gnu file names to x86_64-pc-linux-gnu format... func_convert_file_noop
checking how to convert x86_64-pc-linux-gnu file names to toolchain format... func_convert_file_noop
checking for /bin/ld option to reload object files... -r
checking for objdump... objdump
checking how to recognize dependent libraries... pass_all
checking for dlltool... no
checking how to associate runtime and link libraries... printf %s\n
checking for ar... ar
checking for archiver @FILE support... @
checking for strip... strip
checking for ranlib... ranlib
checking command to parse /bin/nm -B output from gcc object... ok
checking for sysroot... no
checking for a working dd... /bin/dd
checking how to truncate binary pipes... /bin/dd bs=4096 count=1
checking for mt... no
checking if : is a manifest tool... no
checking how to run the C preprocessor... gcc -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for dlfcn.h... yes
checking for objdir... .libs
checking if gcc supports -fno-rtti -fno-exceptions... no
checking for gcc option to produce PIC... -fPIC -DPIC
checking if gcc PIC flag -fPIC -DPIC works... yes
checking if gcc static flag -static works... no
checking if gcc supports -c -o file.o... yes
checking if gcc supports -c -o file.o... (cached) yes
checking whether the gcc linker (/bin/ld -m elf_x86_64) supports shared libraries... yes
checking whether -lc should be explicitly linked in... no
checking dynamic linker characteristics... GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... yes
checking whether to build static libraries... yes
checking how to run the C++ preprocessor... g++ -E
checking for ld used by g++... /bin/ld -m elf_x86_64
checking if the linker (/bin/ld -m elf_x86_64) is GNU ld... yes
checking whether the g++ linker (/bin/ld -m elf_x86_64) supports shared libraries... yes
checking for g++ option to produce PIC... -fPIC -DPIC
checking if g++ PIC flag -fPIC -DPIC works... yes
checking if g++ static flag -static works... no
checking if g++ supports -c -o file.o... yes
checking if g++ supports -c -o file.o... (cached) yes
checking whether the g++ linker (/bin/ld -m elf_x86_64) supports shared libraries... yes
checking dynamic linker characteristics... (cached) GNU/Linux ld.so
checking how to hardcode library paths into programs... immediate
checking for ranlib... (cached) ranlib
checking for ANSI C header files... (cached) yes
checking infiniband/verbs.h usability... yes
checking infiniband/verbs.h presence... yes
checking for infiniband/verbs.h... yes
checking for ibv_get_device_list in -libverbs... yes
checking for rdma_create_event_channel in -lrdmacm... yes
checking for umad_init in -libumad... yes
checking for log in -lm... yes
checking for ibv_reg_dmabuf_mr in -libverbs... yes
checking pci/pci.h usability... yes
checking pci/pci.h presence... yes
checking for pci/pci.h... yes
checking for pci_init in -lpci... yes
checking for cuMemGetHandleForAddressRange in -lcuda... yes
checking for efadv_create_qp_ex in -lefa... yes
checking for mlx5dv_create_qp in -lmlx5... yes
checking for hnsdv_query_device in -lhns... no
checking that generated files are newer than configure... done
configure: creating ./config.status
config.status: creating Makefile
config.status: creating config.h
config.status: executing depfiles commands
config.status: executing libtool commands
config.status: executing man commands
Finally we can run make
to build the binaries.
$ make -j
make all-am
make[1]: Entering directory '/home/cloud-user/perftest'
ln -s .././man/perftest.1 man/ib_read_bw.1
ln -s .././man/perftest.1 man/ib_write_bw.1
ln -s .././man/perftest.1 man/ib_send_bw.1
ln -s .././man/perftest.1 man/ib_atomic_bw.1
ln -s .././man/perftest.1 man/ib_read_lat.1
ln -s .././man/perftest.1 man/ib_write_lat.1
ln -s .././man/perftest.1 man/ib_send_lat.1
ln -s .././man/perftest.1 man/raw_ethernet_bw.1
ln -s .././man/perftest.1 man/ib_atomic_lat.1
ln -s .././man/perftest.1 man/raw_ethernet_lat.1
ln -s .././man/perftest.1 man/raw_ethernet_burst_lat.1
CC src/send_bw.o
ln -s .././man/perftest.1 man/raw_ethernet_fs_rate.1
CC src/multicast_resources.o
CC src/perftest_communication.o
CC src/get_clock.o
CC src/perftest_parameters.o
CC src/perftest_resources.o
CC src/perftest_counters.o
CC src/host_memory.o
CC src/mmap_memory.o
CC src/cuda_memory.o
CC src/raw_ethernet_resources.o
CC src/send_lat.o
CC src/write_lat.o
CC src/write_bw.o
CC src/read_lat.o
CC src/read_bw.o
CC src/atomic_lat.o
CC src/atomic_bw.o
CC src/raw_ethernet_send_bw.o
CC src/raw_ethernet_send_lat.o
CC src/raw_ethernet_send_burst_lat.o
CC src/raw_ethernet_fs_rate.o
AR libperftest.a
CCLD ib_send_bw
CCLD ib_send_lat
CCLD ib_write_bw
CCLD ib_write_lat
CCLD ib_read_lat
CCLD ib_read_bw
CCLD ib_atomic_bw
CCLD raw_ethernet_bw
CCLD ib_atomic_lat
CCLD raw_ethernet_lat
CCLD raw_ethernet_burst_lat
CCLD raw_ethernet_fs_rate
make[1]: Leaving directory '/home/cloud-user/perftest'
Configure Secondary Interface in Virtual Machines
Inside our virtual machines we need to confirm the device is showing and then configure ip addresses on the interfaces. First we can look at the mst status
.
$ mst status -v
MST modules:
------------
MST PCI module is not loaded
MST PCI configuration module is not loaded
PCI devices:
------------
DEVICE_TYPE MST PCI RDMA NET NUMA
BlueField3(rev:1) NA 09:00.0 mlx5_0 net-eth1 -1
Next we can find our eth1 interface.
$ nmcli con show
NAME UUID TYPE DEVICE
System eth0 5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03 ethernet eth0
Wired connection 1 6ca36168-6830-3427-8853-c89c61c8b70b ethernet eth1
lo 7d903415-466e-41c0-9e52-062f9a33270c loopback lo
We will bring down the interface.
$ nmcli con down "Wired connection 1"
Connection 'Wired connection 1' successfully deactivated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/3)
Modofy the interface to add static ipaddress and mtu of 9000.
$ nmcli con modify "Wired connection 1" ipv4.method manual ipv4.addresses 192.168.12.2/24 mtu 9000
Then bring the interface back up.
$ nmcli con up "Wired connection 1"
Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/4)
This completes setting up the network connectivity inside the virtual machines.
Run Performance Tests
Now that we have configured our virtual machines with all the requirements we can run some tests to confirm that GPUDirect RDMA is working properly. To do this we will use the perftest tooling we built and run the ib_write_bw
command. This will require that we open two console sessions into each virtual machine. The first consolse session we will run the listener ib_write_bw
command and in the second we will run the initiator. In the first VM we will run the following command.
$ sudo /home/cloud-user/perftest/ib_write_bw -R -T 41 -s 65536 -F -x 3 -m 4096 --report_gbits -q 16 -D 60 -d mlx5_0 -p 10000 --source_ip 192.168.12.1
WARNING: BW peak won't be measured in this run.
Perftest doesn't supports CUDA tests with inline messages: inline size set to 0
************************************
* Waiting for client to connect... *
************************************
The second VM should have the following command.
$ sudo /home/cloud-user/perftest/ib_write_bw -R -T 41 -s 65536 -F -x 3 -m 4096 --report_gbits -q 16 -D 60 -d mlx5_0 -p 10000 --source_ip 192.168.12.2 192.168.12.1
If we go back to the first VMs console screen we should see output similar to the results of our run below.
************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
RDMA_Write BW Test
Dual-port : OFF Device : mlx5_0
Number of qps : 16 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON Lock-free : OFF
ibv_wr* API : ON Using DDP : OFF
CQ Moderation : 1
CQE Poll Batch : 16
Mtu : 4096[B]
Link type : Ethernet
GID index : 3
Max inline data : 0[B]
rdma_cm QPs : ON
Data ex. method : rdma_cm TOS : 41
---------------------------------------------------------------------------------------
Waiting for client rdma_cm QP to connect
Please run the same command with the IB/RoCE interface IP
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x0129 PSN 0x40432d
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x012a PSN 0xfa7213
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x012b PSN 0x152561
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x012d PSN 0x28af9c
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x012e PSN 0x5aa37f
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x012f PSN 0x280c75
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0130 PSN 0x42d30
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0131 PSN 0x659969
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0132 PSN 0xb18159
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0133 PSN 0x9c8667
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0134 PSN 0x6af97f
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0135 PSN 0x315a6
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0136 PSN 0xd4499a
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0137 PSN 0xc79c3f
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0138 PSN 0xf1a591
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0139 PSN 0x999481
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
remote address: LID 0000 QPN 0x004b PSN 0xe23c56
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x004c PSN 0x985e88
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x004d PSN 0x9ff132
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x004e PSN 0xb3d99
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0050 PSN 0x9e6638
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0051 PSN 0x802b3a
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0052 PSN 0xca4511
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0053 PSN 0x9dea36
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0054 PSN 0x4016a2
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0055 PSN 0xbdac7c
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0056 PSN 0x1b0e70
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0057 PSN 0xf88643
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0058 PSN 0x3e4a73
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0059 PSN 0xb8eea4
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x005a PSN 0xd47892
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x005b PSN 0xac51ee
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps]
65536 22422716 0.00 391.87 0.747423
---------------------------------------------------------------------------------------
Next we can run another test in our corresponding virtual machines where in the first one we add the following switches to our original command --use_cuda=0 --use_cuda_dmabuf
. This will ensure we are now testing with GPU and using DMA-BUF. Note that DMA-BUF is now preferred over using nvidia-peermem module. The first VM should have the following command.
$ sudo /home/cloud-user/perftest/ib_write_bw -R -T 41 -s 65536 -F -x 3 -m 4096 --report_gbits -q 16 -D 60 -d mlx5_0 -p 10000 --source_ip 192.168.12.1 --use_cuda=0 --use_cuda_dmabuf
WARNING: BW peak won't be measured in this run.
Perftest doesn't supports CUDA tests with inline messages: inline size set to 0
************************************
* Waiting for client to connect... *
************************************
The second VM should have the following command.
$ sudo /home/cloud-user/perftest/ib_write_bw -R -T 41 -s 65536 -F -x 3 -m 4096 --report_gbits -q 16 -D 60 -d mlx5_0 -p 10000 --source_ip 192.168.12.2 192.168.12.1 --use_cuda=0 --use_cuda_dmabuf
If we go back to the first VMs console screen we should see output similar to the results of our run below.
************************************
* Waiting for client to connect... *
************************************
initializing CUDA
Listing all CUDA devices in system:
CUDA device 0: PCIe address is 0A:00
Picking device No. 0
[pid = 2206, dev = 0] device name = [NVIDIA A40]
creating CUDA Ctx
making it the current CUDA Ctx
CUDA device integrated: 0
cuMemAlloc() of a 2097152 bytes GPU buffer
allocated GPU buffer address at 00007fcdd8600000 pointer=0x7fcdd8600000
using DMA-BUF for GPU buffer address at 0x7fcdd8600000 aligned at 0x7fcdd8600000 with aligned size 2097152
Calling ibv_reg_dmabuf_mr(offset=0, size=2097152, addr=0x7fcdd8600000, fd=40) for QP #0
---------------------------------------------------------------------------------------
RDMA_Write BW Test
Dual-port : OFF Device : mlx5_0
Number of qps : 16 Transport type : IB
Connection type : RC Using SRQ : OFF
PCIe relax order: ON Lock-free : OFF
ibv_wr* API : ON Using DDP : OFF
CQ Moderation : 1
CQE Poll Batch : 16
Mtu : 4096[B]
Link type : Ethernet
GID index : 3
Max inline data : 0[B]
rdma_cm QPs : ON
Data ex. method : rdma_cm TOS : 41
---------------------------------------------------------------------------------------
Waiting for client rdma_cm QP to connect
Please run the same command with the IB/RoCE interface IP
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x013c PSN 0xcf0fd
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x013d PSN 0x953a3
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x013e PSN 0xd28fb1
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x013f PSN 0x12d3ac
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0140 PSN 0x325e4f
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0141 PSN 0x997705
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0142 PSN 0xcbec80
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0143 PSN 0x13ee79
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0144 PSN 0x181929
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0145 PSN 0x7009f7
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0146 PSN 0x6d5dcf
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0147 PSN 0xe7abb6
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0148 PSN 0xba8e6a
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x0149 PSN 0x65c8cf
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x014a PSN 0x13fee1
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
local address: LID 0000 QPN 0x014b PSN 0x377b91
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:01
remote address: LID 0000 QPN 0x004a PSN 0x70c5e6
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x004b PSN 0xa050d8
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x004c PSN 0x75fd42
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x004d PSN 0xc4c069
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x004e PSN 0xe1f8c8
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0050 PSN 0xb2f28a
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0051 PSN 0x3b0221
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0052 PSN 0x3aca06
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0053 PSN 0xe04232
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0054 PSN 0xd398cc
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0055 PSN 0x808c80
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0056 PSN 0x319313
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0057 PSN 0xcb9f03
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0058 PSN 0x9f4ff4
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x0059 PSN 0x19c7a2
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
remote address: LID 0000 QPN 0x005a PSN 0xf75bbe
GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:12:02
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps]
65536 10383932 0.00 181.47 0.346131
---------------------------------------------------------------------------------------
deallocating GPU buffer 00007fcdd8600000
destroying current CUDA Ctx
This concludes the workflow of testing RDMA inside of OpenShift virtualization virtual machines.