GPU support for Kubernetes workloads

MKE provides GPU support for Kubernetes workloads running on Linux worker nodes. This exercise walks you through setting up your system to use underlying GPU support, and through deploying GPU-targeted workloads.

Installing the GPU drivers

GPU drivers are required for setting up GPU support. The instalation of these drivers can occur either before or after the installation of MKE.

The GPU drivers installation procedure will install the NVIDIA driver by way of a runfile on your Linux host. Note that this procedure uses version 440.59, which is the latest available and verified version at the time of this writing.


This procedure describes how to manually install these drivers, but it is recommended that you use a pre-existing automation system to automate installation and patching of the drivers along with the kernel and other host software.

  1. Ensure that your NVIDIA GPU is supported:

    lspci | grep -i nvidia
  2. Verify that your GPU is a supported NVIDIA GPU Product.

  3. Install dependencies.

  4. Verify that your system is up-to-date, and you are running the latest kernel.

  5. Install the following packages depending on your OS.

    • Ubuntu:

      sudo apt-get install -y gcc make curl linux-headers-$(uname -r)
    • RHEL:

      sudo yum install -y kernel-devel-$(uname -r) kernel-headers-$(uname -r) gcc make curl elfutils-libelf-devel
  6. Ensure that i2c_core and ipmi_msghandler kernel modules are loaded:

    sudo modprobe -a i2c_core ipmi_msghandler

    To persist the change across reboots:

    echo -e "i2c_core\nipmi_msghandler" | sudo tee /etc/modules-load.d/nvidia.conf

    All of the NVIDIA libraries are present under the specific directory on the host:

    sudo mkdir -p $NVIDIA_OPENGL_PREFIX/lib
    echo "${NVIDIA_OPENGL_PREFIX}/lib" | sudo tee /etc/
    sudo ldconfig
  7. Run the installation:

    sudo sh --opengl-prefix="${NVIDIA_OPENGL_PREFIX}"
  8. Load the NVIDIA Unified Memory kernel module and create device files for the module on startup:

    sudo tee /etc/systemd/system/nvidia-modprobe.service << END
    Description=NVIDIA modprobe
    ExecStart=/usr/bin/nvidia-modprobe -c0 -u
    sudo systemctl enable nvidia-modprobe
    sudo systemctl start nvidia-modprobe
  9. Enable the NVIDIA persistence daemon to initialize GPUs and keep them initialized:

    sudo tee /etc/systemd/system/nvidia-persistenced.service << END
    Description=NVIDIA Persistence Daemon
    ExecStart=/usr/bin/nvidia-persistenced --verbose
    ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced
    sudo systemctl enable nvidia-persistenced
    sudo systemctl start nvidia-persistenced

See for more information.

Test the device plugin

MKE includes a GPU device plugin to instrument your GPUs, which is necessary for GPU support. This section describes deploying

kubectl describe node <node-name>

cpu:                8
ephemeral-storage:  40593612Ki
hugepages-1Gi:      0
hugepages-2Mi:      0
memory:             62872884Ki     1
pods:               110
cpu:                7750m
ephemeral-storage:  36399308Ki
hugepages-1Gi:      0
hugepages-2Mi:      0
memory:             60775732Ki     1
pods:               110
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource        Requests    Limits
--------        --------    ------
cpu             500m (6%)   200m (2%)
memory          150Mi (0%)  440Mi (0%)  0           0

Scheduling GPU workloads

To consume GPUs from your container, request in the limits section. The following example shows how to deploy a simple workload that reports detected NVIDIA CUDA devices.

  1. Create the example deployment:

    kubectl apply -f- <<EOF
    apiVersion: apps/v1
    kind: Deployment
      creationTimestamp: null
        run: gpu-test
      name: gpu-test
      replicas: 1
          run: gpu-test
            run: gpu-test
          - command:
            - sh
            - -c
            - "deviceQuery && sleep infinity"
            image: kshatrix/gpu-example:cuda-10.2
            name: gpu-test
  2. If you have any available GPUs in your system, the pod will be scheduled on them. After some time, the Pod should be in the Running state:

    NAME                        READY   STATUS    RESTARTS   AGE
    gpu-test-747d746885-hpv74   1/1     Running   0          14m
  3. Check the logs and look for Result = PASS to verify successful completion:

    kubectl logs <name of the pod>
    deviceQuery Starting...
    CUDA Device Query (Runtime API) version (CUDART static linking)
    Detected 1 CUDA Capable device(s)
    Device 0: "Tesla V100-SXM2-16GB"
    deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 1
    Result = PASS
  4. Determine the overall GPU capacity of your cluster by inspecting its nodes:

    echo $(kubectl get nodes -l com.docker.ucp.gpu.nvidia="true" -o jsonpath="0{range .items[*]}+{.status.allocatable['nvidia\.com/gpu']}{end}") | bc
  5. Set the proper replica number to acquire all available GPUs:

    kubectl scale deployment/gpu-test --replicas N
  1. Verify that all of the replicas are scheduled:

    kubectl get po
    NAME                        READY   STATUS    RESTARTS   AGE
    gpu-test-747d746885-hpv74   1/1     Running   0          12m
    gpu-test-747d746885-swrrx   1/1     Running   0          11m

If you attempt to add an additional replica, it should result in a FailedScheduling error with Insufficient message:

kubectl scale deployment/gpu-test --replicas N+1

kubectl get po
NAME                        READY   STATUS    RESTARTS   AGE
gpu-test-747d746885-hpv74   1/1     Running   0          14m
gpu-test-747d746885-swrrx   1/1     Running   0          13m
gpu-test-747d746885-zgwfh   0/1     Pending   0          3m26s

Run kubectl describe po gpu-test-747d746885-zgwfh to see the status of the failed deployment:

Type     Reason            Age        From               Message
----     ------            ----       ----               -------
Warning  FailedScheduling  <unknown>  default-scheduler  0/2 nodes are available: 2 Insufficient

Remove the deployment and corresponding pods:

kubectl delete deployment gpu-test

See also