GPU support for Kubernetes workloads

MKE provides graphics processing unit (GPU) support for Kubernetes workloads that run on Linux worker nodes. This topic describes how to configure your system to use and deploy NVIDIA GPUs.

Install the GPU drivers

GPU support requires that you install GPU drivers, which you can do either prior to or after installing MKE. Installing the GPU drivers installs the NVIDIA driver using a runfile on your Linux host.


This procedure describes how to manually install the GPU drivers. However, Mirantis recommends that you use a pre-existing automation system to automate the installation and patching of the drivers, along with the kernel and other host software.

  1. Enable the NVIDIA GPU device plugin by setting nvidia_device_plugin to true in the MKE configuration file.

  2. Verify that your system supports NVIDIA GPU:

    lspci | grep -i nvidia
  3. Verify that your GPU is a supported NVIDIA GPU Product.

  4. Install all the dependencies listed in the NVIDIA Minimum Requirements.

  5. Verify that your system is up to date and that you are running the latest kernel version.

  6. Install the following packages:

    • Ubuntu:

      sudo apt-get install -y gcc make curl linux-headers-$(uname -r)
    • RHEL:

      sudo yum install -y kernel-devel-$(uname -r) \
      kernel-headers-$(uname -r) gcc make curl elfutils-libelf-devel
  7. Verify that the i2c_core and ipmi_msghandler kernel modules are loaded:

    sudo modprobe -a i2c_core ipmi_msghandler
  8. Persist the change across reboots:

    echo -e "i2c_core\nipmi_msghandler" | sudo tee /etc/modules-load.d/nvidia.conf
  9. Review the NVIDIA libraries, which are located under the following directory on the host:

    sudo mkdir -p $NVIDIA_OPENGL_PREFIX/lib
    echo "${NVIDIA_OPENGL_PREFIX}/lib" | sudo tee /etc/
    sudo ldconfig
  10. Install the NVIDIA GPU driver:

    sudo sh --opengl-prefix="${NVIDIA_OPENGL_PREFIX}"

    Set <version-number> to the NVIDIA driver version of your choice.

  11. Load the NVIDIA Unified Memory kernel module and create device files for the module on startup:

    sudo tee /etc/systemd/system/nvidia-modprobe.service << END
    Description=NVIDIA modprobe
    ExecStart=/usr/bin/nvidia-modprobe -c0 -u
    sudo systemctl enable nvidia-modprobe
    sudo systemctl start nvidia-modprobe
  12. Enable the NVIDIA persistence daemon to initialize GPUs and keep them initialized:

    sudo tee /etc/systemd/system/nvidia-persistenced.service << END
    Description=NVIDIA Persistence Daemon
    ExecStart=/usr/bin/nvidia-persistenced --verbose
    ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced
    sudo systemctl enable nvidia-persistenced
    sudo systemctl start nvidia-persistenced
  13. Test the device plugin and review its description:

    kubectl describe node <node-name>

    Example output:

    cpu:                8
    ephemeral-storage:  40593612Ki
    hugepages-1Gi:      0
    hugepages-2Mi:      0
    memory:             62872884Ki     1
    pods:               110
    cpu:                7750m
    ephemeral-storage:  36399308Ki
    hugepages-1Gi:      0
    hugepages-2Mi:      0
    memory:             60775732Ki     1
    pods:               110
    Allocated resources:
    (Total limits may be over 100 percent, i.e., overcommitted.)
    Resource        Requests    Limits
    --------        --------    ------
    cpu             500m (6%)   200m (2%)
    memory          150Mi (0%)  440Mi (0%)  0           0

Schedule GPU workloads

The following example describes how to deploy a simple workload that reports detected NVIDIA CUDA devices.

  1. Create a practice Deployment that requests in the limits section. The Pod will be scheduled on any available GPUs in your system.

    kubectl apply -f- <<EOF
    apiVersion: apps/v1
    kind: Deployment
      creationTimestamp: null
        run: gpu-test
      name: gpu-test
      replicas: 1
          run: gpu-test
            run: gpu-test
          - command:
            - sh
            - -c
            - "deviceQuery && sleep infinity"
            image: kshatrix/gpu-example:cuda-10.2
            name: gpu-test
  2. Verify that it is in the Running state:

kubectl get pods | grep "gpu-test"
NAME                        READY   STATUS    RESTARTS   AGE
gpu-test-747d746885-hpv74   1/1     Running   0          14m
  1. Review the logs. The presence of Result = PASS indicates a successful deployment:

    kubectl logs <name of the pod>

    Example output:

    deviceQuery Starting...
    CUDA Device Query (Runtime API) version (CUDART static linking)
    Detected 1 CUDA Capable device(s)
    Device 0: "Tesla V100-SXM2-16GB"
    deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 1
    Result = PASS
  2. Determine the overall GPU capacity of your cluster by inspecting its nodes:

    echo $(kubectl get nodes -l com.docker.ucp.gpu.nvidia="true" \
    -o jsonpath="0{range .items[*]}+{.status.allocatable['nvidia\.com/gpu']}{end}") | bc
  3. Set the proper replica number to acquire all available GPUs:

    kubectl scale deployment/gpu-test --replicas N
  4. Verify that all of the replicas are scheduled:

    kubectl get pods | grep "gpu-test"

    Example output:

    NAME                        READY   STATUS    RESTARTS   AGE
    gpu-test-747d746885-hpv74   1/1     Running   0          12m
    gpu-test-747d746885-swrrx   1/1     Running   0          11m
  5. Remove the Deployment and corresponding Pods:

    kubectl delete deployment gpu-test


If you attempt to add an additional replica to the previous example Deployment, it will result in a FailedScheduling error with the Insufficient message.

  1. Add an additional replica:

    kubectl scale deployment/gpu-test --replicas N+1
    kubectl get pods | grep "gpu-test"

    Example output:

    NAME                        READY   STATUS    RESTARTS   AGE
    gpu-test-747d746885-hpv74   1/1     Running   0          14m
    gpu-test-747d746885-swrrx   1/1     Running   0          13m
    gpu-test-747d746885-zgwfh   0/1     Pending   0          3m26s
  2. Review the status of the failed Deployment:

    kubectl describe po gpu-test-747d746885-zgwfh

    Example output:

    Type     Reason            Age        From               Message
    ----     ------            ----       ----               -------
    Warning  FailedScheduling  <unknown>  default-scheduler  0/2 nodes are available: 2 Insufficient