Docker Enterprise provides GPU support for Kubernetes workloads. This exercise walks you through setting up your system to use underlying GPU support, and through deploying GPU-targeted workloads.
To complete the steps, you will need a Docker Hub account as well as an Amazon AWS account or equivalent. The instructions use AWS instances but you can also do them on any of the platforms supported by Docker Enterprise.
This section describes how to install a MKE cluster with one or more Linux instances. You will use this cluster in the remaining steps.
Create the first Linux instance using the steps at https://aws.amazon.com/getting-started/tutorials/launch-a-virtual-machine to install a two-node Linux-only MKE cluster using the Ubuntu 18.04 AMI.
Log into your Linux instance and install Mirantis Container Runtime.
Install MKE 3.3.0 version on this first Linux instance:
Download the MKE offline bundle using the following command:
$ curl -o ucp_images.tar.gz https://packages.docker.com/caas/ucp_images_3.3.0.tar.gz
Load the MKE image using the following command:
$ docker load < ucp_images.tar.gz
Run the following command to install MKE. Substitute your password for <password> and the public IP address of your VM for the <public IP> placeholder.
$ docker container run \
--rm \
--interactive \
--tty \
--name ucp \
--volume /var/run/docker.sock:/var/run/docker.sock \
docker/ucp:3.3.0 \
install \
--admin-password <password> \
--debug \
--force-minimums \
--san <public IP>
On completion of the command, you will have a single-node MKE cluster with the Linux instance as its Manager node.
Use your browser on your local system to log into the MKE installation above.
Navigate to the nodes list and click on Add Node at the top right of the page.
In the Add Node page select “Linux” as the node type. Choose ‘Worker’ for the node role.
Optionally, you may also select and set custom listen and advertise addresses.
A command line will be generated that includes a join-token. It should look something like:
docker swarm join ... --token <join-token> ...
Copy this command line from the UI for use later.
For each additional Linux instance that you need to add to the cluster, do the following:
Create the instance using the steps at https://aws.amazon.com/getting-started/tutorials/launch-a-virtual-machine , using the Ubuntu 16.04 or 18.04 AMIs.
Log into your Linux instance and install Mirantis Container Runtime.
Download the MKE offline bundle using the following command:
$ curl -o ucp_images.tar.gz https://packages.docker.com/caas/ucp_images_3.3.0.tar.gz
Load the MKE image using the following command:
$ docker load < ucp_images.tar.gz
Add your Linux instance to the MKE cluster by running the swarm-join commandline generated above.
To access your MKE cluster it is necessary to have both the docker CLI and the kubectl CLI running on your local system.
Perform the following steps once you are finished installing the MKE cluster.
GPU drivers are required for setting up GPU support. The instalation of these drivers can occur either before or after the installation of MKE.
The GPU drivers installation procedure will install the NVIDIA driver by way of a runfile on your Linux host. Note that this procedure uses version 440.59, which is the latest available and verified version at the time of this writing.
Note
This procedure describes how to manually install these drivers, but it is recommended that you use a pre-existing automation system to automate installation and patching of the drivers along with the kernel and other host software.
Ensure that your NVIDIA GPU is supported:
lspci | grep -i nvidia
Verify that your GPU is a supported NVIDIA GPU Product.
Install dependencies.
Verify that your system is up-to-date, and you are running the latest kernel.
Install the following packages depending on your OS.
Ubuntu:
sudo apt-get install -y gcc make curl linux-headers-$(uname -r)
RHEL:
sudo yum install -y kernel-devel-$(uname -r) kernel-headers-$(uname -r) gcc make curl elfutils-libelf-devel
Ensure that i2c_core and ipmi_msghandler kernel modules are loaded:
sudo modprobe -a i2c_core ipmi_msghandler
To persist the change across reboots:
echo -e "i2c_core\nipmi_msghandler" | sudo tee /etc/modules-load.d/nvidia.conf
All of the NVIDIA libraries are present under the specific directory on the host:
NVIDIA_OPENGL_PREFIX=/opt/kubernetes/nvidia
sudo mkdir -p $NVIDIA_OPENGL_PREFIX/lib
echo "${NVIDIA_OPENGL_PREFIX}/lib" | sudo tee /etc/ld.so.conf.d/nvidia.conf
sudo ldconfig
Run the installation:
NVIDIA_DRIVER_VERSION=440.59
curl -LSf https://us.download.nvidia.com/XFree86/Linux-x86_64/${NVIDIA_DRIVER_VERSION}/NVIDIA-Linux-x86_64-${NVIDIA_DRIVER_VERSION}.run -o nvidia.run
Note
–opengl-prefix must be set to /opt/kubernetes/nvidia: sudo sh nvidia.run –opengl-prefix=”${NVIDIA_OPENGL_PREFIX}”
Load the NVIDIA Unified Memory kernel module and create device files for the module on startup:
sudo tee /etc/systemd/system/nvidia-modprobe.service << END
[Unit]
Description=NVIDIA modprobe
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/nvidia-modprobe -c0 -u
[Install]
WantedBy=multi-user.target
END
sudo systemctl enable nvidia-modprobe
sudo systemctl start nvidia-modprobe
Enable the NVIDIA persistence daemon to initialize GPUs and keep them initialized:
sudo tee /etc/systemd/system/nvidia-persistenced.service << END
[Unit]
Description=NVIDIA Persistence Daemon
Wants=syslog.target
[Service]
Type=forking
PIDFile=/var/run/nvidia-persistenced/nvidia-persistenced.pid
Restart=always
ExecStart=/usr/bin/nvidia-persistenced --verbose
ExecStopPost=/bin/rm -rf /var/run/nvidia-persistenced
[Install]
WantedBy=multi-user.target
END
sudo systemctl enable nvidia-persistenced
sudo systemctl start nvidia-persistenced
See Driver Persistence <https://docs.nvidia.com/deploy/driver-persistence/index.html> for more information.
MKE includes a GPU device plugin to instrument your GPUs, which is necessary for GPU support. This section describes deploying nvidia.com/gpu.
kubectl describe node <node-name>
...
Capacity:
cpu: 8
ephemeral-storage: 40593612Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 62872884Ki
nvidia.com/gpu: 1
pods: 110
Allocatable:
cpu: 7750m
ephemeral-storage: 36399308Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 60775732Ki
nvidia.com/gpu: 1
pods: 110
...
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 500m (6%) 200m (2%)
memory 150Mi (0%) 440Mi (0%)
nvidia.com/gpu 0 0
To consume GPUs from your container, request nvidia.com/gpu in the limits section. The following example shows how to deploy a simple workload that reports detected NVIDIA CUDA devices.
Create the example deployment:
kubectl apply -f- <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
run: gpu-test
name: gpu-test
spec:
replicas: 1
selector:
matchLabels:
run: gpu-test
template:
metadata:
labels:
run: gpu-test
spec:
containers:
- command:
- sh
- -c
- "deviceQuery && sleep infinity"
image: kshatrix/gpu-example:cuda-10.2
name: gpu-test
resources:
limits:
nvidia.com/gpu: "1"
EOF
If you have any available GPUs in your system, the pod will be scheduled on them. After some time, the Pod should be in the Running state:
NAME READY STATUS RESTARTS AGE
gpu-test-747d746885-hpv74 1/1 Running 0 14m
Check the logs and look for Result = PASS to verify successful completion:
kubectl logs <name of the pod>
deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "Tesla V100-SXM2-16GB"
...
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 1
Result = PASS
Determine the overall GPU capacity of your cluster by inspecting its nodes:
echo $(kubectl get nodes -l com.docker.ucp.gpu.nvidia="true" -o jsonpath="0{range .items[*]}+{.status.allocatable['nvidia\.com/gpu']}{end}") | bc
Set the proper replica number to acquire all available GPUs:
kubectl scale deployment/gpu-test --replicas N
Verify that all of the replicas are scheduled:
kubectl get po
NAME READY STATUS RESTARTS AGE
gpu-test-747d746885-hpv74 1/1 Running 0 12m
gpu-test-747d746885-swrrx 1/1 Running 0 11m
If you attempt to add an additional replica, it should result in a FailedScheduling error with Insufficient nvidia.com/gpu message:
kubectl scale deployment/gpu-test --replicas N+1
kubectl get po
NAME READY STATUS RESTARTS AGE
gpu-test-747d746885-hpv74 1/1 Running 0 14m
gpu-test-747d746885-swrrx 1/1 Running 0 13m
gpu-test-747d746885-zgwfh 0/1 Pending 0 3m26s
Run kubectl describe po gpu-test-747d746885-zgwfh to see the status of the failed deployment:
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling <unknown> default-scheduler 0/2 nodes are available: 2 Insufficient nvidia.com/gpu.
Remove the deployment and corresponding pods:
kubectl delete deployment gpu-test