Upgrade an MKE installation

Upgrade an MKE installation

This section helps you upgrade Mirantis Kubernetes Engine (MKE).

Note

Kubernetes Ingress cannot be deployed on a cluster after MKE is upgraded from 3.2.6 to 3.3.0. A fresh install of 3.3.0 is not impacted. For more information about how to reproduce and workaround the issue, see the release notes.

Before upgrading to a new version of MKE, review the MKE 3.3.x Release Notes for detail on new features and any other information that may be relevant to upgrading to a particular version.

As part of the upgrade process, you’ll upgrade the Mirantis Container Runtime installed on each node of the cluster to version 19.03 or higher. You should plan for the upgrade to take place outside of business hours, to ensure there’s minimal impact to your users.

Also, don’t make changes to MKE configurations while you’re upgrading it. This can lead to misconfigurations that are difficult to troubleshoot.

Environment checklist

Complete the checks as detailed in the following areas:

Systems

  • Confirm time sync across all nodes (and check time daemon logs for any large time drifting)

  • Check system requirements PROD=4 vCPU/16GB for MKE managers and MSR replicas

  • Review the full MKE, MSR, and MCR port requirements

  • Ensure that your cluster nodes meet the minimum requirements

  • Before performing any upgrade, ensure that you meet all minimum requirements listed in MKE System requirements, including port openings (MKE 3.x added more required ports for Kubernetes), memory, and disk space. For example, manager nodes must have at least 8GB of memory.

Note

If you are upgrading a cluster to UCP 3.0.2 or higher on Microsoft Azure, please ensure that all of the Azure prerequisites are met.

Storage

  • Check /var/ storage allocation and increase if it is over 70% usage.

  • In addition, check all nodes’ local file systems for any disk storage issues (and MSR back-end storage, for example, NFS).

  • If not using Overlay2 storage drivers please take this opportunity to do so, you will find stability there. Note that the transition from Device mapper to Overlay2 is a destructive rebuild.

Operating system

  • If cluster nodes OS branch is older (Ubuntu 14.x, RHEL 7.3, etc), consider patching all relevant packages to the most recent (including kernel).

  • Rolling restart of each node before upgrade (to confirm in-memory settings are the same as startup-scripts).

  • Run check-config.sh on each cluster node (after rolling restart) for any kernel compatibility issues.

Procedural

  • Perform Swarm, MKE and MSR backups before upgrading

  • Gather Compose file/service/stack files

  • Generate a MKE Support dump (for point in time) before upgrading

  • Preinstall MKE, MSR, and MCR images. If your cluster is offline (with no connection to the internet), Mirantis provides tarballs containing all of the required container images. If your cluster is online, you can pull the required container images onto your nodes with the following command:

    $ docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
    mirantis/ucp:3.3.11 images \ --list | xargs -L 1 docker pull
    
  • Load troubleshooting packages (netshoot, etc)

  • Best order for upgrades: MCR, MKE, and then MSR. Note that the scope of this section is limited to upgrade instructions for MKE.

Set upgrade strategy

Important

In all upgrade workflows, manager nodes are automatically upgraded in place. You cannot control the order of manager node upgrades.

For each worker node that requires an upgrade, you can upgrade that node in place or you can replace the node with a new worker node. The type of upgrade you perform depends on what is needed for each node:

Upgrade strategy

Description

Automated, in-place cluster upgrade

Performed on any manager node. Automatically upgrades the entire cluster.

Manual cluster upgrade, existing nodes in place

Automatically upgrades manager nodes and allows you to control the upgrade order of worker nodes. This type of upgrade is more advanced than the automated, in-place cluster upgrade.

Manual cluster upgrade, replace all worker nodes using blue-green deployment

Performed using the CLI. This type of upgrade allows you to stand up a new cluster in parallel to the current code and cut over when complete. This type of upgrade allows you to join new worker nodes, schedule workloads to run on new nodes, pause, drain, and remove old worker nodes in batches of multiple nodes rather than one at a time, and shut down servers to remove worker nodes. This type of upgrade is the most advanced.

Back up your cluster

Before starting an upgrade, make sure that your cluster is healthy. If a problem occurs, this makes it easier to find and troubleshoot it.

Create a backup of your cluster. This allows you to recover if something goes wrong during the upgrade process.

Note

The backup archive is version-specific, so you can’t use it during the upgrade process. For example, if you create a backup archive for a UCP 2.2 cluster, you can’t use the archive file after you upgrade to UCP 3.0.

Upgrade Mirantis Container Runtime

For each node that is part of your cluster, upgrade the Mirantis Container Runtime installed on that node to Mirantis Container Runtime version 19.03 or higher.

Starting with the manager nodes, and then worker nodes:

  1. Log into the node using ssh.

  2. Upgrade the Mirantis Container Runtime to version 19.03 or higher.

  3. Make sure the node is healthy. In your browser, navigate to Nodes in the MKE web interface, and check that the node is healthy and is part of the cluster.

Perform the Upgrade to MKE

When upgrading MKE to version 3.3.11, you can choose from a variety of upgrade workflows.

There are three different methods for upgrading MKE to version 3.3.11, all of which make use of the CLI.

  • Automated in-place cluster upgrade

  • Phased in-place cluster upgrade

  • Replacing of existing worker nodes using blue-green deployment

Automated in-place cluster upgrade

The automated in-place cluster upgrade approach updates all MKE components on all nodes within the MKE Cluster. The upgrade is done node by node, however once the user has initiated an upgrade it will work its way through the entire cluster. This is the traditional approach to upgrading MKE and is often used when the order in which MKE worker nodes is upgraded is NOT important.

To upgrade MKE, ensure all MCR instances have been upgraded to the corresponding new version. Then a user should SSH to one MKE manager node and run the following command. The upgrade command should not be run on a workstation with a client bundle.

$ docker container run --rm -it \
  --name ucp \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.3.11 \
  upgrade \
  --interactive

The upgrade command will print messages regarding the progress of the upgrade as it automatically upgrades MKE on all nodes in the cluster.

Phased in-place cluster upgrade

The second MKE upgrade method is a phased approach that allows granular control of the MKE upgrade process. Once initiated, this method will upgrade all MKE components on a single MKE worker nodes, giving the user more control to migrate workloads and control traffic when upgrading the cluster. A user can temporarily run MKE worker nodes with different versions of MKE and MCR.

The phased in-place cluster upgrade workflow is useful when a user wants to manually control how workloads and traffic are migrated around a cluster during an upgrade. The process can also be used if a user wants to add additional worker node capacity during an upgrade to handle failover. Worker nodes can be added to a partially upgraded MKE Cluster, workloads migrated across, and previous worker nodes then taken offline and upgraded.

To start a phased upgrade of MKE, first all manager nodes will need to be upgraded to the new MKE version. To tell MKE to upgrade the manager nodes but not upgrade any worker nodes, pass --manual-worker-upgrade into the upgrade command.

To upgrade MKE, ensure that MCR on all MKE manager nodes has been upgraded to the corresponding new version. SSH to a MKE manager node and run the following command. The upgrade command should not be run on a workstation with a client bundle.

$ docker container run --rm -it \
  --name ucp \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.3.11 \
  upgrade \
  --manual-worker-upgrade \
  --interactive

The --manual-worker-upgrade flag will add an upgrade-hold label to all worker nodes. MKE will be constantly monitor this label, and if that label is removed MKE will then upgrade the node.

To trigger the upgrade on a worker node, you will have to remove the label.

$ docker node update --label-rm com.docker.ucp.upgrade-hold <node name or id>

Optional

Joining new worker nodes to the cluster. Once the manager nodes have been upgraded to a new MKE version, new worker nodes can be added to the cluster, assuming they are running the corresponding new MCR version.

The swarm join token can be found in the MKE UI, or while ssh’d on a MKE manager node. For more information, refer to Join Linux nodes to your cluster.

$ docker swarm join --token SWMTKN-<YOUR TOKEN> <manager ip>:2377

Replace existing worker nodes using blue-green deployment

The replace existing worker nodes using blue-gree deployment workflow is used to create a parallel environment for a new deployment, which can greatly reduce downtime, upgrades worker node engines without disrupting workloads, and allows traffic to be migrated to the new environment with worker node rollback capability. This type of upgrade creates a parallel environment for reduced downtime and workload disruption.

Note

Steps 2 through 6 can be repeated for groups of nodes - you do not have to replace all worker nodes in the cluster at one time.

  1. Upgrade manager nodes

    • The --manual-worker-upgrade command automatically upgrades manager nodes first, and then allows you to control the upgrade of the MKE components on the worker nodes using node labels.

      $ docker container run --rm -it \
      --name ucp \
      --volume /var/run/docker.sock:/var/run/docker.sock \
      mirantis/ucp:3.3.11 \
      upgrade \
      --manual-worker-upgrade \
      --interactive
      
  2. Join new worker nodes

    • New worker nodes have newer MCRs already installed and have the new MKE version running when they join the cluster. On the manager node, run commands similar to the following examples to get the Swarm Join token and add new worker nodes:

      docker swarm join-token worker
      
      • On the node to be joined:

      docker swarm join --token SWMTKN-<YOUR TOKEN> <manager ip>:2377
      
  3. Join MCR to the cluster docker swarm join --token SWMTKN-<YOUR TOKEN> <manager ip>:2377

  4. Pause all existing worker nodes

    • This ensures that new workloads are not deployed on existing nodes.

      docker node update --availability pause <node name>
      
  5. Drain paused nodes for workload migration

    • Redeploy workloads on all existing nodes to new nodes. Because all existing nodes are paused, workloads are automatically rescheduled onto new nodes.

      docker node update --availability drain <node name>
      
  6. Remove drained nodes

    • After each node is fully drained, it can be shut down and removed from the cluster. On each worker node that is getting removed from the cluster, run a command similar to the following example :

      docker swarm leave <node name>
      
    • Run a command similar to the following example on the manager node when the old worker comes unresponsive:

      docker node rm <node name>
      
  7. Remove old MKE agents

    • After upgrade completion, remove old MKE agents, which includes 390x and Windows agents, that were carried over from the previous install by running the following command on the manager node:

      docker service rm ucp-agent
      docker service rm ucp-agent-win
      docker service rm ucp-agent-s390x
      

Troubleshooting

  • Upgrade compatibility

    The upgrade command automatically checks for multiple ucp-worker-agents before proceeding with the upgrade. The existence of multiple ucp-worker-agents might indicate that the cluster still in the middle of a prior manual upgrade and you must resolve the conflicting node labels issues before proceeding with the upgrade.

  • Upgrade failures

    For worker nodes, an upgrade failure can be rolled back by changing the node label back to the previous target version. Rollback of manager nodes is not supported.

  • Kubernetes errors in node state messages after upgrading MKE

  • The following information applies if you have upgraded to UCP 3.0.0 or newer:

    • After performing a MKE upgrade from 2.2.x to 3.x.x, you might see unhealthy nodes in your MKE dashboard with any of the following errors listed:

      Awaiting healthy status in Kubernetes node inventory
      Kubelet is unhealthy: Kubelet stopped posting node status
      
    • Alternatively, you may see other port errors such as the one below in the ucp-controller container logs:

      http: proxy error: dial tcp 10.14.101.141:12388: connect: no route to host
      

Upgrade Offline

Upgrading Mirantis Kubernetes Engine is the same, whether your hosts have access to the internet or not.

The only difference when installing on an offline host is that instead of pulling the MKE images directly to the computer you’re installing on, you use a computer that’s connected to the internet to download a single package with all the images. Then you copy this package to the host where you upgrade MKE.

Download the offline package

To install a MKE package from the command line, use wget to pull down the version you want to install. Even if you are installing offline, you need a way to get the files. You can see the current packages in the MKE Deployment Guide: Versions available.

  1. Use the MKE package url for the version you want to install in place of the <mke-package-url> parameter of the wget command.

$ wget <mke-package-url> -O ucp.tar.gz

#. Now that you have the package in your local machine, you can transfer it to the machines where you want to upgrade MKE.

For each machine that you want to manage with MKE:

  1. Copy the offline package to the machine.

    $ scp ucp.tar.gz <user>@<host>
    
  2. Use ssh to log in to the hosts where you transferred the package.

  3. Load the MKE images.

    Once the package is transferred to the hosts, you can use the docker load command, to load the Docker images from the tar archive:

    $ docker load -i ucp.tar.gz