Install MKE

Install MKE

Mirantis Kubernetes Engine (MKE) is a containerized application that you can install on-premise or on a cloud infrastructure. MKE requires specific hardware and software in order to run, as outlined in the following sections.

Prerequisites

Important

Due to a Kubernetes limitation, MKE containers will not run in Hyper-V isolation mode on Windows.

Hardware requirements

Minimum hardware requirements

  • 8GB of RAM for manager nodes

  • 4GB of RAM for worker nodes

  • 2 vCPUs for manager nodes

  • 10GB of free disk space for the /var partition for manager nodes

  • 500MB of free disk space for the /var partition for worker nodes

Software requirements

  • All nodes must be running the same version of Mirantis Container Runtime v19.03 or higher.

  • Linux kernel version 3.10 or higher. For debugging purposes, it is suggested to match the host OS kernel versions as closely as possible.

  • A static IP address for each node in the cluster.

  • MKE does not currently support user namespaces for nodes.

Default install directories

  • /var/lib/docker (Docker Data Root Directory)

  • /var/lib/kubelet (Kubelet Data Root Directory)

  • /var/lib/containerd (Containerd Data Root Directory)

Ports used

When installing MKE on a host, you need to open specific ports to incoming traffic. Each of these ports listens for incoming traffic from a set of hosts, called the Scope of that port. The three scopes are:

  • External

    Traffic arrives from outside the cluster through end-user interaction.

  • Internal

    Traffic arrives from other hosts in the same cluster.

  • Self

    Traffic arrives to that port only from processes on the same host.

Open the following ports for incoming traffic on the respective host types:

Hosts

Port

Scope

Purpose

managers, workers

TCP 179

Internal

Port for BGP peers, used for Kubernetes networking

managers

TCP 443 (configurable)

External, Internal

Port for the MKE web UI and API

managers

TCP 2376 (configurable)

Internal

Port for the Docker Swarm manager. Used for backwards compatibility

managers

TCP 2377 (configurable)

Internal

Port for control communication between swarm nodes

managers, workers

UDP 4789

Internal

Port for overlay networking

managers

TCP 6443 (configurable)

External, Internal

Port for Kubernetes API server endpoint

managers, workers

TCP 6444

Self

Port for Kubernetes API reverse proxy

managers, workers

TCP, UDP 7946

Internal

Port for gossip-based clustering

managers, workers

TCP 9099

Self

Port for calico health check

managers, workers

TCP 10250

Internal

Port for Kubelet

managers, workers

TCP 12376

Internal

Port for a TLS authentication proxy that provides access to Mirantis Container Runtime.

managers, workers

TCP 12378

Self

Port for Etcd reverse proxy

managers

TCP 12379

Internal

Port for Etcd Control API

managers

TCP 12380

Internal

Port for Etcd Peer API

managers

TCP 12381

Internal

Port for the MKE cluster certificate authority

managers

TCP 12382

Internal

Port for the MKE client certificate authority

managers

TCP 12383

Internal

Port for the authentication storage backend

managers

TCP 12384

Internal

Port for the authentication storage backend for replication across managers

managers

TCP 12385

Internal

Port for the authentication service API

managers

TCP 12386

Internal

Port for the authentication worker

managers

TCP 12388

Internal

Internal Port for the Kubernetes API Server

SLES considerations

  • CLOUD_NETCONFIG_MANAGE

    For SUSE Linux Enterprise Server 15 (SLES 15) installations, disable CLOUD_NETCONFIG_MANAGE prior to installing MKE, as shown in the example.

    1. In the network interface configuration file at /etc/sysconfig/network/ ifcfg-eth0, set the CLOUD_NETCONFIG_MANAGE="no".

    2. Run service network restart.

  • Connection tracking

    Enable connection tracking on the loopback interface for SLES.

    Kubernetes controllers in Calico can’t reach the Kubernetes API server unless connection tracking is enabled on the loopback interface. SLES disables connection tracking by default so you need to enable it.

    Run the following commands for each node in the cluster:

    sudo mkdir -p /etc/sysconfig/SuSEfirewall2.d/defaults
    echo FW_LO_NOTRACK=no | sudo tee /etc/sysconfig/SuSEfirewall2.d/defaults/99-docker.cfg
    
    sudo SuSEfirewall2 start
    

Enable ESP traffic

For overlay networks with encryption to work, you need to ensure that IP protocol 50 (Encapsulating Security Payload) traffic is allowed.

Enable IP-in-IP traffic

The default networking plugin for MKE is Calico, which uses IP Protocol Number 4 for IP-in-IP encapsulation.

If you’re deploying to AWS or another cloud provider, enable IP-in-IP traffic for your cloud provider’s security group.

Timeout settings

Make sure the networks you’re using allow the MKE components enough time to communicate before they time out.

Component

Timeout (ms)

Configurable

Raft consensus between manager nodes

3000

no

Gossip protocol for overlay networking

5000

no

etcd

500

yes

RethinkDB

10000

no

Stand-alone cluster

90000

no

Time Synchronization

In distributed systems like MKE, time synchronization is critical to ensure proper operation. As a best practice to ensure consistency between the MCRs in a MKE cluster, all MCRs should regularly synchronize time with a Network Time Protocol (NTP) server. If a server’s clock is skewed, unexpected behavior may cause poor performance or even failures.

Hostname strategy

Before installing MKE on your cluster nodes, you should plan for a common hostname strategy.

Decide if you want to use short hostnames, like engine01, or Fully Qualified Domain Names (FQDN), like node01.company.example.com.

Whichever you choose, confirm your naming strategy is consistent across the cluster, because Mirantis Container Runtime and MKE use hostnames.

IP considerations

Static addresses

MKE requires each node on the cluster to have a static IPv4 address. Before installing MKE, ensure your network and nodes are configured to support this.

Avoid IP range conflicts

The following table lists recommendations to avoid IP range conflicts.

Component

Subnet

Range

Default IP address

Mirantis Container Runtime

default-address-pools

CIDR range for interface and bridge networks

172.17.0.0/16 - 172.30.0.0/16, 192.168.0.0/16

Swarm

default-addr-pool

CIDR range for Swarm overlay networks

10.0.0.0/8

Kubernetes

pod-cidr

CIDR range for Kubernetes pods

192.168.0.0/16

Kubernetes

service-cluster-ip-range

CIDR range for Kubernetes services

10.96.0.0/16

MCR considerations

Mirantis Container Runtime uses two IP ranges for the docker0 and docker_gwbridge interfaces.

default-address-pools defines a pool of CIDR ranges that are used to allocate subnets for local bridge networks.

By default the first available subnet (172.17.0.0/16) is assigned to docker0 and the next available subnet (172.18.0.0/16) is assigned to docker_gwbridge.

Both the docker0 and docker_gwbridge subnet can be modified by changing the default-address-pools.

default-address-pools

A list of IP address pools for local bridge networks. Each entry in the list contain the following:

  • base: CIDR range to be allocated for bridge networks.

  • size: CIDR netmask that determines the subnet size to allocate from the base pool.

By default, default-address-pools contains the following values.

{
  "default-address-pools": [
   {"base":"172.17.0.0/16","size":16}, <-- docker0
   {"base":"172.18.0.0/16","size":16}, <-- docker_gwbridge
   {"base":"172.19.0.0/16","size":16},
   {"base":"172.20.0.0/16","size":16},
   {"base":"172.21.0.0/16","size":16},
   {"base":"172.22.0.0/16","size":16},
   {"base":"172.23.0.0/16","size":16},
   {"base":"172.24.0.0/16","size":16},
   {"base":"172.25.0.0/16","size":16},
   {"base":"172.26.0.0/16","size":16},
   {"base":"172.27.0.0/16","size":16},
   {"base":"172.28.0.0/16","size":16},
   {"base":"172.29.0.0/16","size":16},
   {"base":"172.30.0.0/16","size":16},
   {"base":"192.168.0.0/16","size":20}
   ]
 }

For example, {"base":"192.168.0.0/16","size":20} will allocate /20 subnets from 192.168.0.0/16 yielding the following subnets for bridge networks:

192.168.0.0/20 (192.168.0.1 - 192.168.15.255)

192.168.16.0/20 (192.168.16.1 - 192.168.31.255)

192.168.32.0/20 (192.168.32.1 - 192.168.47.255

192.168.48.0/20 (192.168.48.1 - 192.168.63.255)

192.168.64.0/20 (192.168.64.1 - 192.168.79.255)

192.168.240.0/20 (192.168.240.1 - 192.168.255.255)

If the size matches the netmask of the base, that pool contains one subnet. For example, {"base":"172.17.0.0/16","size":16} creates one subnet 172.17.0.0/16 (172.17.0.1 - 172.17.255.255).

docker0

MCR creates and configures the host system with a virtual network interface called docker0, which is an ethernet bridge. If you don’t specify a different network when starting a container, the container is connected to the bridge and all traffic coming from and going to the container flows over the bridge to MCR, which handles routing on behalf of the container.

docker0 has a configurable IP range, and MCR allocates IP addresseses in this range to containers connected to the default bridge. The bridge has default settings that you can override. For example, the default subnet for docker0 is the first pool in default-address-pools and it contains 172.17.0.0/16.

The recommended way to configure the docker0 settings is to customize the daemon.json file.

If only the subnet needs to be customized, it can be changed by modifying the first pool of default-address-pools in the daemon.json file as shown.

{
   "default-address-pools": [
         {"base":"172.17.0.0/16","size":16}, <-- Modify this value
         {"base":"172.18.0.0/16","size":16},
         {"base":"172.19.0.0/16","size":16},
         {"base":"172.20.0.0/16","size":16},
         {"base":"172.21.0.0/16","size":16},
         {"base":"172.22.0.0/16","size":16},
         {"base":"172.23.0.0/16","size":16},
         {"base":"172.24.0.0/16","size":16},
         {"base":"172.25.0.0/16","size":16},
         {"base":"172.26.0.0/16","size":16},
         {"base":"172.27.0.0/16","size":16},
         {"base":"172.28.0.0/16","size":16},
         {"base":"172.29.0.0/16","size":16},
         {"base":"172.30.0.0/16","size":16},
         {"base":"192.168.0.0/16","size":20}
   ]
}

Note

Modifying this value can also affect the docker_gwbridge if the size doesn’t match the netmask of the base.

You can also use the ``fixed-cidr``setting to configure a CIDR range.

{
  "fixed-cidr": "172.17.0.0/16",
}

fixed-cidr: Specify the subnet for docker0, using standard CIDR notation. Default is 172.17.0.0/16, the network gateway will be 172.17.0.1 and IPs for your containers will be allocated from (172.17.0.2 - 172.17.255.254).

You can also use the bip setting to configure a gateway IP and CIDR range.

{
  "bip": "172.17.0.0/16",
}

bip contains a gateway IP address and CIDR netmask of the docker0 network. The notation is <gateway IP>/<CIDR netmask> and the default is 172.17.0.1/16 which will make the docker0 network gateway 172.17.0.1 and subnet 172.17.0.0/16.

docker_gwbridge

The docker_gwbridge is a virtual network interface that connects the overlay networks (including the ingress network) to an individual MCR’s physical network. Docker creates it automatically when you initialize a swarm or join a Docker host to a swarm, but it is not a Docker device. It exists in the kernel of the Docker host. The default subnet for docker_gwbridge is the next available subnet in default-address-pools which with defaults is 172.18.0.0/16.

Note

If you need to customize the docker_gwbridge settings, you must do so before joining the host to the swarm, or after temporarily removing the host from the swarm.

The recommended way to configure the docker_gwbridge settings is to use the daemon.json file.

For docker_gwbridge, the second available subnet will be allocated from default-address-pools. If any customizations where made to the docker0 interface it could affect which subnet is allocated. With the default default-address-pools settings you would modify the second pool.

{
    "default-address-pools": [
       {"base":"172.17.0.0/16","size":16},
       {"base":"172.18.0.0/16","size":16}, <-- Modify this value
       {"base":"172.19.0.0/16","size":16},
       {"base":"172.20.0.0/16","size":16},
       {"base":"172.21.0.0/16","size":16},
       {"base":"172.22.0.0/16","size":16},
       {"base":"172.23.0.0/16","size":16},
       {"base":"172.24.0.0/16","size":16},
       {"base":"172.25.0.0/16","size":16},
       {"base":"172.26.0.0/16","size":16},
       {"base":"172.27.0.0/16","size":16},
       {"base":"172.28.0.0/16","size":16},
       {"base":"172.29.0.0/16","size":16},
       {"base":"172.30.0.0/16","size":16},
       {"base":"192.168.0.0/16","size":20}
   ]
}

Swarm

Swarm uses a default address pool of 10.0.0.0/8 for its overlay networks. If this conflicts with your current network implementation, please use a custom IP address pool. To specify a custom IP address pool, use the --default-addr-pool command line option during Swarm initialization.

The Swarm default-addr-pool setting is separate from the MCR default-address-pools setting. They are two separate ranges that are used for different purposes.

Note

Currently, you cannot set this flag during the MKE installation process. To deploy with a custom IP pool, Swarm must first be initialized using this flag, and MKE must be installed on top of it.

Kubernetes

There are two internal IP ranges used within Kubernetes that may overlap and conflict with the underlying infrastructure:

  • The Pod Network - Each Pod in Kubernetes is given an IP address from either the Calico or Azure IPAM services. In a default installation Pods are given IP addresses on the 192.168.0.0/16 range. This can be customized at install time by passing the --pod-cidr flag to the MKE install command.

  • The Services Network - When a user exposes a Service in Kubernetes it is accessible via a VIP, this VIP comes from a Cluster IP Range. By default on MKE this range is 10.96.0.0/16. Beginning with 3.1.8, this value can be changed at install time with the --service-cluster-ip-range flag.

Avoid firewall conflicts

For SUSE Linux Enterprise Server 12 SP2 (SLES12), the FW_LO_NOTRACK flag is turned on by default in the openSUSE firewall. This speeds up packet processing on the loopback interface, and breaks certain firewall setups that need to redirect outgoing packets via custom rules on the local machine.

To turn off the FW_LO_NOTRACK option, edit the /etc/sysconfig/SuSEfirewall2 file and set FW_LO_NOTRACK="no". Save the file and restart the firewall or reboot.

For SUSE Linux Enterprise Server 12 SP3, the default value for FW_LO_NOTRACK was changed to no.

For Red Hat Enterprise Linux (RHEL) 8, if firewalld is running and FirewallBackend=nftables is set in /etc/firewalld/firewalld.conf, change this to FirewallBackend=iptables, or you can explicitly run the following commands to allow traffic to enter the default bridge (docker0) network:

firewall-cmd --permanent --zone=trusted --add-interface=docker0
firewall-cmd --reload

Load balancing

MKE doesn’t include a load balancer. You can configure your own load balancer to balance user requests across all manager nodes.

If you plan to use a load balancer, you need to decide whether you’ll add the nodes to the load balancer using their IP addresses or their FQDNs. Whichever you choose, be consistent across nodes. When this is decided, take note of all IPs or FQDNs before starting the installation.

By default, MKE and MSR both use port 443. If you plan on deploying MKE and MSR, your load balancer needs to distinguish traffic between the two by IP address or port number.

  • If you want to configure your load balancer to listen on port 443:

    • Use one load balancer for MKE and another for MSR.

    • Use the same load balancer with multiple virtual IPs.

  • Configure your load balancer to expose MKE or MSR on a port other than 443.

If you want to install MKE in a high-availability configuration that uses a load balancer in front of your MKE controllers, include the appropriate IP address and FQDN of the load balancer’s VIP by using one or more --san flags in the MKE install command or when you’re asked for additional SANs in interactive mode.

Use an external Certificate Authority

You can customize MKE to use certificates signed by an external Certificate Authority. When using your own certificates, you need to have a certificate bundle that has:

  • A ca.pem file with the root CA public certificate,

  • A cert.pem file with the server certificate and any intermediate CA public certificates. This certificate should also have SANs for all addresses used to reach the MKE manager,

  • A key.pem file with server private key.

You can have a certificate for each manager, with a common SAN. For example, on a three-node cluster, you can have:

  • node1.company.example.org with SAN mke.company.org

  • node2.company.example.org with SAN mke.company.org

  • node3.company.example.org with SAN mke.company.org

You can also install MKE with a single externally-signed certificate for all managers, rather than one for each manager node. In this case, the certificate files are copied automatically to any new manager nodes joining the cluster or being promoted to a manager role.

Customize named volumes

Skip this step if you want to use the defaults provided by MKE.

MKE uses named volumes to persist data. If you want to customize the drivers used to manage these volumes, you can create the volumes before installing MKE. When you install MKE, the installer will notice that the volumes already exist, and it will start using them.

If these volumes don’t exist, they’ll be automatically created when installing MKE.

Install mirantis/ucp image

You can install MKE using the mirantis/ucp image, which has commands to install and manage MKE.

  1. Use ssh to log in to the host where you want to install MKE.

  2. Run the following commands.

    # Pull the latest version of MKE
    docker image pull mirantis/ucp:3.3.2
    
    # Install MKE
    docker container run --rm -it --name ucp \
      -v /var/run/docker.sock:/var/run/docker.sock \
      mirantis/ucp:3.3.2 install \
      --host-address <node-ip-address> \
      --interactive
    

    This runs the install command in interactive mode, so that you’re prompted for any necessary configuration values. To find what other options are available in the install command, including how to install MKE on a system with SELinux enabled, check the MKE CLI Reference documentation.

Important

MKE will install Project Calico for container-to-container communication for Kubernetes. A platform operator may choose to install an alternative CNI plugin, such as Weave or Flannel. Please see Install an unmanaged CNI plugin for more information.

License your installation

Note

Typically, the use of MKE is by subscription only. For testing purposes, however, Mirantis offers a free trial license by request.

To license your installation of MKE, click Upload License and navigate to your license (.lic) file. When you’re finished selecting the license, MKE updates with the new settings.

Install offline

The procedure to install Mirantis Kubernetes Engine on a host is the same, whether the host has access to the internet or not.

The only difference when installing on an offline host is that instead of pulling the MKE images from Docker Hub, you use a computer that’s connected to the internet to download a single package with all the images. Then you copy this package to the host where you install MKE.

Note

You can only install offline if all of the nodes are offline. Offline installation will fail if manager nodes have internet access and worker nodes do not.

Versions available

Use a computer with internet access to download the MKE package from the following links.

Download the offline package using the CLI

You can also use the link urls to get the MKE package from the command line:

$ wget <mke-package-url> -O ucp.tar.gz

Now that you have the package in your local machine, you can transfer it to the machines where you want to install MKE.

For each machine that you want to manage with MKE:

  1. Copy the MKE package to the machine.

    $ scp ucp.tar.gz <user>@<host>
    
  2. Use ssh to log in to the hosts where you transferred the package.

  3. Load the MKE images.

    Once the package is transferred to the hosts, you can use the docker load command, to load the Docker images from the tar archive:

    $ docker load -i ucp.tar.gz
    

Install on AWS

Mirantis Kubernetes Engine (MKE) can be installed on top of AWS without any customization, therefore this document is optional, however if you are deploying Kubernetes workloads with MKE and want to leverage the AWS Kubernetes cloud provider, which provides dynamic volume and loadbalancer provisioning then you should follow this guide. This guide is not required if you are only deploying swarm workloads.

The requirements for installing MKE on AWS are included in the following sections.

Hostnames

The instance’s host name must be named ip-<private ip>.<region>.compute.internal. For example: ip-172-31-15-241.us-east-2.compute.internal

Instance tags

The instance must be tagged with kubernetes.io/cluster/<UniqueID for Cluster> and given a value of owned or shared. If the resources created by the cluster is considered owned and managed by the cluster, the value should be owned. If the resources can be shared between multiple clusters, it should be tagged as shared.

kubernetes.io/cluster/1729543642a6 owned

Instance profile for managers

Manager nodes must have an instance profile with appropriate policies attached to enable introspection and provisioning of resources. The following example is very permissive:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [ "ec2:*" ],
      "Resource": [ "*" ]
    },
    {
      "Effect": "Allow",
      "Action": [ "elasticloadbalancing:*" ],
      "Resource": [ "*" ]
    },
    {
      "Effect": "Allow",
      "Action": [ "route53:*" ],
      "Resource": [ "*" ]
    },
    {
      "Effect": "Allow",
      "Action": "s3:*",
      "Resource": [ "arn:aws:s3:::kubernetes-*" ]
    }
  ]
}

Instance profile for workers

Worker nodes must have an instance profile with appropriate policies attached to enable access to dynamically provisioned resources. The following example is very permissive:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:*",
      "Resource": [ "arn:aws:s3:::kubernetes-*" ]
    },
    {
      "Effect": "Allow",
      "Action": "ec2:Describe*",
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": "ec2:AttachVolume",
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": "ec2:DetachVolume",
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [ "route53:*" ],
      "Resource": [ "*" ]
    }
}

VPC tags

The VPC must be tagged with kubernetes.io/cluster/<UniqueID for Cluster> and given a value of owned or shared. If the resources created by the cluster is considered owned and managed by the cluster, the value should be owned. If the resources can be shared between multiple clusters, it should be tagged shared.

kubernetes.io/cluster/1729543642a6 owned

Subnet tags

Subnets must be tagged with kubernetes.io/cluster/<UniqueID for Cluster> and given a value of owned or shared. If the resources created by the cluster is considered owned and managed by the cluster, the value should be owned. If the resources may be shared between multiple clusters, it should be tagged shared. For example:

kubernetes.io/cluster/1729543642a6 owned

Once all pre-requisities have been met, run the following command to install MKE on a manager node. The --host-address flag maps to the private IP address of the master node.

$ docker container run --rm -it \
  --name ucp \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.3.2 install \
  --host-address <ucp-ip> \
  --cloud-provider aws \
  --interactive

Install on Azure

Mirantis Kubernetes Engine (MKE) closely integrates with Microsoft Azure for its Kubernetes Networking and Persistent Storage feature set. MKE deploys the Calico CNI provider. In Azure, the Calico CNI leverages the Azure networking infrastructure for data path networking and the Azure IPAM for IP address management. There are infrastructure prerequisites required prior to MKE installation for the Calico / Azure integration.

Prerequisites

You must meet the following infrastructure prerequisites to successfully deploy MKE on Azure. Failure to meet these prerequisites may result in significant errors during the installation process.

  • All MKE Nodes (Managers and Workers) need to be deployed into the same Azure Resource Group. The Azure Networking components (Virtual Network, Subnets, Security Groups) could be deployed in a second Azure Resource Group.

  • The Azure Virtual Network and Subnet must be appropriately sized for your environment, as addresses from this pool will be consumed by Kubernetes Pods.

  • All MKE worker and manager nodes need to be attached to the same Azure Subnet.

  • Internal IP addresses for all nodes should be set to Static rather than the default of Dynamic.

  • The Azure Virtual Machine Object Name needs to match the Azure Virtual Machine Computer Name and the Node Operating System’s Hostname which is the FQDN of the host, including domain names. Note that this requires all characters to be in lowercase.

  • An Azure Service Principal with Contributor access to the Azure Resource Group hosting the MKE Nodes. This Service principal will be used by Kubernetes to communicate with the Azure API. The Service Principal ID and Secret Key are needed as part of the MKE prerequisites. If you are using a separate Resource Group for the networking components, the same Service Principal will need Network Contributor access to this Resource Group.

  • Kubernetes pods integrate into the underlying Azure networking stack, from an IPAM and routing perspective with the Azure CNI IPAM module. Therefore Azure Network Security Groups (NSG) impact pod to pod communication. End users may expose containerized services on a range of underlying ports, resulting in a manual process to open an NSG port every time a new containerized service is deployed on to the platform. This would only affect workloads deployed on to the Kubernetes orchestrator. It is advisable to have an open NSG between all IPs on the Azure Subnet passed into MKE at install time. To limit exposure, this Azure subnet should be locked down to only be used for Container Host VMs and Kubernetes Pods. Additionally, end users can leverage Kubernetes Network Policies to provide micro segmentation for containerized applications and services.

MKE requires the following information for the installation:

  • subscriptionId - The Azure Subscription ID in which the MKE objects are being deployed.

  • tenantId - The Azure Active Directory Tenant ID in which the MKE objects are being deployed.

  • aadClientId - The Azure Service Principal ID.

  • aadClientSecret - The Azure Service Principal Secret Key.

Networking

MKE configures the Azure IPAM module for Kubernetes to allocate IP addresses for Kubernetes pods. The Azure IPAM module requires each Azure VM which is part of the Kubernetes cluster to be configured with a pool of IP addresses.

There are two options for provisioning IPs for the Kubernetes cluster on Azure:

  • An automated mechanism provided by MKE which allows for IP pool configuration and maintenance for standalone Azure virtual machines (VMs). This service runs within the calico-node daemonset and provisions 128 IP addresses for each node by default.

    If a VXLAN dataplane is used, MKE automatically uses Calico IPAM. You don’t need to do anything specific for Azure IPAM.

  • Manual provision of additional IP address for each Azure VM. This could be done through the Azure Portal, the Azure CLI $ az network nic ip-config create, or an ARM template.

Azure Configuration File

For MKE to integrate with Microsoft Azure, all Linux MKE Manager and Linux MKE Worker nodes in your cluster need an identical Azure configuration file, azure.json. Place this file within /etc/kubernetes on each host. Since the configuration file is owned by root, set its permissions to 0644 to ensure the container user has read access.

The following is an example template for azure.json. Replace *** with real values, and leave the other parameters as is.

{
    "cloud":"AzurePublicCloud",
    "tenantId": "***",
    "subscriptionId": "***",
    "aadClientId": "***",
    "aadClientSecret": "***",
    "resourceGroup": "***",
    "location": "***",
    "subnetName": "***",
    "securityGroupName": "***",
    "vnetName": "***",
    "useInstanceMetadata": true
}

There are some optional parameters for Azure deployments:

  • primaryAvailabilitySetName - The Worker Nodes availability set.

  • vnetResourceGroup - The Virtual Network Resource group, if your Azure Network objects live in a separate resource group.

  • routeTableName - If you have defined multiple Route tables within an Azure subnet.

Guidelines for IPAM Configuration

Warning

You must follow these guidelines and either use the appropriate size network in Azure or take the proper action to fit within the subnet. Failure to follow these guidelines may cause significant issues during the installation process.

The subnet and the virtual network associated with the primary interface of the Azure VMs needs to be configured with a large enough address prefix/range. The number of required IP addresses depends on the workload and the number of nodes in the cluster.

For example, in a cluster of 256 nodes, make sure that the address space of the subnet and the virtual network can allocate at least 128 * 256 IP addresses, in order to run a maximum of 128 pods concurrently on a node. This would be in addition to initial IP allocations to VM network interface card (NICs) during Azure resource creation.

Accounting for IP addresses that are allocated to NICs during VM bring-up, set the address space of the subnet and virtual network to 10.0.0.0/16. This ensures that the network can dynamically allocate at least 32768 addresses, plus a buffer for initial allocations for primary IP addresses.

Note

The Azure IPAM module queries an Azure VM’s metadata to obtain a list of IP addresses which are assigned to the VM’s NICs. The IPAM module allocates these IP addresses to Kubernetes pods. You configure the IP addresses as ipConfigurations in the NICs associated with a VM or scale set member, so that Azure IPAM can provide them to Kubernetes when requested.

Manually provision IP address pools as part of an Azure VM scale set

Configure IP Pools for each member of the VM scale set during provisioning by associating multiple ipConfigurations with the scale set’s networkInterfaceConfigurations. The following is an example networkProfile configuration for an ARM template that configures pools of 32 IP addresses for each VM in the VM scale set.

"networkProfile": {
  "networkInterfaceConfigurations": [
    {
      "name": "[variables('nicName')]",
      "properties": {
        "ipConfigurations": [
          {
            "name": "[variables('ipConfigName1')]",
            "properties": {
              "primary": "true",
              "subnet": {
                "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/virtualNetworks/', variables('virtualNetworkName'), '/subnets/', variables('subnetName'))]"
              },
              "loadBalancerBackendAddressPools": [
                {
                  "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/loadBalancers/', variables('loadBalancerName'), '/backendAddressPools/', variables('bePoolName'))]"
                }
              ],
              "loadBalancerInboundNatPools": [
                {
                  "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/loadBalancers/', variables('loadBalancerName'), '/inboundNatPools/', variables('natPoolName'))]"
                }
              ]
            }
          },
          {
            "name": "[variables('ipConfigName2')]",
            "properties": {
              "subnet": {
                "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/virtualNetworks/', variables('virtualNetworkName'), '/subnets/', variables('subnetName'))]"
              }
            }
          }
          .
          .
          .
          {
            "name": "[variables('ipConfigName32')]",
            "properties": {
              "subnet": {
                "id": "[concat('/subscriptions/', subscription().subscriptionId,'/resourceGroups/', resourceGroup().name, '/providers/Microsoft.Network/virtualNetworks/', variables('virtualNetworkName'), '/subnets/', variables('subnetName'))]"
              }
            }
          }
        ],
        "primary": "true"
      }
    }
  ]
}

Adjust the IP Count Value

During a MKE installation, a user can alter the number of Azure IP addresses MKE will automatically provision for pods. By default, MKE will provision 128 addresses, from the same Azure Subnet as the hosts, for each VM in the cluster. However, if you have manually attached additional IP addresses to the VMs (via an ARM Template, Azure CLI or Azure Portal) or you are deploying in to small Azure subnet (less than /16), an --azure-ip-count flag can be used at install time.

Note

Do not set the --azure-ip-count variable to a value of less than 6 if you have not manually provisioned additional IP addresses for each VM. The MKE installation will need at least 6 IP addresses to allocate to the core MKE components that run as Kubernetes pods. This is in addition to the VM’s private IP address.

Below are some example scenarios which require the --azure-ip-count variable to be defined.

Scenario 1 - Manually Provisioned Addresses

If you have manually provisioned additional IP addresses for each VM, and want to disable MKE from dynamically provisioning more IP addresses for you, then you would pass --azure-ip-count 0 into the MKE installation command.

Scenario 2 - Reducing the number of Provisioned Addresses

If you want to reduce the number of IP addresses dynamically allocated from 128 addresses to a custom value due to:

  • Primarily using the Swarm Orchestrator

  • Deploying MKE on a small Azure subnet (for example, /24)

  • Plan to run a small number of Kubernetes pods on each node.

For example if you wanted to provision 16 addresses per VM, then you would pass --azure-ip-count 16 into the MKE installation command.

If you need to adjust this value post-installation, refer to the instructions on how to download the MKE configuration file, change the value, and update the configuration via the API. If you reduce the value post-installation, existing VMs will not be reconciled, and you will have to manually edit the IP count in Azure.

Run the following command to install MKE on a manager node. The --pod-cidr option maps to the IP address range that you have configured for the Azure subnet, and the --host-address maps to the private IP address of the master node. Finally if you want to adjust the amount of IP addresses provisioned to each VM pass --azure-ip-count.

Note

The pod-cidr range must match the Azure Virtual Network’s Subnet attached the hosts. For example, if the Azure Virtual Network had the range 172.0.0.0/16 with VMs provisioned on an Azure Subnet of 172.0.1.0/24, then the Pod CIDR should also be 172.0.1.0/24.

docker container run --rm -it \
  --name ucp \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.3.2 install \
  --host-address <ucp-ip> \
  --pod-cidr <ip-address-range> \
  --cloud-provider Azure \
  --interactive

Azure custom roles

Deploy an MKE Cluster into a single resource group

A resource group is a container that holds resources for an Azure solution. These resources are the virtual machines (VMs), networks, and storage accounts associated with the swarm.

To create a custom, all-in-one role with permissions to deploy an MKE cluster into a single resource group:

  1. Create the role permissions JSON file.

    {
      "Name": "Docker Platform All-in-One",
      "IsCustom": true,
      "Description": "Can install and manage Docker platform.",
      "Actions": [
        "Microsoft.Authorization/*/read",
        "Microsoft.Authorization/roleAssignments/write",
        "Microsoft.Compute/availabilitySets/read",
        "Microsoft.Compute/availabilitySets/write",
        "Microsoft.Compute/disks/read",
        "Microsoft.Compute/disks/write",
        "Microsoft.Compute/virtualMachines/extensions/read",
        "Microsoft.Compute/virtualMachines/extensions/write",
        "Microsoft.Compute/virtualMachines/read",
        "Microsoft.Compute/virtualMachines/write",
        "Microsoft.Network/loadBalancers/read",
        "Microsoft.Network/loadBalancers/write",
        "Microsoft.Network/loadBalancers/backendAddressPools/join/action",
        "Microsoft.Network/networkInterfaces/read",
        "Microsoft.Network/networkInterfaces/write",
        "Microsoft.Network/networkInterfaces/join/action",
        "Microsoft.Network/networkSecurityGroups/read",
        "Microsoft.Network/networkSecurityGroups/write",
        "Microsoft.Network/networkSecurityGroups/join/action",
        "Microsoft.Network/networkSecurityGroups/securityRules/read",
        "Microsoft.Network/networkSecurityGroups/securityRules/write",
        "Microsoft.Network/publicIPAddresses/read",
        "Microsoft.Network/publicIPAddresses/write",
        "Microsoft.Network/publicIPAddresses/join/action",
        "Microsoft.Network/virtualNetworks/read",
        "Microsoft.Network/virtualNetworks/write",
        "Microsoft.Network/virtualNetworks/subnets/read",
        "Microsoft.Network/virtualNetworks/subnets/write",
        "Microsoft.Network/virtualNetworks/subnets/join/action",
        "Microsoft.Resources/subscriptions/resourcegroups/read",
        "Microsoft.Resources/subscriptions/resourcegroups/write",
        "Microsoft.Security/advancedThreatProtectionSettings/read",
        "Microsoft.Security/advancedThreatProtectionSettings/write",
        "Microsoft.Storage/*/read",
        "Microsoft.Storage/storageAccounts/listKeys/action",
        "Microsoft.Storage/storageAccounts/write"
      ],
      "NotActions": [],
      "AssignableScopes": [
        "/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"
      ]
    }
    
  2. Create the Azure RBAC role.

    az role definition create --role-definition all-in-one-role.json
    

Deploy MKE compute resources

Compute resources act as servers for running containers.

To create a custom role to deploy MKE compute resources only:

  1. Create the role permissions JSON file.

    {
      "Name": "Docker Platform",
      "IsCustom": true,
      "Description": "Can install and run Docker platform.",
      "Actions": [
        "Microsoft.Authorization/*/read",
        "Microsoft.Authorization/roleAssignments/write",
        "Microsoft.Compute/availabilitySets/read",
        "Microsoft.Compute/availabilitySets/write",
        "Microsoft.Compute/disks/read",
        "Microsoft.Compute/disks/write",
        "Microsoft.Compute/virtualMachines/extensions/read",
        "Microsoft.Compute/virtualMachines/extensions/write",
        "Microsoft.Compute/virtualMachines/read",
        "Microsoft.Compute/virtualMachines/write",
        "Microsoft.Network/loadBalancers/read",
        "Microsoft.Network/loadBalancers/write",
        "Microsoft.Network/networkInterfaces/read",
        "Microsoft.Network/networkInterfaces/write",
        "Microsoft.Network/networkInterfaces/join/action",
        "Microsoft.Network/publicIPAddresses/read",
        "Microsoft.Network/virtualNetworks/read",
        "Microsoft.Network/virtualNetworks/subnets/read",
        "Microsoft.Network/virtualNetworks/subnets/join/action",
        "Microsoft.Resources/subscriptions/resourcegroups/read",
        "Microsoft.Resources/subscriptions/resourcegroups/write",
        "Microsoft.Security/advancedThreatProtectionSettings/read",
        "Microsoft.Security/advancedThreatProtectionSettings/write",
        "Microsoft.Storage/storageAccounts/read",
        "Microsoft.Storage/storageAccounts/listKeys/action",
        "Microsoft.Storage/storageAccounts/write"
      ],
      "NotActions": [],
      "AssignableScopes": [
        "/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"
      ]
    }
    
  2. Create the Docker Platform RBAC role.

    az role definition create --role-definition platform-role.json
    

Deploy MKE network resources

Network resources are services inside your cluster. These resources can include virtual networks, security groups, address pools, and gateways.

To create a custom role to deploy MKE network resources only:

  1. Create the role permissions JSON file.

    {
      "Name": "Docker Networking",
      "IsCustom": true,
      "Description": "Can install and manage Docker platform networking.",
      "Actions": [
        "Microsoft.Authorization/*/read",
        "Microsoft.Network/loadBalancers/read",
        "Microsoft.Network/loadBalancers/write",
        "Microsoft.Network/loadBalancers/backendAddressPools/join/action",
        "Microsoft.Network/networkInterfaces/read",
        "Microsoft.Network/networkInterfaces/write",
        "Microsoft.Network/networkInterfaces/join/action",
        "Microsoft.Network/networkSecurityGroups/read",
        "Microsoft.Network/networkSecurityGroups/write",
        "Microsoft.Network/networkSecurityGroups/join/action",
        "Microsoft.Network/networkSecurityGroups/securityRules/read",
        "Microsoft.Network/networkSecurityGroups/securityRules/write",
        "Microsoft.Network/publicIPAddresses/read",
        "Microsoft.Network/publicIPAddresses/write",
        "Microsoft.Network/publicIPAddresses/join/action",
        "Microsoft.Network/virtualNetworks/read",
        "Microsoft.Network/virtualNetworks/write",
        "Microsoft.Network/virtualNetworks/subnets/read",
        "Microsoft.Network/virtualNetworks/subnets/write",
        "Microsoft.Network/virtualNetworks/subnets/join/action",
        "Microsoft.Resources/subscriptions/resourcegroups/read",
        "Microsoft.Resources/subscriptions/resourcegroups/write"
      ],
      "NotActions": [],
      "AssignableScopes": [
        "/subscriptions/6096d756-3192-4c1f-ac62-35f1c823085d"
      ]
    }
    
  2. Create the Docker Networking RBAC role.

    az role definition create --role-definition networking-role.json
    

Upgrade MKE

This section helps you upgrade Mirantis Kubernetes Engine (MKE).

Note

Kubernetes Ingress cannot be deployed on a cluster after MKE is upgraded from 3.2.6 to 3.3.0. A fresh install of 3.3.0 is not impacted. For more information about how to reproduce and workaround the issue, see the release notes.

Before upgrading to a new version of MKE, check the Docker Enterprise Release Notes. There you’ll find information about new features, breaking changes, and other relevant information for upgrading to a particular version.

As part of the upgrade process, you’ll upgrade the Mirantis Container Runtime installed on each node of the cluster to version 19.03 or higher. You should plan for the upgrade to take place outside of business hours, to ensure there’s minimal impact to your users.

Also, don’t make changes to MKE configurations while you’re upgrading it. This can lead to misconfigurations that are difficult to troubleshoot.

Environment checklist

Complete the checks as detailed in the following areas:

Systems

  • Confirm time sync across all nodes (and check time daemon logs for any large time drifting)

  • Check system requirements PROD=4 vCPU/16GB for MKE managers and MSR replicas

  • Review the full UCP/MSR/MCR port requirements

  • Ensure that your cluster nodes meet the minimum requirements

  • Before performing any upgrade, ensure that you meet all minimum requirements listed in MKE System requirements, including port openings (MKE 3.x added more required ports for Kubernetes), memory, and disk space. For example, manager nodes must have at least 8GB of memory.

Note

If you are upgrading a cluster to UCP 3.0.2 or higher on Microsoft Azure, please ensure that all of the Azure prerequisites are met.

Storage

  • Check /var/ storage allocation and increase if it is over 70% usage.

  • In addition, check all nodes’ local file systems for any disk storage issues (and MSR back-end storage, for example, NFS).

  • If not using Overlay2 storage drivers please take this opportunity to do so, you will find stability there. Note that the transition from Device mapper to Overlay2 is a destructive rebuild.

Operating system

  • If cluster nodes OS branch is older (Ubuntu 14.x, RHEL 7.3, etc), consider patching all relevant packages to the most recent (including kernel).

  • Rolling restart of each node before upgrade (to confirm in-memory settings are the same as startup-scripts).

  • Run check-config.sh on each cluster node (after rolling restart) for any kernel compatibility issues.

Procedural

  • Perform Swarm, MKE and MSR backups before upgrading

  • Gather Compose file/service/stack files

  • Generate a MKE Support dump (for point in time) before upgrading

  • Preinstall MCR/MKE/MSR images. If your cluster is offline (with no connection to the internet), Mirantis provides tarballs containing all of the required container images. If your cluster is online, you can pull the required container images onto your nodes with the following command:

    $ docker run --rm mirantis/ucp:3.3.2 images --list | xargs -L 1 docker pull
    
  • Load troubleshooting packages (netshoot, etc)

  • Best order for upgrades: MCR, MKE, and then MSR. Note that the scope of this topic is limited to upgrade instructions for MKE.

Set upgrade strategy

Important

In all upgrade workflows, manager nodes are automatically upgraded in place. You cannot control the order of manager node upgrades.

For each worker node that requires an upgrade, you can upgrade that node in place or you can replace the node with a new worker node. The type of upgrade you perform depends on what is needed for each node:

Upgrade strategy

Description

Automated, in-place cluster upgrade

Performed on any manager node. Automatically upgrades the entire cluster.

Manual cluster upgrade, existing nodes in place

Automatically upgrades manager nodes and allows you to control the upgrade order of worker nodes. This type of upgrade is more advanced than the automated, in-place cluster upgrade.

Manual cluster upgrade, replace all worker nodes using blue-green deployment

Performed using the CLI. This type of upgrade allows you to stand up a new cluster in parallel to the current code and cut over when complete. This type of upgrade allows you to join new worker nodes, schedule workloads to run on new nodes, pause, drain, and remove old worker nodes in batches of multiple nodes rather than one at a time, and shut down servers to remove worker nodes. This type of upgrade is the most advanced.

Back up your cluster

Before starting an upgrade, make sure that your cluster is healthy. If a problem occurs, this makes it easier to find and troubleshoot it.

Create a backup of your cluster. This allows you to recover if something goes wrong during the upgrade process.

Note

The backup archive is version-specific, so you can’t use it during the upgrade process. For example, if you create a backup archive for a UCP 2.2 cluster, you can’t use the archive file after you upgrade to UCP 3.0.

Upgrade Mirantis Container Runtime

For each node that is part of your cluster, upgrade the Mirantis Container Runtime installed on that node to Mirantis Container Runtime version 19.03 or higher.

Starting with the manager nodes, and then worker nodes:

  1. Log into the node using ssh.

  2. Upgrade the Mirantis Container Runtime to version 19.03 or higher. See Mirantis Container Runtime.

  3. Make sure the node is healthy. In your browser, navigate to Nodes in the MKE web interface, and check that the node is healthy and is part of the cluster.

Perform the Upgrade to MKE

When upgrading MKE to version 3.3.2, you can choose from a variety of upgrade workflows.

There are three different methods for upgrading MKE to version 3.3.2, all of which make use of the CLI.

  • Automated in-place cluster upgrade

  • Phased in-place cluster upgrade

  • Replacing of existing worker nodes using blue-green deployment

Automated in-place cluster upgrade

The automated in-place cluster upgrade approach updates all MKE components on all nodes within the MKE Cluster. The upgrade is done node by node, however once the user has initiated an upgrade it will work its way through the entire cluster. This is the traditional approach to upgrading MKE and is often used when the order in which MKE worker nodes is upgraded is NOT important.

To upgrade MKE, ensure all MCR instances have been upgraded to the corresponding new version. Then a user should SSH to one MKE manager node and run the following command. The upgrade command should not be run on a workstation with a client bundle.

$ docker container run --rm -it \
  --name ucp \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.3.2 \
  upgrade \
  --interactive

The upgrade command will print messages regarding the progress of the upgrade as it automatically upgrades MKE on all nodes in the cluster.

Phased in-place cluster upgrade

The second MKE upgrade method is a phased approach that allows granular control of the MKE upgrade process. Once initiated, this method will upgrade all MKE components on a single MKE worker nodes, giving the user more control to migrate workloads and control traffic when upgrading the cluster. A user can temporarily run MKE worker nodes with different versions of MCR and MKE.

The phased in-place cluster upgrade workflow is useful when a user wants to manually control how workloads and traffic are migrated around a cluster during an upgrade. The process can also be used if a user wants to add additional worker node capacity during an upgrade to handle failover. Worker nodes can be added to a partially upgraded MKE Cluster, workloads migrated across, and previous worker nodes then taken offline and upgraded.

To start a phased upgrade of MKE, first all manager nodes will need to be upgraded to the new MKE version. To tell MKE to upgrade the manager nodes but not upgrade any worker nodes, pass --manual-worker-upgrade into the upgrade command.

To upgrade MKE, ensure that MCR on all MKE manager nodes has been upgraded to the corresponding new version. SSH to a MKE manager node and run the following command. The upgrade command should not be run on a workstation with a client bundle.

$ docker container run --rm -it \
  --name ucp \
  --volume /var/run/docker.sock:/var/run/docker.sock \
  mirantis/ucp:3.3.2 \
  upgrade \
  --manual-worker-upgrade \
  --interactive

The --manual-worker-upgrade flag will add an upgrade-hold label to all worker nodes. MKE will be constantly monitor this label, and if that label is removed MKE will then upgrade the node.

To trigger the upgrade on a worker node, you will have to remove the label.

$ docker node update --label-rm com.docker.ucp.upgrade-hold <node name or id>

Optional

Joining new worker nodes to the cluster. Once the manager nodes have been upgraded to a new MKE version, new worker nodes can be added to the cluster, assuming they are running the corresponding new MCR version.

The swarm join token can be found in the MKE UI, or while ssh’d on a MKE manager node. For more information, refer to Join Linux nodes to your cluster.

$ docker swarm join --token SWMTKN-<YOUR TOKEN> <manager ip>:2377

Replace existing worker nodes using blue-green deployment

The replace existing worker nodes using blue-gree deployment workflow is used to create a parallel environment for a new deployment, which can greatly reduce downtime, upgrades worker node engines without disrupting workloads, and allows traffic to be migrated to the new environment with worker node rollback capability. This type of upgrade creates a parallel environment for reduced downtime and workload disruption.

Note

Steps 2 through 6 can be repeated for groups of nodes - you do not have to replace all worker nodes in the cluster at one time.

  1. Upgrade manager nodes

    • The --manual-worker-upgrade command automatically upgrades manager nodes first, and then allows you to control the upgrade of the MKE components on the worker nodes using node labels.

      $ docker container run --rm -it \
      --name ucp \
      --volume /var/run/docker.sock:/var/run/docker.sock \
      mirantis/ucp:3.3.2 \
      upgrade \
      --manual-worker-upgrade \
      --interactive
      
  2. Join new worker nodes

    • New worker nodes have newer MCRs already installed and have the new MKE version running when they join the cluster. On the manager node, run commands similar to the following examples to get the Swarm Join token and add new worker nodes:

      docker swarm join-token worker
      
      • On the node to be joined:

      docker swarm join --token SWMTKN-<YOUR TOKEN> <manager ip>:2377
      
  3. Join MCR to the cluster docker swarm join --token SWMTKN-<YOUR TOKEN> <manager ip>:2377

  4. Pause all existing worker nodes

    • This ensures that new workloads are not deployed on existing nodes.

      docker node update --availability pause <node name>
      
  5. Drain paused nodes for workload migration

    • Redeploy workloads on all existing nodes to new nodes. Because all existing nodes are paused, workloads are automatically rescheduled onto new nodes.

      docker node update --availability drain <node name>
      
  6. Remove drained nodes

    • After each node is fully drained, it can be shut down and removed from the cluster. On each worker node that is getting removed from the cluster, run a command similar to the following example :

      docker swarm leave <node name>
      
    • Run a command similar to the following example on the manager node when the old worker comes unresponsive:

      docker node rm <node name>
      
  7. Remove old MKE agents

    • After upgrade completion, remove old MKE agents, which includes 390x and Windows agents, that were carried over from the previous install by running the following command on the manager node:

      docker service rm ucp-agent
      docker service rm ucp-agent-win
      docker service rm ucp-agent-s390x
      

Troubleshooting

  • Upgrade compatibility

    The upgrade command automatically checks for multiple ucp-worker-agents before proceeding with the upgrade. The existence of multiple ucp-worker-agents might indicate that the cluster still in the middle of a prior manual upgrade and you must resolve the conflicting node labels issues before proceeding with the upgrade.

  • Upgrade failures

    For worker nodes, an upgrade failure can be rolled back by changing the node label back to the previous target version. Rollback of manager nodes is not supported.

  • Kubernetes errors in node state messages after upgrading MKE

  • The following information applies if you have upgraded to UCP 3.0.0 or newer:

    • After performing a MKE upgrade from 2.2.x to 3.x.x, you might see unhealthy nodes in your MKE dashboard with any of the following errors listed:

      Awaiting healthy status in Kubernetes node inventory
      Kubelet is unhealthy: Kubelet stopped posting node status
      
    • Alternatively, you may see other port errors such as the one below in the ucp-controller container logs:

      http: proxy error: dial tcp 10.14.101.141:12388: connect: no route to host
      

Upgrade Offline

Upgrading Mirantis Kubernetes Engine is the same, whether your hosts have access to the internet or not.

The only difference when installing on an offline host is that instead of pulling the MKE images directly to the computer you’re installing on, you use a computer that’s connected to the internet to download a single package with all the images. Then you copy this package to the host where you upgrade MKE.

Download the offline package

To install a MKE package from the command line, use wget to pull down the version you want to install. Even if you are installing offline, you need a way to get the files. You can see the current packages in the Versions available section of this topic.

  1. Use the MKE package url for the version you want to install in place of the <mke-package-url> parameter of the wget command.

$ wget <mke-package-url> -O ucp.tar.gz

#. Now that you have the package in your local machine, you can transfer it to the machines where you want to upgrade MKE.

For each machine that you want to manage with MKE:

  1. Copy the offline package to the machine.

    $ scp ucp.tar.gz <user>@<host>
    
  2. Use ssh to log in to the hosts where you transferred the package.

  3. Load the MKE images.

    Once the package is transferred to the hosts, you can use the docker load command, to load the Docker images from the tar archive:

    $ docker load -i ucp.tar.gz
    

Uninstall MKE

MKE is designed to scale as your applications grow in size and usage. You can add and remove nodes from the cluster to make it scale to your needs.

You can also uninstall MKE from your cluster. In this case, the MKE services are stopped and removed, but your Mirantis Container Runtimes will continue running in swarm mode. You applications will continue running normally.

If you wish to remove a single node from the MKE cluster, you should instead remove that node from the cluster.

After you uninstall MKE from the cluster, you’ll no longer be able to enforce role-based access control (RBAC) to the cluster, or have a centralized way to monitor and manage the cluster. After uninstalling MKE from the cluster, you will no longer be able to join new nodes using docker swarm join, unless you reinstall MKE.

To uninstall MKE, log in to a manager node using ssh, and run the following command:

docker container run --rm -it \
  -v /var/run/docker.sock:/var/run/docker.sock \
  --name ucp \
  mirantis/ucp:3.3.2 uninstall-ucp --interactive

This runs the uninstall command in interactive mode, so that you are prompted for any necessary configuration values.

If the uninstall-ucp command fails, you can run the following commands to manually uninstall MKE:

#Run the following command on one manager node to remove remaining MKE services
docker service rm $(docker service ls -f name=ucp- -q)

#Run the following command on each manager node to remove remaining MKE containers
docker container rm -f $(docker container ps -a -f name=ucp- -f name=k8s_ -q)

#Run the following command on each manager node to remove remaining MKE volumes
docker volume rm $(docker volume ls -f name=ucp -q)

The MKE configuration is kept in case you want to reinstall MKE with the same configuration. If you want to also delete the configuration, run the uninstall command with the --purge-config option.

Refer to the MKE CLI Reference documentation to see the available options.

Once the uninstall command finishes, MKE is completely removed from all the nodes in the cluster. You don’t need to run the command again from other nodes.

Swarm mode CA

After uninstalling MKE, the nodes in your cluster will still be in swarm mode, but you can’t join new nodes until you reinstall MKE, because swarm mode relies on MKE to provide the CA certificates that allow nodes in the cluster to identify one another. Also, since swarm mode is no longer controlling its certificates, if the certificates expire after you uninstall MKE, the nodes in the swarm won’t be able to communicate at all. To fix this, either reinstall MKE before the certificates expire or disable swarm mode by running docker swarm leave --force on every node.

Restore IP tables

When you install MKE, the Calico network plugin changes the host’s IP tables. When you uninstall MKE, the IP tables aren’t reverted to their previous state. After you uninstall MKE, restart the node to restore its IP tables.