Update notes

This section describes the specific actions you as a Cloud Operator need to complete to accurately plan and successfully perform your Mirantis OpenStack for Kubernetes (MOSK) cluster to the version 22.1. Consider this information as a supplement to the generic update procedure published in Operations Guide: Update a MOSK cluster.

Additionally, read through the Cluster update known issues for the problems that are known to occur during update with recommended workarounds.

Features

Virtual CPU mode - new default

Starting from MOSK 22.1, the virtual CPU mode is set to host-model by default, which replaces the previous default kvm64 CPU model.

The new default option provides performance and workload portability, namely reliable live and cold migration of instances between hosts, and ability to run modern guest operating systems, such as Windows Server.

For the deployments the virtual CPU mode settings customized through spec:services, remove this customization in favor of the defaults after the update.

Update impact and maintenance windows planning

Host OS kernel version upgrade to v5.4

MOSK 22.1 includes the updated version of the host machine’s kernel that is v5.4. All nodes in the cluster will get restarted to apply the relevant changes.

Node group

Sequential restart

Impact on end users and workloads

Kubernetes master nodes

Yes

No

Control plane nodes

Yes

No

Storage nodes

Yes

No

Compute nodes

Yes

15-20 minutes of downtime for workloads hosted on a compute node depending on the hardware specifications of the node

Up to 1 minute of downtime for TF data plane

During the Kubernetes master nodes update, there is a downtime on Kubernetes cluster’s internal DNS service. Thus, Tungsten Fabric vRouter pods lose connection with the control plane resulting in up to 1 minute of downtime for the Tungsten Fabric data plane nodes and impact on the tenant networking.

Post-upgrade actions

Manual restart of TF vRouter agent pods

To complete the update of the cluster with Tungsten Fabric, manually restart Tungsten Fabric vRouter agent pods on all compute nodes. The restart of a vRouter agent on a compute node will cause up to 30-60 seconds of networking downtime per instance hosted there. If downtime is unacceptable for some workloads, we recommend that you migrate them before restarting the vRouter pods.

Warning

Under certain rare circumstances, the reload of the vRouter kernel module triggered by the restart of a vRouter agent is known to hang due to the inability to complete the drop_caches operation. Watch the status and logs of the vRouter agent being restarted and trigger the reboot of the node if necessary.