Cluster update known issues

This section lists the cluster update known issues with workarounds for the Mirantis OpenStack for Kubernetes release 21.6.

[4288] Cluster update failure with kubelet being stuck

A MOS managed cluster may fail to update to the latest Cluster release with kubelet being stuck and reporting authorization errors.

The cluster is affected by the issue if you see the Failed to make webhook authorizer request: context canceled error in the kubelet logs:

docker logs ucp-kubelet --since 5m 2>&1 | grep 'Failed to make webhook authorizer request: context canceled'

As a workaround, restart the ucp-kubelet container on the affected node(s):

ctr -n com.docker.ucp snapshot rm ucp-kubelet
docker rm -f ucp-kubelet


Ignore failures in the output of the first command, if any.

[16987] Сluster update fails at Ceph CSI pod eviction

An update of a MOS managed cluster may fail with the ceph csi-driver is not evacuated yet, waiting… error during the Ceph CSI pod eviction.


  1. Scale the affected StatefulSet of the pod that fails to init down to 0 replicas. If it is the DaemonSet such as nova-compute, it must not be scheduled on the affected node.

  2. On every csi-rbdplugin pod, search for stuck csi-vol:

    rbd device list | grep <csi-vol-uuid>
  3. Unmap the affected csi-vol:

    rbd unmap -o force /dev/rbd<i>
  4. Delete volumeattachment of the affected pod:

    kubectl get volumeattachments | grep <csi-vol-uuid>
    kubectl delete volumeattacmhent <id>
  5. Scale the affected StatefulSet back to the original number of replicas and until its state is Running. If it is a DaemonSet, run the pod on the affected node again.

[15525] HelmBundle controller gets stuck during cluster update

The HelmBundle controller that handles OpenStack releases gets stuck during cluster update and does not apply HelmBundle changes. The issue is caused by an unlimited releases history that increases the amount of RAM consumed by Tiller. The workaround is to manually limit the releases number history to 3.


  1. Remove the old releases:

    1. Clean up releases in the stacklight namespace:

      function cleanup_release_history {
         for i in $(kubectl -n stacklight get cm |grep "$pattern" | awk '{print $1}' | sort -V | head -n -${left_items})
           kubectl -n stacklight delete cm $i

      For example:

      kubectl -n stacklight get cm |grep "openstack-cinder.v" | awk '{print $1}'
      cleanup_release_history openstack-cinder.v
  2. Fix the releases in the FAILED state:

    1. Connect to one of StackLight Helm controller pods and list the releases in the FAILED state:

      kubectl -n stacklight exec -it stacklight-helm-controller-699cc6949-dtfgr -- sh
      ./helm --host localhost:44134 list

      Example of system response:

      # openstack-heat            2313   Wed Jun 23 06:50:55 2021   FAILED   heat-0.1.0-mcp-3860      openstack
      # openstack-keystone        76     Sun Jun 20 22:47:50 2021   FAILED   keystone-0.1.0-mcp-3860  openstack
      # openstack-neutron         147    Wed Jun 23 07:00:37 2021   FAILED   neutron-0.1.0-mcp-3860   openstack
      # openstack-nova            1      Wed Jun 23 07:09:43 2021   FAILED   nova-0.1.0-mcp-3860      openstack
      # openstack-nova-rabbitmq   15     Wed Jun 23 07:04:38 2021   FAILED   rabbitmq-0.1.0-mcp-2728  openstack
    2. Determine the reason for a release failure. Typically, this is due to changes in the immutable objects (jobs). For example:

      ./helm --host localhost:44134 history openstack-mariadb

      Example of system response:

      REVISION   UPDATED                    STATUS     CHART                   APP VERSION   DESCRIPTION
      173        Thu Jun 17 20:26:14 2021   DEPLOYED   mariadb-0.1.0-mcp-2710                Upgrade complete
      212        Wed Jun 23 07:07:58 2021   FAILED     mariadb-0.1.0-mcp-2728                Upgrade "openstack-mariadb" failed: Job.batch "openstack-...
      213        Wed Jun 23 07:55:22 2021   FAILED     mariadb-0.1.0-mcp-2728                Upgrade "openstack-mariadb" failed: Job.batch "exporter-c...
    3. Remove the FAILED job and roll back the release. For example:

      kubectl -n openstack delete job -l application=mariadb
      ./helm --host localhost:44134 rollback openstack-mariadb 213
    4. Verify that the release is in the DEPLOYED state. For example:

      ./helm --host localhost:44134 history openstack-mariadb
    5. Perform the steps above for all releases in the FAILED state one by one.

  3. Set TILLER_HISTORY_MAX in the StackLight controller to 3:

    kubectl -n stacklight edit deployment stacklight-helm-controller

[18871] MySQL crashes during managed cluster update or instances live migration

MySQL may crash when performing instances live migration or during a managed cluster update from version 6.19.0 to 6.20.0. After the crash, MariaDB cannot connect to the cluster and gets stuck in the CrashLoopBackOff state.


  1. Verify that other MariaDB replicas are up and running and have joined the cluster:

    1. Verify that at least 2 pods are running and operational (2/2 and Running):

      kubectl -n openstack get pods |grep maria

      Example of system response where the pods mariadb-server-0 and mariadb-server-2 are operational:

      mariadb-controller-77b5ff47d5-ndj68   1/1     Running     0          39m
      mariadb-server-0                      2/2     Running     0          39m
      mariadb-server-1                      0/2     Running     0          39m
      mariadb-server-2                      2/2     Running     0          39m
    2. Log in to each operational pod and verify that the node is Primary and the cluster size is at least 2. For example:

      mysql -u root -p$MYSQL_DBADMIN_PASSWORD -e "show status;" |grep -e \
      wsrep_cluster_size -e "wsrep_cluster_status" -e "wsrep_local_state_comment"

      Example of system response:

      wsrep_cluster_size          2
      wsrep_cluster_status        Primary
      wsrep_local_state_comment   Synced
  2. Remove the content of the /var/lib/mysql/* directory:

    kubectl -n openstack exec -it mariadb-server-1 – rm -rf /var/lib/mysql/*
  3. Restart the MariaDB container:

    kubectl -n openstack delete pod mariadb-server-1