OpenStack Controller maintenance API

When LCM creates the ClusterMaintenanceRequest object, the OpenStack Controller (Rockoon) ensures that all OpenStack components are in the Healthy state, which means that the pods are up and running, and the readiness probes are passing.

ClusterMaintenanceRequest object creation flow

ClusterMaintenanceRequest - create

When LCM creates the NodeMaintenanceRequest, the OpenStack Controller:

  1. Prepares components on the node for maintenance by removing nova-compute from scheduling.

  2. If the reboot of a node is possible, the instance migration workflow is triggered. The Operator can configure the instance migration flow through the Kubernetes node annotation and should define the required option before the managed cluster update. For configuration details, refer to Instance migration configuration for hosts.

    Also, since MOSK 25.1, cloud users can mark their instances for LCM to handle them individually during host maintenance operations. This allows for greater flexibility during cluster updates, especially for workloads that are sensitive to live migration. For details, refer to Configure per-instance migration mode.

  3. If the OpenStack Controller cannot migrate instances due to errors, it is suspended unless all instances are migrated manually or the openstack.lcm.mirantis.com/instance_migration_mode annotation is set to skip.

NodeMaintenanceRequest object creation flow

NodeMaintenanceRequest - create

When the node maintenance is over, LCM removes the NodeMaintenanceRequest object and the OpenStack Controller:

  • Verifies that the Kubernetes Node becomes Ready.

  • Verifies that all OpenStack components on a given node are Healthy, which means that the pods are up and running, and the readiness probes are passing.

  • Ensures that the OpenStack components are connected to RabbitMQ. For example, the Neutron Agents become alive on the node, and compute instances are in the UP state.

Note

The OpenStack Controller enables you to have only one nodeworkloadlock object at a time in the inactive state. Therefore, the update process for nodes is sequential.

NodeMaintenanceRequest object removal flow

NodeMaintenanceRequest - delete

When the cluster maintenance is over, the OpenStack Controller sets the ClusterWorkloadLock object to back active and the update completes.

CLusterMaintenanceRequest object removal flow

ClusterMaintenanceRequest - delete