Parallelizing node update operations¶
Available since MOSK 23.2 TechPreview
MOSK enables you to parallelize node update operations, significantly improving the efficiency of your deployment. This capability applies to any operation that utilizes the Node Maintenance API, such as cluster updates or graceful node reboots.
The core implementation of parallel updates is handled by the LCM Controller
ensuring seamless execution of parallel operations. LCM starts performing an
operation on the node only when all NodeWorkloadLock
objects for the node
are marked as inactive
. By default, the LCM Controller creates one
NodeMaintenanceRequest
at a time.
Each application controller, including Ceph, OpenStack, and Tungsten Fabric
Controllers, manages parallel NodeMaintenanceRequest
objects independently.
The controllers determine how to handle and execute parallel node maintenance
requests based on specific requirements of their respective applications.
To understand the workflow of the Node Maintenance API, refer to
WorkloadLock objects.
Enhancing parallelism during node updates¶
- Set the nodes update order.
You can optimize parallel updates by setting the order in which nodes are updated. You can accomplish this by configuring
upgradeIndex
of theMachine
object. For the procedure, refer to Mirantis Container Cloud: Change upgrade order for machines.
- Increase parallelism.
Boost parallelism by adjusting the maximum number of worker node updates that are allowed during LCM operations using the
spec.providerSpec.value.maxWorkerUpgradeCount
configuration parameter, which is set to1
by default.For configuration details, refer to Mirantis Container Cloud: Configure the parallel update of worker nodes.
- Execute LCM operations.
Run LCM operations, such as cluster updates, taking advantage of the increased parallelism.
OpenStack nodes update¶
By default, the OpenStack Controller handles the NodeMaintenanceRequest
objects as follows:
Updates the OpenStack controller nodes sequentially (one by one).
Updates the gateway nodes sequentially. Technically, you can increase the number of gateway nodes upgrades allowed in parallel using the
nwl_parallel_max_gateway
parameter but Mirantis does not recommend to do so.Updates the compute nodes in parallel. The default number of allowed parallel updates is
30
. You can adjust this value through thenwl_parallel_max_compute
parameter.Parallelism considerations for compute nodes
When considering parallelism for compute nodes, take into account that during certain pod restarts, for example, the
openvswitch-vswitchd
pods, a brief instance downtime may occur. Select a suitable level of parallelism to minimize the impact on workloads and prevent excessive load on the control plane nodes.If your cloud environment is distributed across failure domains, which are represented by Nova availability zones, you can limit the parallel updates of nodes to only those within the same availability zone. This behavior is controlled by the
respect_nova_az
option in the OpenStack Controller.
The OpenStack Controller configuration is stored in the
openstack-controller-config
configMap of the osh-system
namespace.
The options are picked up automatically after update. To learn more about
the OpenStack Controller configuration parameters,
refer to OpenStack Controller configuration.
Ceph nodes update¶
By default, the Ceph Controller handles the NodeMaintenanceRequest
objects as follows:
Updates the non-storage nodes sequentially. Non-storage nodes include all nodes that have
mon
,mgr
,rgw
, ormds
roles.Updates storage nodes in parallel. The default number of allowed parallel updates is calculated automatically based on the minimal failure domain in a Ceph cluster.
Parallelism calculations for storage nodes
The Ceph Controller automatically calculates the parallelism number in the following way:
Finds the minimal failure domain for a Ceph cluster. For example, the minimal failure domain is
rack
.Filters all currently requested nodes by minimal failure domain. For example, parallelism equals to 5, and LCM requests 3 nodes from the
rack1
rack and 2 nodes from therack2
rack.Handles each filtered node group one by one. For example, the controller handles in parallel all nodes from
rack1
before processing nodes fromrack2
.
The Ceph Controller handles non-storage nodes before the storage ones. If there are node requests for both node types, the Ceph Controller handles sequentially the non-storage nodes first. Therefore, Mirantis recommends setting the upgrade index of a higher priority for the non-storage nodes to decrease the total upgrade time.
If the minimal failure domain is host
, the Ceph Controller updates only
one storage node per failure domain unit. This results in updating all Ceph
nodes sequentially, despite the potential for increased parallelism.
Tungsten Fabric nodes update¶
By default, the Tungsten Fabric Controller handles the
NodeMaintenanceRequest
objects as follows:
Updates the Tungsten Fabric Controller and gateway nodes sequentially.
Updates the vRouter nodes in parallel. The Tungsten Fabric Controller allows updating up to 30 vRouter nodes in parallel.
Maximum amount of vRouter nodes in maintenance
While the Tungsten Fabric Controller has the capability to process up to 30
NodeMaintenanceRequest
objects targeted to vRouter nodes, the actual amount may be lower. This is due to a check that ensures OpenStack readiness to unlock the relevant nodes for maintenance. If OpenStack allows for maintenance, the Tungsten Fabric Controller verifies the vRouter pods. Upon successful verification, theNodeWorkloadLock
object is switched to the maintenance mode.