Precautions for a cluster machine deletion

Before deleting a cluster machine, carefully read the following essential information for a successful machine deletion:

  • We recommend deleting cluster machines using the Container Cloud web UI or API instead of using the cloud provider tools directly. Otherwise, the cluster deletion or detachment may hang and additional manual steps will be required to clean up machine resources.

  • An operational managed cluster must contain a minimum of 3 Kubernetes manager machines to meet the etcd quorum and 2 Kubernetes worker machines.

    The deployment of the cluster does not start until the minimum number of machines is created.

    A machine with the manager role is automatically deleted during the cluster deletion. Manual deletion of manager machines is allowed only for the purpose of node replacement or recovery.

    Support status of manager machine deletion

    • Since the Cluster releases 17.0.0, 16.0.0, and 14.1.0, the feature is generally available.

    • Before the Cluster releases 16.0.0 and 14.1.0, the feature is available within the Technology Preview features scope for non-MOSK-based clusters.

    • Before the Cluster release 17.0.0 the feature is not supported for MOSK.

  • Consider the following precautions before deleting manager machines:

    • Create a new manager machine to replace the deleted one as soon as possible. This is necessary since after machine removal, the cluster has limited capabilities to tolerate faults. Deletion of manager machines is intended only for replacement or recovery of failed nodes.

    • You can delete a manager machine only if your cluster has at least two manager machines in the Ready state.

    • Do not delete more than one manager machine at once to prevent cluster failure and data loss.

    • For MOSK-based clusters, after deletion of a manager machine, proceed with additional manual steps described in Mirantis OpenStack for Kubernetes Operations Guide: Replace a failed controller node.

    • Before replacing a failed manager machine, make sure that all Deployments with replicas configured to 1 are ready.

    • For the bare metal provider, ensure that the machine to delete is not a Ceph Monitor. Otherwise, migrate the Ceph Monitor to keep the odd number quorum of Ceph Monitors after the machine deletion. For details, see Migrate a Ceph Monitor before machine replacement.

    • On managed clusters, deletion of a machine assigned to a machine pool without decreasing replicas count of a pool automatically recreates the machine in the pool. Therefore, to delete a machine from a machine pool, first decrease the pool replicas count.

  • If StackLight in HA mode is enabled and you are going to delete a machine with the StackLight label:

    • Make sure that at least 3 machines with the StackLight label remain after the deletion. Otherwise, add an additional machine with such label before the deletion. After the deletion, perform the additional steps described in the deletion procedure, if required.

    • Do not delete more than 1 machine with the StackLight label. Since StackLight in HA mode uses local volumes bound to machines, the data from these volumes on the deleted machine will be purged but its replicas remain on other machines. Removal of more than 1 machine can cause data loss.

  • If you move the StackLight label to a new worker machine on an existing cluster, manually deschedule all StackLight components from the old worker machine, which you remove the StackLight label from. For details, see Deschedule StackLight Pods from a worker machine.

  • If the machine being deleted has a prioritized upgrade index and you want to preserve the same upgrade order, manually set the required index to the new node that replaces the deleted one. Otherwise, the new node is automatically set the greatest upgrade index that is prioritized the last. To set the upgrade index, refer to Change the upgrade order of a machine or machine pool.