Precautions for a cluster machine deletion

Before deleting a cluster machine, carefully read the following essential information for a successful machine deletion:

  • Mirantis recommends deleting cluster machines using the Container Cloud web UI or API instead of using the provider tools directly. Otherwise, the cluster deletion or detachment may hang and additional manual steps will be required to clean up machine resources.

  • An operational managed cluster must contain a minimum of 3 Kubernetes manager machines to meet the etcd quorum and 2 Kubernetes worker machines.

    The deployment of the cluster does not start until the minimum number of machines is created.

    A machine with the manager role is automatically deleted during the cluster deletion. Manual deletion of manager machines is allowed only for the purpose of node replacement or recovery.

  • Consider the following precautions before deleting manager machines:

    • Create a new manager machine to replace the deleted one as soon as possible. This is necessary because after machine removal, the cluster has limited capabilities to tolerate faults. Deletion of manager machines is intended only for replacement or recovery of failed nodes.

    • You can delete a manager machine only if your cluster has at least two manager machines in the Ready state.

    • Do not delete more than one manager machine at once to prevent cluster failure and data loss.

    • After deletion of a manager machine, proceed with additional manual steps described in Replace a failed controller node.

    • Before replacing a failed manager machine, make sure that all Deployments with replicas configured to 1 are ready.

    • Ensure that the machine to delete is not a Ceph Monitor. Otherwise, migrate the Ceph Monitor to keep the odd number quorum of Ceph Monitors after the machine deletion. For details, see Migrate a Ceph Monitor before machine replacement.

  • If StackLight in HA mode is enabled and you are going to delete a machine with the StackLight label:

    • Make sure that at least 3 machines with the StackLight label remain after the deletion. Otherwise, add an additional machine with such label before the deletion. After the deletion, perform the additional steps described in the deletion procedure, if required.

    • Do not delete more than 1 machine with the StackLight label. Since StackLight in HA mode uses local volumes bound to machines, the data from these volumes on the deleted machine will be purged but its replicas remain on other machines. Removal of more than 1 machine can cause data loss.

  • If you move the StackLight label to a new worker machine on an existing cluster, manually deschedule all StackLight components from the old worker machine, which you remove the StackLight label from. For details, see Container Cloud documentation: StackLight operations - Deschedule StackLight Pods from a worker machine.

  • If the machine being deleted has a prioritized upgrade index and you want to preserve the same upgrade order, manually set the required index to the new node that replaces the deleted one. Otherwise, the new node is automatically set the greatest upgrade index that is prioritized the last. To set the upgrade index, refer to Change the upgrade order of a machine or machine pool.