Precautions for a cluster machine deletion

Before deleting a cluster machine, carefully read the following essential information for a successful machine deletion:

  • We recommend deleting cluster machines using the Container Cloud web UI or API instead of using the cloud provider tools directly. Otherwise, the cluster deletion or detachment may hang and additional manual steps will be required to clean up machine resources.

  • An operational managed cluster must contain a minimum of 3 Kubernetes manager nodes and 2 Kubernetes worker nodes. The deployment of the cluster does not start until the minimum number of nodes is created.

    A machine with the manager node role is automatically deleted during the cluster deletion.

    • Before Container Cloud 2.17.0, to meet the etcd quorum and prevent the deployment failure, deletion of the manager nodes is prohibited.

    • Since Container Cloud 2.17.0, deletion of the manager nodes is allowed for non-MOSK-based clusters within the Technology Preview features scope for the purpose of node replacement or recovery.

  • Consider the following precautions before deleting manager machines from non-MOSK-based clusters:

    • Create a new manager machine to replace the deleted one as soon as possible. This is necessary since after machine removal, the cluster has limited capabilities to tolerate faults. Deletion of manager machines is intended only for replacement or recovery of failed nodes.

    • You can delete a manager machine only if your cluster has at least two manager machines in the Ready state.

    • Do not delete more than one manager machine at once to prevent cluster failure and data loss.

    • For the Equinix Metal and bare metal providers, ensure that the machine to delete is not a Ceph Monitor. Otherwise, migrate the Ceph Monitor to keep the odd number quorum of Ceph Monitors after the machine deletion. For details, see Migrate a Ceph Monitor before machine replacement.

    • On managed clusters, deletion of a machine assigned to a machine pool without decreasing replicas count of a pool automatically recreates the machine in the pool. Therefore, to delete a machine from a machine pool, first decrease the pool replicas count.

  • If StackLight in HA mode is enabled and you are going to delete a machine with the StackLight label:

    • Make sure that at least 3 machines with the StackLight label remain after the deletion. Otherwise, add an additional machine with such label before the deletion. After the deletion, perform the additional steps described in the deletion procedure, if required.

    • Do not delete more than 1 machine with the StackLight label. Since StackLight in HA mode uses local volumes bound to machines, the data from these volumes on the deleted machine will be purged but its replicas remain on other machines. Removal of more than 1 machine can cause data loss.

  • If you move the StackLight label to a new worker machine on an existing cluster, manually deschedule all StackLight components from the old worker machine, which you remove the StackLight label from. For details, see Deschedule StackLight Pods from a worker machine.

  • If the machine being deleted has a prioritized upgrade index and you want to preserve the same upgrade order, manually set the required index to the new node that replaces the deleted one. Otherwise, the new node is automatically set the greatest upgrade index that is prioritized the last. To set the upgrade index, refer to Change the upgrade order of a machine or machine pool.