Delete a cluster machine

This section instructs you on how to scale down an existing management, regional, or managed cluster through the Mirantis Container Cloud web UI.

Precautions

Before deleting a cluster machine, carefully read the following essential information for a successful machine deletion:

  • We recommend deleting cluster machines using the Container Cloud web UI or API instead of using the cloud provider tools directly. Otherwise, the cluster deletion or detachment may hang and additional manual steps will be required to clean up machine resources.

  • An operational managed cluster must contain a minimum of 3 Kubernetes manager nodes and 2 Kubernetes worker nodes. The deployment of the cluster does not start until the minimum number of nodes is created.

    A machine with the manager node role is automatically deleted during the cluster deletion.

    Before Container Cloud 2.17.0, to meet the etcd quorum and prevent the deployment failure, deletion of the manager nodes was prohibited.

  • Since Container Cloud 2.17.0, you can delete manager machines within the Technology Preview features scope and with the following precautions:

    • Create a new manager machine to replace the deleted one as soon as possible. This is necessary since after machine removal, the cluster has limited capabilities to tolerate faults. Deletion of manager machines is intended only for replacement or recovery of failed nodes.

    • You can delete a manager machine only if your cluster has at least two manager machines in the Ready state.

    • Do not delete more than one manager machine at once to prevent cluster failure and data loss.

    • For MOSK-based clusters, after deletion of a manager machine, proceed with additional manual steps described in Mirantis OpenStack for Kubernetes Operations Guide: Replace a failed controller node.

    • For the Equinix Metal and bare metal providers, ensure that the machine to delete is not a Ceph Monitor. Otherwise, migrate the Ceph Monitor to keep the odd number quorum of Ceph Monitors after the machine deletion. For details, see Migrate a Ceph Monitor before machine replacement.

    • On managed clusters, deletion of a machine assigned to a machine pool without decreasing replicas count of a pool automatically recreates the machine in the pool. Therefore, to delete a machine from a machine pool, first decrease the pool replicas count.

  • If StackLight in HA mode is enabled and you are going to delete a machine with the StackLight label:

    • Make sure that at least 3 machines with the StackLight label will remain after the deletion. Otherwise, add an additional machine with such label before the deletion. After the deletion, perform the additional steps described below.

    • Do not delete more than 1 machine with the StackLight label. Since StackLight in HA mode uses local volumes bound to machines, the data from these volumes on the deleted machine will be purged but its replicas remain on other machines. Removal of more than 1 machine can cause data loss.

Delete a machine from a cluster

  1. Carefully read the machine deletion precautions.

  2. Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.

  3. Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.

  4. In the Clusters tab, click on the required cluster name to open the list of machines running on it.

  5. For the Equinix Metal and bare metal providers, ensure that the machine being deleted is not a Ceph Monitor. If it is, migrate the Ceph Monitor to keep the odd number quorum of Ceph Monitors after the machine deletion. For details, see Migrate a Ceph Monitor before machine replacement.

    If you delete a machine on the regional cluster, refer to the known issue 23853 to complete the deletion.

  6. If the machine is assigned to a machine pool, decrease replicas count of the pool as described in Change replicas count of a machine pool.

  7. Click the More action icon in the last column of the machine you want to delete and select Delete. Confirm the deletion.

    Deleting a machine automatically frees up the resources allocated to this machine.

  8. Applicable only to managed clusters created before Container Cloud 2.17.0.

    If StackLight in HA mode is enabled and the deleted machine had the StackLight label, perform the following steps:

    1. Connect to the managed cluster as described in the steps 5-7 in Connect to a Mirantis Container Cloud cluster.

    2. Define the pods in the Pending state:

      kubectl get po -n stacklight | grep Pending
      

      Example of system response:

      elasticsearch-master-2          0/1       Pending       0       49s
      patroni-12-0                    0/3       Pending       0       51s
      patroni-13-0                    0/3       Pending       0       48s
      prometheus-alertmanager-1       0/1       Pending       0       47s
      prometheus-server-0             0/2       Pending       0       47s
      
    3. Verify that the reason for the pod Pending state is volume node affinity conflict:

      kubectl describe pod <POD_NAME> -n stacklight
      

      Example of system response:

      Events:
        Type     Reason            Age    From               Message
        ----     ------            ----   ----               -------
        Warning  FailedScheduling  6m53s  default-scheduler  0/6 nodes are available:
                                                             3 node(s) didn't match node selector,
                                                             3 node(s) had volume node affinity conflict.
        Warning  FailedScheduling  6m53s  default-scheduler  0/6 nodes are available:
                                                             3 node(s) didn't match node selector,
                                                             3 node(s) had volume node affinity conflict.
      
    4. Obtain the PVC of one of the pods:

      kubectl get pod <POD_NAME> -n stacklight -o=jsonpath='{range .spec.volumes[*]}{.persistentVolumeClaim}{"\n"}{end}'
      

      Example of system response:

      {"claimName":"elasticsearch-master-elasticsearch-master-2"}
      
    5. Remove the PVC using the obtained name. For example, for elasticsearch-master-elasticsearch-master-2:

      kubectl delete pvc elasticsearch-master-elasticsearch-master-2 -n stacklight
      
    6. Delete the pod:

      kubectl delete po <POD_NAME> -n stacklight
      
    7. Verify that a new pod is created and scheduled to the spare node. This may take some time. For example:

      kubectl get po elasticsearch-master-2 -n stacklight
      NAME                     READY   STATUS   RESTARTS   AGE
      elasticsearch-master-2   1/1     Running  0          7m1s
      
    8. Repeat the steps above for the remaining pods in the Pending state.