Delete a cluster machine using CLI¶

Available since MOSK 23.3

This section instructs you on how to scale down an existing management or managed cluster through the Container Cloud API. To delete a machine using the Container Cloud web UI, see Delete a cluster machine using web UI.

Using the Container Cloud API, you can delete a cluster machine using the following methods:

Recommended. Enable the delete field in the providerSpec section of the required Machine object. It allows aborting graceful machine deletion before the node is removed from Docker Swarm.
Not recommended. Apply the delete request to the Machine object.

You can control machine deletion steps by following a specific machine deletion policy.

Overview of machine deletion policies¶

The deletion policy of the Machine resource used in the Container Cloud API defines specific steps occurring before a machine deletion.

The Container Cloud API contains the following types of deletion policies: graceful, unsafe, forced. By default, the graceful deletion policy is used.

You can change the deletion policy before the machine deletion. If the deletion process has already started, you can reduce the deletion policy restrictions in the following order only: graceful > unsafe > forced.

Graceful machine deletion¶

Recommended

During a graceful machine deletion, the provider and LCM controllers perform the following steps:

Cordon and drain the node being deleted.
Remove the node from Docker Swarm.
Send the delete request to the corresponding Machine resource.
Remove the provider resources such as the VM instance, network, volume, and so on. Remove the related Kubernetes resources.
Remove the finalizer from the Machine resource. This step completes the machine deletion from Kubernetes resources.

Caution

You can abort a graceful machine deletion only before the corresponding node is removed from Docker Swarm.

During a graceful machine deletion, the Machine object status displays prepareDeletionPhase with the following possible values:

started
Provider controller prepares a machine for deletion by cordoning, draining the machine, and so on.
completed
LCM Controller starts removing the machine resources since the preparation for deletion is complete.
aborting
Provider controller attempts to uncordon the node. If the attempt fails, the status changes to failed.
failed
Error in the deletion workflow.

Unsafe machine deletion¶

During an unsafe machine deletion, the provider and LCM controllers perform the following steps:

Send the delete request to the corresponding Machine resource.
Remove the provider resources such as the VM instance, network, volume, and so on. Remove the related Kubernetes resources.
Remove the finalizer from the Machine resource. This step completes the machine deletion from Kubernetes resources.

Forced machine deletion¶

During a forced machine deletion, the provider and LCM controllers perform the following steps:

Send the delete request to the corresponding Machine resource.
Remove the provider resources such as the VM instance, network, volume, and so on. Remove the related Kubernetes resources.
Remove the finalizer from the Machine resource. This step completes the machine deletion from Kubernetes resources.

This policy type allows deleting a Machine resource even if the provider or LCM controller gets stuck at some step. But this policy may require a manual cleanup of machine resources in case of a controller failure. For details, see Delete a machine from a cluster using CLI.

Caution

Consider the following precautions applied to the forced machine deletion policy:

Use the forced machine deletion only if either graceful or unsafe machine deletion fails.
If the forced machine deletion fails at any step, the LCM Controller removes the finalizer anyway.

Before starting the forced machine deletion, back up the related Machine resource:

kubectl get machine -n <projectName> <machineName> -o json > deleted_machine.json

Delete a machine from a cluster using CLI¶

Carefully read the machine deletion precautions.
Log in to the host where your management cluster kubeconfig is located and where kubectl is installed.
For the bare metal provider, ensure that the machine being deleted is not a Ceph Monitor. If it is, migrate the Ceph Monitor to keep the odd number quorum of Ceph Monitors after the machine deletion. For details, see Migrate a Ceph Monitor before machine replacement.
Select from the following options:
- Recommended. In the providerSpec.value section of the Machine object, set delete to true:
```
kubectl patch machines.cluster.k8s.io -n <projectName> <machineName> --type=merge -p '{"spec":{"providerSpec":{"value":{"delete":true}}}}'
```
  Replace the parameters enclosed in angle brackets with the corresponding values.
- Delete the Machine object.
```
kubectl delete machines.cluster.k8s.io -n <projectName> <machineName>
```
After a successful unsafe or graceful machine deletion, the resources allocated to the machine are automatically freed up.
If you applied the forced machine deletion, verify that all machine resources are freed up. Otherwise, manually clean up resources:
1. Delete the Kubernetes Node object related to the deleted Machine object:
  
  Note
  
  Since MOSK 23.1, skip this step as the system performs it automatically.
  1. Log in to the host where your managed cluster kubeconfig is located.
  2. Verify whether the Node object for the deleted Machine object still exists:
    kubectl get node $(jq -r '.status.nodeRef.name' deleted_machine.json)
    If the system response is positive:
    1. Log in to the host where your management cluster kubeconfig is located.
    2. Delete the LcmMachine object with same name and project name as the deleted Machine object.
      kubectl delete lcmmachines.lcm.mirantis.com -n <projectName> <machineName>
2. Clean up the provider resources:
  1. Log in to the host that contains the management cluster kubeconfig and jq installed.
  2. If the deleted machine was located on the managed cluster, delete the Ceph node as described in High-level workflow of Ceph OSD or node removal.
  3. Obtain the BareMetalHost object that relates to the deleted machine:
    BMH=$(jq -r '.metadata.annotations."metal3.io/BareMetalHost"| split("/") | .[1]' deleted_machine.json)
  4. Delete the BareMetalHost credentials:
    kubectl delete secret -n <projectName> <machineName>-user-data
  5. Deprovision the related bare metal host object:
    Since the management cluster update to 16.4.0 (MCC 2.29.0)
    m:kaas@management-admin only. This limitation is lifted once the management cluster is updated to the Cluster release 16.4.1 or later.
    
    kubectl patch baremetalhost -n <projectName> ${BMH} --type merge --patch '{"spec": {"image": null, "userData": null}}' kubectl patch baremetalhostinventories -n <projectName> ${BMH} --type merge --patch '{"spec": {"online":false}}' kubectl patch baremetalhost -n <projectName> ${BMH} --type merge --patch '{"spec": {"consumerRef": null}}'
    Before the management cluster update to 16.4.0 (MCC 2.29.0)
    kubectl patch baremetalhost -n <projectName> ${BMH} --type merge --patch '{"spec": {"image": null, "userData": null, "online":false}}' kubectl patch baremetalhost -n <projectName> ${BMH} --type merge --patch '{"spec": {"consumerRef": null}}'
Strongly recommended. Back up MKE as described in Mirantis Kubernetes Engine documentation: Back up MKE.

Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.