Delete a cluster machine using CLI
This section instructs you on how to scale down an existing management or MOSK cluster through the MOSK management API. To delete a machine using the MOSK management console, see Delete a cluster machine using the management console.
Using the MOSK management API, you can delete a cluster machine using the following methods:
Recommended. Enable the
deletefield in theproviderSpecsection of the requiredMachineobject. It allows aborting graceful machine deletion before the node is removed from Docker Swarm.Not recommended. Apply the
deleterequest to theMachineobject.
You can control machine deletion steps by following a specific machine deletion policy.
Overview of machine deletion policies
The deletion policy of the Machine resource used in the
MOSK management API defines specific steps occurring before
a machine deletion.
The MOSK management API contains the following types of deletion policies: graceful, unsafe, forced. By default, the graceful deletion policy is used.
You can change the deletion policy before the machine deletion. If the deletion process has already started, you can reduce the deletion policy restrictions in the following order only: graceful > unsafe > forced.
Graceful machine deletion
Recommended
During a graceful machine deletion, the provider and LCM controllers perform the following steps:
Cordon and drain the node being deleted.
Remove the node from Docker Swarm.
Send the
deleterequest to the correspondingMachineresource.Remove the provider resources such as the VM instance, network, volume, and so on. Remove the related Kubernetes resources.
Remove the finalizer from the
Machineresource. This step completes the machine deletion from Kubernetes resources.
Caution
You can abort a graceful machine deletion only before the corresponding node is removed from Docker Swarm.
During a graceful machine deletion, the Machine object status displays
prepareDeletionPhase with the following possible values:
startedProvider controller prepares a machine for deletion by cordoning, draining the machine, and so on.
completedLCM Controller starts removing the machine resources since the preparation for deletion is complete.
abortingProvider controller attempts to uncordon the node. If the attempt fails, the status changes to
failed.
failedError in the deletion workflow.
Unsafe machine deletion
During an unsafe machine deletion, the provider and LCM controllers perform the following steps:
Send the
deleterequest to the correspondingMachineresource.Remove the provider resources such as the VM instance, network, volume, and so on. Remove the related Kubernetes resources.
Remove the finalizer from the
Machineresource. This step completes the machine deletion from Kubernetes resources.
Forced machine deletion
During a forced machine deletion, the provider and LCM controllers perform the following steps:
Send the
deleterequest to the correspondingMachineresource.Remove the provider resources such as the VM instance, network, volume, and so on. Remove the related Kubernetes resources.
Remove the finalizer from the
Machineresource. This step completes the machine deletion from Kubernetes resources.
This policy type allows deleting a Machine resource even if the provider or
LCM controller gets stuck at some step. But this policy may require a manual
cleanup of machine resources in case of a controller failure. For details, see
Delete a machine from a cluster using CLI.
Caution
Consider the following precautions applied to the forced machine deletion policy:
Use the forced machine deletion only if either graceful or unsafe machine deletion fails.
If the forced machine deletion fails at any step, the LCM Controller removes the finalizer anyway.
Before starting the forced machine deletion, back up the related
Machineresource:kubectl get machine -n <projectName> <machineName> -o json > deleted_machine.json
Delete a machine from a cluster using CLI
Carefully read the machine deletion precautions.
Log in to the host where your management cluster
kubeconfigis located and where kubectl is installed.For the bare-metal provider, ensure that the machine being deleted is not a Ceph Monitor. If it is, migrate the Ceph Monitor to keep the odd number quorum of Ceph Monitors after the machine deletion. For details, see Move Ceph Monitor before node replacement.
Select from the following options:
Recommended. In the
providerSpec.valuesection of theMachineobject, setdeletetotrue:kubectl patch machines.cluster.k8s.io -n <projectName> <machineName> --type=merge -p '{"spec":{"providerSpec":{"value":{"delete":true}}}}'
Replace the parameters enclosed in angle brackets with the corresponding values.
Delete the
Machineobject.kubectl delete machines.cluster.k8s.io -n <projectName> <machineName>
After a successful
unsafeorgracefulmachine deletion, the resources allocated to the machine are automatically freed up.If you applied the
forcedmachine deletion, verify that all machine resources are freed up. Otherwise, manually clean up resources:Log in to the host that contains the management cluster
kubeconfigand jq installed.If the deleted machine was located on the MOSK cluster, delete the Ceph node as described in Remove a Ceph node.
Obtain the
BareMetalHostobject that relates to the deleted machine:BMH=$(jq -r '.metadata.annotations."metal3.io/BareMetalHost"| split("/") | .[1]' deleted_machine.json)
Delete the
BareMetalHostcredentials:kubectl delete secret -n <projectName> <machineName>-user-data
Deprovision the related bare-metal host object:
kubectl patch baremetalhost -n <projectName> ${BMH} --type merge --patch '{"spec": {"image": null, "userData": null}}' kubectl patch baremetalhostinventories -n <projectName> ${BMH} --type merge --patch '{"spec": {"online":false}}' kubectl patch baremetalhost -n <projectName> ${BMH} --type merge --patch '{"spec": {"consumerRef": null}}'
Strongly recommended. Back up MKE as described in Create backups of Mirantis Kubernetes Engine.
Since the procedure above modifies the cluster configuration, a fresh backup is required to restore the cluster in case further reconfigurations fail.
Important
Because the MKE restoration process is complicated, we strongly recommend contacting Mirantis support for assistance.
If you still decide to restore MKE from a backup on your own, you must scale down
helm-controlleron the cluster being restored if the MKE version of the affected cluster after the restore will differ from the MKE version in theClusterReleaseobject that is set in MOSK Cluster objects in the management cluster:If you are restoring MKE on a management cluster: before starting the restore, scale down
helm-controlleron each affected MOSK cluster. This prevents unintended Ceph and OpenStack downgrades on MOSK clusters after the management cluster is restored.If you are restoring MKE on a MOSK cluster: immediately after the restore completes, scale down
helm-controller. Because the restore rolls the cluster back to an older release, this prevents it from triggering a premature upgrade of Helm releases.