Delete a cluster machine using CLI¶
Available since 2.21.0 for non-MOSK clusters as TechPreview
This section instructs you on how to scale down an existing management, regional, or managed cluster through the Container Cloud API. To delete a machine using the Container Cloud web UI, see Delete a cluster machine using web UI.
Using the Container Cloud API, you can delete a cluster machine using the following methods:
Recommended. Enable the
delete
field in theproviderSpec
section of the requiredMachine
object. It allows aborting graceful machine deletion before the node is removed from Docker Swarm.Not recommended. Apply the
delete
request to theMachine
object.
You can control machine deletion steps by following a specific machine deletion policy.
Overview of machine deletion policies¶
The deletion policy of the Machine
resource used in the Container Cloud
API defines specific steps occurring before a machine deletion.
The Container Cloud API contains the following types of deletion policies: graceful, unsafe, forced.
By default, the unsafe deletion policy is used. In future Container Cloud releases, the default policy will be changed to the graceful one.
You can change the deletion policy before the machine deletion. If the deletion process has already started, you can reduce the deletion policy restrictions in the following order only: graceful > unsafe > forced.
Graceful machine deletion¶
During a graceful machine deletion, the cloud provider and LCM controllers perform the following steps:
Cordon and drain the node being deleted.
Remove the node from Docker Swarm.
Send the
delete
request to the correspondingMachine
resource.Remove the provider resources such as the VM instance, network, volume, and so on. Remove the related Kubernetes resources.
Remove the finalizer from the
Machine
resource. This step completes the machine deletion from Kubernetes resources.
Caution
You can abort a graceful machine deletion only before the corresponding node is removed from Docker Swarm.
During a graceful machine deletion, the Machine
object status displays
prepareDeletionPhase
with the following possible values:
started
Cloud provider controller prepares a machine for deletion by cordoning, draining the machine, and so on.
completed
LCM Controller starts removing the machine resources since the preparation for deletion is complete.
aborting
Cloud provider controller attempts to uncordon the node. If the attempt fails, the status changes to
failed
.
failed
Error in the deletion workflow.
Unsafe machine deletion¶
During an unsafe machine deletion, the cloud provider and LCM controllers perform the following steps:
Send the
delete
request to the correspondingMachine
resource.Remove the provider resources such as the VM instance, network, volume, and so on. Remove the related Kubernetes resources.
Remove the finalizer from the
Machine
resource. This step completes the machine deletion from Kubernetes resources.
Forced machine deletion¶
During a forced machine deletion, the cloud provider and LCM controllers perform the following steps:
Send the
delete
request to the correspondingMachine
resource.Remove the provider resources such as the VM instance, network, volume, and so on. Remove the related Kubernetes resources.
Remove the finalizer from the
Machine
resource. This step completes the machine deletion from Kubernetes resources.
This policy type allows deleting a Machine
resource even if the cloud
provider or LCM controller gets stuck at some step. But this policy may
require a manual cleanup of machine resources in case of a сontroller
failure. For details, see Delete a machine from a cluster using CLI.
Caution
Consider the following precautions applied to the forced machine deletion policy:
Use the forced machine deletion only if either graceful or unsafe machine deletion fails.
If the forced machine deletion fails at any step, the LCM Controller removes the finalizer anyway.
Before starting the forced machine deletion, back up the related
Machine
resource:kubectl get machine -n <projectName> <machineName> -o json > deleted_machine.json
Delete a machine from a cluster using CLI¶
Carefully read the machine deletion precautions.
Log in to the Container Cloud web UI with the
m:kaas:namespace@operator
orm:kaas:namespace@writer
permissions.Log in to the host where your management cluster
kubeconfig
is located and where kubectl is installed.For the bare metal provider, ensure that the machine being deleted is not a Ceph Monitor. If it is, migrate the Ceph Monitor to keep the odd number quorum of Ceph Monitors after the machine deletion. For details, see Migrate a Ceph Monitor before machine replacement.
If the machine is assigned to a machine pool, decrease replicas count of the pool as described in Change replicas count of a machine pool.
Select from the following options:
Recommended. In the
providerSpec.value
section of theMachine
object, setdelete
totrue
:kubectl patch machines.cluster.k8s.io -n <projectName> <machineName> --type=merge -p '{"spec":{"providerSpec":{"value":{"delete":true}}}}'
Replace the parameters enclosed in angle brackets with the corresponding values.
Delete the
Machine
object.kubectl delete machines.cluster.k8s.io -n <projectName> <machineName>
After a successful
unsafe
orgraceful
machine deletion, the resources allocated to the machine are automatically freed up.If you applied the
forced
machine deletion, verify that all machine resources are freed up. Otherwise, manually clean up resources:Delete the Kubernetes
Node
object related to the deletedMachine
object:Note
Since Container Cloud 2.23.0, skip this step as the system performs it automatically.
Log in to the host where your managed cluster
kubeconfig
is located.Verify whether the
Node
object for the deletedMachine
object still exists:kubectl get node $(jq -r '.status.nodeRef.name' deleted_machine.json)
If the system response is positive:
Log in to the host where your management cluster
kubeconfig
is located.Delete the
LcmMachine
object with same name and project name as the deletedMachine
object.kubectl delete lcmmachines.lcm.mirantis.com -n <projectName> <machineName>
Clean up the provider-specific resources. Select from the following options:
Bare metal
Log in to the host that contains the following configuration:
Management cluster
kubeconfig
vSphere credentials configured
jq installed
If the deleted machine was located on a managed cluster, delete the Ceph node as described in High-level workflow of Ceph OSD or node removal.
Obtain the
BareMetalHost
object that relates to the deleted machine:BMH=$(jq -r '.metadata.annotations."metal3.io/BareMetalHost"| split("/") | .[1]' deleted_machine.json)
Delete the
BareMetalHost
credentials:kubectl delete secret -n <projectName> <machineName>-user-data
Deprovision the related
BareMetalHost
object:kubectl patch baremetalhost -n <projectName> ${BMH} --type merge --patch '{"spec": {"image": null, "userData": null, "online":false}}' kubectl patch baremetalhost -n <projectName> ${BMH} --type merge --patch '{"spec": {"consumerRef": null}}'
OpenStack
Log in to the host that contains the following configuration:
Management cluster
kubeconfig
OpenStack credentials configured
Required tools: kubectl, jq, openstack-cli
Obtain the instance ID of the deleted machine:
SERVER_ID=$(jq -r ".status.providerStatus.providerInstanceState.id" deleted_machine.json)
Verify whether the OpenStack server still exists:
openstack server show ${SERVER_ID}
If the system response is positive, delete the OpenStack server:
openstack server delete ${SERVER_ID}
Delete the floating IP on the related managed cluster:
PORT=$(openstack port list --device-id <serverID> -c ID -f value) FLOATING=$(openstack floating ip list --port ${PORT} -c ID -f value) openstack floating ip delete ${FLOATING}
vSphere
Log in to the host that contains the following configuration:
Management cluster
kubeconfig
vSphere credentials configured
Required tools: kubectl, jq, govc
Obtain the VM UUID that relates to the deleted machine:
VM_UUID=$(jq -r ".status.providerStatus.providerInstanceState.id" deleted_machine.json)
Verify whether the VM still exists:
govc vm.info -vm.uuid ${VM_UUID}
If the system response is positive, delete the VM:
govc vm.destroy -vm.uuid ${VM_UUID}