Mirantis Container Cloud (MCC) becomes part of Mirantis OpenStack for Kubernetes (MOSK)!
Starting with MOSK 25.2, the MOSK documentation set covers all product layers, including MOSK management (formerly Container Cloud). This means everything you need is in one place. Some legacy names may remain in the code and documentation and will be updated in future releases. The separate Container Cloud documentation site will be retired, so please update your bookmarks for continued easy access to the latest content.
Remove Ceph OSD manually¶
Warning
This procedure is valid for MOSK clusters that use the deprecated
KaaSCephCluster custom resource (CR) instead of the MiraCeph CR that is
available since MOSK 25.2 as a new Ceph configuration entrypoint. For the
equivalent procedure with the MiraCeph CR, refer to the following section:
You may need to manually remove a Ceph OSD, for example, in the following cases:
If you have removed a device or node from the
KaaSCephClusterspec.cephClusterSpec.nodesorspec.cephClusterSpec.nodeGroupssection withmanageOsdsset tofalse.If you do not want to rely on Ceph LCM operations and want to manage the Ceph OSDs life cycle manually.
To safely remove one or multiple Ceph OSDs from a Ceph cluster, perform the following procedure for each Ceph OSD one by one.
Warning
The procedure presupposes the Ceph OSD disk or logical volumes partition cleanup.
To remove a Ceph OSD manually:
Edit the
KaaSCephClusterresource on a management cluster:kubectl --kubeconfig <mgmtKubeconfig> -n <moskClusterProjectName> edit kaascephcluster
Substitute
<mgmtKubeconfig>with the management clusterkubeconfigand<moskClusterProjectName>with the project name of the MOSK cluster.In the
spec.cephClusterSpec.nodessection, remove the requiredstorageDevicesitem from the corresponding node spec or exclude the required item fromstorageDeviceFilter. If after removalstorageDevicesbecomes empty and the node spec has no roles specified, also remove the node spec.Obtain
kubeconfigof the MOSK cluster and provide it as an environment variable:export KUBECONFIG=<pathToMoskKubeconfig>
Verify that all Ceph OSDs are
upandin, the Ceph cluster is healthy, and no rebalance or recovery is in progress:kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l \ app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') -- ceph -s
Example of system response:
cluster: id: 8cff5307-e15e-4f3d-96d5-39d3b90423e4 health: HEALTH_OK ... osd: 4 osds: 4 up (since 10h), 4 in (since 10h)
Stop the
rook-ceph/rook-ceph-operatordeployment to avoid premature reorchestration of the Ceph cluster:kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 0
Enter the
ceph-toolspod:kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l \ app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') bash
Mark the required Ceph OSD as
out:ceph osd out osd.<ID>
Note
In the command above and in the steps below, substitute
<ID>with the number of the Ceph OSD to remove.Wait until data backfilling to other OSDs is complete:
ceph -sOnce all of the PGs are
active+clean, backfilling is complete and it is safe to remove the disk.Note
For additional information on PGs backfilling, run ceph pg dump_stuck.
Exit from the
ceph-toolspod:exitScale the
rook-ceph/rook-ceph-osd-<ID>deployment to0replicas:kubectl -n rook-ceph scale deploy rook-ceph-osd-<ID> --replicas 0
Enter the
ceph-toolspod:kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l \ app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') bash
Verify that the number of Ceph OSDs that are
upandinhas decreased by one daemon:ceph -sExample of system response:
osd: 4 osds: 3 up (since 1h), 3 in (since 5s)
Remove the Ceph OSD from the Ceph cluster:
ceph osd purge <ID> --yes-i-really-mean-it
Delete the Ceph OSD
authentry, if present. Otherwise, skip this step.ceph auth del osd.<ID>
If you have removed the last Ceph OSD on the node and want to remove this node from the Ceph cluster, remove the CRUSH map entry:
ceph osd crush remove <nodeName>
Substitute
<nodeName>with the name of the node where the removed Ceph OSD was placed.Verify that the failure domain within Ceph OSDs has been removed from the CRUSH map:
ceph osd tree
If you have removed the node, it will be removed from the CRUSH map.
Exit from the
ceph-toolspod:exitClean up the disk used by the removed Ceph OSD. For details, see official Rook documentation.
Warning
If you are using multiple Ceph OSDs per device or metadata device, make sure that you can clean up the entire disk. Otherwise, instead clean up only the logical volume partitions for the volume group by running lvremove <lvpartion_uuid> any Ceph OSD pod that belongs to the same host as the removed Ceph OSD.
Delete the
rook-ceph/rook-ceph-osd-<ID>deployment previously scaled to0replicas:kubectl -n rook-ceph delete deploy rook-ceph-osd-<ID>
Substitute
<ID>with the number of the removed Ceph OSD.Scale the
rook-ceph/rook-ceph-operatordeployment to1replica and wait for the orchestration to complete:kubectl -n rook-ceph scale deploy rook-ceph-operator --replicas 1 kubectl -n rook-ceph get pod -w
Once done, Ceph OSD removal is complete.