After a physical disk replacement, you can use Ceph LCM API to redeploy
a failed Ceph OSD. The common flow of replacing a failed Ceph OSD is as
follows:
Remove the obsolete Ceph OSD from the Ceph cluster by device name, by Ceph
OSD ID, or by path.
Add a new Ceph OSD on the new disk to the Ceph cluster.
Remove a failed Ceph OSD by device name, path, or ID¶
Warning
The procedure below presuppose that the Operator knows the exact
device name, by-path, or by-id of the replaced device, as well as on
which node the replacement occurred.
Warning
Since Container Cloud 2.23.0 and 2.23.1 for MOSK
23.1, a Ceph OSD removal using by-path, by-id, or device name is
not supported if a device was physically removed from a node. Therefore, use
cleanupByOsdId instead. For details, see
Remove a failed Ceph OSD by Ceph OSD ID.
Warning
Since Container Cloud 2.25.0, Mirantis does not recommend
setting device name or device by-path symlink in the
cleanupByDevice field as these identifiers are not persistent and
can change at node boot. Remove Ceph OSDs with by-id symlinks
specified in the path field or use cleanupByOsdId instead.
Substitute <managedClusterProjectName> with the corresponding value.
In the nodes section, remove the required device:
spec:cephClusterSpec:nodes:<machineName>:storageDevices:-name:<deviceName># remove the entire item from storageDevices list# fullPath: <deviceByPath> if device is specified with symlink instead of nameconfig:deviceClass:hdd
Substitute <machineName> with the machine name of the node where the
device <deviceName> or <deviceByPath> is going to be replaced.
Save KaaSCephCluster and close the editor.
Create a KaaSCephOperationRequest CR template and save it as
replace-failed-osd-<machineName>-<deviceName>-request.yaml:
apiVersion:kaas.mirantis.com/v1alpha1kind:KaaSCephOperationRequestmetadata:name:replace-failed-osd-<machineName>-<deviceName>namespace:<managedClusterProjectName>spec:osdRemove:nodes:<machineName>:cleanupByDevice:-name:<deviceName># If a device is specified with by-path or by-id (since Container# Cloud 2.19.0 and 2.20.1 for MOSK 22.4) instead of# name, path: <deviceByPath> or <deviceById>.kaasCephCluster:name:<kaasCephClusterName>namespace:<managedClusterProjectName>
Substitute <kaasCephClusterName> with the corresponding
KaaSCephCluster resource from the <managedClusterProjectName>
namespace.
Substitute <managedClusterProjectName> with the corresponding value.
In the nodes section, add a new device:
spec:cephClusterSpec:nodes:<machineName>:storageDevices:-fullPath:<deviceByID># Since Container Cloud 2.25.0 if device is supposed to be added with by-id# name: <deviceByID> # Prior Container Cloud 2.25.0 if device is supposed to be added with by-id# fullPath: <deviceByPath> # if device is supposed to be added with by-pathconfig:deviceClass:hdd
Substitute <machineName> with the machine name of the node where device
<deviceName> or <deviceByPath> is going to be added as a Ceph OSD.
Verify that the new Ceph OSD has appeared in the Ceph cluster and is in
and up. The fullClusterInfo section should not contain any issues.