Replace a failed Ceph OSD disk with a metadata device as a device name¶
You can apply the below procedure if a Ceph OSD failed with data disk outage
and the metadata partition is not specified in the BareMetalHostProfile
custom resource (CR). This scenario implies that the Ceph cluster
automatically creates a required metadata logical volume on a desired device.
Remove a Ceph OSD with a metadata device as a device name¶
To remove the affected Ceph OSD with a metadata device as a device name, follow the Remove a failed Ceph OSD by ID with a defined metadata device procedure and capture the following details:
While editing
KaasCephCluster
in thenodes
section, capture themetadataDevice
path to reuse it during re-creation of the Ceph OSD.Example of the
spec.nodes
section:spec: cephClusterSpec: nodes: <machineName>: storageDevices: - name: <deviceName> # remove the entire item from the storageDevices list # fullPath: <deviceByPath> if device is specified using by-path instead of name config: deviceClass: hdd metadataDevice: /dev/nvme0n1
In the example above, save the
metadataDevice
device name/dev/nvme0n1
.During verification of
removeInfo
, capture theusedPartition
value of the metadata device located in thedeviceMapping.<metadataDevice>
section.Example of the
removeInfo
section:removeInfo: cleanUpMap: <nodeName>: osdMapping: "<osdID>": deviceMapping: <dataDevice>: deviceClass: hdd devicePath: <dataDeviceByPath> devicePurpose: block usedPartition: /dev/ceph-d2d3a759-2c22-4304-b890-a2d87e056bd4/osd-block-ef516477-d2da-492f-8169-a3ebfc3417e2 zapDisk: true <metadataDevice>: deviceClass: hdd devicePath: <metadataDeviceByPath> devicePurpose: db usedPartition: /dev/ceph-b0c70c72-8570-4c9d-93e9-51c3ab4dd9f9/osd-db-ecf64b20-1e07-42ac-a8ee-32ba3c0b7e2f uuid: ef516477-d2da-492f-8169-a3ebfc3417e2
In the example above, capture the following values from the
<metadataDevice>
section:ceph-b0c70c72-8570-4c9d-93e9-51c3ab4dd9f9
- name of the volume group that contains all metadata partitions on the<metadataDevice>
diskosd-db-ecf64b20-1e07-42ac-a8ee-32ba3c0b7e2f
- name of the logical volume that relates to a failed Ceph OSD
Re-create the metadata partition on the existing metadata disk¶
After you remove the Ceph OSD disk, manually create a separate logical volume for the metadata partition in an existing volume group on the metadata device:
lvcreate -l 100%FREE -n meta_1 <vgName>
Subtitute <vgName>
with the name of a volume group captured in the
usedPartiton
parameter.
Note
If you removed more than one OSD, replace 100%FREE
with the
corresponding partition size. For example:
lvcreate -l <partitionSize> -n meta_1 <vgName>
Substitute <partitionSize>
with the corresponding value that matches the
size of other partitions placed on the affected metadata drive. To obtain
<partitionSize>
, use the output of the lvs command. For example:
16G
.
During execution of the lvcreate command, the system asks you to wipe the found bluestore label on a metadata device. For example:
WARNING: ceph_bluestore signature detected on /dev/ceph-b0c70c72-8570-4c9d-93e9-51c3ab4dd9f9/meta_1 at offset 0. Wipe it? [y/n]:
Using the interactive shell, answer n
to keep all metadata partitions
alive. After answering n
, the system outputs the following:
Aborted wiping of ceph_bluestore.
1 existing signature left on the device.
Logical volume "meta_1" created.
Re-create the Ceph OSD with the re-created metadata partition¶
Open the
KaasCephCluster
CR for editing:kubectl edit kaascephcluster -n <managedClusterProjectName>
Substitute
<managedClusterProjectName>
with the corresponding value.In the
nodes
section, add the replaced device with the samemetadataDevice
path as in the previous Ceph OSD:spec: cephClusterSpec: nodes: <machineName>: storageDevices: - name: <deviceByID> # Recommended. Add a new device by ID, for example, /dev/disk/by-id/... #fullPath: <deviceByPath> # Add a new device by path, for example, /dev/disk/by-path/... config: deviceClass: hdd metadataDevice: /dev/<vgName>/meta_1
Substitute
<machineName>
with the machine name of the node where the new device<deviceByID>
or<deviceByPath>
must be added. Also specifymetadataDevice
with the path to the logical volume created during the Re-create the metadata partition on the existing metadata disk procedure.Wait for the replaced disk to apply to the Ceph cluster as a new Ceph OSD.
You can monitor the application state using either the
status
section of theKaaSCephCluster
CR or in therook-ceph-tools
Pod:kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph -s