Replace a failed Ceph node

After a physical node replacement, you can use the Pelagia LCM API to redeploy failed Ceph nodes. The common flow of replacing a failed Ceph node is as follows:

Remove the obsolete Ceph node from the Ceph cluster.
Add a new Ceph node with the same configuration to the Ceph cluster.

Ceph OSD removal presupposes usage of a CephOsdRemoveTask CR. For workflow overview, see Create a Ceph OSD remove task.

Remove a failed Ceph node

Open the CephDeployment CR for editing:
```
kubectl -n pelagia edit cephdpl
```
In the nodes section, remove the required device. When using device filters, update regexp accordingly.

For example:
```
spec:
  nodes:
  - name: <nodeName> # remove the entire entry for the node to replace
    devices: {...}
    role: [...]
```
Substitute <nodeName> with the node name to replace.
Save CephDeployment changes and close the editor.

Create a CephOsdRemoveTask CR template and save it as replace-failed-<nodeName>-task.yaml:

apiVersion: lcm.mirantis.com/v1alpha1
kind: CephOsdRemoveTask
metadata:
  name: replace-failed-<nodeName>-task
  namespace: pelagia
spec:
  nodes:
    <nodeName>:
      completeCleanup: true

Apply the template to the cluster:

kubectl apply -f replace-failed-<nodeName>-task.yaml

Verify that the corresponding task has been created:
```
kubectl -n pelagia get cephosdremovetask
```

Verify that the removeInfo section appeared in the CephOsdRemoveTask CR status:

kubectl -n pelagia get cephosdremovetask replace-failed-<nodeName>-task -o yaml

Example of system response:

removeInfo:
  cleanupMap:
    <nodeName>:
      osdMapping:
        ...
        <osdId>:
          deviceMapping:
            ...
            <deviceName>:
              path: <deviceByPath>
              partition: "/dev/ceph-b-vg_sdb/osd-block-b-lv_sdb"
              type: "block"
              class: "hdd"
              zapDisk: true

Definition of values in angle brackets:

<nodeName> - underlying machine node name, for example, storage-worker-5.
<osdId> - actual Ceph OSD ID for the device being replaced, for example, 1.
<deviceName> - actual device name placed on the node, for example, sdb.
<deviceByPath> - actual device by-path placed on the node, for example, /dev/disk/by-path/pci-0000:00:1t.9.

Verify that the cleanupMap section matches the required removal and wait for the ApproveWaiting phase to appear in status:
```
kubectl -n pelagia get cephosdremovetask replace-failed-<nodeName>-task -o yaml
```
Example of system response:
```
status:
  phase: ApproveWaiting
```

Edit the CephOsdRemoveTask CR and set the approve flag to true:

kubectl -n pelagia edit cephosdremovetask replace-failed-<nodeName>-task

For example:

spec:
  approve: true

Review the following status fields of the Ceph LCM CR processing:
- status.phase - current state of task processing;
- status.messages - description of the current phase;
- status.conditions - full history of task processing before the current phase;
- status.removeInfo.issues and status.removeInfo.warnings - error and warning messages occurred during task processing, if any.

Verify that the CephOsdRemoveTask has been completed. For example:

status:
  phase: Completed # or CompletedWithWarnings if there are non-critical issues

Remove the device cleanup jobs:

kubectl delete jobs -n pelagia -l app=pelagia-lcm-cleanup-disks

Deploy a new Ceph node after removal of a failed one

Note

You can spawn Ceph OSD on a raw device, but it must be clean and without any data or partitions. If you want to add a device that was in use, also ensure it is raw and clean. To clean up all data and partitions from a device, refer to official Rook documentation.

Open the CephDeployment CR for editing:
```
kubectl -n pelagia edit cephdpl
```

In the nodes section, add a new device:

spec:
  nodes:
  - name: <nodeName> # add new configuration for replaced Ceph node
    devices:
    - fullPath: <deviceByID> # Recommended. Contains either disk serial number or wwn
      # name: <deviceByID> # Not recommended. If a device is supposed to be added with by-id.
      # fullPath: <deviceByPath> # if device is supposed to be added with by-path.
      config:
        deviceClass: hdd
      ...

Substitute <nodeName> with the replaced node name and configure it as required.

Verify that all Ceph daemons from the replaced node have appeared on the Ceph cluster and are in and up. The healthReport section of CephDeploymentHealth CR should not contain any issues.

kubectl -n pelagia get cephdeploymenthealth -o yaml

Example of system response:

status:
  healthReport:
    rookCephObjects:
      cephCluster:
        ceph:
          health: HEALTH_OK
          ...
    cephDaemons:
      cephDaemons:
        mgr:
          info:
          - 'a is active mgr, standbys: [b]'
          status: ok
        mon:
          info:
          - 3 mons, quorum [a b c]
          status: ok
        osd:
          info:
          - 3 osds, 3 up, 3 in
          status: ok

Verify the Ceph node in the Rook namespace:

kubectl -n rook-ceph get pod -o wide | grep <nodeName>