Add, remove, or reconfigure Ceph OSDs with metadata devices

Mirantis Ceph Controller simplifies Ceph cluster management by automating LCM operations. This section describes how to add, remove, or reconfigure Ceph OSDs with a separate metadata device.

Add a Ceph OSD with a metadata device

  1. From the Ceph disks defined in the BareMetalHostProfile object that was configured using the Configure Ceph disks in a host profile procedure, select one disk for data and one logical volume for metadata of a Ceph OSD to be added to the Ceph cluster.

    Note

    If you add a new disk after machine provisioning, manually prepare the required machine devices using Logical Volume Manager (LVM) 2 on the existing node because BareMetalHostProfile does not support in-place changes.

  2. Open the KaasCephCluster object for editing:

    kubectl -n <managedClusterProjectName> edit kaascephcluster
    

    Substitute <managedClusterProjectName> with the corresponding value.

  3. In the nodes.<machineName>.storageDevices section, specify the parameters for a Ceph OSD as required. For the parameters description, see Node parameters.

    The example configuration of the nodes section with the new node:

    nodes:
      kaas-node-5bgk6:
        roles:
        - mon
        - mgr
        storageDevices:
        - config: # existing item
            deviceClass: hdd
          fullPath: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS
        - config: # new item
            deviceClass: hdd
            metadataDevice: /dev/bluedb/meta_1
          fullPath: /dev/disk/by-id/scsi-0ATA_HGST_HUS724040AL_PN1334PEHN1VBC
    
    nodes:
      kaas-node-5bgk6:
        roles:
        - mon
        - mgr
        storageDevices:
        - config: # existing item
            deviceClass: hdd
          name: sdb
        - config: # new item
            deviceClass: hdd
            metadataDevice: /dev/bluedb/meta_1
          name: sdc
    

    Warning

    Since Container Cloud 2.25.0, Mirantis highly recommends using the non-wwn by-id symlinks to specify storage devices in the storageDevices list.

    For details, see Addressing storage devices.

  4. Verify that the Ceph OSD is successfully deployed on the specified node:

    kubectl -n <managedClusterProjectName> get kaascephcluster -o yaml
    

    In the system response, the fullClusterInfo section should not contain any issues.

    Example of a successful system response:

    status:
      fullClusterInfo:
        daemonsStatus:
          ...
          osd:
            running: '4/4 running: 4 up, 4 in'
            status: Ok
    
  5. Obtain the name of the node on which the machine with the Ceph OSD is running:

    kubectl -n <managedClusterProjectName> get machine <machineName> -o jsonpath='{.status.nodeRef.name}'
    

    Substitute <managedClusterProjectName> and <machineName> with corresponding values.

  6. Verify the Ceph OSD status:

    kubectl -n rook-ceph get pod -l app=rook-ceph-osd -o wide | grep <nodeName>
    

    Substitute <nodeName> with the value obtained on the previous step.

    Example of system response:

    rook-ceph-osd-0-7b8d4d58db-f6czn   1/1     Running   0          42h   10.100.91.6   kaas-node-6c5e76f9-c2d2-4b1a-b047-3c299913a4bf   <none>           <none>
    rook-ceph-osd-1-78fbc47dc5-px9n2   1/1     Running   0          21h   10.100.91.6   kaas-node-6c5e76f9-c2d2-4b1a-b047-3c299913a4bf   <none>           <none>
    rook-ceph-osd-3-647f8d6c69-87gxt   1/1     Running   0          21h   10.100.91.6   kaas-node-6c5e76f9-c2d2-4b1a-b047-3c299913a4bf   <none>           <none>
    

Remove a Ceph OSD with a metadata device

Note

Ceph OSD removal implies the usage of the KaaSCephOperationRequest custom resource (CR). For workflow overview, spec and phases description, see High-level workflow of Ceph OSD or node removal.

Warning

When using the non-recommended Ceph pools replicated.size of less than 3, Ceph OSD removal cannot be performed. The minimal replica size equals a rounded up half of the specified replicated.size.

For example, if replicated.size is 2, the minimal replica size is 1, and if replicated.size is 3, then the minimal replica size is 2. The replica size of 1 allows Ceph having PGs with only one Ceph OSD in the acting state, which may cause a PG_TOO_DEGRADED health warning that blocks Ceph OSD removal. Mirantis recommends setting replicated.size to 3 for each Ceph pool.

  1. Open the KaasCephCluster object of the managed cluster for editing:

    kubectl edit kaascephcluster -n <managedClusterProjectName>
    

    Substitute <managedClusterProjectName> with the corresponding value.

  2. Remove the required Ceph OSD specification from the spec.cephClusterSpec.nodes.<machineName>.storageDevices list:

    The example configuration of the nodes section with the new node:

    nodes:
      kaas-node-5bgk6:
        roles:
        - mon
        - mgr
        storageDevices:
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS
        - config: # remove the entire item entry from storageDevices list
            deviceClass: hdd
            metadataDevice: /dev/bluedb/meta_1
          fullPath: /dev/disk/by-id/scsi-0ATA_HGST_HUS724040AL_PN1334PEHN1VBC
    
    nodes:
      kaas-node-5bgk6:
        roles:
        - mon
        - mgr
        storageDevices:
        - config:
            deviceClass: hdd
          name: sdb
        - config: # remove the entire item entry from storageDevices list
            deviceClass: hdd
            metadataDevice: /dev/bluedb/meta_1
          name: sdc
    
  3. Create a YAML template for the KaaSCephOperationRequest CR. For example:

    apiVersion: kaas.mirantis.com/v1alpha1
    kind: KaaSCephOperationRequest
    metadata:
      name: remove-osd-<machineName>-sdb
      namespace: <managedClusterProjectName>
    spec:
      osdRemove:
        nodes:
          <machineName>:
            cleanupByDevice:
            - name: sdb
      kaasCephCluster:
        name: <kaasCephClusterName>
        namespace: <managedClusterProjectName>
    

    Substitute <managedClusterProjectName> with the corresponding cluster namespace and <kaasCephClusterName> with the corresponding KaaSCephCluster name.

    Warning

    Since Container Cloud 2.25.0, Mirantis does not recommend setting device name or device by-path symlink in the cleanupByDevice field as these identifiers are not persistent and can change at node boot. Remove Ceph OSDs with by-id symlinks specified in the path field or use cleanupByOsdId instead.

    For details, see Addressing storage devices.

    Note

    • Since Container Cloud 2.23.0 and 2.23.1 for MOSK 23.1, cleanupByDevice is not supported if a device was physically removed from a node. Therefore, use cleanupByOsdId instead. For details, see Remove a failed Ceph OSD by Ceph OSD ID.

    • Before Container Cloud 2.23.0 and 2.23.1 for MOSK 23.1, if the storageDevice item was specified with by-id, specify the path parameter in the cleanupByDevice section instead of name.

    • If the storageDevice item was specified with a by-path device path, specify the path parameter in the cleanupByDevice section instead of name.

  4. Apply the template on the management cluster in the corresponding namespace:

    kubectl apply -f remove-osd-<machineName>-sdb.yaml
    
  5. Verify that the corresponding request has been created:

    kubectl get kaascephoperationrequest remove-osd-<machineName>-sdb -n <managedClusterProjectName>
    
  6. Verify that the removeInfo section appeared in the KaaSCephOperationRequest CR status:

    kubectl -n <managedClusterProjectName> get kaascephoperationrequest remove-osd-<machineName>-sdb -o yaml
    

    Example of system response:

    status:
      childNodesMapping:
        kaas-node-d4aac64d-1721-446c-b7df-e351c3025591: <machineName>
      osdRemoveStatus:
        removeInfo:
          cleanUpMap:
            kaas-node-d4aac64d-1721-446c-b7df-e351c3025591:
              osdMapping:
                "10":
                  deviceMapping:
                    sdb:
                      path: "/dev/disk/by-path/pci-0000:00:1t.9"
                      partition: "/dev/ceph-b-vg_sdb/osd-block-b-lv_sdb"
                      type: "block"
                      class: "hdd"
                      zapDisk: true
                "5":
                  deviceMapping:
                    /dev/sdc:
                      deviceClass: hdd
                      devicePath: /dev/disk/by-path/pci-0000:00:0f.0
                      devicePurpose: block
                      usedPartition: /dev/ceph-2d11bf90-e5be-4655-820c-fb4bdf7dda63/osd-block-e41ce9a8-4925-4d52-aae4-e45167cfcf5c
                      zapDisk: true
                    /dev/sdf:
                      deviceClass: hdd
                      devicePath: /dev/disk/by-path/pci-0000:00:12.0
                      devicePurpose: db
                      usedPartition: /dev/bluedb/meta_1
    
  7. Verify that the cleanUpMap section matches the required removal and wait for the ApproveWaiting phase to appear in status:

    kubectl -n <managedClusterProjectName> get kaascephoperationrequest remove-osd-<machineName>-sdb -o yaml
    

    Example of system response:

    status:
      phase: ApproveWaiting
    
  8. In the KaaSCephOperationRequest CR, set the approve flag to true:

    kubectl -n <managedClusterProjectName> edit kaascephoperationrequest remove-osd-<machineName>-sdb
    

    Configuration snippet:

    spec:
      osdRemove:
        approve: true
    
  9. Review the following status fields of the KaaSCephOperationRequest CR request processing:

    • status.phase - current state of request processing

    • status.messages - description of the current phase

    • status.conditions - full history of request processing before the current phase

    • status.removeInfo.issues and status.removeInfo.warnings - error and warning messages occurred during request processing, if any

  10. Verify that the KaaSCephOperationRequest has been completed.

    Example of the positive status.phase field:

    status:
      phase: Completed # or CompletedWithWarnings if there are non-critical issues
    
  11. Remove the device cleanup jobs:

    kubectl delete jobs -n ceph-lcm-mirantis -l app=miraceph-cleanup-disks
    

Reconfigure a partition of a Ceph OSD metadata device

There is no hot reconfiguration procedure for existing Ceph OSDs. To reconfigure an existing Ceph node, remove and re-add a Ceph OSD with a metadata device using the following options:

  • Since Container Cloud 2.24.0, if metadata device partitions are specified in the BareMetalHostProfile object as described in Configure Ceph disks in a host profile, the metadata device definition is an LVM path in metadataDevice of the KaaSCephCluster object.

    Therefore, automated LCM will clean up the logical volume without removal and it can be reused. For this reason, to reconfigure a partition of a Ceph OSD metadata device:

    1. Remove a Ceph OSD from the Ceph cluster as described in Remove a Ceph OSD with a metadata device.

    2. Add the same Ceph OSD but with a modified configuration as described in Add a Ceph OSD with a metadata device.

  • Before Container Cloud 2.24.0 or if metadata device partitions are not specified in the BareMetalHostProfile object as described in Configure Ceph disks in a host profile, the most common definition of a metadata device is a full device name (by-path or by-id) in metadataDevice of the KaaSCephCluster object for Ceph OSD. For example, metadataDevice: /dev/nvme0n1. In this case, to reconfigure a partition of a Ceph OSD metadata device:

    1. Remove a Ceph OSD from the Ceph cluster as described in Remove a Ceph OSD with a metadata device. Automated LCM will clean up the data device and will remove the metadata device partition for the required Ceph OSD.

    2. Reconfigure the metadata device partition manually to use it during addition of a new Ceph OSD.

      Manual reconfiguration of a metadata device partition
      1. Log in to the Ceph node running a Ceph OSD to reconfigure.

      2. Find the required metadata device used for Ceph OSDs that should have LVM partitions with the osd--db substring:

        lsblk
        

        Example of system response:

        ...
        vdf               252:80   0   32G  0 disk
        ├─ceph--7831901d--398e--415d--8941--e78486f3b019-osd--db--4bdbb0a0--e613--416e--ab97--272f237b7eab
        │                 253:3    0   16G  0 lvm
        └─ceph--7831901d--398e--415d--8941--e78486f3b019-osd--db--8f439d5c--1a19--49d5--b71f--3c25ae343303
                          253:5    0   16G  0 lvm
        

        Capture the volume group UUID and logical volume sizes. In the example above, the volume group UUID is ceph--7831901d--398e--415d--8941--e78486f3b019 and the size is 16G.

      3. Find the volume group of the metadata device:

        vgs
        

        Example of system response:

        VG                                        #PV #LV #SN Attr   VSize   VFree
        ceph-508c7a6d-db01-4873-98c3-52ab204b5ca8   1   1   0 wz--n- <32.00g    0
        ceph-62d84b29-8de5-440c-a6e9-658e8e246af7   1   1   0 wz--n- <32.00g    0
        ceph-754e0772-6d0f-4629-bf1d-24cb79f3ee82   1   1   0 wz--n- <32.00g    0
        ceph-7831901d-398e-415d-8941-e78486f3b019   1   2   0 wz--n- <48.00g <17.00g
        lvm_root                                    1   1   0 wz--n- <61.03g    0
        

        Capture the volume group with the name that matches the prefix of LVM partitions of the metadata device. In the example above, the required volume group is ceph-7831901d-398e-415d-8941-e78486f3b019.

      4. Make a manual LVM partitioning for the new Ceph OSD. Create a new logical volume in the obtained volume group:

        lvcreate -L <lvSize> -n <lvName> <vgName>
        

        Substitute the following parameters:

        • <lvSize> with the previously obtained logical volume size. In the example above, it is 16G.

        • <lvName> with a new logical volume name. For example, meta_1.

        • <vgName> with the previously obtained volume group name. In the example above, it is ceph-7831901d-398e-415d-8941-e78486f3b019.

        Note

        Manually created partitions can be removed only manually, or during a complete metadata disk removal, or during the Machine object removal or re-provisioning.

    3. Add the same Ceph OSD but with a modified configuration and manually created logical volume of the metadata device as described in Add a Ceph OSD with a metadata device.

      For example, instead of metadataDevice: /dev/bluedb/meta_1 define metadataDevice: /dev/ceph-7831901d-398e-415d-8941-e78486f3b019/meta_1 that was manually created in the previous step.