Migrate Ceph cluster to address storage devices using by-id¶

The by-id identifier is the only persistent device identifier for a Ceph cluster that remains stable after the cluster upgrade or any other maintenance. Therefore, Mirantis recommends using device by-id symlinks rather than device names or by-path symlinks.

Container Cloud uses the device by-id identifier as the default method of addressing the underlying devices of Ceph OSDs. Thus, you should migrate all existing Ceph clusters, which are still utilizing the device names or device by-path symlinks, to the by-id format.

This section explains how to configure the KaaSCephCluster specification to use the by-id symlinks instead of disk names and by-path identifiers as the default method of addressing storage devices.

Note

Mirantis recommends avoiding the use of wwn symlinks as by-id identifiers due to their lack of persistence expressed in inconsistent discovery during node boot.

Besides migrating to by-id, consider using the fullPath field for the by-id symlinks configuration, instead of the name field in the spec.cephClusterSpec.nodes.storageDevices section. This approach allows for clear understanding of field namings and their use cases.

Note

MOSK enables you to use fullPath for the by-id symlinks since MCC 2.25.0 (Cluster release 17.0.0). For earlier product versions, use the name field instead.

Migrate the Ceph nodes section to by-id identifiers¶

Available since MCC 2.25.0 (Cluster release 17.0.0)

Make sure that your managed cluster is not currently running an upgrade or any other maintenance process.

Obtain the list of all KaasCephCluster storage devices that use disk names or disk by-path as identifiers of Ceph node storage devices:

kubectl -n <managedClusterProject> get kcc -o yaml

Substitute <managedClusterProject> with the corresponding managed cluster namespace.

Output example:

spec:
  cephClusterSpec:
    nodes:
      ...
      managed-worker-1:
        storageDevices:
        - config:
            deviceClass: hdd
          name: sdc
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2
      managed-worker-2:
        storageDevices:
        - config:
            deviceClass: hdd
          name: /dev/disk/by-id/wwn-0x26d546263bd312b8
        - config:
            deviceClass: hdd
          name: /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_2e52abb48862dsdc
      managed-worker-3:
        storageDevices:
        - config:
            deviceClass: nvme
          name: nvme3n1
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS

Verify the items from the storageDevices sections to be moved to the by-id symlinks. The list of the items to migrate includes:
- A disk name in the name field. For example, sdc, nvme3n1, and so on.
- A disk /dev/disk/by-path symlink in the fullPath field. For example, /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2.
- A disk /dev/disk/by-id symlink in the name field.
  
  Note
  
  This condition applies since MCC 2.25.0 (Cluster release 17.0.0).
- A disk /dev/disk/by-id/wwn symlink, which is programmatically calculated at boot. For example, /dev/disk/by-id/wwn-0x26d546263bd312b8.
For the example above, we have to migrate both items of managed-worker-1, both items of managed-worker-2, and the first item of managed-worker-3. The second item of managed-worker-3 has already been configured in the required format, therefore, we are leaving it as is.
To migrate all affected storageDevices items to by-id symlinks, open the KaaSCephCluster custom resource for editing:
```
kubectl -n <managedClusterProject> edit kcc
```

For each affected node from the spec.cephClusterSpec.nodes section, obtain a corresponding status.providerStatus.hardware.storage section from the Machine custom resource:

kubectl -n <managedClusterProject> get machine <machineName> -o yaml

Substitute <managedClusterProject> with the corresponding cluster namespace and <machineName> with the machine name.

Output example for managed-worker-1:

status:
  providerStatus:
    hardware:
      storage:
      - byID: /dev/disk/by-id/wwn-0x05ad99618d66a21f
        byIDs:
        - /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_05ad99618d66a21f
        - /dev/disk/by-id/scsi-305ad99618d66a21f
        - /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_05ad99618d66a21f
        - /dev/disk/by-id/wwn-0x05ad99618d66a21f
        byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:0
        byPaths:
        - /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:0
        name: /dev/sda
        serialNumber: 05ad99618d66a21f
        size: 61
        type: hdd
      - byID: /dev/disk/by-id/wwn-0x26d546263bd312b8
        byIDs:
        - /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_26d546263bd312b8
        - /dev/disk/by-id/scsi-326d546263bd312b8
        - /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_26d546263bd312b8
        - /dev/disk/by-id/wwn-0x26d546263bd312b8
        byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2
        byPaths:
        - /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2
        name: /dev/sdb
        serialNumber: 26d546263bd312b8
        size: 32
        type: hdd
      - byID: /dev/disk/by-id/wwn-0x2e52abb48862dbdc
        byIDs:
        - /dev/disk/by-id/lvm-pv-uuid-MncrcO-6cel-0QsB-IKaY-e8UK-6gDy-k2hOtf
        - /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_2e52abb48862dbdc
        - /dev/disk/by-id/scsi-32e52abb48862dbdc
        - /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_2e52abb48862dbdc
        - /dev/disk/by-id/wwn-0x2e52abb48862dbdc
        byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:1
        byPaths:
        - /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:1
        name: /dev/sdc
        serialNumber: 2e52abb48862dbdc
        size: 61
        type: hdd

For each affected storageDevices item from the considered Machine, obtain a correct by-id symlink from status.providerStatus.hardware.storage.byIDs. Such by-id symlink must contain status.providerStatus.hardware.storage.serialNumber and must not contain wwn.

For managed-worker-1, according to the example output above, we can use the following by-id symlinks:
- Replace the first item of storageDevices that contains name: sdc with fullPath: /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_2e52abb48862dbdc;
- Replace the second item of storageDevices that contains fullPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2 with fullPath: /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_26d546263bd312b8.

Replace all affected storageDevices items in KaaSCephCluster with the obtained ones.

Note

Prior to MCC 2.25.0 (Cluster release 17.0.0), place the by-id symlinks in the name field instead of the fullPath field.

The resulting example of the storage device identifier migration:

spec:
  cephClusterSpec:
    nodes:
      ...
      managed-worker-1:
        storageDevices:
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_2e52abb48862dbdc
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_26d546263bd312b8
      managed-worker-2:
        storageDevices:
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_031d9054c9b48f79
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_2e52abb48862dsdc
      managed-worker-3:
        storageDevices:
        - config:
            deviceClass: nvme
          fullPath: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
        - config:
            deviceClass: hdd
          fullPath: /dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS

Save and quit editing the KaaSCephCluster custom resource.

After migration, the re-orchestration occurs. The whole procedure should not result in any real changes to the Ceph cluster state in Ceph OSDs.

Migrate the Ceph nodeGroups section to by-id identifiers¶

Available since MCC 2.25.0 (Cluster release 17.0.0)

Besides the nodes section, your cluster may contain the nodeGroups section specified with disk names instead of by-id symlinks. Despite of inplace replacement of the nodes storage device identifiers, nodeGroups requires another approach because of the repeatable spec section for different nodes.

In the case of migrating nodeGroups storage devices, the deviceLabels section should be used to label different disks with the same labels and use these labels in node groups after. For the deviceLabels section specification, refer to Ceph advanced configuration: extraOpts.

The following procedure describes how to keep the nodeGroups section but use unique by-id identifiers instead of disk names.

To migrate the Ceph nodeGroups section to by-id identifiers:

Make sure that your managed cluster is not currently running an upgrade or any other maintenance process.

Obtain the list of all KaasCephCluster storage devices that use disk names or disk by-path as identifiers of Ceph node group storage devices:

kubectl -n <managedClusterProject> get kcc -o yaml

Substitute <managedClusterProject> with the corresponding managed cluster namespace.

Output example of the KaaSCephCluster nodeGroups section with disk names used as identifiers:

spec:
  cephClusterSpec:
    nodeGroups:
      ...
      rack-1:
        nodes:
        - node-1
        - node-2
        spec:
          crush:
            rack: "rack-1"
          storageDevices:
          - name: nvme0n1
            config:
              deviceClass: nvme
          - name: nvme1n1
            config:
              deviceClass: nvme
          - name: nvme2n1
            config:
              deviceClass: nvme
      rack-2:
        nodes:
        - node-3
        - node-4
        spec:
          crush:
            rack: "rack-2"
          storageDevices:
          - name: nvme0n1
            config:
              deviceClass: nvme
          - name: nvme1n1
            config:
              deviceClass: nvme
          - name: nvme2n1
            config:
              deviceClass: nvme
      rack-3:
        nodes:
        - node-5
        - node-6
        spec:
          crush:
            rack: "rack-3"
          storageDevices:
          - name: nvme0n1
            config:
              deviceClass: nvme
          - name: nvme1n1
            config:
              deviceClass: nvme
          - name: nvme2n1
            config:
              deviceClass: nvme

Verify the items from the storageDevices sections to be moved to by-id symlinks. The list of the items to migrate includes:
- A disk name in the name field. For example, sdc, nvme3n1, and so on.
- A disk /dev/disk/by-path symlink in the fullPath field. For example, /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2.
- A disk /dev/disk/by-id symlink in the name field.
  
  Note
  
  This condition applies since MCC 2.25.0 (Cluster release 17.0.0).
- A disk /dev/disk/by-id/wwn symlink, which is programmatically calculated at boot. For example, /dev/disk/by-id/wwn-0x26d546263bd312b8.
All storageDevice sections in the example above contain disk names in the name field. Therefore, you need to replace them with by-id symlinks.
Open the KaaSCephCluster custom resource for editing to start migration of all affected storageDevices items to by-id symlinks:
```
kubectl -n <managedClusterProject> edit kcc
```

Within each impacted Ceph node group in the nodeGroups section, add disk labels to the deviceLabels sections for every affected storage device linked with the nodes listed in nodes of that specific node group. Verify that these disk labels are equal to by-id symlinks of corresponding disks.

For example, if the node group rack-1 contains two nodes node-1 and node-2 and spec contains three items with name, you need to obtain proper by-id symlinks for disk names from both nodes and write it down with the same disk labels. The following example contains the labels for by-id symlinks of nvme0n1, nvme1n1, and nvme2n1 disks from node-1 and node-2 correspondingly:

spec:
  cephClusterSpec:
    extraOpts:
      deviceLabels:
        node-1:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R372150
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R183266
        node-2:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB4040ALR-00007_S46FNY0R900128
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB4040ALR-00007_S46FNY0R805840
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB4040ALR-00007_S46FNY0R848469

Note

Keep device labels repeatable for all nodes from the node group. This allows for specifying unified spec for different by-id symlinks of different nodes.

Example of the full deviceLabels section for the nodeGroups section:

spec:
  cephClusterSpec:
    extraOpts:
      deviceLabels:
        node-1:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R372150
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R183266
        node-2:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB4040ALR-00007_S46FNY0R900128
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB4040ALR-00007_S46FNY0R805840
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB4040ALR-00007_S46FNY0R848469
        node-3:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB00T2B0A-00007_S46FNY0R900128
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB00T2B0A-00007_S46FNY0R805840
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB00T2B0A-00007_S46FNY0R848469
        node-4:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB00Z4SA0-00007_S46FNY0R286212
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB00Z4SA0-00007_S46FNY0R350024
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB00Z4SA0-00007_S46FNY0R300756
        node-5:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB8UK0QBD-00007_S46FNY0R577024
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB8UK0QBD-00007_S46FNY0R718411
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB8UK0QBD-00007_S46FNY0R831424
        node-6:
          nvme-1: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB01DAU34-00007_S46FNY0R908440
          nvme-2: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB01DAU34-00007_S46FNY0R945405
          nvme-3: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB01DAU34-00007_S46FNY0R224911

For each affected node group in the nodeGroups section, replace the field with the insufficient disk identifier to the devLabel field with the disk label from the deviceLabels section.

For the example above, the updated nodeGroups section looks as follows:

spec:
  cephClusterSpec:
    nodeGroups:
      ...
      rack-1:
        nodes:
        - node-1
        - node-2
        spec:
          crush:
            rack: "rack-1"
          storageDevices:
          - devLabel: nvme-1
            config:
              deviceClass: nvme
          - devLabel: nvme-2
            config:
              deviceClass: nvme
          - devLabel: nvme-3
            config:
              deviceClass: nvme
      rack-2:
        nodes:
        - node-3
        - node-4
        spec:
          crush:
            rack: "rack-2"
          storageDevices:
          - devLabel: nvme-1
            config:
              deviceClass: nvme
          - devLabel: nvme-2
            config:
              deviceClass: nvme
          - devLabel: nvme-3
            config:
              deviceClass: nvme
      rack-3:
        nodes:
        - node-5
        - node-6
        spec:
          crush:
            rack: "rack-3"
          storageDevices:
          - devLabel: nvme-1
            config:
              deviceClass: nvme
          - devLabel: nvme-2
            config:
              deviceClass: nvme
          - devLabel: nvme-3
            config:
              deviceClass: nvme

Save and quit editing the KaaSCephCluster custom resource.

After migration, the re-orchestration occurs. The whole procedure should not result in any real changes to the Ceph cluster state in Ceph OSDs.