Addressing storage devices¶
There are several formats to use when specifying and addressing storage devices
of a Ceph cluster. The default and recommended one is the /dev/disk/by-id
format. This format is reliable and unaffected by the disk controller actions,
such as device name shuffling or /dev/disk/by-path
recalculating.
Difference between by-id, name, and by-path formats¶
The storage device /dev/disk/by-id
format in most of the cases bases on
a disk serial number, which is unique for each disk. A by-id
symlink
is created by the udev
rules in the following format, where <BusID>
is an ID of the bus to which the disk is attached and <DiskSerialNumber>
stands for a unique disk serial number:
/dev/disk/by-id/<BusID>-<DiskSerialNumber>
Typical by-id
symlinks for storage devices look as follows:
/dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
/dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS
/dev/disk/by-id/ata-WDC_WD4003FZEX-00Z4SA0_WD-WMC5D0D9DMEH
In the example above, symlinks contain the following IDs:
Bus IDs:
nvme
,scsi-SATA
andata
Disk serial numbers:
SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
,HGST_HUS724040AL_PN1334PEHN18ZS
andWDC_WD4003FZEX-00Z4SA0_WD-WMC5D0D9DMEH
.
An exception to this rule is the wwn
by-id
symlinks, which are
programmatically generated at boot. They are not solely based on disk
serial numbers but also include other node information. This can lead
to the wwn
being recalculated when the node reboots. As a result,
this symlink type cannot guarantee a persistent disk identifier and should
not be used as a stable storage device symlink in a Ceph cluster.
The storage device name
and by-path
formats cannot be considered
persistent because the sequence in which block devices are added during boot
is semi-arbitrary. This means that block device names, for example, nvme0n1
and sdc
, are assigned to physical disks during discovery, which may vary
inconsistently from the previous node state. The same inconsistency applies
to by-path
symlinks, as they rely on the shortest physical path
to the device at boot and may differ from the previous node state.
Therefore, Mirantis highly recommends using storage device by-id
symlinks
that contain disk serial numbers. This approach enables you to use a persistent
device identifier addressed in the Ceph cluster specification.
Example KaaSCephCluster with device by-id identifiers¶
Below is an example KaaSCephCluster
custom resource using the
/dev/disk/by-id
format for storage devices specification:
Note
Container Cloud enables you to use fullPath
for the by-id
symlinks since 2.25.0. For the earlier product versions, use the name
field instead.
apiVersion: kaas.mirantis.com/v1alpha1
kind: KaaSCephCluster
metadata:
name: ceph-cluster-managed-cluster
namespace: managed-ns
spec:
cephClusterSpec:
nodes:
# Add the exact ``nodes`` names.
# Obtain the name from the "get machine" list.
cz812-managed-cluster-storage-worker-noefi-58spl:
roles:
- mgr
- mon
# All disk configuration must be reflected in ``status.providerStatus.hardware.storage`` of the ``Machine`` object
storageDevices:
- config:
deviceClass: ssd
fullPath: /dev/disk/by-id/scsi-1ATA_WDC_WDS100T2B0A-00SM50_200231440912
cz813-managed-cluster-storage-worker-noefi-lr4k4:
roles:
- mgr
- mon
storageDevices:
- config:
deviceClass: nvme
fullPath: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
cz814-managed-cluster-storage-worker-noefi-z2m67:
roles:
- mgr
- mon
storageDevices:
- config:
deviceClass: nvme
fullPath: /dev/disk/by-id/nvme-SAMSUNG_ML1EB3T8HMLA-00007_S46FNY1R130423
pools:
- default: true
deviceClass: ssd
name: kubernetes
replicated:
size: 3
role: kubernetes
k8sCluster:
name: managed-cluster
namespace: managed-ns
Migrating device names used in KaaSCephCluster to device by-id symlinks¶
The majority of existing clusters uses device names as addressed storage
devices identifiers in the spec.cephClusterSpec.nodes
section of
the KaaSCephCluster
custom resource. Therefore, they are prone
to the issue of inconsistent storage device identifiers during cluster
update. Refer to Migrate Ceph cluster to address storage devices using by-id to mitigate possible
risks.