Provisioning failure due to device naming issues in a bare metal host profile

During a bare metal host provisioning, transition to each stage implies the host reboot. This may cause device name issues if a device is configured using the by_name device identifier.

In Linux, assignment of device names, for example, /dev/sda, to physical disks can change, especially in systems with multiple disks or when hardware configuration changes. For example:

  • If you add or remove a hard drive or change the boot order, the device names can shift.

  • If the system uses hardware with additional disk array controllers, such as RaidControllers in the JBOD mode, device names can shift during reboot. This can lead to unintended consequences and potential data loss if your file systems are not mounted correctly.

  • The /dev/sda partition on the first boot may become /dev/sdb on the second boot. Consequently, your file system may not be provisioned as expected, leading to errors during disk formatting and assembling.

Linux recommends using unique identifiers (UUIDs) or labels for device identification in /etc/fstab. These identifiers are more stable and ensure that the defined devices are mounted regardless of the naming changes.

Therefore, to prevent device naming issues during a bare metal host provisioning, instead of the by_name identifier, Mirantis recommends using the workBy parameter along with device labels or filters such as minSize and maxSize. These device settings ensure a successful bare metal host provisioning with /dev/disk/by-uuid/<UUID> or /dev/disk/by-label/<label> in /etc/fstab. For details on workBy, see BareMetalHostProfile spec.

Overview of the device naming logic in a bare metal host profile

To manage physical devices, the bare metal provider uses the following entities:

  • The BareMetalHostProfile object

    Object created by an operator with description of the required file-system schema on a node. For details, see Create a custom bare metal host profile.

  • The status.hardware.storage fields of the BareMetalHost object

    Initial description of physical disks that is discovered only once during a bare metal host inspection.

  • The status.hostInfo.storage fields of the LCMMachine object

    Current state of physical disks during life cycle of Machine and LCMMachine objects.

The default device naming workflow during management of BareMetalHost and BareMetalHostProfile objects is as follows:

  1. An operator creates the BareMetalHost and BareMetalHostCredential objects.

  2. The baremetal-operator service inspects the objects.

  3. The operator creates or reviews an existing BareMetalHostProfile object using the status.hardware.storage fields of the BareMetalHost object. For details, see Create a custom bare metal host profile.

  4. The operator creates a Machine object and maps it to the related BareMetalHost and BareMetalHostProfile objects. For details, see Deploy a machine to a specific bare metal host.

  5. The baremeral-provider service starts processing BareMetalHostProfile and searching for suitable hardware disks to build the internal AnsibleExtra object configuration. During the building process:

  6. The cleanup and provisioning stage of BareMetalHost starts:

    • During provisioning, the selection order described in bmhp.workBy applies. For details, see Create a custom host profile.

      This logic ensures that an exact by_id name is taken from the discovery stage, as opposed to by_name that can be changed during transition from the inspection to provisioning stage.

    • After provisioning finishes, the target system /etc/fstab is generated using UUIDs.

Note

For the /dev/disk/by-id mapping in Ceph, see Addressing storage devices.