BareMetalHostProfile

This section describes the BareMetalHostProfile resource used in Mirantis Container Cloud API to define how the storage devices and operating system are provisioned and configured.

For demonstration purposes, the Container Cloud BareMetalHostProfile custom resource (CR) is split into the following major sections:

metadata

The Container Cloud BareMetalHostProfile CR contains the following fields:

  • apiVersion

    API version of the object that is metal3.io/v1alpha1.

  • kind

    Object type that is BareMetalHostProfile.

  • metadata

    The metadata field contains the following subfields:

    • name

      Name of the bare metal host profile.

    • namespace

      Project in which the bare metal host profile was created.

Configuration example:

apiVersion: metal3.io/v1alpha1
kind: BareMetalHostProfile
metadata:
  name: default
  namespace: default

spec

The spec field of BareMetalHostProfile object contains the fields to customize your hardware configuration:

Warning

Any data stored on any device defined in the fileSystems list can be deleted or corrupted during cluster (re)deployment. It happens because each device from the fileSystems list is a part of the rootfs directory tree that is overwritten during (re)deployment.

Examples of affected devices include:

  • A raw device partition with a file system on it

  • A device partition in a volume group with a logical volume that has a file system on it

  • An mdadm RAID device with a file system on it

  • An LVM RAID device with a file system on it

The wipe field (deprecated) or wipeDevice structure (recommended since Container Cloud 2.26.0) have no effect in this case and cannot protect data on these devices.

Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

  • devices

    List of definitions of the physical storage devices. To configure more than three storage devices per host, add additional devices to this list. Each device in the list can have one or more partitions defined by the list in the partitions field.

    • Each device in the list must have the following fields in the properties section for device handling:

      • workBy (recommended, string)

        Defines how the device should be identified. Accepts a comma-separated string with the following recommended value (in order of priority): by_id,by_path,by_wwn,by_name. Since 2.25.1, this value is set by default.

      • wipeDevice (recommended, object)

        Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Enables and configures cleanup of a device or its metadata before cluster deployment. Contains the following fields:

        • eraseMetadata (dictionary)

          Enables metadata cleanup of a device. Contains the following field:

          • enable (boolean)

            Enables the eraseMetadata option. False by default.

        • eraseDevice (dictionary)

          Configures a complete cleanup of a device. Contains the following fields:

          • blkdiscard (object)

            Executes the blkdiscard command on the target device to discard all data blocks. Contains the following fields:

            • enable (boolean)

              Enables the blkdiscard option. False by default.

            • zeroout (string)

              Configures writing of zeroes to each block during device erasure. Contains the following options:

              • fallback - default, blkdiscard attempts to write zeroes only if the device does not support the block discard feature. In this case, the blkdiscard command is re-executed with an additional --zeroout flag.

              • always - always write zeroes.

              • never - never write zeroes.

          • userDefined (object)

            Enables execution of a custom command or shell script to erase the target device. Contains the following fields:

            • enabled (boolean)

              Enables the userDefined option. False by default.

            • command (string)

              Defines a command to erase the target device. Empty by default. Mutually exclusive with script. For the command execution, the ansible.builtin.command module is called.

            • script (string)

              Defines a plain-text script allowing pipelines (|) to erase the target device. Empty by default. Mutually exclusive with command. For the script execution, the ansible.builtin.shell module is called.

            When executing a command or a script, you can use the following environment variables:

            • DEVICE_KNAME (always defined by Ansible)

              Device kernel path, for example, /dev/sda

            • DEVICE_BY_NAME (optional)

              Link from /dev/disk/by-name/ if it was added by udev

            • DEVICE_BY_ID (optional)

              Link from /dev/disk/by-id/ if it was added by udev

            • DEVICE_BY_PATH (optional)

              Link from /dev/disk/by-path/ if it was added by udev

            • DEVICE_BY_WWN (optional)

              Link from /dev/disk/by-wwn/ if it was added by udev

        For configuration details, see Wipe a device or partition.

      • wipe (boolean, deprecated)

        Defines whether the device must be wiped of the data before being used.

        Note

        This field is deprecated since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0) for the sake of wipeDevice and will be removed in one of the following releases.

        For backward compatibility, any existing wipe: true option is automatically converted to the following structure:

        wipeDevice:
          eraseMetadata:
            enabled: True
        

        Before Container Cloud 2.26.0, the wipe field is mandatory.

    • Each device in the list can have the following fields in its properties section that affect the selection of the specific device when the profile is applied to a host:

      • type (optional, string)

        The device type. Possible values: hdd, ssd, nvme. This property is used to filter selected devices by type.

      • partflags (optional, string)

        Extra partition flags to be applied on a partition. For example, bios_grub.

      • minSizeGiB, maxSizeGiB (deprecated, optional, string)

        The lower and upper limit of the selected device size. Only the devices matching these criteria are considered for allocation. Omitted parameter means no upper or lower limit.

        The minSize and maxSize parameter names are also available for the same purpose.

        Caution

        Mirantis recommends using only one parameter name type and units throughout the configuration files. If both sizeGiB and size are used, sizeGiB is ignored during deployment and the suffix is adjusted accordingly. For example, 1.5Gi will be serialized as 1536Mi. The size without units is counted in bytes. For example, size: 120 means 120 bytes.

        Since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0), minSizeGiB and maxSizeGiB are deprecated. Instead of floats that define sizes in GiB for *GiB fields, use the <sizeNumber>Gi text notation (Ki, Mi, and so on). All newly created profiles are automatically migrated to the Gi syntax. In existing profiles, migrate the syntax manually.

      • byName (deprecated, optional, string)

        The specific device name to be selected during provisioning, such as dev/sda.

        Warning

        With NVME devices and certain hardware disk controllers, you cannot reliably select such device by the system name. Therefore, use a more specific byPath, serialNumber, or wwn selector.

        Since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0), byName is deprecated and will be removed in one of the following releases. As a replacement, use byPath, serialNumber, or wwn.

      • byPath (optional, string) Since 2.26.0 (17.1.0, 16.1.0)

        The specific device name with its path to be selected during provisioning, such as /dev/disk/by-path/pci-0000:00:07.0.

      • serialNumber (optional, string) Since 2.26.0 (17.1.0, 16.1.0)

        The specific serial number of a physical disk to be selected during provisioning, such as S2RBNXAH116186E.

      • wwn (optional, string) Since 2.26.0 (17.1.0, 16.1.0)

        The specific World Wide Name number of a physical disk to be selected during provisioning, such as 0x5002538d409aeeb4.

        Warning

        When using strict filters, such as byPath, serialNumber, or wwn, Mirantis strongly recommends not combining them with a soft filter, such as minSize / maxSize. Use only one approach.

  • softRaidDevices Tech Preview

    List of definitions of a software-based Redundant Array of Independent Disks (RAID) created by mdadm. Use the following fields to describe an mdadm RAID device:

    • name (mandatory, string)

      Name of a RAID device. Supports the following formats:

      • dev path, for example, /dev/md0.

      • simple name, for example, raid-name that will be created as /dev/md/raid-name on the target OS.

    • devices (mandatory, list)

      List of partitions from the devices list. Expand the resulting list of devices into at least two partitions.

    • level (optional, string)

      Level of a RAID device, defaults to raid1. Possible values: raid1, raid0, raid10.

    • metadata (optional, string)

      Metadata version of RAID, defaults to 1.0. Possible values: 1.0, 1.1, 1.2. For details about the differences in metadata, see man 8 mdadm.

      Warning

      The EFI system partition partflags: ['esp'] must be a physical partition in the main partition table of the disk, not under LVM or mdadm software RAID.

  • fileSystems

    List of file systems. Each file system can be created on top of either device, partition, or logical volume. If more file systems are required for additional devices, define them in this field. Each fileSystems in the list has the following fields:

    • fileSystem (mandatory, string)

      Type of a file system to create on a partition. For example, ext4, vfat.

    • mountOpts (optional, string)

      Comma-separated string of mount options. For example, rw,noatime,nodiratime,lazytime,nobarrier,commit=240,data=ordered.

    • mountPoint (optional, string)

      Target mount point for a file system. For example, /mnt/local-volumes/.

    • partition (optional, string)

      Partition name to be selected for creation from the list in the devices section. For example, uefi.

    • logicalVolume (optional, string)

      LVM logical volume name if the file system is supposed to be created on an LVM volume defined in the logicalVolumes section. For example, lvp.

  • logicalVolumes

    List of LVM logical volumes. Every logical volume belongs to a volume group from the volumeGroups list and has the size attribute for a size in the corresponding units.

    You can also add a software-based RAID raid1 created by LVM using the following fields:

    • name (mandatory, string)

      Name of a logical volume.

    • vg (mandatory, string)

      Name of a volume group that must be a name from the volumeGroups list.

    • sizeGiB or size (mandatory, string)

      Size of a logical volume in gigabytes. When set to 0, all available space on the corresponding volume group will be used. The 0 value equals -l 100%FREE in the lvcreate command.

    • type (optional, string)

      Type of a logical volume. If you require a usual logical volume, you can omit this field.

      Possible values:

      • linear

        Default. A usual logical volume. This value is implied for bare metal host profiles created using the Container Cloud release earlier than 2.12.0 where the type field is unavailable.

      • raid1 Tech Preview

        Serves to build the raid1 type of LVM. Equals to the lvcreate --type raid1... command. For details, see man 8 lvcreate and man 7 lvmraid.

      Caution

      Mirantis recommends using only one parameter name type and units throughout the configuration files. If both sizeGiB and size are used, sizeGiB is ignored during deployment and the suffix is adjusted accordingly. For example, 1.5Gi will be serialized as 1536Mi. The size without units is counted in bytes. For example, size: 120 means 120 bytes.

  • volumeGroups

    List of definitions of LVM volume groups. Each volume group contains one or more devices or partitions from the devices list. Contains the following field:

    • devices (mandatory, list)

      List of partitions to be used in a volume group. For example:

      - partition: lvm_root_part1
      - partition: lvm_root_part2
      

      Must contain the following field:

      • name (mandatory, string)

        Name of a volume group to be created. For example: lvm_root.

  • preDeployScript (optional, string)

    Shell script that executes on a host before provisioning the target operating system inside the ramfs system.

  • postDeployScript (optional, string)

    Shell script that executes on a host after deploying the operating system inside the ramfs system that is chrooted to the target operating system. To use a specific default gateway (for example, to have Internet access) on this stage, refer to Migration of DHCP configuration for existing management clusters.

  • grubConfig (optional, object)

    Set of options for the Linux GRUB bootloader on the target operating system. Contains the following field:

    • defaultGrubOptions (optional, array)

      Set of options passed to the Linux GRUB bootloader. Each string in the list defines one parameter. For example:

      defaultGrubOptions:
      - GRUB_DISABLE_RECOVERY="true"
      - GRUB_PRELOAD_MODULES=lvm
      - GRUB_TIMEOUT=20
      
  • kernelParameters:sysctl (optional, object)

    List of kernel sysctl options passed to /etc/sysctl.d/999-baremetal.conf during a bmh provisioning. For example:

    kernelParameters:
      sysctl:
        fs.aio-max-nr: "1048576"
        fs.file-max: "9223372036854775807"
    

    For the list of options prohibited to change, refer to MKE documentation: Set up kernel default protections.

    Note

    If asymmetric traffic is expected on some of the managed cluster nodes, enable the loose mode for the corresponding interfaces on those nodes by setting the net.ipv4.conf.<interface-name>.rp_filter parameter to "2" in the kernelParameters.sysctl section. For example:

    kernelParameters:
      sysctl:
        net.ipv4.conf.k8s-lcm.rp_filter: "2"
    
  • kernelParameters:modules (optional, object)

    List of options for kernel modules to be passed to /etc/modprobe.d/{filename} during a bare metal host provisioning. For example:

    kernelParameters:
      modules:
      - content: |
          options kvm_intel nested=1
        filename: kvm_intel.conf
    
Configuration example with strict filtering for device - applies since 2.26.0 (17.1.0 and 16.1.0)
spec:
  devices:
  - device:
      wipe: true
      workBy: by_wwn,by_path,by_id,by_name
      wwn: "0x5002538d409aeeb4"
    partitions:
    - name: bios_grub
      partflags:
      - bios_grub
      size: 4Mi
      wipe: true
    - name: uefi
      partflags:
      - esp
      size: 200Mi
      wipe: true
    - name: config-2
      size: 64Mi
      wipe: true
    - name: lvm_root_part
      size: 0
      wipe: true
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:1f.2-ata-1
      minSize: 30Gi
      wipe: true
      workBy: by_id,by_path,by_wwn,by_name
    partitions:
    - name: lvm_lvp_part1
      size: 0
      wipe: true
  - device:
      byPath: /dev/disk/by-path/pci-0000:00:1f.2-ata-3
      minSize: 30Gi
      wipe: true
      workBy: by_id,by_path,by_wwn,by_name
    partitions:
    - name: lvm_lvp_part2
      size: 0
      wipe: true
  - device:
      serialNumber: 'Z1X69DG6'
      wipe: true
      workBy: by_id,by_path,by_wwn,by_name
  fileSystems:
  - fileSystem: vfat
    partition: config-2
  - fileSystem: vfat
    mountPoint: /boot/efi
    partition: uefi
  - fileSystem: ext4
    logicalVolume: root
    mountPoint: /
  - fileSystem: ext4
    logicalVolume: lvp
    mountPoint: /mnt/local-volumes/
  grubConfig:
    defaultGrubOptions:
    - GRUB_DISABLE_RECOVERY="true"
    - GRUB_PRELOAD_MODULES=lvm
    - GRUB_TIMEOUT=5
  ...
  logicalVolumes:
  - name: root
    size: 0
    type: linear
    vg: lvm_root
  - name: lvp
    size: 0
    type: linear
    vg: lvm_lvp
  ...
  volumeGroups:
  - devices:
    - partition: lvm_root_part
    name: lvm_root
  - devices:
    - partition: lvm_lvp_part1
    - partition: lvm_lvp_part2
    name: lvm_lvp
General configuration example with the wipeDevice option for devices - applies since 2.26.0 (17.1.0 and 16.1.0)
spec:
  devices:
  - device:
      wipeDevice:
        eraseMetadata:
          enabled: true
      workBy: by_wwn,by_path,by_id,by_name
    partitions:
    - name: bios_grub
      partflags:
      - bios_grub
      size: 4Mi
    - name: uefi
      partflags:
      - esp
      size: 200Mi
    - name: config-2
      size: 64Mi
    - name: lvm_root_part
      size: 0
  - device:
      minSize: 30Gi
      wipeDevice:
        eraseMetadata:
          enabled: true
      workBy: by_id,by_path,by_wwn,by_name
    partitions:
    - name: lvm_lvp_part1
      size: 0
      wipe: true
  - device:
      minSize: 30Gi
      wipeDevice:
        eraseMetadata:
          enabled: true
      workBy: by_id,by_path,by_wwn,by_name
    partitions:
    - name: lvm_lvp_part2
      size: 0
  - device:
      wipeDevice:
        eraseMetadata:
          enabled: true
      workBy: by_id,by_path,by_wwn,by_name
  fileSystems:
  - fileSystem: vfat
    partition: config-2
  - fileSystem: vfat
    mountPoint: /boot/efi
    partition: uefi
  - fileSystem: ext4
    logicalVolume: root
    mountPoint: /
  - fileSystem: ext4
    logicalVolume: lvp
    mountPoint: /mnt/local-volumes/
  grubConfig:
    defaultGrubOptions:
    - GRUB_DISABLE_RECOVERY="true"
    - GRUB_PRELOAD_MODULES=lvm
    - GRUB_TIMEOUT=5
  ...
  logicalVolumes:
  - name: root
    size: 0
    type: linear
    vg: lvm_root
  - name: lvp
    size: 0
    type: linear
    vg: lvm_lvp
  ...
  volumeGroups:
  - devices:
    - partition: lvm_root_part
    name: lvm_root
  - devices:
    - partition: lvm_lvp_part1
    - partition: lvm_lvp_part2
    name: lvm_lvp
General configuration example with the deprecated wipe option for devices - applies before 2.26.0 (17.1.0 and 16.1.0)
spec:
  devices:
   - device:
       #byName: /dev/sda
       minSize: 61GiB
       wipe: true
       workBy: by_wwn,by_path,by_id,by_name
     partitions:
       - name: bios_grub
         partflags:
         - bios_grub
         size: 4Mi
         wipe: true
       - name: uefi
         partflags: ['esp']
         size: 200Mi
         wipe: true
       - name: config-2
         # limited to 64Mb
         size: 64Mi
         wipe: true
       - name: md_root_part1
         wipe: true
         partflags: ['raid']
         size: 60Gi
       - name: lvm_lvp_part1
         wipe: true
         partflags: ['raid']
         # 0 Means, all left space
         size: 0
   - device:
       #byName: /dev/sdb
       minSize: 61GiB
       wipe: true
       workBy: by_wwn,by_path,by_id,by_name
     partitions:
       - name: md_root_part2
         wipe: true
         partflags: ['raid']
         size: 60Gi
       - name: lvm_lvp_part2
         wipe: true
         # 0 Means, all left space
         size: 0
   - device:
       #byName: /dev/sdc
       minSize: 30Gib
       wipe: true
       workBy: by_wwn,by_path,by_id,by_name
  softRaidDevices:
    - name: md_root
      metadata: "1.2"
      devices:
        - partition: md_root_part1
        - partition: md_root_part2
  volumeGroups:
    - name: lvm_lvp
      devices:
        - partition: lvm_lvp_part1
        - partition: lvm_lvp_part2
  logicalVolumes:
    - name: lvp
      vg: lvm_lvp
      # Means, all left space
      sizeGiB: 0
  postDeployScript: |
    #!/bin/bash -ex
    echo $(date) 'post_deploy_script done' >> /root/post_deploy_done
  preDeployScript: |
    #!/bin/bash -ex
    echo 'ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"' > /etc/udev/rules.d/60-ssd-scheduler.rules
    echo $(date) 'pre_deploy_script done' >> /root/pre_deploy_done
  fileSystems:
    - fileSystem: vfat
      partition: config-2
    - fileSystem: vfat
      partition: uefi
      mountPoint: /boot/efi/
    - fileSystem: ext4
      softRaidDevice: md_root
      mountPoint: /
    - fileSystem: ext4
      logicalVolume: lvp
      mountPoint: /mnt/local-volumes/
  grubConfig:
    defaultGrubOptions:
    - GRUB_DISABLE_RECOVERY="true"
    - GRUB_PRELOAD_MODULES=lvm
    - GRUB_TIMEOUT=20
  kernelParameters:
    sysctl:
    # For the list of options prohibited to change, refer to
    # https://docs.mirantis.com/mke/3.7/install/predeployment/set-up-kernel-default-protections.html
      kernel.dmesg_restrict: "1"
      kernel.core_uses_pid: "1"
      fs.file-max: "9223372036854775807"
      fs.aio-max-nr: "1048576"
      fs.inotify.max_user_instances: "4096"
      vm.max_map_count: "262144"
    modules:
      - filename: kvm_intel.conf
        content: |
          options kvm_intel nested=1