BareMetalHostProfile¶
This section describes the BareMetalHostProfile
resource used
in Mirantis Container Cloud API
to define how the storage devices and operating system
are provisioned and configured.
For demonstration purposes, the Container Cloud BareMetalHostProfile
custom resource (CR) is split into the following major sections:
metadata¶
The Container Cloud BareMetalHostProfile
CR contains
the following fields:
apiVersion
API version of the object that is
metal3.io/v1alpha1
.
kind
Object type that is
BareMetalHostProfile
.
metadata
The
metadata
field contains the following subfields:name
Name of the bare metal host profile.
namespace
Project in which the bare metal host profile was created.
Configuration example:
apiVersion: metal3.io/v1alpha1
kind: BareMetalHostProfile
metadata:
name: default
namespace: default
spec¶
The spec
field of BareMetalHostProfile
object contains
the fields to customize your hardware configuration:
Warning
Any data stored on any device defined in the fileSystems
list can be deleted or corrupted during cluster (re)deployment. It happens
because each device from the fileSystems
list is a part of the
rootfs
directory tree that is overwritten during (re)deployment.
Examples of affected devices include:
A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it
The wipe
field (deprecated) or wipeDevice
structure (recommended
since Container Cloud 2.26.0) have no effect in this case and cannot
protect data on these devices.
Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.
devices
List of definitions of the physical storage devices. To configure more than three storage devices per host, add additional devices to this list. Each
device
in the list can have one or morepartitions
defined by the list in thepartitions
field.Each
device
in the list must have the following fields in theproperties
section for device handling:workBy
(recommended, string)Defines how the device should be identified. Accepts a comma-separated string with the following recommended value (in order of priority):
by_id,by_path,by_wwn,by_name
. Since 2.25.1, this value is set by default.
wipeDevice
(recommended, object)Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Enables and configures cleanup of a device or its metadata before cluster deployment. Contains the following fields:
eraseMetadata
(dictionary)Enables metadata cleanup of a device. Contains the following field:
enable
(boolean)Enables the
eraseMetadata
option.False
by default.
eraseDevice
(dictionary)Configures a complete cleanup of a device. Contains the following fields:
blkdiscard
(object)Executes the blkdiscard command on the target device to discard all data blocks. Contains the following fields:
enable
(boolean)Enables the
blkdiscard
option.False
by default.
zeroout
(string)Configures writing of zeroes to each block during device erasure. Contains the following options:
fallback
- default, blkdiscard attempts to write zeroes only if the device does not support the block discard feature. In this case, the blkdiscard command is re-executed with an additional--zeroout
flag.always
- always write zeroes.never
- never write zeroes.
userDefined
(object)Enables execution of a custom command or shell script to erase the target device. Contains the following fields:
enabled
(boolean)Enables the
userDefined
option.False
by default.
command
(string)Defines a command to erase the target device. Empty by default. Mutually exclusive with
script
. For the command execution, theansible.builtin.command
module is called.
script
(string)Defines a plain-text script allowing pipelines (
|
) to erase the target device. Empty by default. Mutually exclusive withcommand
. For the script execution, theansible.builtin.shell
module is called.
When executing a command or a script, you can use the following environment variables:
DEVICE_KNAME
(always defined by Ansible)Device kernel path, for example,
/dev/sda
DEVICE_BY_NAME
(optional)Link from
/dev/disk/by-name/
if it was added by udev
DEVICE_BY_ID
(optional)Link from
/dev/disk/by-id/
if it was added by udev
DEVICE_BY_PATH
(optional)Link from
/dev/disk/by-path/
if it was added by udev
DEVICE_BY_WWN
(optional)Link from
/dev/disk/by-wwn/
if it was added by udev
For configuration details, see Wipe a device or partition.
wipe
(boolean, deprecated)Defines whether the device must be wiped of the data before being used.
Note
This field is deprecated since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0) for the sake of
wipeDevice
and will be removed in one of the following releases.For backward compatibility, any existing
wipe: true
option is automatically converted to the following structure:wipeDevice: eraseMetadata: enabled: True
Before Container Cloud 2.26.0, the
wipe
field is mandatory.
Each
device
in the list can have the following fields in its properties section that affect the selection of the specific device when the profile is applied to a host:type
(optional, string)The device type. Possible values:
hdd
,ssd
,nvme
. This property is used to filter selected devices by type.
partflags
(optional, string)Extra partition flags to be applied on a partition. For example,
bios_grub
.
minSizeGiB
,maxSizeGiB
(deprecated, optional, string)The lower and upper limit of the selected device size. Only the devices matching these criteria are considered for allocation. Omitted parameter means no upper or lower limit.
The
minSize
andmaxSize
parameter names are also available for the same purpose.Caution
Mirantis recommends using only one parameter name type and units throughout the configuration files. If both
sizeGiB
andsize
are used,sizeGiB
is ignored during deployment and the suffix is adjusted accordingly. For example,1.5Gi
will be serialized as1536Mi
. The size without units is counted in bytes. For example,size: 120
means 120 bytes.Since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0),
minSizeGiB
andmaxSizeGiB
are deprecated. Instead of floats that define sizes in GiB for*GiB
fields, use the<sizeNumber>Gi
text notation (Ki
,Mi
, and so on). All newly created profiles are automatically migrated to theGi
syntax. In existing profiles, migrate the syntax manually.
byName
(forbidden in new profiles since 2.27.0, optional, string)The specific device name to be selected during provisioning, such as
dev/sda
.Warning
With NVME devices and certain hardware disk controllers, you cannot reliably select such device by the system name. Therefore, use a more specific
byPath
,serialNumber
, orwwn
selector.Caution
Since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0),
byName
is deprecated. Since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0),byName
is blocked byadmission-controller
in newBareMetalHostProfile
objects. As a replacement, use a more specific selector, such asbyPath
,serialNumber
, orwwn
.
byPath
(optional, string) Since 2.26.0 (17.1.0, 16.1.0)The specific device name with its path to be selected during provisioning, such as
/dev/disk/by-path/pci-0000:00:07.0
.
serialNumber
(optional, string) Since 2.26.0 (17.1.0, 16.1.0)The specific serial number of a physical disk to be selected during provisioning, such as
S2RBNXAH116186E
.
wwn
(optional, string) Since 2.26.0 (17.1.0, 16.1.0)The specific World Wide Name number of a physical disk to be selected during provisioning, such as
0x5002538d409aeeb4
.Warning
When using strict filters, such as
byPath
,serialNumber
, orwwn
, Mirantis strongly recommends not combining them with a soft filter, such asminSize
/maxSize
. Use only one approach.
softRaidDevices
Tech PreviewList of definitions of a software-based Redundant Array of Independent Disks (RAID) created by mdadm. Use the following fields to describe an mdadm RAID device:
name
(mandatory, string)Name of a RAID device. Supports the following formats:
dev
path, for example,/dev/md0
.simple name, for example,
raid-name
that will be created as/dev/md/raid-name
on the target OS.
devices
(mandatory, list)List of partitions from the
devices
list. Expand the resulting list of devices into at least two partitions.
level
(optional, string)Level of a RAID device, defaults to
raid1
. Possible values:raid1
,raid0
,raid10
.
metadata
(optional, string)Metadata version of RAID, defaults to
1.0
. Possible values:1.0
,1.1
,1.2
. For details about the differences in metadata, see man 8 mdadm.Warning
The EFI system partition
partflags: ['esp']
must be a physical partition in the main partition table of the disk, not under LVM or mdadm software RAID.
fileSystems
List of file systems. Each file system can be created on top of either device, partition, or logical volume. If more file systems are required for additional devices, define them in this field. Each
fileSystems
in the list has the following fields:fileSystem
(mandatory, string)Type of a file system to create on a partition. For example,
ext4
,vfat
.
mountOpts
(optional, string)Comma-separated string of mount options. For example,
rw,noatime,nodiratime,lazytime,nobarrier,commit=240,data=ordered
.
mountPoint
(optional, string)Target mount point for a file system. For example,
/mnt/local-volumes/
.
partition
(optional, string)Partition name to be selected for creation from the list in the
devices
section. For example,uefi
.
logicalVolume
(optional, string)LVM logical volume name if the file system is supposed to be created on an LVM volume defined in the
logicalVolumes
section. For example,lvp
.
logicalVolumes
List of LVM logical volumes. Every logical volume belongs to a volume group from the
volumeGroups
list and has thesize
attribute for a size in the corresponding units.You can also add a software-based RAID
raid1
created by LVM using the following fields:name
(mandatory, string)Name of a logical volume.
vg
(mandatory, string)Name of a volume group that must be a name from the
volumeGroups
list.
sizeGiB
orsize
(mandatory, string)Size of a logical volume in gigabytes. When set to
0
, all available space on the corresponding volume group will be used. The0
value equals-l 100%FREE
in the lvcreate command.
type
(optional, string)Type of a logical volume. If you require a usual logical volume, you can omit this field.
Possible values:
linear
Default. A usual logical volume. This value is implied for bare metal host profiles created using the Container Cloud release earlier than 2.12.0 where the
type
field is unavailable.
raid1
Tech PreviewServes to build the
raid1
type of LVM. Equals to the lvcreate --type raid1... command. For details, see man 8 lvcreate and man 7 lvmraid.
Caution
Mirantis recommends using only one parameter name type and units throughout the configuration files. If both
sizeGiB
andsize
are used,sizeGiB
is ignored during deployment and the suffix is adjusted accordingly. For example,1.5Gi
will be serialized as1536Mi
. The size without units is counted in bytes. For example,size: 120
means 120 bytes.
volumeGroups
List of definitions of LVM volume groups. Each volume group contains one or more devices or partitions from the
devices
list. Contains the following field:devices
(mandatory, list)List of partitions to be used in a volume group. For example:
- partition: lvm_root_part1 - partition: lvm_root_part2
Must contain the following field:
name
(mandatory, string)Name of a volume group to be created. For example:
lvm_root
.
preDeployScript
(optional, string)Shell script that executes on a host before provisioning the target operating system inside the
ramfs
system.
postDeployScript
(optional, string)Shell script that executes on a host after deploying the operating system inside the
ramfs
system that is chrooted to the target operating system. To use a specific default gateway (for example, to have Internet access) on this stage, refer to Migration of DHCP configuration for existing management clusters.
grubConfig
(optional, object)Set of options for the Linux GRUB bootloader on the target operating system. Contains the following field:
defaultGrubOptions
(optional, array)Set of options passed to the Linux GRUB bootloader. Each string in the list defines one parameter. For example:
defaultGrubOptions: - GRUB_DISABLE_RECOVERY="true" - GRUB_PRELOAD_MODULES=lvm - GRUB_TIMEOUT=20
kernelParameters:sysctl
(optional, object)List of kernel sysctl options passed to
/etc/sysctl.d/999-baremetal.conf
during abmh
provisioning. For example:kernelParameters: sysctl: fs.aio-max-nr: "1048576" fs.file-max: "9223372036854775807"
For the list of options prohibited to change, refer to MKE documentation: Set up kernel default protections.
Note
If asymmetric traffic is expected on some of the managed cluster nodes, enable the loose mode for the corresponding interfaces on those nodes by setting the
net.ipv4.conf.<interface-name>.rp_filter
parameter to"2"
in thekernelParameters.sysctl
section. For example:kernelParameters: sysctl: net.ipv4.conf.k8s-lcm.rp_filter: "2"
kernelParameters:modules
(optional, object)List of options for kernel modules to be passed to
/etc/modprobe.d/{filename}
during a bare metal host provisioning. For example:kernelParameters: modules: - content: | options kvm_intel nested=1 filename: kvm_intel.conf
Configuration example with strict filtering for device
-
applies since 2.26.0 (17.1.0 and 16.1.0)
spec:
devices:
- device:
wipe: true
workBy: by_wwn,by_path,by_id,by_name
wwn: "0x5002538d409aeeb4"
partitions:
- name: bios_grub
partflags:
- bios_grub
size: 4Mi
wipe: true
- name: uefi
partflags:
- esp
size: 200Mi
wipe: true
- name: config-2
size: 64Mi
wipe: true
- name: lvm_root_part
size: 0
wipe: true
- device:
byPath: /dev/disk/by-path/pci-0000:00:1f.2-ata-1
minSize: 30Gi
wipe: true
workBy: by_id,by_path,by_wwn,by_name
partitions:
- name: lvm_lvp_part1
size: 0
wipe: true
- device:
byPath: /dev/disk/by-path/pci-0000:00:1f.2-ata-3
minSize: 30Gi
wipe: true
workBy: by_id,by_path,by_wwn,by_name
partitions:
- name: lvm_lvp_part2
size: 0
wipe: true
- device:
serialNumber: 'Z1X69DG6'
wipe: true
workBy: by_id,by_path,by_wwn,by_name
fileSystems:
- fileSystem: vfat
partition: config-2
- fileSystem: vfat
mountPoint: /boot/efi
partition: uefi
- fileSystem: ext4
logicalVolume: root
mountPoint: /
- fileSystem: ext4
logicalVolume: lvp
mountPoint: /mnt/local-volumes/
grubConfig:
defaultGrubOptions:
- GRUB_DISABLE_RECOVERY="true"
- GRUB_PRELOAD_MODULES=lvm
- GRUB_TIMEOUT=5
...
logicalVolumes:
- name: root
size: 0
type: linear
vg: lvm_root
- name: lvp
size: 0
type: linear
vg: lvm_lvp
...
volumeGroups:
- devices:
- partition: lvm_root_part
name: lvm_root
- devices:
- partition: lvm_lvp_part1
- partition: lvm_lvp_part2
name: lvm_lvp
General configuration example with the wipeDevice
option
for devices - applies since 2.26.0 (17.1.0 and 16.1.0)
spec:
devices:
- device:
wipeDevice:
eraseMetadata:
enabled: true
workBy: by_wwn,by_path,by_id,by_name
partitions:
- name: bios_grub
partflags:
- bios_grub
size: 4Mi
- name: uefi
partflags:
- esp
size: 200Mi
- name: config-2
size: 64Mi
- name: lvm_root_part
size: 0
- device:
minSize: 30Gi
wipeDevice:
eraseMetadata:
enabled: true
workBy: by_id,by_path,by_wwn,by_name
partitions:
- name: lvm_lvp_part1
size: 0
wipe: true
- device:
minSize: 30Gi
wipeDevice:
eraseMetadata:
enabled: true
workBy: by_id,by_path,by_wwn,by_name
partitions:
- name: lvm_lvp_part2
size: 0
- device:
wipeDevice:
eraseMetadata:
enabled: true
workBy: by_id,by_path,by_wwn,by_name
fileSystems:
- fileSystem: vfat
partition: config-2
- fileSystem: vfat
mountPoint: /boot/efi
partition: uefi
- fileSystem: ext4
logicalVolume: root
mountPoint: /
- fileSystem: ext4
logicalVolume: lvp
mountPoint: /mnt/local-volumes/
grubConfig:
defaultGrubOptions:
- GRUB_DISABLE_RECOVERY="true"
- GRUB_PRELOAD_MODULES=lvm
- GRUB_TIMEOUT=5
...
logicalVolumes:
- name: root
size: 0
type: linear
vg: lvm_root
- name: lvp
size: 0
type: linear
vg: lvm_lvp
...
volumeGroups:
- devices:
- partition: lvm_root_part
name: lvm_root
- devices:
- partition: lvm_lvp_part1
- partition: lvm_lvp_part2
name: lvm_lvp
General configuration example with the deprecated wipe
option for devices - applies before 2.26.0 (17.1.0 and 16.1.0)
spec:
devices:
- device:
#byName: /dev/sda
minSize: 61GiB
wipe: true
workBy: by_wwn,by_path,by_id,by_name
partitions:
- name: bios_grub
partflags:
- bios_grub
size: 4Mi
wipe: true
- name: uefi
partflags: ['esp']
size: 200Mi
wipe: true
- name: config-2
# limited to 64Mb
size: 64Mi
wipe: true
- name: md_root_part1
wipe: true
partflags: ['raid']
size: 60Gi
- name: lvm_lvp_part1
wipe: true
partflags: ['raid']
# 0 Means, all left space
size: 0
- device:
#byName: /dev/sdb
minSize: 61GiB
wipe: true
workBy: by_wwn,by_path,by_id,by_name
partitions:
- name: md_root_part2
wipe: true
partflags: ['raid']
size: 60Gi
- name: lvm_lvp_part2
wipe: true
# 0 Means, all left space
size: 0
- device:
#byName: /dev/sdc
minSize: 30Gib
wipe: true
workBy: by_wwn,by_path,by_id,by_name
softRaidDevices:
- name: md_root
metadata: "1.2"
devices:
- partition: md_root_part1
- partition: md_root_part2
volumeGroups:
- name: lvm_lvp
devices:
- partition: lvm_lvp_part1
- partition: lvm_lvp_part2
logicalVolumes:
- name: lvp
vg: lvm_lvp
# Means, all left space
sizeGiB: 0
postDeployScript: |
#!/bin/bash -ex
echo $(date) 'post_deploy_script done' >> /root/post_deploy_done
preDeployScript: |
#!/bin/bash -ex
echo 'ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"' > /etc/udev/rules.d/60-ssd-scheduler.rules
echo $(date) 'pre_deploy_script done' >> /root/pre_deploy_done
fileSystems:
- fileSystem: vfat
partition: config-2
- fileSystem: vfat
partition: uefi
mountPoint: /boot/efi/
- fileSystem: ext4
softRaidDevice: md_root
mountPoint: /
- fileSystem: ext4
logicalVolume: lvp
mountPoint: /mnt/local-volumes/
grubConfig:
defaultGrubOptions:
- GRUB_DISABLE_RECOVERY="true"
- GRUB_PRELOAD_MODULES=lvm
- GRUB_TIMEOUT=20
kernelParameters:
sysctl:
# For the list of options prohibited to change, refer to
# https://docs.mirantis.com/mke/3.7/install/predeployment/set-up-kernel-default-protections.html
kernel.dmesg_restrict: "1"
kernel.core_uses_pid: "1"
fs.file-max: "9223372036854775807"
fs.aio-max-nr: "1048576"
fs.inotify.max_user_instances: "4096"
vm.max_map_count: "262144"
modules:
- filename: kvm_intel.conf
content: |
options kvm_intel nested=1
Mounting recommendations for the /var directory¶
During volume mounts, Mirantis strongly advises against mounting the entire
/var
directory to a separate disk or partition. Otherwise, the
cloud-init
service may fail to configure the target host system during
the first boot.
This recommendation allows preventing the following cloud-init issue related to asynchronous mount in systemd with ignoring dependency:
System boots the
/
mounts.The
cloud-init
service starts and processes data in/var/lib/cloud-init
, which currently references[/]var/lib/cloud-init
.The
systemd
service mounts/var/lib/cloud-init
and breaks thecloud-init
service logic.
Recommended configuration example for /var/lib/nova
spec:
devices:
...
- device:
serialNumber: BTWA516305VE480FGN
type: ssd
wipeDevice:
eraseMetadata:
enabled: true
partitions:
- name: var_lib_nova_part
size: 0
fileSystems:
....
- fileSystem: ext4
partition: var_lib_nova_part
mountPoint: '/var/lib/nova'
mountOpts: 'rw,noatime,nodiratime,lazytime'
Not recommended configuration example for /var
spec:
devices:
...
- device:
serialNumber: BTWA516305VE480FGN
type: ssd
wipeDevice:
eraseMetadata:
enabled: true
partitions:
- name: var_part
size: 0
fileSystems:
....
- fileSystem: ext4
partition: var_part
mountPoint: '/var' # NOT RECOMMENDED
mountOpts: 'rw,noatime,nodiratime,lazytime'