Mirantis Container Cloud (MCC) becomes part of Mirantis OpenStack for Kubernetes (MOSK)!

Starting with MOSK 25.2, the MOSK documentation set covers all product layers, including MOSK management (formerly MCC). This means everything you need is in one place. The separate MCC documentation site will be retired, so please update your bookmarks for continued easy access to the latest content.

Deploy a management cluster

This section contains an overview of the cluster-related objects along with the configuration procedure of these objects during deployment of a management cluster using Bootstrap v2 through the Container Cloud API.

Deploy a management cluster using CLI

The following procedure describes how to prepare and deploy a management cluster using Bootstrap v2 by operating YAML templates available in the kaas-bootstrap/templates/ folder.

To deploy a management cluster using CLI:

  1. Set up a bootstrap cluster.

  2. Export kubeconfig of the kind cluster:

    export KUBECONFIG=<pathToKindKubeconfig>
    

    By default, <pathToKindKubeconfig> is $HOME/.kube/kind-config-clusterapi.

  3. Configure BIOS on a bare metal host.

  4. Navigate to kaas-bootstrap/templates/bm.

    Warning

    The kubectl apply command automatically saves the applied data as plain text into the kubectl.kubernetes.io/last-applied-configuration annotation of the corresponding object. This may result in revealing sensitive data in this annotation when creating or modifying objects containing credentials. Such Container Cloud objects include:

    • BareMetalHostCredential

    • ClusterOIDCConfiguration

    • License

    • Proxy

    • ServiceUser

    • TLSConfig

    Therefore, do not use kubectl apply on these objects. Use kubectl create, kubectl patch, or kubectl edit instead.

    If you used kubectl apply on these objects, you can remove the kubectl.kubernetes.io/last-applied-configuration annotation from the objects using kubectl edit.

  5. Create the BootstrapRegion object by modifying bootstrapregion.yaml.template.

    Configuration of bootstrapregion.yaml.template
    1. Set provider: baremetal and use the default <regionName>, which is region-one.

      apiVersion: kaas.mirantis.com/v1alpha1
      kind: BootstrapRegion
      metadata:
        name: region-one
        namespace: default
      spec:
        provider: baremetal
      
    2. Create the object:

      ./kaas-bootstrap/bin/kubectl create -f \
          kaas-bootstrap/templates/bm/bootstrapregion.yaml.template
      

    Note

    In the following steps, apply the changes to objects using the commands below with the required template name:

    ./kaas-bootstrap/bin/kubectl create -f \
        kaas-bootstrap/templates/bm/<templateName>.yaml.template
    
  6. Create the ServiceUser object by modifying serviceusers.yaml.template.

    Configuration of serviceusers.yaml.template

    Service user is the initial user to create in Keycloak for access to a newly deployed management cluster. By default, it has the global-admin, operator (namespaced), and bm-pool-operator (namespaced) roles.

    You can delete serviceuser after setting up other required users with specific roles or after any integration with an external identity provider, such as LDAP.

    apiVersion: kaas.mirantis.com/v1alpha1
    kind: ServiceUserList
    items:
    - apiVersion: kaas.mirantis.com/v1alpha1
      kind: ServiceUser
      metadata:
        name: <USERNAME>
      spec:
        password:
          value: <PASSWORD>
    
  7. Optional. Prepare any number of additional SSH keys using the following example:

    apiVersion: kaas.mirantis.com/v1alpha1
    kind: PublicKey
    metadata:
      name: <SSHKeyName>
      namespace: default
    spec:
      publicKey: |
        <insert your public key here>
    
  8. Optional. Add the Proxy object using the example below:

    apiVersion: kaas.mirantis.com/v1alpha1
    kind: Proxy
    metadata:
      name: <proxyName>
      namespace: default
    spec:
      ...
    
  9. Inspect the default bare metal host profile definition in baremetalhostprofiles.yaml.template and adjust it to fit your hardware configuration. For details, see Customize the default bare metal host profile.

    Warning

    All data will be wiped during cluster deployment on devices defined directly or indirectly in the fileSystems list of BareMetalHostProfile. For example:

    • A raw device partition with a file system on it

    • A device partition in a volume group with a logical volume that has a file system on it

    • An mdadm RAID device with a file system on it

    • An LVM RAID device with a file system on it

    The wipe field is always considered true for these devices. The false value is ignored.

    Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

  10. In baremetalhosts.yaml.template, update the bare metal host definitions according to your environment configuration. Use the reference table below to manually set all parameters that start with SET_.

    Mandatory parameters for a bare metal host template

    Parameter

    Description

    Example value

    SET_MACHINE_0_IPMI_USERNAME

    The IPMI user name to access the BMC. 0

    user

    SET_MACHINE_0_IPMI_PASSWORD

    The IPMI password to access the BMC. 0

    password

    SET_MACHINE_0_MAC

    The MAC address of the first master node in the PXE network.

    ac:1f:6b:02:84:71

    SET_MACHINE_0_BMC_ADDRESS

    The IP address of the BMC endpoint for the first master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.

    192.168.100.11

    SET_MACHINE_1_IPMI_USERNAME

    The IPMI user name to access the BMC. 0

    user

    SET_MACHINE_1_IPMI_PASSWORD

    The IPMI password to access the BMC. 0

    password

    SET_MACHINE_1_MAC

    The MAC address of the second master node in the PXE network.

    ac:1f:6b:02:84:72

    SET_MACHINE_1_BMC_ADDRESS

    The IP address of the BMC endpoint for the second master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.

    192.168.100.12

    SET_MACHINE_2_IPMI_USERNAME

    The IPMI user name to access the BMC. 0

    user

    SET_MACHINE_2_IPMI_PASSWORD

    The IPMI password to access the BMC. 0

    password

    SET_MACHINE_2_MAC

    The MAC address of the third master node in the PXE network.

    ac:1f:6b:02:84:73

    SET_MACHINE_2_BMC_ADDRESS

    The IP address of the BMC endpoint for the third master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.

    192.168.100.13

    0(1,2,3,4,5,6)

    The parameter requires a user name and password in plain text.

  11. Configure cluster network:

    Important

    Bootstrap V2 supports only separated PXE and LCM networks.

    Note

    Since Container Cloud 2.30.0 (Cluster releases 21.0.0 and 20.0.0), management cluster supports full L3 networking topology in the Technology Preview scope. This enables deployment of management cluster nodes in dedicated racks without the need for L2 layer extension between them.

    Note

    When BGP mode is used for announcement of IP addresses of load-balanced services and for the cluster API VIP, three BGP sessions are created for every node of a management cluster:

    • Two sessions are created by MetalLB for public and provisioning services

    • One session is created by the bird BGP daemon for the cluster API VIP

    BGP only allows one session to be established per a pair of endpoints. For details, see MetalLB documentation: Issues with Calico. To solve this issue, different methods can be used. MOSK allows configuring three networks for a management cluster: provisioning, management, and external. In this case, configure MetalLB to use provisioning and external networks, and BIRD to use the management network.

    If you configure BGP announcement of the load-balancer IP address for a management cluster API and for load-balanced services of the cluster using three networks, you can use the following example IPAM objects instead of the default ipam-objects.yaml.template file.

    Example template with IPAM objects for a management cluster that uses BGP announcements and 3 networks

    Substitute the default ipam-objects.yaml.template file with the one that you create using the following example template:

    ---
    # This template allows you to configure networking for servers of the management cluster.
    # Network configuration requires the following resources:
    
    # Management cluster occupies rack1, rack2, rack3
    # BGP announcement of cluster API LB & k8s services LBs is configured for management cluster.
    
    # MOSK cluster occupies rack1, rack2, rack3: one master node per rack, one or more workers in each rack.
    # BGP announcement of cluster API LB & k8s services LBs is configured for MOSK cluster.
    
    {# ==================== Define PXE subnet(s) ================================ #}
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: k8s-pxe-nics-rack1
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        ipam/SVC-pxe-nics: "rack1"
    spec:
      cidr: SET_PXE_RACK1_CIDR
      gateway: SET_PXE_RACK1_GW
      includeRanges:
        - SET_PXE_RACK1_RANGE
    ---
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: k8s-pxe-nics-rack2
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        ipam/SVC-pxe-nics: "rack2"
    spec:
      cidr: SET_PXE_RACK2_CIDR
      gateway: SET_PXE_RACK2_GW
      includeRanges:
        - SET_PXE_RACK2_RANGE
    ---
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: k8s-pxe-nics-rack3
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        ipam/SVC-pxe-nics: "rack3"
    spec:
      cidr: SET_PXE_RACK3_CIDR
      gateway: SET_PXE_RACK3_GW
      includeRanges:
        - SET_PXE_RACK3_RANGE
    {# ==================== End PXE subnet(s) =================================== #}
    
    {# ==================== Define DHCP subnet(s) =============================== #}
    ---
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: k8s-pxe-dhcp-rack1
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        ipam/SVC-dhcp-range: "rack1"
    spec:
      cidr: SET_DHCP_RACK1_CIDR
      gateway: SET_DHCP_RACK1_GW
      includeRanges:
        - SET_DHCP_RACK1_RANGE
    ---
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: k8s-pxe-dhcp-rack2
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        ipam/SVC-dhcp-range: "rack2"
    spec:
      cidr: SET_DHCP_RACK2_CIDR
      gateway: SET_DHCP_RACK2_GW
      includeRanges:
        - SET_DHCP_RACK2_RANGE
    ---
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: k8s-pxe-dhcp-rack3
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        ipam/SVC-dhcp-range: "rack3"
    spec:
      cidr: SET_DHCP_RACK3_CIDR
      gateway: SET_DHCP_RACK3_GW
      includeRanges:
        - SET_DHCP_RACK3_RANGE
    {# ==================== End DHCP subnet(s) ================================== #}
    
    {# ==================== Define LCM subnet(s) ================================ #}
    ---
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: k8s-lcm-rack1
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        ipam/SVC-k8s-lcm: "rack1"
    spec:
      cidr: SET_LCM_RACK1_CIDR
      gateway: SET_LCM_RACK1_GW
      includeRanges:
        - SET_LCM_RACK1_RANGE
    ---
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: k8s-lcm-rack2
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        ipam/SVC-k8s-lcm: "rack2"
    spec:
      cidr: SET_LCM_RACK2_CIDR
      gateway: SET_LCM_RACK2_GW
      includeRanges:
        - SET_LCM_RACK2_RANGE
    ---
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: k8s-lcm-rack3
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        ipam/SVC-k8s-lcm: "rack3"
    spec:
      cidr: SET_LCM_RACK3_CIDR
      gateway: SET_LCM_RACK3_GW
      includeRanges:
        - SET_LCM_RACK3_RANGE
    {# ==================== End LCM subnet(s) =================================== #}
    
    {# ==================== Define EXT subnet(s) ================================ #}
    ---
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: mgmt-k8s-ext-rack1
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        multirack-mgmt-ext-subnet: "rack1"
    spec:
      cidr: SET_EXT_RACK1_CIDR
      gateway: SET_EXT_RACK1_GW
      includeRanges:
        - SET_EXT_RACK1_RANGE
      nameservers:
        - SET_EXT_RACK1_NS
    ---
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: mgmt-k8s-ext-rack2
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        multirack-mgmt-ext-subnet: "rack2"
    spec:
      cidr: SET_EXT_RACK2_CIDR
      gateway: SET_EXT_RACK2_GW
      includeRanges:
        - SET_EXT_RACK2_RANGE
      nameservers:
        - SET_EXT_RACK2_NS
    ---
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: mgmt-k8s-ext-rack3
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        multirack-mgmt-ext-subnet: "rack3"
    spec:
      cidr: SET_EXT_RACK3_CIDR
      gateway: SET_EXT_RACK3_GW
      includeRanges:
        - SET_EXT_RACK3_RANGE
      nameservers:
        - SET_EXT_RACK3_NS
    {# ==================== End EXT subnet(s) =================================== #}
    
    {# ==================== Define LoadBalancer IP(s) =========================== #}
    ---
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: k8s-api-lb
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        ipam/SVC-LBhost: "presents"
    spec:
      cidr: SET_LB_HOST/32
      useWholeCidr: true
    {# ==================== End LoadBalancer IP(s) ============================== #}
    
    {# ========================= Define Rack/MultiRack ======================== #}
    ---
    apiVersion: ipam.mirantis.com/v1alpha1
    kind: MultiRackCluster
    metadata:
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
      name: kaas-mgmt
      namespace: default
    spec:
      bgpdConfigFilePath: "/etc/bird"
      bgpdConfigFileName: "bird.conf"
      bgpdConfigTemplate: "# Default BGP daemon configuration file for a cluster"
    ---
    apiVersion: ipam.mirantis.com/v1alpha1
    kind: Rack
    metadata:
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        rack-id: rack1
      name: rack1
      namespace: default
    {%- raw %}
    spec:
      bgpdConfigTemplate: |
        protocol device {
        }
        #
        protocol direct {
          interface "lo";
          ipv4;
        }
        #
        protocol kernel {
          ipv4 {
            export all;
          };
        }
        #
        protocol bgp bgp_lcm {
          local port SET_BGP_LCM_LOCAL_PORT as SET_BGP_LCM_LOCAL_ASN;
          neighbor SET_BGP_LCM_RACK1_PEER_IP as SET_BGP_LCM_RACK1_PEER_ASN;
          ipv4 {
            import none;
            export filter {
              if dest = RTD_UNREACHABLE then {
                reject;
              }
              accept;
            };
          };
        }
    {%- endraw %}
    ---
    apiVersion: ipam.mirantis.com/v1alpha1
    kind: Rack
    metadata:
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        rack-id: rack2
      name: rack2
      namespace: default
    {%- raw %}
    spec:
      bgpdConfigTemplate: |
        protocol device {
        }
        #
        protocol direct {
          interface "lo";
          ipv4;
        }
        #
        protocol kernel {
          ipv4 {
            export all;
          };
        }
        #
        protocol bgp bgp_lcm {
          local port SET_BGP_LCM_LOCAL_PORT as SET_BGP_LCM_LOCAL_ASN;
          neighbor SET_BGP_LCM_RACK2_PEER_IP as SET_BGP_LCM_RACK2_PEER_ASN;
          ipv4 {
            import none;
            export filter {
              if dest = RTD_UNREACHABLE then {
                reject;
              }
              accept;
            };
          };
        }
    {%- endraw %}
    ---
    apiVersion: ipam.mirantis.com/v1alpha1
    kind: Rack
    metadata:
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        rack-id: rack3
      name: rack3
      namespace: default
    {%- raw %}
    spec:
      bgpdConfigTemplate: |
        protocol device {
        }
        #
        protocol direct {
          interface "lo";
          ipv4;
        }
        #
        protocol kernel {
          ipv4 {
            export all;
          };
        }
        #
        protocol bgp bgp_lcm {
          local port SET_BGP_LCM_LOCAL_PORT as SET_BGP_LCM_LOCAL_ASN;
          neighbor SET_BGP_LCM_RACK3_PEER_IP as SET_BGP_LCM_RACK3_PEER_ASN;
          ipv4 {
            import none;
            export filter {
              if dest = RTD_UNREACHABLE then {
                reject;
              }
              accept;
            };
          };
        }
    {%- endraw %}
    {# ========================= End Rack / MultiRack ======================== #}
    
    {# ==================== Define L2Template object(s) ========================= #}
    ---
    apiVersion: ipam.mirantis.com/v1alpha1
    kind: L2Template
    metadata:
      name: "mgmt-rack1"
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        l2template-mgmt-rack1: "presents" # to be used in Machine's L2templateSelector
    spec:
      autoIfMappingPrio:
      - provision
      - eno
      - ens
      - enp
      l3Layout:
        - scope: namespace
          subnetName: k8s-pxe-nics-rack1
        - scope: namespace
          subnetName: k8s-pxe-nics-rack2
        - scope: namespace
          subnetName: k8s-pxe-nics-rack3
        - scope: namespace
          subnetName: k8s-lcm-rack1
        - scope: namespace
          subnetName: k8s-lcm-rack2
        - scope: namespace
          subnetName: k8s-lcm-rack3
        - scope: namespace
          subnetName: mgmt-k8s-ext-rack1
      npTemplate: |
    
        version: 2
        renderer: networkd
    {# protect go-template below from Jinja #}
    {% raw %}
        ethernets:
          lo:
            addresses:
              - {{ cluster_api_lb_ip }}
            dhcp4: false
            dhcp6: false
          {{ nic 0 }}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{ mac 0 }}
            set-name: {{ nic 0 }}
            mtu: 1500
          {{ nic 1 }}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{ mac 1 }}
            set-name: {{ nic 1 }}
            mtu: 1500
          {{ nic 2 }}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{ mac 2 }}
            set-name: {{ nic 2 }}
            mtu: 1500
          {{ nic 3 }}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{ mac 3 }}
            set-name: {{ nic 3 }}
            mtu: 1500
        bonds: # this example uses 2 bonds, you may have different interfaces configuration
          bond0:
            mtu: 1500
            parameters:
              mode: 802.3ad
              mii-monitor-interval: 100
            interfaces:
              - {{ nic 0 }}
              - {{ nic 1 }}
          bond1:
            mtu: 1500
            parameters:
              mode: 802.3ad
              mii-monitor-interval: 100
            interfaces:
              - {{ nic 2 }}
              - {{ nic 3 }}
        vlans:
          k8s-lcm:
            id: SET_LCM_VLAN_ID
            link: bond1
            dhcp4: false
            dhcp6: false
            addresses:
              - {{ ip "k8s-lcm:mgmt-lcm-rack1" }}
            routes:
              - to: {{ cidr_from_subnet "k8s-lcm-rack2" }}
                via: {{ gateway_from_subnet "k8s-lcm-rack1" }}
              - to: {{ cidr_from_subnet "k8s-lcm-rack3" }}
                via: {{ gateway_from_subnet "k8s-lcm-rack1" }}
          k8s-ext:
            id: SET_EXT_VLAN_ID
            link: bond1
            dhcp4: false
            dhcp6: false
            addresses:
              - {{ ip "k8s-ext:mgmt-k8s-ext-rack1" }}
            nameservers:
              addresses: {{ nameservers_from_subnet "mgmt-k8s-ext-rack1" }}
            routes:
              - to: 0.0.0.0/0
                via: {{ gateway_from_subnet "mgmt-k8s-ext-rack1" }}
        bridges:
          k8s-pxe:
            mtu: 1500
            interfaces:
             - bond0
            dhcp4: false
            dhcp6: false
            addresses:
              - {{ ip "k8s-pxe:k8s-pxe-nics-rack1" }}
            routes:
              - to: {{ cidr_from_subnet "k8s-pxe-nics-rack2" }}
                via: {{ gateway_from_subnet "k8s-pxe-nics-rack1" }}
              - to: {{ cidr_from_subnet "k8s-pxe-nics-rack3" }}
                via: {{ gateway_from_subnet "k8s-pxe-nics-rack1" }}
    {% endraw %}
    ---
    apiVersion: ipam.mirantis.com/v1alpha1
    kind: L2Template
    metadata:
      name: "mgmt-rack2"
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        l2template-mgmt-rack2: "presents" # to be used in L2templateSelector of the Machine object
    spec:
      autoIfMappingPrio:
      - provision
      - eno
      - ens
      - enp
      l3Layout:
        - scope: namespace
          subnetName: k8s-pxe-nics-rack1
        - scope: namespace
          subnetName: k8s-pxe-nics-rack2
        - scope: namespace
          subnetName: k8s-pxe-nics-rack3
        - scope: namespace
          subnetName: k8s-lcm-rack1
        - scope: namespace
          subnetName: k8s-lcm-rack2
        - scope: namespace
          subnetName: k8s-lcm-rack3
        - scope: namespace
          subnetName: mgmt-k8s-ext-rack2
      npTemplate: |
    
        version: 2
        renderer: networkd
    {# protect go-template below from Jinja #}
    {% raw %}
        ethernets:
          lo:
            addresses:
              - {{ cluster_api_lb_ip }}
            dhcp4: false
            dhcp6: false
          {{ nic 0 }}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{ mac 0 }}
            set-name: {{ nic 0 }}
            mtu: 1500
          {{ nic 1 }}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{ mac 1 }}
            set-name: {{ nic 1 }}
            mtu: 1500
          {{ nic 2 }}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{ mac 2 }}
            set-name: {{ nic 2 }}
            mtu: 1500
          {{ nic 3 }}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{ mac 3 }}
            set-name: {{ nic 3 }}
            mtu: 1500
        bonds: # this example uses 2 bonds, you may have different interfaces configuration
          bond0:
            mtu: 1500
            parameters:
              mode: 802.3ad
              mii-monitor-interval: 100
            interfaces:
              - {{ nic 0 }}
              - {{ nic 1 }}
          bond1:
            mtu: 1500
            parameters:
              mode: 802.3ad
              mii-monitor-interval: 100
            interfaces:
              - {{ nic 2 }}
              - {{ nic 3 }}
        vlans:
          k8s-lcm:
            id: SET_LCM_VLAN_ID
            link: bond1
            dhcp4: false
            dhcp6: false
            addresses:
              - {{ ip "k8s-lcm:mgmt-lcm-rack2" }}
            routes:
              - to: {{ cidr_from_subnet "k8s-lcm-rack1" }}
                via: {{ gateway_from_subnet "k8s-lcm-rack2" }}
              - to: {{ cidr_from_subnet "k8s-lcm-rack3" }}
                via: {{ gateway_from_subnet "k8s-lcm-rack2" }}
          k8s-ext:
            id: SET_EXT_VLAN_ID
            link: bond1
            dhcp4: false
            dhcp6: false
            addresses:
              - {{ ip "k8s-ext:mgmt-k8s-ext-rack2" }}
            nameservers:
              addresses: {{ nameservers_from_subnet "mgmt-k8s-ext-rack2" }}
            routes:
              - to: 0.0.0.0/0
                via: {{ gateway_from_subnet "mgmt-k8s-ext-rack2" }}
        bridges:
          k8s-pxe:
            mtu: 1500
            interfaces:
             - bond0
            dhcp4: false
            dhcp6: false
            addresses:
              - {{ ip "k8s-pxe:mgmt-pxe-nics-rack2" }}
            routes:
              - to: {{ cidr_from_subnet "k8s-pxe-nics-rack1" }}
                via: {{ gateway_from_subnet "k8s-pxe-nics-rack2" }}
              - to: {{ cidr_from_subnet "k8s-pxe-nics-rack3" }}
                via: {{ gateway_from_subnet "k8s-pxe-nics-rack2" }}
    {% endraw %}
    ---
    apiVersion: ipam.mirantis.com/v1alpha1
    kind: L2Template
    metadata:
      name: "mgmt-rack3"
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
        l2template-mgmt-rack3: "presents" # to be used in Machine's L2templateSelector
    spec:
      autoIfMappingPrio:
      - provision
      - eno
      - ens
      - enp
      l3Layout:
        - scope: namespace
          subnetName: k8s-pxe-nics-rack1
        - scope: namespace
          subnetName: k8s-pxe-nics-rack2
        - scope: namespace
          subnetName: k8s-pxe-nics-rack3
        - scope: namespace
          subnetName: k8s-lcm-rack1
        - scope: namespace
          subnetName: k8s-lcm-rack2
        - scope: namespace
          subnetName: k8s-lcm-rack3
        - scope: namespace
          subnetName: mgmt-k8s-ext-rack3
      npTemplate: |
    
        version: 2
        renderer: networkd
    {# protect go-template below from Jinja #}
    {% raw %}
        ethernets:
          lo:
            addresses:
              - {{ cluster_api_lb_ip }}
            dhcp4: false
            dhcp6: false
          {{ nic 0 }}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{ mac 0 }}
            set-name: {{ nic 0 }}
            mtu: 1500
          {{ nic 1 }}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{ mac 1 }}
            set-name: {{ nic 1 }}
            mtu: 1500
          {{ nic 2 }}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{ mac 2 }}
            set-name: {{ nic 2 }}
            mtu: 1500
          {{ nic 3 }}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{ mac 3 }}
            set-name: {{ nic 3 }}
            mtu: 1500
        bonds: # this example uses 2 bonds, you may have different interfaces configuration
          bond0:
            mtu: 1500
            parameters:
              mode: 802.3ad
              mii-monitor-interval: 100
            interfaces:
              - {{ nic 0 }}
              - {{ nic 1 }}
          bond1:
            mtu: 1500
            parameters:
              mode: 802.3ad
              mii-monitor-interval: 100
            interfaces:
              - {{ nic 2 }}
              - {{ nic 3 }}
        vlans:
          k8s-lcm:
            id: SET_LCM_VLAN_ID
            link: bond1
            dhcp4: false
            dhcp6: false
            addresses:
              - {{ ip "k8s-lcm:mgmt-lcm-rack3" }}
            routes:
              - to: {{ cidr_from_subnet "k8s-lcm-rack1" }}
                via: {{ gateway_from_subnet "k8s-lcm-rack3" }}
              - to: {{ cidr_from_subnet "k8s-lcm-rack2" }}
                via: {{ gateway_from_subnet "k8s-lcm-rack3" }}
          k8s-ext:
            id: SET_EXT_VLAN_ID
            link: bond1
            dhcp4: false
            dhcp6: false
            addresses:
              - {{ ip "k8s-ext:mgmt-k8s-ext-rack3" }}
            nameservers:
              addresses: {{ nameservers_from_subnet "mgmt-k8s-ext-rack3" }}
            routes:
              - to: 0.0.0.0/0
                via: {{ gateway_from_subnet "mgmt-k8s-ext-rack3" }}
        bridges:
          k8s-pxe:
            mtu: 1500
            interfaces:
             - bond0
            dhcp4: false
            dhcp6: false
            addresses:
              - {{ ip "k8s-pxe:mgmt-pxe-nics-rack3" }}
            routes:
              - to: {{ cidr_from_subnet "k8s-pxe-nics-rack1" }}
                via: {{ gateway_from_subnet "k8s-pxe-nics-rack3" }}
              - to: {{ cidr_from_subnet "k8s-pxe-nics-rack2" }}
                via: {{ gateway_from_subnet "k8s-pxe-nics-rack3" }}
    {% endraw %}
    {# ==================== End L2Template object(s) ============================ #}
    

    If you configure ARP announcement of the load-balancer IP address for a management cluster API and for load-balanced services of the cluster:

    • Update the network object definition in ipam-objects.yaml.template according to the environment configuration. By default, this template implies the use of separate PXE and life-cycle management (LCM) networks.

    • Manually set all parameters that start with SET_.

    • To ensure successful bootstrap, enable asymmetric routing on the interfaces of the management cluster nodes. This is required because the seed node relies on one network by default, which can potentially cause traffic asymmetry.

      In the kernelParameters section of baremetalhostprofiles.yaml.template, set rp_filter to 2. This enables loose mode as defined in RFC3704.

      Example configuration of asymmetric routing
      ...
      kernelParameters:
        ...
        sysctl:
          # Enables the "Loose mode" for the "k8s-lcm" interface (management network)
          net.ipv4.conf.k8s-lcm.rp_filter: "2"
          # Enables the "Loose mode" for the "bond0" interface (PXE network)
          net.ipv4.conf.bond0.rp_filter: "2"
          ...
      

      Note

      More complicated solutions that are not described in this manual include getting rid of traffic asymmetry, for example:

      • Configure source routing on management cluster nodes.

      • Plug the seed node into the same networks as the management cluster nodes, which requires custom configuration of the seed node.

    For configuration details of bond network interface for the PXE and management network, see Configure NIC bonding.

    Example of the default L2 template snippet for a management cluster
    bonds:
      bond0:
        interfaces:
          - {{ nic 0 }}
          - {{ nic 1 }}
        parameters:
          mode: active-backup
          primary: {{ nic 0 }}
        dhcp4: false
        dhcp6: false
        addresses:
          - {{ ip "bond0:mgmt-pxe" }}
    vlans:
      k8s-lcm:
        id: SET_VLAN_ID
        link: bond0
        addresses:
          - {{ ip "k8s-lcm:kaas-mgmt" }}
        nameservers:
          addresses: {{ nameservers_from_subnet "kaas-mgmt" }}
        routes:
          - to: 0.0.0.0/0
            via: {{ gateway_from_subnet "kaas-mgmt" }}
    

    In this example, the following configuration applies:

    • A bond of two NIC interfaces

    • A static address in the PXE network set on the bond

    • An isolated L2 segment for the LCM network is configured using the k8s-lcm VLAN with the static address in the LCM network

    • The default gateway address is in the LCM network

    For general concepts of configuring separate PXE and LCM networks for a management cluster, see Separate PXE and management networks. For current object templates and variable names to use, see the following tables.

    Network parameters mapping overview

    Deployment file name

    Parameters list to update manually

    ipam-objects.yaml.template

    • SET_LB_HOST

    • SET_MGMT_ADDR_RANGE

    • SET_MGMT_CIDR

    • SET_MGMT_DNS

    • SET_MGMT_NW_GW

    • SET_MGMT_SVC_POOL

    • SET_PXE_ADDR_POOL

    • SET_PXE_ADDR_RANGE

    • SET_PXE_CIDR

    • SET_PXE_SVC_POOL

    • SET_VLAN_ID

    bootstrap.env

    • KAAS_BM_PXE_IP

    • KAAS_BM_PXE_MASK

    • KAAS_BM_PXE_BRIDGE

    Mandatory network parameters of the IPAM object template

    The following table contains examples of mandatory parameter values to set in ipam-objects.yaml.template for the network scheme that has the following networks:

    • 172.16.59.0/24 - PXE network

    • 172.16.61.0/25 - LCM network

    Parameter

    Description

    Example value

    SET_PXE_CIDR

    The IP address of the PXE network in the CIDR notation. The minimum recommended network size is 256 addresses (/24 prefix length).

    172.16.59.0/24

    SET_PXE_SVC_POOL

    The IP address range to use for endpoints of load balancers in the PXE network for the Container Cloud services: Ironic-API, DHCP server, HTTP server, and caching server. The minimum required range size is 5 addresses.

    172.16.59.6-172.16.59.15

    SET_PXE_ADDR_POOL

    The IP address range in the PXE network to use for dynamic address allocation for hosts during inspection and provisioning.

    The minimum recommended range size is 30 addresses for management cluster nodes if it is located in a separate PXE network segment. Otherwise, it depends on the number of managed cluster nodes to deploy in the same PXE network segment as the management cluster nodes.

    172.16.59.51-172.16.59.200

    SET_PXE_ADDR_RANGE

    The IP address range in the PXE network to use for static address allocation on each management cluster node. The minimum recommended range size is 6 addresses.

    172.16.59.41-172.16.59.50

    SET_MGMT_CIDR

    The IP address of the LCM network for the management cluster in the CIDR notation. If managed clusters will have their separate LCM networks, those networks must be routable to the LCM network. The minimum recommended network size is 128 addresses (/25 prefix length).

    172.16.61.0/25

    SET_MGMT_NW_GW

    The default gateway address in the LCM network. This gateway must provide access to the OOB network of the Container Cloud cluster and to the Internet to download the Mirantis artifacts.

    172.16.61.1

    SET_LB_HOST

    The IP address of the externally accessible MKE API endpoint of the cluster in the CIDR notation. This address must be within the management SET_MGMT_CIDR network but must NOT overlap with any other addresses or address ranges within this network. External load balancers are not supported.

    172.16.61.5/32

    SET_MGMT_DNS

    An external (non-Kubernetes) DNS server accessible from the LCM network.

    8.8.8.8

    SET_MGMT_ADDR_RANGE

    The IP address range that includes addresses to be allocated to bare metal hosts in the LCM network for the management cluster.

    When this network is shared with managed clusters, the size of this range limits the number of hosts that can be deployed in all clusters sharing this network.

    When this network is solely used by a management cluster, the range must include at least 6 addresses for bare metal hosts of the management cluster.

    172.16.61.30-172.16.61.40

    SET_MGMT_SVC_POOL

    The IP address range to use for the externally accessible endpoints of load balancers in the LCM network for the Container Cloud services, such as Keycloak, web UI, and so on. The minimum required range size is 19 addresses.

    172.16.61.10-172.16.61.29

    SET_VLAN_ID

    The VLAN ID used for isolation of LCM network. The bootstrap.sh process and the seed node must have routable access to the network in this VLAN.

    3975

    While using separate PXE and LCM networks, the management cluster services are exposed in different networks using two separate MetalLB address pools:

    • Services exposed through the PXE network are as follows:

      • Ironic API as a bare metal provisioning server

      • HTTP server that provides images for network boot and server provisioning

      • Caching server for accessing the Container Cloud artifacts deployed on hosts

    • Services exposed through the LCM network are all other Container Cloud services, such as Keycloak, web UI, and so on.

    The default MetalLB configuration described in the MetalLBConfig object template of metallbconfig.yaml.template uses two separate MetalLB address pools. Also, it uses the interfaces selector in its l2Advertisements template.

    Caution

    When you change the L2Template object template in ipam-objects.yaml.template, ensure that interfaces listed in the interfaces field of the MetalLBConfig.spec.l2Advertisements section match those used in your L2Template. For details about the interfaces selector, see MetalLBConfig spec.

    See Configure and verify MetalLB for details on MetalLB configuration.

    If you configure BGP announcement of the load-balancer IP address for a management cluster API and for load-balanced services of the cluster using three networks, you can use the following MetalLBConfig example instead of the default metallbconfig.yaml.template file.

    Example MetalLBConfig template for a management cluster that uses BGP announcements and 3 networks

    Substitute the default metallbconfig.yaml.template file with the one that you create using the following example template.

    apiVersion: kaas.mirantis.com/v1alpha1
    kind: MetalLBConfig
    metadata:
      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: kaas-mgmt
      name: kaas-mgmt-metallb
      namespace: default
    spec:
      ipAddressPools:
      - name: default
        spec:
          addresses:
          - SET_MGMT_SVC_POOL
          autoAssign: true
          avoidBuggyIPs: false
      - name: services-pxe
        spec:
          addresses:
          - SET_PXE_SVC_POOL
          autoAssign: false
          avoidBuggyIPs: false
    
      # l2Advertisements are optional and can be used for the bootstrap cluster
      # (during provisioning of a management cluster) if the ARP announcement
      # from the bootstrap cluster is more suitable than BGP.
      l2Advertisements:
      - name: pxe-bootstrap
        spec:
          ipAddressPools:
          - services-pxe
          - dhcp-lb
          nodeSelectors:
          - matchLabels:
              kubernetes.io/hostname: clusterapi-control-plane # bootstrap cluster node
    
      bgpPeers:
        - name: bootstrap-bgp-peer-pxe # used for the bootstrap cluster only (during provisioning of a management cluster)
          spec:
            holdTime: 0s
            keepaliveTime: 0s
            myASN: SET_BGP_BOOTSTRAP_LOCAL_ASN
            nodeSelectors:
              - matchLabels:
                  kubernetes.io/hostname: clusterapi-control-plane # bootstrap cluster node
            peerASN: SET_BGP_PXE_RACKX_PEER_ASN # can be in the same (or different) rack as for one of management cluster nodes
            peerAddress: SET_BGP_PXE_RACKX_PEER_IP# can be in the same (or different) rack as for one of management cluster nodes
    
        - name: bgp-peer-rack1-pxe
          spec:
            holdTime: 0s
            keepaliveTime: 0s
            myASN: SET_BGP_PXE_LOCAL_ASN
            nodeSelectors:
              - matchLabels:
                  rack-id: rack1
            peerASN: SET_BGP_PXE_RACK1_PEER_ASN
            peerAddress: SET_BGP_PXE_RACK1_PEER_IP
    
        - name: bgp-peer-rack2-pxe
          spec:
            holdTime: 0s
            keepaliveTime: 0s
            myASN: SET_BGP_PXE_LOCAL_ASN
            nodeSelectors:
              - matchLabels:
                  rack-id: rack2
            peerASN: SET_BGP_PXE_RACK2_PEER_ASN
            peerAddress: SET_BGP_PXE_RACK2_PEER_IP
    
        - name: bgp-peer-rack3-pxe
          spec:
            holdTime: 0s
            keepaliveTime: 0s
            myASN: SET_BGP_PXE_LOCAL_ASN
            nodeSelectors:
              - matchLabels:
                  rack-id: rack3
            peerASN: SET_BGP_PXE_RACK3_PEER_ASN
            peerAddress: SET_BGP_PXE_RACK3_PEER_IP
    
        - name: bgp-peer-rack1-ext
          spec:
            holdTime: 0s
            keepaliveTime: 0s
            myASN: SET_BGP_EXT_LOCAL_ASN
            nodeSelectors:
              - matchLabels:
                  rack-id: rack1
            peerASN: SET_BGP_EXT_RACK1_PEER_ASN
            peerAddress: SET_BGP_EXT_RACK1_PEER_IP
    
        - name: bgp-peer-rack2-ext
          spec:
            holdTime: 0s
            keepaliveTime: 0s
            myASN: SET_BGP_EXT_LOCAL_ASN
            nodeSelectors:
              - matchLabels:
                  rack-id: rack2
            peerASN: SET_BGP_EXT_RACK2_PEER_ASN
            peerAddress: SET_BGP_EXT_RACK2_PEER_IP
    
        - name: bgp-peer-rack3-ext
          spec:
            holdTime: 0s
            keepaliveTime: 0s
            myASN: SET_BGP_EXT_LOCAL_ASN
            nodeSelectors:
              - matchLabels:
                  rack-id: rack3
            peerASN: SET_BGP_EXT_RACK3_PEER_ASN
            peerAddress: SET_BGP_EXT_RACK3_PEER_IP
    
      bgpAdvertisements:
        - name: default
          spec:
            aggregationLength: 32
            ipAddressPools:
              - default
            peers:
              - bgp-peer-rack1-ext
              - bgp-peer-rack2-ext
              - bgp-peer-rack3-ext
    
        - name: provision-pxe
          spec:
            aggregationLength: 32
            ipAddressPools:
              - services-pxe
            peers:
              - bootstrap-bgp-peer-pxe
              - bgp-peer-rack1-pxe
              - bgp-peer-rack2-pxe
              - bgp-peer-rack3-pxe
    
  12. In cluster.yaml.template:

    1. Set the mandatory label:

      labels:
        kaas.mirantis.com/provider: baremetal
      
    2. Available since Container Cloud 2.30.0 (Cluster releases 21.0.0 and 20.0.0) as Technology Preview. If you configured BGP of the load-balancer IP address for the management cluster API in previous steps, enable the useBGPAnnouncement parameter:

      spec:
        providerSpec:
          value:
            useBGPAnnouncement: true
      
    3. Update other cluster-related settings to fit your deployment.

  13. Optional. Technology Preview. Deprecated since Container Cloud 2.29.0 (Cluster release 16.4.0). Enable WireGuard for traffic encryption on the Kubernetes workloads network.

    WireGuard configuration
    1. Ensure that the Calico MTU size is at least 60 bytes smaller than the interface MTU size of the workload network. IPv4 WireGuard uses a 60-byte header. For details, see Set the MTU size for Calico.

    2. In cluster.yaml.template, enable WireGuard by adding the secureOverlay parameter:

      spec:
        ...
        providerSpec:
          value:
            ...
            secureOverlay: true
      

      Caution

      Changing this parameter on a running cluster causes a downtime that can vary depending on the cluster size.

    For more details about WireGuard, see Calico documentation: Encrypt in-cluster pod traffic.

  14. Configure StackLight. For parameters description, see StackLight configuration parameters.

  15. Optional. Configure additional cluster settings as described in Configure optional settings.

  16. In machines.yaml.template:

    1. Add the following mandatory machine labels:

      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: <clusterName>
        cluster.sigs.k8s.io/control-plane: "true"
      
    2. Adjust spec and labels sections of each entry according to your deployment.

    3. Adjust the spec.providerSpec.value.hostSelector values to match BareMetalHostInventory corresponding to each machine. For details, see spec:providerSpec for instance configuration.

    4. Available since Container Cloud 2.30.0 (Cluster releases 21.0.0 and 20.0.0). Optional. Configure the controlled provisioning workflow by adding the day1Provisioning and day1Deployment fields to BareMetalMachineProviderSpec:

      spec:
        providerSpec:
          value:
            day1Provisioning: manual  # Default, requires approval
            day1Deployment: manual    # Default, requires approval
      

      When using controlled provisioning during bootstrap:

      1. Machines switch to the AwaitsProvisioning state for hardware verification after bare metal host inspection.

      2. Approval of the BootstrapRegion object automatically sets day1Provisioning: auto for all machines.

      3. After provisioning, machines switch to the AwaitsDeployment state.

      4. An operator validates the configuration and manually sets day1Deployment: auto in BareMetalMachineProviderSpec.

      For details, see Controlled bare metal provisioning workflows.

      Note

      If not specified, both fields are set to manual by default. Set to auto either initially to automatically start the bootstrap, or later to continue after a manual inspection and approval.

    5. Adjust the spec.providerSpec.value.l2TemplateSelector values to match L2Template corresponding to each machine.

      This step is required if the nodes of the management cluster are deployed using different L2 templates, for example, if the nodes are deployed in a multi-rack environment.

  17. Monitor the inspecting process of the baremetal hosts and wait until all hosts are in the available state:

    kubectl get bmh -o go-template='{{- range .items -}} {{.status.provisioning.state}}{{"\n"}} {{- end -}}'
    

    Example of system response:

    available
    available
    available
    
  18. Monitor the BootstrapRegion object status and wait until it is ready.

    kubectl get bootstrapregions -o go-template='{{(index .items 0).status.ready}}{{"\n"}}'
    

    To obtain more granular status details, monitor status.conditions:

    kubectl get bootstrapregions -o go-template='{{(index .items 0).status.conditions}}{{"\n"}}'
    

    For a more user-friendly system response, consider using dedicated tools such as jq or yq and adjust the -o flag to output in the json or yaml format accordingly.

  19. Change the directory to /kaas-bootstrap/.

  20. Approve the BootstrapRegion object to start the cluster deployment:

    ./container-cloud bootstrap approve all
    

    Caution

    Once you approve the BootstrapRegion object, no cluster modification is allowed.

    A machine modification is allowed only while a machine is in the AwaitsDeployment state. This state applies if day1Deployment: manual is set in BareMetalMachineProviderSpec.

    Warning

    Do not manually restart or power off any of the bare metal hosts during the bootstrap process.

  21. Monitor the deployment progress. For description of deployment stages, see Overview of the deployment workflow.

  22. Verify that network addresses used on your clusters do not overlap with the following default MKE network addresses for Swarm and MCR:

    • 10.0.0.0/16 is used for Swarm networks. IP addresses from this network are virtual.

    • 10.99.0.0/16 is used for MCR networks. IP addresses from this network are allocated on hosts.

    Verification of Swarm and MCR network addresses

    To verify Swarm and MCR network addresses, run on any master node:

    docker info
    

    Example of system response:

    Server:
     ...
     Swarm:
      ...
      Default Address Pool: 10.0.0.0/16
      SubnetSize: 24
      ...
     Default Address Pools:
       Base: 10.99.0.0/16, Size: 20
     ...
    

    Not all of Swarm and MCR addresses are usually in use. One Swarm Ingress network is created by default and occupies the 10.0.0.0/24 address block. Also, three MCR networks are created by default and occupy three address blocks: 10.99.0.0/20, 10.99.16.0/20, 10.99.32.0/20.

    To verify the actual networks state and addresses in use, run:

    docker network ls
    docker network inspect <networkName>
    
  23. Optional. If you plan to use multiple L2 segments for provisioning of managed cluster nodes, consider the requirements specified in Configure multiple DHCP address ranges.