Deploy a management cluster

This section contains an overview of the cluster-related objects along with the configuration procedure of these objects during deployment of a management cluster using Bootstrap v2 through the Container Cloud API.

Deploy a management cluster using CLI

The following procedure describes how to prepare and deploy a management cluster using Bootstrap v2 by operating YAML templates available in the kaas-bootstrap/templates/ folder.

To deploy a management cluster using CLI:

  1. Set up a bootstrap cluster.

  2. Export kubeconfig of the kind cluster:

    export KUBECONFIG=<pathToKindKubeconfig>
    

    By default, <pathToKindKubeconfig> is $HOME/.kube/kind-config-clusterapi.

  3. Configure BIOS on a bare metal host.

  4. Navigate to kaas-bootstrap/templates/bm.

    Warning

    The kubectl apply command automatically saves the applied data as plain text into the kubectl.kubernetes.io/last-applied-configuration annotation of the corresponding object. This may result in revealing sensitive data in this annotation when creating or modifying objects containing credentials. Such Container Cloud objects include:

    • BareMetalHostCredential

    • ClusterOIDCConfiguration

    • License

    • Proxy

    • ServiceUser

    • TLSConfig

    Therefore, do not use kubectl apply on these objects. Use kubectl create, kubectl patch, or kubectl edit instead.

    If you used kubectl apply on these objects, you can remove the kubectl.kubernetes.io/last-applied-configuration annotation from the objects using kubectl edit.

  5. Create the BootstrapRegion object by modifying bootstrapregion.yaml.template.

    Configuration of bootstrapregion.yaml.template
    1. Set provider: baremetal and use the default <regionName>, which is region-one.

      apiVersion: kaas.mirantis.com/v1alpha1
      kind: BootstrapRegion
      metadata:
        name: region-one
        namespace: default
      spec:
        provider: baremetal
      
    2. Create the object:

      ./kaas-bootstrap/bin/kubectl create -f \
          kaas-bootstrap/templates/bm/bootstrapregion.yaml.template
      

    Note

    In the following steps, apply the changes to objects using the commands below with the required template name:

    ./kaas-bootstrap/bin/kubectl create -f \
        kaas-bootstrap/templates/bm/<templateName>.yaml.template
    
  6. Create the ServiceUser object by modifying serviceusers.yaml.template.

    Configuration of serviceusers.yaml.template

    Service user is the initial user to create in Keycloak for access to a newly deployed management cluster. By default, it has the global-admin, operator (namespaced), and bm-pool-operator (namespaced) roles.

    You can delete serviceuser after setting up other required users with specific roles or after any integration with an external identity provider, such as LDAP.

    apiVersion: kaas.mirantis.com/v1alpha1
    kind: ServiceUserList
    items:
    - apiVersion: kaas.mirantis.com/v1alpha1
      kind: ServiceUser
      metadata:
        name: <USERNAME>
      spec:
        password:
          value: <PASSWORD>
    
  7. Optional. Prepare any number of additional SSH keys using the following example:

    apiVersion: kaas.mirantis.com/v1alpha1
    kind: PublicKey
    metadata:
      name: <SSHKeyName>
      namespace: default
    spec:
      publicKey: |
        <insert your public key here>
    
  8. Optional. Add the Proxy object using the example below:

    apiVersion: kaas.mirantis.com/v1alpha1
    kind: Proxy
    metadata:
      name: <proxyName>
      namespace: default
    spec:
      ...
    
  9. Inspect the default bare metal host profile definition in baremetalhostprofiles.yaml.template and adjust it to fit your hardware configuration. For details, see Customize the default bare metal host profile.

    Warning

    All data will be wiped during cluster deployment on devices defined directly or indirectly in the fileSystems list of BareMetalHostProfile. For example:

    • A raw device partition with a file system on it

    • A device partition in a volume group with a logical volume that has a file system on it

    • An mdadm RAID device with a file system on it

    • An LVM RAID device with a file system on it

    The wipe field is always considered true for these devices. The false value is ignored.

    Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

  10. In baremetalhostinventory.yaml.template, update the bare metal host definitions according to your environment configuration. Use the reference table below to manually set all parameters that start with SET_.

    Mandatory parameters for a bare metal host template

    Parameter

    Description

    Example value

    SET_MACHINE_0_IPMI_USERNAME

    The IPMI user name to access the BMC. 0

    user

    SET_MACHINE_0_IPMI_PASSWORD

    The IPMI password to access the BMC. 0

    password

    SET_MACHINE_0_MAC

    The MAC address of the first master node in the PXE network.

    ac:1f:6b:02:84:71

    SET_MACHINE_0_BMC_ADDRESS

    The IP address of the BMC endpoint for the first master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.

    192.168.100.11

    SET_MACHINE_1_IPMI_USERNAME

    The IPMI user name to access the BMC. 0

    user

    SET_MACHINE_1_IPMI_PASSWORD

    The IPMI password to access the BMC. 0

    password

    SET_MACHINE_1_MAC

    The MAC address of the second master node in the PXE network.

    ac:1f:6b:02:84:72

    SET_MACHINE_1_BMC_ADDRESS

    The IP address of the BMC endpoint for the second master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.

    192.168.100.12

    SET_MACHINE_2_IPMI_USERNAME

    The IPMI user name to access the BMC. 0

    user

    SET_MACHINE_2_IPMI_PASSWORD

    The IPMI password to access the BMC. 0

    password

    SET_MACHINE_2_MAC

    The MAC address of the third master node in the PXE network.

    ac:1f:6b:02:84:73

    SET_MACHINE_2_BMC_ADDRESS

    The IP address of the BMC endpoint for the third master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.

    192.168.100.13

    0(1,2,3,4,5,6)

    The parameter requires a user name and password in plain text.

  11. Configure cluster network:

    Important

    Bootstrap V2 supports only separated PXE and LCM networks.

    • Update the network object definition in ipam-objects.yaml.template according to the environment configuration. By default, this template implies the use of separate PXE and life-cycle management (LCM) networks.

    • Manually set all parameters that start with SET_.

    • To ensure successful bootstrap, enable asymmetric routing on the interfaces of the management cluster nodes. This is required because the seed node relies on one network by default, which can potentially cause traffic asymmetry.

      In the kernelParameters section of baremetalhostprofiles.yaml.template, set rp_filter to 2. This enables loose mode as defined in RFC3704.

      Example configuration of asymmetric routing
      ...
      kernelParameters:
        ...
        sysctl:
          # Enables the "Loose mode" for the "k8s-lcm" interface (management network)
          net.ipv4.conf.k8s-lcm.rp_filter: "2"
          # Enables the "Loose mode" for the "bond0" interface (PXE network)
          net.ipv4.conf.bond0.rp_filter: "2"
          ...
      

      Note

      More complicated solutions that are not described in this manual include getting rid of traffic asymmetry, for example:

      • Configure source routing on management cluster nodes.

      • Plug the seed node into the same networks as the management cluster nodes, which requires custom configuration of the seed node.

    For configuration details of bond network interface for the PXE and management network, see Configure NIC bonding.

    Example of the default L2 template snippet for a management cluster
    bonds:
      bond0:
        interfaces:
          - {{ nic 0 }}
          - {{ nic 1 }}
        parameters:
          mode: active-backup
          primary: {{ nic 0 }}
        dhcp4: false
        dhcp6: false
        addresses:
          - {{ ip "bond0:mgmt-pxe" }}
    vlans:
      k8s-lcm:
        id: SET_VLAN_ID
        link: bond0
        addresses:
          - {{ ip "k8s-lcm:kaas-mgmt" }}
        nameservers:
          addresses: {{ nameservers_from_subnet "kaas-mgmt" }}
        routes:
          - to: 0.0.0.0/0
            via: {{ gateway_from_subnet "kaas-mgmt" }}
    

    In this example, the following configuration applies:

    • A bond of two NIC interfaces

    • A static address in the PXE network set on the bond

    • An isolated L2 segment for the LCM network is configured using the k8s-lcm VLAN with the static address in the LCM network

    • The default gateway address is in the LCM network

    For general concepts of configuring separate PXE and LCM networks for a management cluster, see Separate PXE and management networks. For current object templates and variable names to use, see the following tables.

    Network parameters mapping overview

    Deployment file name

    Parameters list to update manually

    ipam-objects.yaml.template

    • SET_LB_HOST

    • SET_MGMT_ADDR_RANGE

    • SET_MGMT_CIDR

    • SET_MGMT_DNS

    • SET_MGMT_NW_GW

    • SET_MGMT_SVC_POOL

    • SET_PXE_ADDR_POOL

    • SET_PXE_ADDR_RANGE

    • SET_PXE_CIDR

    • SET_PXE_SVC_POOL

    • SET_VLAN_ID

    bootstrap.env

    • KAAS_BM_PXE_IP

    • KAAS_BM_PXE_MASK

    • KAAS_BM_PXE_BRIDGE

    Mandatory network parameters of the IPAM object template

    The following table contains examples of mandatory parameter values to set in ipam-objects.yaml.template for the network scheme that has the following networks:

    • 172.16.59.0/24 - PXE network

    • 172.16.61.0/25 - LCM network

    Parameter

    Description

    Example value

    SET_PXE_CIDR

    The IP address of the PXE network in the CIDR notation. The minimum recommended network size is 256 addresses (/24 prefix length).

    172.16.59.0/24

    SET_PXE_SVC_POOL

    The IP address range to use for endpoints of load balancers in the PXE network for the Container Cloud services: Ironic-API, DHCP server, HTTP server, and caching server. The minimum required range size is 5 addresses.

    172.16.59.6-172.16.59.15

    SET_PXE_ADDR_POOL

    The IP address range in the PXE network to use for dynamic address allocation for hosts during inspection and provisioning.

    The minimum recommended range size is 30 addresses for management cluster nodes if it is located in a separate PXE network segment. Otherwise, it depends on the number of managed cluster nodes to deploy in the same PXE network segment as the management cluster nodes.

    172.16.59.51-172.16.59.200

    SET_PXE_ADDR_RANGE

    The IP address range in the PXE network to use for static address allocation on each management cluster node. The minimum recommended range size is 6 addresses.

    172.16.59.41-172.16.59.50

    SET_MGMT_CIDR

    The IP address of the LCM network for the management cluster in the CIDR notation. If managed clusters will have their separate LCM networks, those networks must be routable to the LCM network. The minimum recommended network size is 128 addresses (/25 prefix length).

    172.16.61.0/25

    SET_MGMT_NW_GW

    The default gateway address in the LCM network. This gateway must provide access to the OOB network of the Container Cloud cluster and to the Internet to download the Mirantis artifacts.

    172.16.61.1

    SET_LB_HOST

    The IP address of the externally accessible MKE API endpoint of the cluster in the CIDR notation. This address must be within the management SET_MGMT_CIDR network but must NOT overlap with any other addresses or address ranges within this network. External load balancers are not supported.

    172.16.61.5/32

    SET_MGMT_DNS

    An external (non-Kubernetes) DNS server accessible from the LCM network.

    8.8.8.8

    SET_MGMT_ADDR_RANGE

    The IP address range that includes addresses to be allocated to bare metal hosts in the LCM network for the management cluster.

    When this network is shared with managed clusters, the size of this range limits the number of hosts that can be deployed in all clusters sharing this network.

    When this network is solely used by a management cluster, the range must include at least 6 addresses for bare metal hosts of the management cluster.

    172.16.61.30-172.16.61.40

    SET_MGMT_SVC_POOL

    The IP address range to use for the externally accessible endpoints of load balancers in the LCM network for the Container Cloud services, such as Keycloak, web UI, and so on. The minimum required range size is 19 addresses.

    172.16.61.10-172.16.61.29

    SET_VLAN_ID

    The VLAN ID used for isolation of LCM network. The bootstrap.sh process and the seed node must have routable access to the network in this VLAN.

    3975

    While using separate PXE and LCM networks, the management cluster services are exposed in different networks using two separate MetalLB address pools:

    • Services exposed through the PXE network are as follows:

      • Ironic API as a bare metal provisioning server

      • HTTP server that provides images for network boot and server provisioning

      • Caching server for accessing the Container Cloud artifacts deployed on hosts

    • Services exposed through the LCM network are all other Container Cloud services, such as Keycloak, web UI, and so on.

    The default MetalLB configuration described in the MetalLBConfig object template of metallbconfig.yaml.template uses two separate MetalLB address pools. Also, it uses the interfaces selector in its l2Advertisements template.

    Caution

    When you change the L2Template object template in ipam-objects.yaml.template, ensure that interfaces listed in the interfaces field of the MetalLBConfig.spec.l2Advertisements section match those used in your L2Template. For details about the interfaces selector, see Container Cloud API Reference: MetalLBConfig spec.

    See Configure and verify MetalLB for details on MetalLB configuration.

  12. In cluster.yaml.template:

    1. Set the mandatory label:

      labels:
        kaas.mirantis.com/provider: baremetal
      
    2. Update the cluster-related settings to fit your deployment.

  13. Optional. Technology Preview. Deprecated since Container Cloud 2.29.0 (Cluster release 16.4.0). Enable WireGuard for traffic encryption on the Kubernetes workloads network.

    WireGuard configuration
    1. Ensure that the Calico MTU size is at least 60 bytes smaller than the interface MTU size of the workload network. IPv4 WireGuard uses a 60-byte header. For details, see Set the MTU size for Calico.

    2. In cluster.yaml.template, enable WireGuard by adding the secureOverlay parameter:

      spec:
        ...
        providerSpec:
          value:
            ...
            secureOverlay: true
      

      Caution

      Changing this parameter on a running cluster causes a downtime that can vary depending on the cluster size.

    For more details about WireGuard, see Calico documentation: Encrypt in-cluster pod traffic.

  14. Configure StackLight. For parameters description, see StackLight configuration parameters.

  15. Optional. Configure additional cluster settings as described in Configure optional settings.

  16. In machines.yaml.template:

    1. Add the following mandatory machine labels:

      labels:
        kaas.mirantis.com/provider: baremetal
        cluster.sigs.k8s.io/cluster-name: <clusterName>
        cluster.sigs.k8s.io/control-plane: "true"
      
    2. Adjust spec and labels sections of each entry according to your deployment.

    3. Adjust the spec.providerSpec.value.hostSelector values to match BareMetalHostInventory corresponding to each machine. For details, see Container Cloud API Reference: Machine spec.

  17. Monitor the inspecting process of the baremetal hosts and wait until all hosts are in the available state:

    kubectl get bmh -o go-template='{{- range .items -}} {{.status.provisioning.state}}{{"\n"}} {{- end -}}'
    

    Example of system response:

    available
    available
    available
    
  18. Monitor the BootstrapRegion object status and wait until it is ready.

    kubectl get bootstrapregions -o go-template='{{(index .items 0).status.ready}}{{"\n"}}'
    

    To obtain more granular status details, monitor status.conditions:

    kubectl get bootstrapregions -o go-template='{{(index .items 0).status.conditions}}{{"\n"}}'
    

    For a more user-friendly system response, consider using dedicated tools such as jq or yq and adjust the -o flag to output in the json or yaml format accordingly.

  19. Change the directory to /kaas-bootstrap/.

  20. Approve the BootstrapRegion object to start the cluster deployment:

    ./container-cloud bootstrap approve all
    

    Caution

    Once you approve the BootstrapRegion object, no cluster or machine modification is allowed.

    Warning

    Do not manually restart or power off any of the bare metal hosts during the bootstrap process.

  21. Monitor the deployment progress. For description of deployment stages, see Overview of the deployment workflow.

  22. Verify that network addresses used on your clusters do not overlap with the following default MKE network addresses for Swarm and MCR:

    • 10.0.0.0/16 is used for Swarm networks. IP addresses from this network are virtual.

    • 10.99.0.0/16 is used for MCR networks. IP addresses from this network are allocated on hosts.

    Verification of Swarm and MCR network addresses

    To verify Swarm and MCR network addresses, run on any master node:

    docker info
    

    Example of system response:

    Server:
     ...
     Swarm:
      ...
      Default Address Pool: 10.0.0.0/16
      SubnetSize: 24
      ...
     Default Address Pools:
       Base: 10.99.0.0/16, Size: 20
     ...
    

    Not all of Swarm and MCR addresses are usually in use. One Swarm Ingress network is created by default and occupies the 10.0.0.0/24 address block. Also, three MCR networks are created by default and occupy three address blocks: 10.99.0.0/20, 10.99.16.0/20, 10.99.32.0/20.

    To verify the actual networks state and addresses in use, run:

    docker network ls
    docker network inspect <networkName>
    
  23. Optional. If you plan to use multiple L2 segments for provisioning of managed cluster nodes, consider the requirements specified in Configure multiple DHCP address ranges.