Prepare metadata and deploy the management cluster since 2.24.0

This section describes how to prepare cluster metadata and deploy the management cluster since Container Cloud 2.24.0. For description of changes applied as compared to previous Container Cloud releases, see Release Notes: MetalLB configuration changes.

Using the example procedure below, replace the addresses and credentials in the configuration YAML files with the data corresponding to your environment. Keep everything else as is, including the file names and YAML structure.

To prepare metadata and deploy the management cluster:

  1. Log in to the seed node that you configured as described in Prepare the seed node.

  2. Change to your preferred work directory, for example:

    cd $HOME
    
  3. Prepare the bootstrap script:

    1. Download and run the Container Cloud bootstrap script:

      sudo apt-get update
      sudo apt-get install wget
      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      chmod 0755 get_container_cloud.sh
      ./get_container_cloud.sh
      
    2. Change the directory to the kaas-bootstrap folder created by the script.

  4. Obtain your license file that will be required during the bootstrap:

    1. Create a user account at www.mirantis.com.

    2. Log in to your account and download the mirantis.lic license file.

    3. Save the license file as mirantis.lic under the kaas-bootstrap directory on the bootstrap node.

    4. Verify that mirantis.lic contains the exact Container Cloud license previously downloaded from www.mirantis.com by decoding the license JWT token, for example, using jwt.io.

      Example of a valid decoded Container Cloud license data with the mandatory license field:

      {
          "exp": 1652304773,
          "iat": 1636669973,
          "sub": "demo",
          "license": {
              "dev": false,
              "limits": {
                  "clusters": 10,
                  "workers_per_cluster": 10
              },
              "openstack": null
          }
      }
      

      Warning

      The MKE license does not apply to mirantis.lic. For details about MKE license, see MKE documentation.

  5. Prepare the deployment templates:

    1. Create a copy of the current templates directory for future reference:

      mkdir templates.backup
      cp -r templates/*  templates.backup/
      
    2. Inspect the default bare metal host profile definition in templates/bm/baremetalhostprofiles.yaml.template and adjust it to fit your hardware configuration. For details, see Customize the default bare metal host profile.

      Warning

      All data will be wiped during cluster deployment on devices defined directly or indirectly in the fileSystems list of BareMetalHostProfile. For example:

      • A raw device partition with a file system on it

      • A device partition in a volume group with a logical volume that has a file system on it

      • An mdadm RAID device with a file system on it

      • An LVM RAID device with a file system on it

      The wipe field is always considered true for these devices. The false value is ignored.

      Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

    3. Update the bare metal hosts definition template in templates/bm/baremetalhosts.yaml.template according to your environment configuration. Use the table below for reference. Manually set all parameters that start with SET_.

      Bare metal hosts template mandatory parameters

      Parameter

      Description

      Example value

      SET_MACHINE_0_IPMI_USERNAME

      The IPMI user name to access the BMC. 0

      user

      SET_MACHINE_0_IPMI_PASSWORD

      The IPMI password to access the BMC. 0

      password

      SET_MACHINE_0_MAC

      The MAC address of the first master node in the PXE network.

      ac:1f:6b:02:84:71

      SET_MACHINE_0_BMC_ADDRESS

      The IP address of the BMC endpoint for the first master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.

      192.168.100.11

      SET_MACHINE_1_IPMI_USERNAME

      The IPMI user name to access the BMC. 0

      user

      SET_MACHINE_1_IPMI_PASSWORD

      The IPMI password to access the BMC. 0

      password

      SET_MACHINE_1_MAC

      The MAC address of the second master node in the PXE network.

      ac:1f:6b:02:84:72

      SET_MACHINE_1_BMC_ADDRESS

      The IP address of the BMC endpoint for the second master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.

      192.168.100.12

      SET_MACHINE_2_IPMI_USERNAME

      The IPMI user name to access the BMC. 0

      user

      SET_MACHINE_2_IPMI_PASSWORD

      The IPMI password to access the BMC. 0

      password

      SET_MACHINE_2_MAC

      The MAC address of the third master node in the PXE network.

      ac:1f:6b:02:84:73

      SET_MACHINE_2_BMC_ADDRESS

      The IP address of the BMC endpoint for the third master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.

      192.168.100.13

      0(1,2,3,4,5,6)

      The parameter requires a user name and password in plain text.

    4. Configure cluster network:

      Important

      Mirantis recommends separating the PXE and management networks.

      • To ensure successful bootstrap, enable asymmetric routing on the interfaces of the management cluster nodes. This is required because the seed node relies on one network by default, which can potentially cause traffic asymmetry.

        In the kernelParameters section of bm/baremetalhostprofiles.yaml.template, set rp_filter to 2. This enables loose mode as defined in RFC3704.

        Example configuration of asymmetric routing
        ...
        kernelParameters:
          ...
          sysctl:
            # Enables the "Loose mode" for the "k8s-lcm" interface (management network)
            net.ipv4.conf.k8s-lcm.rp_filter: "2"
            # Enables the "Loose mode" for the "bond0" interface (PXE network)
            net.ipv4.conf.bond0.rp_filter: "2"
            ...
        

        Note

        More complicated solutions that are not described in this manual include getting rid of traffic asymmetry, for example:

        • Configure source routing on management cluster nodes.

        • Plug the seed node into the same networks as the management cluster nodes, which requires custom configuration of the seed node.

      • Update the network objects definition in templates/bm/ipam-objects.yaml.template according to the environment configuration. By default, this template implies the use of separate PXE and management networks.

      • Manually set all parameters that start with SET_.

      Example of the default L2 template snippet for a management cluster:

      bonds:
        bond0:
          interfaces:
            - {{ nic 0 }}
            - {{ nic 1 }}
          parameters:
            mode: active-backup
            primary: {{ nic 0 }}
          dhcp4: false
          dhcp6: false
          addresses:
            - {{ ip "bond0:mgmt-pxe" }}
      vlans:
        k8s-lcm:
          id: SET_VLAN_ID
          link: bond0
          addresses:
            - {{ ip "k8s-lcm:kaas-mgmt" }}
          nameservers:
            addresses: {{ nameservers_from_subnet "kaas-mgmt" }}
          routes:
            - to: 0.0.0.0/0
              via: {{ gateway_from_subnet "kaas-mgmt" }}
      

      In this example, the following configuration applies:

      • A bond of two NIC interfaces

      • A static address in the PXE network set on the bond

      • An isolated L2 segment for the management network is configured using the k8s-lcm VLAN with the static address in the management network

      • The default gateway address is in the management network

      For general concepts of configuring separate PXE and management networks for a management cluster, see Separate PXE and management networks. For the latest object templates and variable names to use since Container Cloud 2.24.0, use the following tables.

      Network parameters mapping overview

      Deployment file name

      Parameters list to update manually

      ipam-objects.yaml.template

      • SET_LB_HOST

      • SET_MGMT_ADDR_RANGE

      • SET_MGMT_CIDR

      • SET_MGMT_DNS

      • SET_MGMT_NW_GW

      • SET_MGMT_SVC_POOL

      • SET_PXE_ADDR_POOL

      • SET_PXE_ADDR_RANGE

      • SET_PXE_CIDR

      • SET_PXE_SVC_POOL

      • SET_VLAN_ID

      bootstrap.env

      • KAAS_BM_PXE_IP

      • KAAS_BM_PXE_MASK

      • KAAS_BM_PXE_BRIDGE

      The below table contains examples of mandatory parameter values to set in templates/bm/ipam-objects.yaml.template for the network scheme that has the following networks:

      • 172.16.59.0/24 - PXE network

      • 172.16.61.0/25 - management network

      Mandatory network parameters of the IPAM objects template

      Parameter

      Description

      Example value

      SET_PXE_CIDR

      The IP address of the PXE network in the CIDR notation. The minimum recommended network size is 256 addresses (/24 prefix length).

      172.16.59.0/24

      SET_PXE_SVC_POOL

      The IP address range to use for endpoints of load balancers in the PXE network for the Container Cloud services: Ironic-API, DHCP server, HTTP server, and caching server. The minimum required range size is 5 addresses.

      172.16.59.6-172.16.59.15

      SET_PXE_ADDR_POOL

      The IP address range in the PXE network to use for dynamic address allocation for hosts during inspection and provisioning.

      The minimum recommended range size is 30 addresses for management cluster nodes if it is located in a separate PXE network segment. Otherwise, it depends on the number of managed cluster nodes to deploy in the same PXE network segment as the management cluster nodes.

      172.16.59.51-172.16.59.200

      SET_PXE_ADDR_RANGE

      The IP address range in the PXE network to use for static address allocation on each management cluster node. The minimum recommended range size is 6 addresses.

      172.16.59.41-172.16.59.50

      SET_MGMT_CIDR

      The address of a management network for the management cluster in the CIDR notation. If managed clusters will have their separate LCM networks, those networks must be routable to the management network. The minimum recommended network size is 128 addresses (/25 prefix length).

      172.16.61.0/25

      SET_MGMT_NW_GW

      The default gateway address in the management network. This gateway must provide access to the OOB network of the Container Cloud cluster and to the Internet to download the Mirantis artifacts.

      172.16.61.1

      SET_LB_HOST

      The IP address of the externally accessible MKE API endpoint of the cluster in the CIDR notation. This address must be within the management SET_MGMT_CIDR network but must NOT overlap with any other addresses or address ranges within this network. External load balancers are not supported.

      172.16.61.5/32

      SET_MGMT_DNS

      An external (non-Kubernetes) DNS server accessible from the management network.

      8.8.8.8

      SET_MGMT_ADDR_RANGE

      The IP address range that includes addresses to be allocated to bare metal hosts in the management network for the management cluster.

      When this network is shared with managed clusters, the size of this range limits the number of hosts that can be deployed in all clusters sharing this network.

      When this network is solely used by a management cluster, the range must include at least 6 addresses for bare metal hosts of the management cluster.

      172.16.61.30-172.16.61.40

      SET_MGMT_SVC_POOL

      The IP address range to use for the externally accessible endpoints of load balancers in the management network for the Container Cloud services, such as Keycloak, web UI, and so on. The minimum required range size is 19 addresses.

      172.16.61.10-172.16.61.29

      SET_VLAN_ID

      The VLAN ID used for isolation of management network. The bootstrap.sh process and the seed node must have routable access to the network in this VLAN.

      3975

      When using separate PXE and management networks, the management cluster services are exposed in different networks using two separate MetalLB address pools:

      • Services exposed through the PXE network are as follows:

        • Ironic API as a bare metal provisioning server

        • HTTP server that provides images for network boot and server provisioning

        • Caching server for accessing the Container Cloud artifacts deployed on hosts

      • Services exposed through the management network are all other Container Cloud services, such as Keycloak, web UI, and so on.

      The default MetalLB configuration described in the MetalLBConfigTemplate object template of templates/bm/ipam-objects.yaml.template uses two separate MetalLB address pools. Also, it uses the interfaces selector in its l2Advertisements template.

      Caution

      When you change the L2Template object template in templates/bm/ipam-objects.yaml.template, ensure that interfaces listed in the interfaces field of the MetalLBConfigTemplate.spec.templates.l2Advertisements section match those used in your L2Template. For details about the interfaces selector, see API Reference: MetalLBConfigTemplate spec.

      See Configure MetalLB for details on MetalLB configuration.

    5. If you require all Internet traffic to go through a proxy server, in bootstrap.env, add the following environment variables to bootstrap the cluster using proxy:

      • HTTP_PROXY

      • HTTPS_PROXY

      • NO_PROXY

      • PROXY_CA_CERTIFICATE_PATH

      Example snippet:

      export HTTP_PROXY=http://proxy.example.com:3128
      export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
      export NO_PROXY=172.18.10.0,registry.internal.lan
      export PROXY_CA_CERTIFICATE_PATH="/home/ubuntu/.mitmproxy/mitmproxy-ca-cert.cer"
      

      The following formats of variables are accepted:

      Proxy configuration data

      Variable

      Format

      HTTP_PROXY
      HTTPS_PROXY
      • http://proxy.example.com:port - for anonymous access.

      • http://user:password@proxy.example.com:port - for restricted access.

      NO_PROXY

      Comma-separated list of IP addresses or domain names.

      PROXY_CA_CERTIFICATE_PATH

      Optional. Absolute path to the proxy CA certificate for man-in-the-middle (MITM) proxies. Must be placed on the bootstrap node to be trusted. For details, see Install a CA certificate for a MITM proxy on a bootstrap node.

      Warning

      If you require Internet access to go through a MITM proxy, ensure that the proxy has streaming enabled as described in Enable streaming for MITM.

      Note

      For MOSK-based deployments, the parameter is generally available since MOSK 22.4.

      For implementation details, see Proxy and cache support.

      For the requirements on Mirantis resources and IP ranges that must be accessible from the bootstrap, management, and regional clusters, see Requirements for a baremetal-based cluster.

      For the default network addresses used by Swarm, see Default network addresses.

    6. In templates/bm/cluster.yaml.template, update the cluster-related settings to fit your deployment.

    7. Configure NTP server.

      You can optionally disable NTP that is enabled by default. This option disables the management of chrony configuration by Container Cloud to use your own system for chrony management. Otherwise, configure the regional NTP server parameters as described below.

      NTP configuration

      Configure the regional NTP server parameters to be applied to all machines of regional and managed clusters in the specified region.

      In templates/bm/cluster.yaml.template, add the ntp:servers section with the list of required server names:

      spec:
        ...
        providerSpec:
          value:
            kaas:
            ...
            ntpEnabled: true
              regional:
                - helmReleases:
                  - name: <providerName>-provider
                    values:
                      config:
                        lcm:
                          ...
                          ntp:
                            servers:
                            - 0.pool.ntp.org
                            ...
                  provider: <providerName>
                  ...
      

      To disable NTP:

      spec:
        ...
        providerSpec:
          value:
            ...
            ntpEnabled: false
            ...
      
    8. Verify that the kaas-bootstrap directory contains the following files:

      # tree  ~/kaas-bootstrap
        ~/kaas-bootstrap/
        ....
        ├── bootstrap.sh
        ├── kaas
        ├── mirantis.lic
        ├── releases
        ...
        ├── templates
        ....
                     ├── bm
                                  ├── baremetalhostprofiles.yaml.template
                                  ├── baremetalhosts.yaml.template
                                  ├── cluster.yaml.template
                                  ├── ipam-objects.yaml.template
                                  └── machines.yaml.template
                                  └── metallbconfig.yaml.template
        ....
        ├── templates.backup
            ....
      
    9. Export the following mandatory parameters using the commands and table below:

      export KAAS_BM_ENABLED="true"
      #
      export KAAS_BM_PXE_IP="172.16.59.5"
      export KAAS_BM_PXE_MASK="24"
      export KAAS_BM_PXE_BRIDGE="br0"
      #
      unset KAAS_BM_FULL_PREFLIGHT
      
      Bare metal prerequisites data

      Parameter

      Description

      Example value

      KAAS_BM_PXE_IP

      The provisioning IP address in the PXE network. This address will be assigned on the seed node to the interface defined by the KAAS_BM_PXE_BRIDGE parameter described below. The PXE service of the bootstrap cluster uses this address to network boot bare metal hosts.

      172.16.59.5

      KAAS_BM_PXE_MASK

      The PXE network address prefix length to be used with the KAAS_BM_PXE_IP address when assigning it to the seed node interface.

      24

      KAAS_BM_PXE_BRIDGE

      The PXE network bridge name that must match the name of the bridge created on the seed node during the Prepare the seed node stage.

      br0

    10. Optional. Enable the Linux Audit daemon auditd to monitor activity of cluster processes and prevent potential malicious activity.

      Configuration for auditd

      In templates/bm/cluster.yaml.template, add the auditd parameters:

      spec:
        providerSpec:
          value:
            audit:
              auditd:
                enabled: <bool>
                enabledAtBoot: <bool>
                backlogLimit: <int>
                maxLogFile: <int>
                maxLogFileAction: <string>
                maxLogFileKeep: <int>
                mayHaltSystem: <bool>
                presetRules: <string>
                customRules: <string>
                customRulesX32: <text>
                customRulesX64: <text>
      

      Configuration parameters for auditd:

      enabled

      Boolean, default - false. Enables the auditd role to install the auditd packages and configure rules. CIS rules: 4.1.1.1, 4.1.1.2.

      enabledAtBoot

      Boolean, default - false. Configures grub to audit processes that can be audited even if they start up prior to auditd startup. CIS rule: 4.1.1.3.

      backlogLimit

      Integer, default - none. Configures the backlog to hold records. If during boot audit=1 is configured, the backlog holds 64 records. If more than 64 records are created during boot, auditd records will be lost with a potential malicious activity being undetected. CIS rule: 4.1.1.4.

      maxLogFile

      Integer, default - none. Configures the maximum size of the audit log file. Once the log reaches the maximum size, it is rotated and a new log file is created. CIS rule: 4.1.2.1.

      maxLogFileAction

      String, default - none. Defines handling of the audit log file reaching the maximum file size. Allowed values:

      • keep_logs - rotate logs but never delete them

      • rotate - add a cron job to compress rotated log files and keep maximum 5 compressed files.

      • compress - compress log files and keep them under the /var/log/auditd/ directory. Requires auditd_max_log_file_keep to be enabled.

      CIS rule: 4.1.2.2.

      maxLogFileKeep

      Integer, default - 5. Defines the number of compressed log files to keep under the /var/log/auditd/ directory. Requires auditd_max_log_file_action=compress. CIS rules - none.

      mayHaltSystem

      Boolean, default - false. Halts the system when the audit logs are full. Applies the following configuration:

      • space_left_action = email

      • action_mail_acct = root

      • admin_space_left_action = halt

      CIS rule: 4.1.2.3.

      customRules

      String, default - none. Base64-encoded content of the 60-custom.rules file for any architecture. CIS rules - none.

      customRulesX32

      String, default - none. Base64-encoded content of the 60-custom.rules file for the i386 architecture. CIS rules - none.

      customRulesX64

      String, default - none. Base64-encoded content of the 60-custom.rules file for the x86_64 architecture. CIS rules - none.

      presetRules

      String, default - none. Comma-separated list of the following built-in preset rules:

      • access

      • actions

      • delete

      • docker

      • identity

      • immutable

      • logins

      • mac-policy

      • modules

      • mounts

      • perm-mod

      • privileged

      • scope

      • session

      • system-locale

      • time-change

      You can use two keywords for these rules:

      • none - disables all built-in rules.

      • all - enables all built-in rules. With this key, you can add the ! prefix to a rule name to exclude some rules. You can use the ! prefix for rules only if you add the all keyword as the first rule. Place a rule with the ! prefix only after the all keyword.

      Example configurations:

      • presetRules: none - disable all preset rules

      • presetRules: docker - enable only the docker rules

      • presetRules: access,actions,logins - enable only the access, actions, and logins rules

      • presetRules: all - enable all preset rules

      • presetRules: all,!immutable,!sessions - enable all preset rules except immutable and sessions


      CIS controls
      4.1.3 (time-change)
      4.1.4 (identity)
      4.1.5 (system-locale)
      4.1.6 (mac-policy)
      4.1.7 (logins)
      4.1.8 (session)
      4.1.9 (perm-mod)
      4.1.10 (access)
      4.1.11 (privileged)
      4.1.12 (mounts)
      4.1.13 (delete)
      4.1.14 (scope)
      4.1.15 (actions)
      4.1.16 (modules)
      4.1.17 (immutable)
      Docker CIS controls
      1.1.4
      1.1.8
      1.1.10
      1.1.12
      1.1.13
      1.1.15
      1.1.16
      1.1.17
      1.1.18
      1.2.3
      1.2.4
      1.2.5
      1.2.6
      1.2.7
      1.2.10
      1.2.11
    11. Optional. Enable WireGuard for traffic encryption on the Kubernetes workloads network.

      WireGuard configuration
      1. Ensure that the Calico MTU size is at least 60 bytes smaller than the interface MTU size of the workload network. IPv4 WireGuard uses a 60-byte header. For details, see Set the MTU size for Calico.

      2. In templates/bm/cluster.yaml.template, enable WireGuard by adding the secureOverlay parameter:

        spec:
          ...
          providerSpec:
            value:
              ...
              secureOverlay: true
        

        Caution

        Changing this parameter on a running cluster causes a downtime that can vary depending on the cluster size.

      For more details about WireGuard, see Calico documentation: Encrypt in-cluster pod traffic.

    12. Optional. Enable custom host names for cluster machines. When enabled, any machine host name in a particular region matches the related Machine object name. For example, instead of the default kaas-node-<UID>, a machine host name will be master-0. The custom naming format is more convenient and easier to operate with.

      To enable the feature on the management or regional and its future managed clusters, add the following environment variable:

      export CUSTOM_HOSTNAMES=true
      
    13. Optional. Configure external identity provider for IAM.

    14. Optional. Enable infinite timeout for all bootstrap stages by exporting the following environment variable or adding it to bootstrap.env:

      export KAAS_BOOTSTRAP_INFINITE_TIMEOUT=true
      

      Infinite timeout prevents the bootstrap failure due to timeout. This option is useful in the following cases:

      • The network speed is slow for artifacts downloading

      • An infrastructure configuration does not allow booting fast

      • A bare-metal node inspecting presupposes more than two HDDSATA disks to attach to a machine

    15. Optional. Available since Container Cloud 2.23.0. Customize the cluster and region name by exporting the following environment variables or adding them to bootstrap.env:

      export REGION=<customRegionName>
      export CLUSTER_NAME=<customClusterName>
      

      By default, the system uses region-one for the region name and kaas-mgmt for the management cluster name.

  6. Run the bootstrap script:

    ./bootstrap.sh all
    
    • In case of deployment issues, refer to Troubleshooting and inspect logs.

    • If the script fails for an unknown reason:

      1. Run the cleanup script:

        ./bootstrap.sh cleanup
        
      2. Rerun the bootstrap script.

    Warning

    During the bootstrap process, do not manually restart or power off any of the bare metal hosts.

  7. When the bootstrap is complete, collect and save the following management cluster details in a secure location:

    • The kubeconfig file located in the same directory as the bootstrap script. This file contains the admin credentials for the management cluster.

    • The private ssh_key for access to the management cluster nodes that is located in the same directory as the bootstrap script.

      Note

      If the initial version of your Container Cloud management cluster was earlier than 2.6.0, ssh_key is named openstack_tmp and is located at ~/.ssh/.

    • The URL for the Container Cloud web UI.

      To create users with permissions required for accessing the Container Cloud web UI, see Create initial users after a management cluster bootstrap.

    • The StackLight endpoints. For details, see Access StackLight web UIs.

    • The Keycloak URL that the system outputs when the bootstrap completes. The admin password for Keycloak is located in kaas-bootstrap/passwords.yml along with other IAM passwords.

    Note

    The Container Cloud web UI and StackLight endpoints are available through Transport Layer Security (TLS) and communicate with Keycloak to authenticate users. Keycloak is exposed using HTTPS and self-signed TLS certificates that are not trusted by web browsers.

    To use your own TLS certificates for Keycloak, refer to Configure TLS certificates for cluster applications.

    Note

    When the bootstrap is complete, the bootstrap cluster resources are freed up.

  8. Verify that network addresses used on your clusters do not overlap with the following default MKE network addresses for Swarm and MCR:

    • 10.0.0.0/16 is used for Swarm networks. IP addresses from this network are virtual.

    • 10.99.0.0/16 is used for MCR networks. IP addresses from this network are allocated on hosts.

    Verification of Swarm and MCR network addresses

    To verify Swarm and MCR network addresses, run on any master node:

    docker info
    

    Example of system response:

    Server:
     ...
     Swarm:
      ...
      Default Address Pool: 10.0.0.0/16
      SubnetSize: 24
      ...
     Default Address Pools:
       Base: 10.99.0.0/16, Size: 20
     ...
    

    Not all of Swarm and MCR addresses are usually in use. One Swarm Ingress network is created by default and occupies the 10.0.0.0/24 address block. Also, three MCR networks are created by default and occupy three address blocks: 10.99.0.0/20, 10.99.16.0/20, 10.99.32.0/20.

    To verify the actual networks state and addresses in use, run:

    docker network ls
    docker network inspect <networkName>
    
  9. Optional. If you plan to use multiple L2 segments for provisioning of managed cluster nodes, consider the requirements specified in Configure multiple DHCP ranges using Subnet resources.

  10. Optional. Deploy an additional regional cluster of a different provider type as described in Deploy an additional regional cluster (optional).