Configure multiple DHCP ranges using Subnet resources

Caution

Since Container Cloud 2.21.0, this section applies to existing management clusters only. If you configured multiple DHCP ranges before Container Cloud 2.21.0 during the management cluster bootstrap, the DHCP configuration will automatically migrate to Subnet objects after cluster upgrade to 2.21.0.

To facilitate multi-rack and other types of distributed bare metal datacenter topologies, the dnsmasq DHCP server used for host provisioning in Container Cloud supports working with multiple L2 segments through network routers that support DHCP relay.

Container Cloud has its own DHCP relay running on one of the management cluster nodes. That DHCP relay serves for proxying DHCP requests in the same L2 domain where the management cluster nodes are located.

Caution

Networks used for hosts provisioning of a managed cluster must have routes to the PXE network (when a dedicated PXE network is configured) or to the combined PXE/management network of the management cluster. This configuration enables hosts to have access to the management cluster services that are used during host provisioning.

Management cluster nodes must have routes through the PXE network to PXE network segments used on a managed cluster. The following example contains L2 template fragments for a management cluster node:

l3Layout:
  # PXE/static subnet for a management cluster
  - scope: namespace
    subnetName: kaas-mgmt-pxe
    labelSelector:
      kaas-mgmt-pxe-subnet: "1"
  # management (LCM) subnet for a management cluster
  - scope: namespace
    subnetName: kaas-mgmt-lcm
    labelSelector:
      kaas-mgmt-lcm-subnet: "1"
  # PXE/dhcp subnets for a managed cluster
  - scope: namespace
    subnetName: managed-dhcp-rack-1-region-one
  - scope: namespace
    subnetName: managed-dhcp-rack-2-region-one
  - scope: namespace
    subnetName: managed-dhcp-rack-3-region-one
  ...
npTemplate: |
  ...
  bonds:
    bond0:
      interfaces:
        - {{ nic 0 }}
        - {{ nic 1 }}
      parameters:
        mode: active-backup
        primary: {{ nic 0 }}
        mii-monitor-interval: 100
      dhcp4: false
      dhcp6: false
      addresses:
        # static address on management node in the PXE network
        - {{ ip "bond0:kaas-mgmt-pxe" }}
      routes:
        # routes to managed PXE network segments
        - to: {{ cidr_from_subnet "managed-dhcp-rack-1-region-one" }}
          via: {{ gateway_from_subnet "kaas-mgmt-pxe" }}
        - to: {{ cidr_from_subnet "managed-dhcp-rack-2-region-one" }}
          via: {{ gateway_from_subnet "kaas-mgmt-pxe" }}
        - to: {{ cidr_from_subnet "managed-dhcp-rack-3-region-one" }}
          via: {{ gateway_from_subnet "kaas-mgmt-pxe" }}
        ...

To configure DHCP ranges for dnsmasq, create the Subnet objects tagged with the ipam/SVC-dhcp-range label while setting up subnets for a managed cluster using CLI.

For every dhcp-range record, Container Cloud also configures the dhcp-option record to pass the default route through the default gateway from the corresponding subnet to all hosts that obtain addresses from that DHCP range. They will be configured by Container Cloud using another dhcp-option record.

Caution

Support of multiple DHCP ranges has the following imitations:

  • Using of custom DNS server addresses for servers that boot over PXE is not supported.

  • The Subnet objects for DHCP ranges should not reference any specific cluster, as DHCP server configuration is only applicable to the management or regional cluster. The kaas.mirantis.com/region label that specifies the region will be used to determine where to apply the DHCP ranges from the given Subnet object. The Cluster reference will be ignored.

Note

Usage of multiple and single DHCP ranges is as follows:

  • The baremetal-operator chart allows using multiple DHCP ranges in the generated dnsmasq.conf file. The chart iterates over a list of the dhcp-range parameters from its values and adds all items from the list to the dnsmasq configuration.

  • The baremetal-operator chart allows using single DHCP range for backwards compatibility. By default, the KAAS_BM_BM_DHCP_RANGE environment variable is still used to define the DHCP range for a management or regional cluster nodes during provisioning.

Override the default dnsmasq settings

Caution

Since Container Cloud 2.24.0, you can only remove the deprecated dnsmasq.dhcp_range values from the cluster spec. The Admission Controller does not accept any other changes in these values. This configuration is completely superseded by the Subnet object usage.

The dnsmasq configuration options dhcp-option=3 and dhcp-option=6 are absent in the default configuration. So, by default, dnsmasq will send the DNS server and default route to DHCP clients as defined in the dnsmasq official documentation:

  • The netmask and broadcast address are the same as on the host running dnsmasq.

  • The DNS server and default route are set to the address of the host running dnsmasq.

  • If the domain name option is set, this name is sent to DHCP clients.

If such default behavior is not desirable during deployment of managed clusters:

  1. Open the management cluster spec for editing.

  2. In the baremetal-operator release values, remove the dnsmasq.dhcp_range parameter:

    regional:
    - helmReleases:
      - name: baremetal-operator
        values:
          dnsmasq:
            dhcp_range: 10.204.1.0,10.204.5.255,255.255.255.0
    
  3. Set the desired DHCP ranges and options using the Subnet objects as described in Configure DHCP ranges for dnsmasq.

Caution

Since Container Cloud 2.21.0, the dnsmasq.dhcp_range parameter of the baremetal-operator Helm chart values in the Cluster spec is deprecated and will be removed in one of the following releases.

Therefore, migrate to the Subnet objects configuration and manually remove dnsmasq.dhcp_range from the cluster spec.

Configure DHCP ranges for dnsmasq

  1. Create the Subnet objects tagged with the ipam/SVC-dhcp-range label.

    Caution

    For cluster-specific subnets, create Subnet objects in the same namespace as the related Cluster object project. For shared subnets, create Subnet objects in the default namespace.

    To create the Subnet objects, refer to Create subnets.

    Use the following Subnet object example to specify DHCP ranges and DHCP options to pass the default route address:

    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: mgmt-dhcp-range
      namespace: default
      labels:
        ipam/SVC-dhcp-range: ""
        kaas.mirantis.com/provider: baremetal
        kaas.mirantis.com/region: region-one
    spec:
      cidr: 10.0.0.0/24
      gateway: 10.0.0.1
      includeRanges:
        - 10.0.0.121-10.0.0.125
        - 10.0.0.191-10.0.0.199
    

    Note

    Setting of custom nameservers for the PXE subnet is not supported.

    After creation of the above Subnet object, the provided data will be utilized to render the Dnsmasq object used for configuration of the dnsmasq deployment. You do not have to manually edit the Dnsmasq object.

  2. Verify that the changes are applied to the Dnsmasq object:

    kubectl --kubeconfig <pathToMgmtOrRegionalClusterKubeconfig> \
    -n kaas get dnsmasq dnsmasq-dynamic-config -o json
    

Configure DHCP relay on ToR switches

For servers to access the DHCP server across the L2 segment boundaries, for example, from another rack with a different VLAN for PXE network, you must configure DHCP relay (agent) service on the border switch of the segment. For example, on a top-of-rack (ToR) or leaf (distribution) switch, depending on the data center network topology.

Warning

To ensure predictable routing for the relay of DHCP packets, Mirantis strongly advises against the use of chained DHCP relay configurations. This precaution limits the number of hops for DHCP packets, with an optimal scenario being a single hop.

This approach is justified by the unpredictable nature of chained relay configurations and potential incompatibilities between software and hardware relay implementations.

The dnsmasq server listens on the PXE network of the management cluster by using the dhcp-lb Kubernetes Service.

To configure the DHCP relay service, specify the external address of the dhcp-lb Kubernetes Service as an upstream address for the relayed DHCP requests, which is the IP helper address for DHCP. There is the dnsmasq deployment behind this service that can only accept relayed DHCP requests.

Container Cloud has its own DHCP relay running on one of the management cluster nodes. That DHCP relay serves for proxying DHCP requests in the same L2 domain where the management cluster nodes are located.

To obtain the actual IP address issued to the dhcp-lb Kubernetes Service:

kubectl -n kaas get service dhcp-lb

The dnsmasq server listens on the PXE interface of one management cluster node.

To configure DHCP relay service, specify the management cluster node addresses in the PXE network as upstream addresses for the relayed DHCP requests, which are IP helper addresses for DHCP.

Depending on the PXE network setup, select from the following options:

  • If the PXE network is combined with the management network, identify LCM addresses of the management cluster nodes:

    kubectl -n default get lcmmachine -o wide
    

    In the output, select the addresses from the INTERNALIP column to use as the DHCP helper addresses.

  • If you use a dedicated PXE network, identify the addresses assigned to your nodes using the corresponding IpamHost objects:

    kubectl -n default get ipamhost -o yaml
    

    In status.netconfigV2 of each management cluster host, obtain the interface name used for PXE network and collect associated addresses to use as the DHCP helper addresses. For example:

    status:
      ...
      netconfigV2:
        ...
        bridges:
          ...
          k8s-pxe:
            addresses:
            - 10.0.1.4/24
            dhcp4: false
            dhcp6: false
            interfaces:
            - ens3
    

    In this example, k8s-pxe is the PXE interface name and 10.0.1.4 is the address to use as one of the DHCP helper addresses.

    Caution

    The following fields of the ipamHost status are renamed since Container Cloud 2.22.0 in the scope of the L2Template and IpamHost objects refactoring:

    • netconfigV2 to netconfigCandidate

    • netconfigV2state to netconfigCandidateState

    • netconfigFilesState to netconfigFilesStates (per file)

    No user actions are required after renaming.

    The format of netconfigFilesState changed after renaming. The netconfigFilesStates field contains a dictionary of statuses of network configuration files stored in netconfigFiles. The dictionary contains the keys that are file paths and values that have the same meaning for each file that netconfigFilesState had:

    • For a successfully rendered configuration file: OK: <timestamp> <sha256-hash-of-rendered-file>, where a timestamp is in the RFC 3339 format.

    • For a failed rendering: ERR: <error-message>.