Add a machine

After you create a new managed cluster that is based on Equinix Metal as described in Create a managed cluster, proceed with adding machines to this cluster using the Container Cloud web UI.

You can also use the instruction below to scale up an existing managed cluster.

To add a machine to a managed cluster:

  1. Available since Container Cloud 2.22.0 as Technology Preview. If you enabled the custom parameter in the providerSpec.value.network section of the Cluster object, customize network configuration for the cluster machines:

  2. Verify that the servers of a particular type and data center combination are available for the machines deployment as described in Verify the capacity of the Equinix Metal facility.

  3. Log in to the Container Cloud web UI with the m:kaas:namespace@operator or m:kaas:namespace@writer permissions.

  4. Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.

  5. In the Clusters tab, click the required cluster name. The cluster page with Machines list opens.

  6. On the cluster page, click Create Machine.

  7. Fill out the form with the following parameters as required:

    Container Cloud machine configuration

    Parameter

    Description

    Create Machines Pool

    Select to create a set of machines with the same provider spec to manage them as a single unit. Enter the machine pool name in the Pool Name field.

    Count

    Specify the number of machines to create. If you create a machine pool, specify the replicas count of the pool.

    Select Manager or Worker to create a Kubernetes manager or worker node.

    Caution

    The required minimum number of manager machines is three for HA. A cluster can have more than three manager machines but only an odd number of machines.

    In an even-sized cluster, an additional machine remains in the Pending state until an extra manager machine is added. An even number of manager machines does not provide additional fault tolerance but increases the number of node required for etcd quorum.

    The required minimum number of worker machines for the Container Cloud workloads is two. If the multiserver mode is enabled for StackLight, add three worker nodes.

    Machine Type

    Machine type to provision the Equinix Metal server. From the drop-down list, select a server to provision for your project. Pay attention to the machine capacity:

    • Normal - the facility has a lot of available machines. Prioritize this machine type over others.

    • Limited - the facility has a limited number of machines. Do not request many machines of this type.

    • Unknown - Container Cloud cannot fetch information about the capacity level since the feature is disabled.

      Enable the feature with a user-level token in the Credentials object used for cluster deployment. To add a user-level token:

      1. Log in to the Equinix Metal console.

      2. Select the project used for the Container Cloud deployment.

      3. In Profile Settings > Personal API Keys, capture the existing API Key or create a new one:

        1. Click Add New Key.

        2. Fill in the Description and select the Read/Write permissions.

        3. Click Add Key.

      4. In the Credentials tab of the Container Cloud web UI, add the user-level token obtained in the previous step.

    Warning

    Mirantis highly recommends using the c3.small.x86 machine type for the control plane machines deployed with private network to prevent hardware issues with incorrect BIOS boot order.

    Hardware Reservation ID Technology Preview

    Optional. The ID of an Equinix Metal reserved hardware.

    • Fill out this field to use a reserved hardware on your Equinix Metal server.

    • Skip this field if you are deploying Equinix Metal servers on demand.

    Ceph Roles

    Available if Manual Ceph Configuration was selected during the cluster creation.

    • Select the Ceph machine role for Ceph Controller to automatically create the Ceph node based on the Equinix machine hardware storage:

      • Monitor and Manager to deploy Ceph Monitor and Ceph Manager

      • Storage to deploy Ceph OSD

    • To specify the Ceph node manually through the KaaSCephCluster resource, do not select the Ceph machine role.

    Recommended minimal number of Ceph node roles:

    Storage

    Manager and Monitor

    1-2

    1

    3-500

    3 (for HA)

    > 500

    5

    Upgrade Index

    Optional. A positive numeral value that defines the order of machine upgrade during a cluster update.

    Note

    You can change the upgrade order later on an existing cluster. For details, see Change the upgrade order of a machine or machine pool.

    Consider the following upgrade index specifics:

    • The first machine to upgrade is always one of the control plane machines with the lowest upgradeIndex. Other control plane machines are upgraded one by one according to their upgrade indexes.

    • If the Cluster spec dedicatedControlPlane field is false, worker machines are upgraded only after the upgrade of all control plane machines finishes. Otherwise, they are upgraded after the first control plane machine, concurrently with other control plane machines.

    • If several machines have the same upgrade index, they have the same priority during upgrade.

    • If the value is not set, the machine is automatically assigned a value of the upgrade index.

    Node Labels

    Select the required node labels for the worker machine to run certain components on a specific node. For example, for the StackLight nodes that run OpenSearch and require more resources than a standard node, select the StackLight label. The list of available node labels is obtained from allowedNodeLabels of your current Cluster release.

    If the value field is not defined in allowedNodeLabels, select the check box of the required label and define an appropriate custom value for this label to be set to the node. For example, the node-type label can have the storage-ssd value to meet the service scheduling logic on a particular machine.

    Note

    Due to the known issue 23002 fixed in Container Cloud 2.21.0, a custom value for a predefined node label cannot be set using the Container Cloud web UI. For a workaround, refer to the issue description.

    Caution

    If you deploy StackLight in the HA mode (recommended):

    • Add the StackLight label to minimum three worker nodes. Otherwise, StackLight will not be deployed until the required number of worker nodes is configured with the StackLight label.

    • Removal of the StackLight label from worker nodes along with removal of worker nodes with StackLight label can cause the StackLight components to become inaccessible. It is important to correctly maintain the worker nodes where the StackLight local volumes were provisioned. For details, see Delete a cluster machine.

      To obtain the list of nodes where StackLight is deployed, refer to Upgrade managed clusters with StackLight deployed in HA mode.

    If you move the StackLight label to a new worker machine on an existing cluster, manually deschedule all StackLight components from the old worker machine, which you remove the StackLight label from. For details, see Deschedule StackLight Pods from a worker machine.

    Note

    You can add node labels after deploying a worker machine. On the Machines page, click the More action icon in the last column of the required machine field and select Configure machine.

  8. Click Create.

  9. Repeat the steps above for the remaining machines.

    Monitor the deploy or update live status of the machine:

    • Quick status

      On the Clusters page, in the Managers or Workers column. The green status icon indicates that the machine is Ready, the orange status icon indicates that the machine is Updating.

    • Detailed status

      In the Machines section of a particular cluster page, in the Status column. Hover over a particular machine status icon to verify the deploy or update status of a specific machine component.

    You can monitor the status of the following machine components:

    Component

    Description

    Kubelet

    Readiness of a node in a Kubernetes cluster

    Swarm

    Health and readiness of a node in a Docker Swarm cluster

    LCM

    LCM readiness status of a node

    ProviderInstance

    Readiness of a node in the underlying infrastructure (virtual or bare metal, depending on the provider type)

    The machine creation starts with the Provision status. During provisioning, the machine is not expected to be accessible since its infrastructure (VM, network, and so on) is being created.

    Other machine statuses are the same as the LCMMachine object states:

    1. Uninitialized - the machine is not yet assigned to an LCMCluster.

    2. Pending - the agent reports a node IP address and host name.

    3. Prepare - the machine executes StateItems that correspond to the prepare phase. This phase usually involves downloading the necessary archives and packages.

    4. Deploy - the machine executes StateItems that correspond to the deploy phase that is becoming a Mirantis Kubernetes Engine (MKE) node.

    5. Ready - the machine is being deployed.

    6. Upgrade - the machine is being upgraded to the new MKE version.

    7. Reconfigure - the machine executes StateItems that correspond to the reconfigure phase. The machine configuration is being updated without affecting workloads running on the machine.

    Once the status changes to Ready, the deployment of the cluster components on this machine is complete.

    You can also monitor the live machine status using API:

    kubectl get machines <machineName> -o wide
    

    Example of system response since Container Cloud 2.23.0:

    NAME   READY LCMPHASE  NODENAME              UPGRADEINDEX  REBOOTREQUIRED  WARNINGS
    demo-0 true  Ready     kaas-node-c6aa8ad3    1             false
    

    For the history of a machine deployment or update, refer to Inspect the history of a cluster and machine deployment or update.

    Caution

    If a machine is stuck in the Provision state due to the exceeded machine quota, the Provider Instance field of a machine live status contains the Machine quota exceeded message. Delete such machine using the More menu located in the last column of the machine details.

    If the minimal machine requirement is not met as described in Requirements for an Equinix Metal based cluster, create a new machine with the Normal machine capacity label before you can delete the stuck machine to proceed with cluster deployment.

  10. Verify the status of the cluster nodes as described in Connect to a Mirantis Container Cloud cluster.

  11. Establish connection to the cluster private network.

Warning

An operational managed cluster must contain a minimum of 3 Kubernetes manager nodes to meet the etcd quorum and 2 Kubernetes worker nodes.

The deployment of the cluster does not start until the minimum number of nodes is created.

A machine with the manager node role is automatically deleted during the cluster deletion.

Deletion of the manager nodes is allowed for non-MOSK-based clusters within the Technology Preview features scope for the purpose of node replacement or recovery.