Mirantis Container Cloud Documentation

The documentation is intended to help operators understand the core concepts of the product.

The information provided in this documentation set is being constantly improved and amended based on the feedback and kind requests from our software consumers. This documentation set outlines description of the features supported within three latest Container Cloud minor releases and their supported Cluster releases, with a corresponding note Available since <release-version>.

The following table lists the guides included in the documentation set you are reading:

Guides list

Guide

Purpose

Reference Architecture

Learn the fundamentals of Container Cloud reference architecture to plan your deployment.

Deployment Guide

Deploy Container Cloud of a preferred configuration using supported deployment profiles tailored to the demands of specific business cases.

Operations Guide

Deploy and operate the Container Cloud managed clusters.

Release Compatibility Matrix

Deployment compatibility of the Container Cloud components versions for each product release.

Release Notes

Learn about new features and bug fixes in the current Container Cloud version as well as in the Container Cloud minor releases.

QuickStart Guides

Easy and lightweight instructions to get started with Container Cloud.

Intended audience

This documentation assumes that the reader is familiar with network and cloud concepts and is intended for the following users:

  • Infrastructure Operator

    • Is member of the IT operations team

    • Has working knowledge of Linux, virtualization, Kubernetes API and CLI, and OpenStack to support the application development team

    • Accesses Mirantis Container Cloud and Kubernetes through a local machine or web UI

    • Provides verified artifacts through a central repository to the Tenant DevOps engineers

  • Tenant DevOps engineer

    • Is member of the application development team and reports to line-of-business (LOB)

    • Has working knowledge of Linux, virtualization, Kubernetes API and CLI to support application owners

    • Accesses Container Cloud and Kubernetes through a local machine or web UI

    • Consumes artifacts from a central repository approved by the Infrastructure Operator

Conventions

This documentation set uses the following conventions in the HTML format:

Documentation conventions

Convention

Description

boldface font

Inline CLI tools and commands, titles of the procedures and system response examples, table titles.

monospaced font

Files names and paths, Helm charts parameters and their values, names of packages, nodes names and labels, and so on.

italic font

Information that distinguishes some concept or term.

Links

External links and cross-references, footnotes.

Main menu > menu item

GUI elements that include any part of interactive user interface and menu navigation.

Superscript

Some extra, brief information. For example, if a feature is available from a specific release or if a feature is in the Technology Preview development stage.

Note

The Note block

Messages of a generic meaning that may be useful to the user.

Caution

The Caution block

Information that prevents a user from mistakes and undesirable consequences when following the procedures.

Warning

The Warning block

Messages that include details that can be easily missed, but should not be ignored by the user and are valuable before proceeding.

See also

The See also block

List of references that may be helpful for understanding of some related tools, concepts, and so on.

Learn more

The Learn more block

Used in the Release Notes to wrap a list of internal references to the reference architecture, deployment and operation procedures specific to a newly implemented product feature.

Technology Preview features

A Technology Preview feature provides early access to upcoming product innovations, allowing customers to experiment with the functionality and provide feedback.

Technology Preview features may be privately or publicly available but neither are intended for production use. While Mirantis will provide assistance with such features through official channels, normal Service Level Agreements do not apply.

As Mirantis considers making future iterations of Technology Preview features generally available, we will do our best to resolve any issues that customers experience when using these features.

During the development of a Technology Preview feature, additional components may become available to the public for evaluation. Mirantis cannot guarantee the stability of such features. As a result, if you are using Technology Preview features, you may not be able to seamlessly update to subsequent product releases, as well as upgrade or migrate to the functionality that has not been announced as full support yet.

Mirantis makes no guarantees that Technology Preview features will graduate to generally available features.

Documentation history

The documentation set refers to Mirantis Container Cloud GA as to the latest released GA version of the product. For details about the Container Cloud GA minor releases dates, refer to Container Cloud releases.

Product Overview

Mirantis Container Cloud enables you to ship code faster by enabling speed with choice, simplicity, and security. Through a single pane of glass you can deploy, manage, and observe Kubernetes clusters on private clouds or bare metal infrastructure. Container Cloud provides the ability to leverage the following on premises cloud infrastructure: OpenStack, VMware, and bare metal.

The list of the most common use cases includes:

Multi-cloud

Organizations are increasingly moving toward a multi-cloud strategy, with the goal of enabling the effective placement of workloads over multiple platform providers. Multi-cloud strategies can introduce a lot of complexity and management overhead. Mirantis Container Cloud enables you to effectively deploy and manage container clusters (Kubernetes and Swarm) across multiple cloud provider platforms.

Hybrid cloud

The challenges of consistently deploying, tracking, and managing hybrid workloads across multiple cloud platforms is compounded by not having a single point that provides information on all available resources. Mirantis Container Cloud enables hybrid cloud workload by providing a central point of management and visibility of all your cloud resources.

Kubernetes cluster lifecycle management

The consistent lifecycle management of a single Kubernetes cluster is a complex task on its own that is made infinitely more difficult when you have to manage multiple clusters across different platforms spread across the globe. Mirantis Container Cloud provides a single, centralized point from which you can perform full lifecycle management of your container clusters, including automated updates and upgrades. Container Cloud also supports attachment of existing Mirantis Kubernetes Engine clusters that are not originally deployed by Container Cloud.

Highly regulated industries

Regulated industries need a fine level of access control granularity, high security standards and extensive reporting capabilities to ensure that they can meet and exceed the security standards and requirements. Mirantis Container Cloud provides for a fine-grained Role Based Access Control (RBAC) mechanism and easy integration and federation to existing identity management systems (IDM).

Logging, monitoring, alerting

A complete operational visibility is required to identify and address issues in the shortest amount of time – before the problem becomes serious. Mirantis StackLight is the proactive monitoring, logging, and alerting solution designed for large-scale container and cloud observability with extensive collectors, dashboards, trend reporting and alerts.

Storage

Cloud environments require a unified pool of storage that can be scaled up by simply adding storage server nodes. Ceph is a unified, distributed storage system designed for excellent performance, reliability, and scalability. Deploy Ceph utilizing Rook to provide and manage a robust persistent storage that can be used by Kubernetes workloads on the baremetal-based clusters.

Security

Security is a core concern for all enterprises, especially with more of our systems being exposed to the Internet as a norm. Mirantis Container Cloud provides for a multi-layered security approach that includes effective identity management and role based authentication, secure out of the box defaults and extensive security scanning and monitoring during the development process.

5G and Edge

The introduction of 5G technologies and the support of Edge workloads requires an effective multi-tenant solution to manage the underlying container infrastructure. Mirantis Container Cloud provides for a full stack, secure, multi-cloud cluster management and Day-2 operations solution that supports both on premises bare metal and cloud.

Reference Architecture

Overview

Mirantis Container Cloud is a set of microservices that are deployed using Helm charts and run in a Kubernetes cluster. Container Cloud is based on the Kubernetes Cluster API community initiative.

The following diagram illustrates an overview of Container Cloud and the clusters it manages:

_images/cluster-overview.png

All artifacts used by Kubernetes and workloads are stored on the Container Cloud content delivery network (CDN):

  • mirror.mirantis.com (Debian packages including the Ubuntu mirrors)

  • binary.mirantis.com (Helm charts and binary artifacts)

  • mirantis.azurecr.io (Docker image registry)

All Container Cloud components are deployed in the Kubernetes clusters. All Container Cloud APIs are implemented using the Kubernetes Custom Resource Definition (CRD) that represents custom objects stored in Kubernetes and allows you to expand Kubernetes API.

The Container Cloud logic is implemented using controllers. A controller handles the changes in custom resources defined in the controller CRD. A custom resource consists of a spec that describes the desired state of a resource provided by a user. During every change, a controller reconciles the external state of a custom resource with the user parameters and stores this external state in the status subresource of its custom resource.

Container Cloud cluster types

Caution

Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

The types of the Container Cloud clusters include:

Bootstrap cluster
  • Contains the Bootstrap web UI for the OpenStack and vSphere providers. The Bootstrap web UI support for the bare metal provider will be added in one of the following Container Cloud releases.

  • Runs the bootstrap process on a seed node that can be reused after the management cluster deployment for other purposes. For the OpenStack or vSphere provider, it can be an operator desktop computer. For the bare metal provider, this is a data center node.

  • Requires access to one of the following provider backends: bare metal, OpenStack, or vSphere.

  • Initially, the bootstrap cluster is created with the following minimal set of components: Bootstrap Controller, public API charts, and the Bootstrap web UI.

  • The user can interact with the bootstrap cluster through the Bootstrap web UI or API to create the configuration for a management cluster and start its deployment. More specifically, the user performs the following operations:

    1. Select the provider, add provider credentials.

    2. Add proxy and SSH keys.

    3. Configure the cluster and machines.

    4. Deploy a management cluster.

  • The user can monitor the deployment progress of the cluster and machines.

  • After a successful deployment, the user can download the kubeconfig artifact of the provisioned cluster.

Management cluster

Comprises Container Cloud as product and provides the following functionality:

  • Runs all public APIs and services including the web UIs of Container Cloud.

  • Does not require access to any provider backend.

  • Runs the provider-specific services and internal API including LCMMachine and LCMCluster. Also, it runs an LCM controller for orchestrating managed clusters and other controllers for handling different resources.

  • Requires two-way access to a provider backend. The provider connects to a backend to spawn managed cluster nodes, and the agent running on the nodes accesses the regional cluster to obtain the deployment information.

For deployment details of a management cluster, see Deployment Guide.

Managed cluster
  • A Mirantis Kubernetes Engine (MKE) cluster that an end user creates using the Container Cloud web UI.

  • Requires access to its management cluster. Each node of a managed cluster runs an LCM Agent that connects to the LCM machine of the management cluster to obtain the deployment details.

  • Since 2.25.2, an attached MKE cluster that is not created using Container Cloud for vSphere-based clusters. In such case, nodes of the attached cluster do not contain LCM Agent. For supported MKE versions that can be attached to Container Cloud, see Release Compatibility Matrix.

  • Baremetal-based managed clusters support the Mirantis OpenStack for Kubernetes (MOSK) product. For details, see MOSK documentation.

All types of the Container Cloud clusters except the bootstrap cluster are based on the MKE and Mirantis Container Runtime (MCR) architecture. For details, see MKE and MCR documentation.

The following diagram illustrates the distribution of services between each type of the Container Cloud clusters:

_images/cluster-types.png

Cloud provider

The Mirantis Container Cloud provider is the central component of Container Cloud that provisions a node of a management, regional, or managed cluster and runs the LCM Agent on this node. It runs in a management and regional clusters and requires connection to a provider backend.

The Container Cloud provider interacts with the following types of public API objects:

Public API object name

Description

Container Cloud release object

Contains the following information about clusters:

  • Version of the supported Cluster release for a management and regional clusters

  • List of supported Cluster releases for the managed clusters and supported upgrade path

  • Description of Helm charts that are installed on the management and regional clusters depending on the selected provider

Cluster release object

  • Provides a specific version of a management, regional, or managed cluster. Any Cluster release object, as well as a Container Cloud release object never changes, only new releases can be added. Any change leads to a new release of a cluster.

  • Contains references to all components and their versions that are used to deploy all cluster types:

    • LCM components:

      • LCM Agent

      • Ansible playbooks

      • Scripts

      • Description of steps to execute during a cluster deployment and upgrade

      • Helm Controller image references

    • Supported Helm charts description:

      • Helm chart name and version

      • Helm release name

      • Helm values

Cluster object

  • References the Credentials, KaaSRelease and ClusterRelease objects.

  • Is tied to a specific Container Cloud region and provider.

  • Represents all cluster-level resources. For example, for the OpenStack-based clusters, it represents networks, load balancer for the Kubernetes API, and so on. It uses data from the Credentials object to create these resources and data from the KaaSRelease and ClusterRelease objects to ensure that all lower-level cluster objects are created.

Machine object

  • References the Cluster object.

  • Represents one node of a managed cluster, for example, an OpenStack VM, and contains all data to provision it.

Credentials object

  • Contains all information necessary to connect to a provider backend.

  • Is tied to a specific Container Cloud region and provider.

PublicKey object

Is provided to every machine to obtain an SSH access.

The following diagram illustrates the Container Cloud provider data flow:

_images/provider-dataflow.png

The Container Cloud provider performs the following operations in Container Cloud:

  • Consumes the below types of data from a management and regional cluster:

    • Credentials to connect to a provider backend

    • Deployment instructions from the KaaSRelease and ClusterRelease objects

    • The cluster-level parameters from the Cluster objects

    • The machine-level parameters from the Machine objects

  • Prepares data for all Container Cloud components:

    • Creates the LCMCluster and LCMMachine custom resources for LCM Controller and LCM Agent. The LCMMachine custom resources are created empty to be later handled by the LCM Controller.

    • Creates the HelmBundle custom resources for the Helm Controller using data from the KaaSRelease and ClusterRelease objects.

    • Creates service accounts for these custom resources.

    • Creates a scope in Identity and access management (IAM) for a user access to a managed cluster.

  • Provisions nodes for a managed cluster using the cloud-init script that downloads and runs the LCM Agent.

  • Installs Helm Controller as a Helm v3 chart.

Release Controller

The Mirantis Container Cloud Release Controller is responsible for the following functionality:

  • Monitor and control the KaaSRelease and ClusterRelease objects present in a management cluster. If any release object is used in a cluster, the Release Controller prevents the deletion of such an object.

  • Sync the KaaSRelease and ClusterRelease objects published at https://binary.mirantis.com/releases/ with an existing management cluster.

  • Trigger the Container Cloud auto-update procedure if a new KaaSRelease object is found:

    1. Search for the managed clusters with old Cluster releases that are not supported by a new Container Cloud release. If any are detected, abort the auto-update and display a corresponding note about an old Cluster release in the Container Cloud web UI for the managed clusters. In this case, a user must update all managed clusters using the Container Cloud web UI. Once all managed clusters are updated to the Cluster releases supported by a new Container Cloud release, the Container Cloud auto-update is retriggered by the Release Controller.

    2. Trigger the Container Cloud release update of all Container Cloud components in a management cluster. The update itself is processed by the Container Cloud provider.

    3. Trigger the Cluster release update of a management cluster to the Cluster release version that is indicated in the updated Container Cloud release version. The LCMCluster components, such as MKE, are updated before the HelmBundle components, such as StackLight or Ceph.

      Once a management cluster is updated, an option to update a managed cluster becomes available in the Container Cloud web UI. During a managed cluster update, all cluster components including Kubernetes are automatically updated to newer versions if available. The LCMCluster components, such as MKE, are updated before the HelmBundle components, such as StackLight or Ceph.

The Operator can delay the Container Cloud automatic upgrade procedure for a limited amount of time or schedule upgrade to run at desired hours or weekdays. For details, see Schedule Mirantis Container Cloud updates.

Container Cloud remains operational during the management cluster upgrade. Managed clusters are not affected during this upgrade. For the list of components that are updated during the Container Cloud upgrade, see the Components versions section of the corresponding Container Cloud release in Release Notes.

When Mirantis announces support of the newest versions of Mirantis Container Runtime (MCR) and Mirantis Kubernetes Engine (MKE), Container Cloud automatically upgrades these components as well. For the maintenance window best practices before upgrade of these components, see MKE Documentation.

See also

Patch releases

Web UI

The Mirantis Container Cloud web UI is mainly designed to create and update the managed clusters as well as add or remove machines to or from an existing managed cluster.

You can use the Container Cloud web UI to obtain the management cluster details including endpoints, release version, and so on. The management cluster update occurs automatically with a new release change log available through the Container Cloud web UI.

The Container Cloud web UI is a JavaScript application that is based on the React framework. The Container Cloud web UI is designed to work on a client side only. Therefore, it does not require a special backend. It interacts with the Kubernetes and Keycloak APIs directly. The Container Cloud web UI uses a Keycloak token to interact with Container Cloud API and download kubeconfig for the management and managed clusters.

The Container Cloud web UI uses NGINX that runs on a management cluster and handles the Container Cloud web UI static files. NGINX proxies the Kubernetes and Keycloak APIs for the Container Cloud web UI.

Bare metal

The bare metal service provides for the discovery, deployment, and management of bare metal hosts.

The bare metal management in Mirantis Container Cloud is implemented as a set of modular microservices. Each microservice implements a certain requirement or function within the bare metal management system.

Bare metal components

The bare metal management solution for Mirantis Container Cloud includes the following components:

Bare metal components

Component

Description

OpenStack Ironic

The backend bare metal manager in a standalone mode with its auxiliary services that include httpd, dnsmasq, and mariadb.

OpenStack Ironic Inspector

Introspects and discovers the bare metal hosts inventory. Includes OpenStack Ironic Python Agent (IPA) that is used as a provision-time agent for managing bare metal hosts.

Ironic Operator

Monitors changes in the external IP addresses of httpd, ironic, and ironic-inspector and automatically reconciles the configuration for dnsmasq, ironic, baremetal-provider, and baremetal-operator.

Bare Metal Operator

Manages bare metal hosts through the Ironic API. The Container Cloud bare-metal operator implementation is based on the Metal³ project.

Bare metal resources manager

Ensures that the bare metal provisioning artifacts such as the distribution image of the operating system is available and up to date.

cluster-api-provider-baremetal

The plugin for the Kubernetes Cluster API integrated with Container Cloud. Container Cloud uses the Metal³ implementation of cluster-api-provider-baremetal for the Cluster API.

HAProxy

Load balancer for external access to the Kubernetes API endpoint.

LCM Agent

Used for physical and logical storage, physical and logical network, and control over the life cycle of a bare metal machine resources.

Ceph

Distributed shared storage is required by the Container Cloud services to create persistent volumes to store their data.

MetalLB

Load balancer for Kubernetes services on bare metal. 1

Keepalived

Monitoring service that ensures availability of the virtual IP for the external load balancer endpoint (HAProxy). 1

IPAM

IP address management services provide consistent IP address space to the machines in bare metal clusters. See details in IP Address Management.

1(1,2)

For details, see Built-in load balancing.

The diagram below summarizes the following components and resource kinds:

  • Metal³-based bare metal management in Container Cloud (white)

  • Internal APIs (yellow)

  • External dependency components (blue)

_images/bm-component-stack.png
Bare metal networking

This section provides an overview of the networking configuration and the IP address management in the Mirantis Container Cloud on bare metal.

IP Address Management

Mirantis Container Cloud on bare metal uses IP Address Management (IPAM) to keep track of the network addresses allocated to bare metal hosts. This is necessary to avoid IP address conflicts and expiration of address leases to machines through DHCP.

Note

Only IPv4 address family is currently supported by Container Cloud and IPAM. IPv6 is not supported and not used in Container Cloud.

IPAM is provided by the kaas-ipam controller. Its functions include:

  • Allocation of IP address ranges or subnets to newly created clusters using SubnetPool and Subnet resources.

  • Allocation IP addresses to machines and cluster services at the request of baremetal-provider using the IpamHost and IPaddr resources.

  • Creation and maintenance of host networking configuration on the bare metal hosts using the IpamHost resources.

The IPAM service can support different networking topologies and network hardware configurations on the bare metal hosts.

In the most basic network configuration, IPAM uses a single L3 network to assign addresses to all bare metal hosts, as defined in Managed cluster networking.

You can apply complex networking configurations to a bare metal host using the L2 templates. The L2 templates imply multihomed host networking and enable you to create a managed cluster where nodes use separate host networks for different types of traffic. Multihoming is required to ensure the security and performance of a managed cluster.

Caution

Modification of L2 templates in use is allowed with a mandatory validation step from the Infrastructure Operator to prevent accidental cluster failures due to unsafe changes. The list of risks posed by modifying L2 templates includes:

  • Services running on hosts cannot reconfigure automatically to switch to the new IP addresses and/or interfaces.

  • Connections between services are interrupted unexpectedly, which can cause data loss.

  • Incorrect configurations on hosts can lead to irrevocable loss of connectivity between services and unexpected cluster partition or disassembly.

For details, see Modify network configuration on an existing machine.

Management cluster networking

The main purpose of networking in a Container Cloud management cluster is to provide access to the Container Cloud Management API that consists of the Kubernetes API of the Container Cloud management cluster and the Container Cloud LCM API. This API allows end users to provision and configure managed clusters and machines. Also, this API is used by LCM agents in managed clusters to obtain configuration and report status.

The following types of networks are supported for the management clusters in Container Cloud:

  • PXE network

    Enables PXE boot of all bare metal machines in the Container Cloud region.

    • PXE subnet

      Provides IP addresses for DHCP and network boot of the bare metal hosts for initial inspection and operating system provisioning. This network may not have the default gateway or a router connected to it. The PXE subnet is defined by the Container Cloud Operator during bootstrap.

      Provides IP addresses for the bare metal management services of Container Cloud, such as bare metal provisioning service (Ironic). These addresses are allocated and served by MetalLB.

  • Management network

    Connects LCM Agents running on the hosts to the Container Cloud LCM API. Serves the external connections to the Container Cloud Management API. The network is also used for communication between kubelet and the Kubernetes API server inside a Kubernetes cluster. The MKE components use this network for communication inside a swarm cluster.

    • LCM subnet

      Provides IP addresses for the Kubernetes nodes in the management cluster. This network also provides a Virtual IP (VIP) address for the load balancer that enables external access to the Kubernetes API of a management cluster. This VIP is also the endpoint to access the Container Cloud Management API in the management cluster.

      Provides IP addresses for the externally accessible services of Container Cloud, such as Keycloak, web UI, StackLight. These addresses are allocated and served by MetalLB.

  • Kubernetes workloads network

    Technology Preview

    Serves the internal traffic between workloads on the management cluster.

    • Kubernetes workloads subnet

      Provides IP addresses that are assigned to nodes and used by Calico.

  • Out-of-Band (OOB) network

    Connects to Baseboard Management Controllers of the servers that host the management cluster. The OOB subnet must be accessible from the management network through IP routing. The OOB network is not managed by Container Cloud and is not represented in the IPAM API.

Managed cluster networking

A Kubernetes cluster networking is typically focused on connecting pods on different nodes. On bare metal, however, the cluster networking is more complex as it needs to facilitate many different types of traffic.

Kubernetes clusters managed by Mirantis Container Cloud have the following types of traffic:

  • PXE network

    Enables the PXE boot of all bare metal machines in Container Cloud. This network is not configured on the hosts in a managed cluster. It is used by the bare metal provider to provision additional hosts in managed clusters and is disabled on the hosts after provisioning is done.

  • Life-cycle management (LCM) network

    Connects LCM Agents running on the hosts to the Container Cloud LCM API. The LCM API is provided by the management cluster. The LCM network is also used for communication between kubelet and the Kubernetes API server inside a Kubernetes cluster. The MKE components use this network for communication inside a swarm cluster.

    When using the BGP announcement of the IP address for the cluster API load balancer, which is available as Technology Preview since Container Cloud 2.24.4, no segment stretching is required between Kubernetes master nodes. Also, in this scenario, the load balancer IP address is not required to match the LCM subnet CIDR address.

    • LCM subnet(s)

      Provides IP addresses that are statically allocated by the IPAM service to bare metal hosts. This network must be connected to the Kubernetes API endpoint of the management cluster through an IP router.

      LCM Agents running on managed clusters will connect to the management cluster API through this router. LCM subnets may be different per managed cluster as long as this connection requirement is satisfied.

      The Virtual IP (VIP) address for load balancer that enables access to the Kubernetes API of the managed cluster must be allocated from the LCM subnet.

    • Cluster API subnet

      Technology Preview

      Provides a load balancer IP address for external access to the cluster API. Mirantis recommends that this subnet stays unique per managed cluster.

  • Kubernetes workloads network

    Serves as an underlay network for traffic between pods in the managed cluster. Do not share this network between clusters.

    • Kubernetes workloads subnet(s)

      Provides IP addresses that are statically allocated by the IPAM service to all nodes and that are used by Calico for cross-node communication inside a cluster. By default, VXLAN overlay is used for Calico cross-node communication.

  • Kubernetes external network

    Serves ingress traffic to the managed cluster from the outside world. You can share this network between clusters, but with dedicated subnets per cluster. Several or all cluster nodes must be connected to this network. Traffic from external users to the externally available Kubernetes load-balanced services comes through the nodes that are connected to this network.

    • Services subnet(s)

      Provides IP addresses for externally available Kubernetes load-balanced services. The address ranges for MetalLB are assigned from this subnet. There can be several subnets per managed cluster that define the address ranges or address pools for MetalLB.

    • External subnet(s)

      Provides IP addresses that are statically allocated by the IPAM service to nodes. The IP gateway in this network is used as the default route on all nodes that are connected to this network. This network allows external users to connect to the cluster services exposed as Kubernetes load-balanced services. MetalLB speakers must run on the same nodes. For details, see Configure node selector for MetalLB speaker.

  • Storage network

    Serves storage access and replication traffic from and to Ceph OSD services. The storage network does not need to be connected to any IP routers and does not require external access, unless you want to use Ceph from outside of a Kubernetes cluster. To use a dedicated storage network, define and configure both subnets listed below.

    • Storage access subnet(s)

      Provides IP addresses that are statically allocated by the IPAM service to Ceph nodes. The Ceph OSD services bind to these addresses on their respective nodes. Serves Ceph access traffic from and to storage clients. This is a public network in Ceph terms. 1

    • Storage replication subnet(s)

      Provides IP addresses that are statically allocated by the IPAM service to Ceph nodes. The Ceph OSD services bind to these addresses on their respective nodes. Serves Ceph internal replication traffic. This is a cluster network in Ceph terms. 1

  • Out-of-Band (OOB) network

    Connects baseboard management controllers (BMCs) of the bare metal hosts. This network must not be accessible from the managed clusters.

The following diagram illustrates the networking schema of the Container Cloud deployment on bare metal with a managed cluster:

_images/bm-cluster-l3-networking-multihomed.png
1(1,2)

For more details about Ceph networks, see Ceph Network Configuration Reference.

Host networking

The following network roles are defined for all Mirantis Container Cloud clusters nodes on bare metal including the bootstrap, management and managed cluster nodes:

  • Out-of-band (OOB) network

    Connects the Baseboard Management Controllers (BMCs) of the hosts in the network to Ironic. This network is out of band for the host operating system.

  • PXE network

    Enables remote booting of servers through the PXE protocol. In management clusters, DHCP server listens on this network for hosts discovery and inspection. In managed clusters, hosts use this network for the initial PXE boot and provisioning.

  • LCM network

    Connects LCM Agents running on the node to the LCM API of the management cluster. It is also used for communication between kubelet and the Kubernetes API server inside a Kubernetes cluster. The MKE components use this network for communication inside a swarm cluster. In management clusters, it is replaced by the management network.

  • Kubernetes workloads (pods) network

    Technology Preview

    Serves connections between Kubernetes pods. Each host has an address on this network, and this address is used by Calico as an endpoint to the underlay network.

  • Kubernetes external network

    Technology Preview

    Serves external connection to the Kubernetes API and the user services exposed by the cluster. In management clusters, it is replaced by the management network.

  • Management network

    Serves external connections to the Container Cloud Management API and services of the management cluster. Not available in a managed cluster.

  • Storage access network

    Connects Ceph nodes to the storage clients. The Ceph OSD service is bound to the address on this network. This is a public network in Ceph terms. 0

  • Storage replication network

    Connects Ceph nodes to each other. Serves internal replication traffic. This is a cluster network in Ceph terms. 0

Each network is represented on the host by a virtual Linux bridge. Physical interfaces may be connected to one of the bridges directly, or through a logical VLAN subinterface, or combined into a bond interface that is in turn connected to a bridge.

The following table summarizes the default names used for the bridges connected to the networks listed above:

Management cluster

Network type

Bridge name

Assignment method TechPreview

OOB network

N/A

N/A

PXE network

bm-pxe

By a static interface name

Management network

k8s-lcm 2

By a subnet label ipam/SVC-k8s-lcm

Kubernetes workloads network

k8s-pods 1

By a static interface name

Managed cluster

Network type

Bridge name

Assignment method

OOB network

N/A

N/A

PXE network

N/A

N/A

LCM network

k8s-lcm 2

By a subnet label ipam/SVC-k8s-lcm

Kubernetes workloads network

k8s-pods 1

By a static interface name

Kubernetes external network

k8s-ext

By a static interface name

Storage access (public) network

ceph-public

By the subnet label ipam/SVC-ceph-public

Storage replication (cluster) network

ceph-cluster

By the subnet label ipam/SVC-ceph-cluster

0(1,2)

Ceph network configuration reference

1(1,2)

Interface name for this network role is static and cannot be changed.

2(1,2)

Use of this interface name (and network role) is mandatory for every cluster.

Storage

The baremetal-based Mirantis Container Cloud uses Ceph as a distributed storage system for file, block, and object storage. This section provides an overview of a Ceph cluster deployed by Container Cloud.

Overview

Mirantis Container Cloud deploys Ceph on baremetal-based managed clusters using Helm charts with the following components:

Rook Ceph Operator

A storage orchestrator that deploys Ceph on top of a Kubernetes cluster. Also known as Rook or Rook Operator. Rook operations include:

  • Deploying and managing a Ceph cluster based on provided Rook CRs such as CephCluster, CephBlockPool, CephObjectStore, and so on.

  • Orchestrating the state of the Ceph cluster and all its daemons.

KaaSCephCluster custom resource (CR)

Represents the customization of a Kubernetes installation and allows you to define the required Ceph configuration through the Container Cloud web UI before deployment. For example, you can define the failure domain, Ceph pools, Ceph node roles, number of Ceph components such as Ceph OSDs, and so on. The ceph-kcc-controller controller on the Container Cloud management cluster manages the KaaSCephCluster CR.

Ceph Controller

A Kubernetes controller that obtains the parameters from Container Cloud through a CR, creates CRs for Rook and updates its CR status based on the Ceph cluster deployment progress. It creates users, pools, and keys for OpenStack and Kubernetes and provides Ceph configurations and keys to access them. Also, Ceph Controller eventually obtains the data from the OpenStack Controller for the Keystone integration and updates the RADOS Gateway services configurations to use Kubernetes for user authentication. Ceph Controller operations include:

  • Transforming user parameters from the Container Cloud Ceph CR into Rook CRs and deploying a Ceph cluster using Rook.

  • Providing integration of the Ceph cluster with Kubernetes.

  • Providing data for OpenStack to integrate with the deployed Ceph cluster.

Ceph Status Controller

A Kubernetes controller that collects all valuable parameters from the current Ceph cluster, its daemons, and entities and exposes them into the KaaSCephCluster status. Ceph Status Controller operations include:

  • Collecting all statuses from a Ceph cluster and corresponding Rook CRs.

  • Collecting additional information on the health of Ceph daemons.

  • Provides information to the status section of the KaaSCephCluster CR.

Ceph Request Controller

A Kubernetes controller that obtains the parameters from Container Cloud through a CR and manages Ceph OSD lifecycle management (LCM) operations. It allows for a safe Ceph OSD removal from the Ceph cluster. Ceph Request Controller operations include:

  • Providing an ability to perform Ceph OSD LCM operations.

  • Obtaining specific CRs to remove Ceph OSDs and executing them.

  • Pausing the regular Ceph Controller reconcile until all requests are completed.

A typical Ceph cluster consists of the following components:

  • Ceph Monitors - three or, in rare cases, five Ceph Monitors.

  • Ceph Managers:

    • Before Container Cloud 2.22.0, one Ceph Manager.

    • Since Container Cloud 2.22.0, two Ceph Managers.

  • RADOS Gateway services - Mirantis recommends having three or more RADOS Gateway instances for HA.

  • Ceph OSDs - the number of Ceph OSDs may vary according to the deployment needs.

    Warning

    • A Ceph cluster with 3 Ceph nodes does not provide hardware fault tolerance and is not eligible for recovery operations, such as a disk or an entire Ceph node replacement.

    • A Ceph cluster uses the replication factor that equals 3. If the number of Ceph OSDs is less than 3, a Ceph cluster moves to the degraded state with the write operations restriction until the number of alive Ceph OSDs equals the replication factor again.

The placement of Ceph Monitors and Ceph Managers is defined in the KaaSCephCluster CR.

The following diagram illustrates the way a Ceph cluster is deployed in Container Cloud:

_images/ceph-deployment.png

The following diagram illustrates the processes within a deployed Ceph cluster:

_images/ceph-data-flow.png
Limitations

A Ceph cluster configuration in Mirantis Container Cloud includes but is not limited to the following limitations:

  • Only one Ceph Controller per a managed cluster and only one Ceph cluster per Ceph Controller are supported.

  • The replication size for any Ceph pool must be set to more than 1.

  • All CRUSH rules must have the same failure_domain.

  • Only one CRUSH tree per cluster. The separation of devices per Ceph pool is supported through device classes with only one pool of each type for a device class.

  • Only the following types of CRUSH buckets are supported:

    • topology.kubernetes.io/region

    • topology.kubernetes.io/zone

    • topology.rook.io/datacenter

    • topology.rook.io/room

    • topology.rook.io/pod

    • topology.rook.io/pdu

    • topology.rook.io/row

    • topology.rook.io/rack

    • topology.rook.io/chassis

  • Only IPv4 is supported.

  • If two or more Ceph OSDs are located on the same device, there must be no dedicated WAL or DB for this class.

  • Only a full collocation or dedicated WAL and DB configurations are supported.

  • The minimum size of any defined Ceph OSD device is 5 GB.

  • Lifted since Container Cloud 2.24.2 (Cluster releases 14.0.1 and 15.0.1). Ceph cluster does not support removable devices (with hotplug enabled) for deploying Ceph OSDs.

  • Ceph OSDs support only raw disks as data devices meaning that no dm or lvm devices are allowed.

  • When adding a Ceph node with the Ceph Monitor role, if any issues occur with the Ceph Monitor, rook-ceph removes it and adds a new Ceph Monitor instead, named using the next alphabetic character in order. Therefore, the Ceph Monitor names may not follow the alphabetical order. For example, a, b, d, instead of a, b, c.

  • Reducing the number of Ceph Monitors is not supported and causes the Ceph Monitor daemons removal from random nodes.

  • Removal of the mgr role in the nodes section of the KaaSCephCluster CR does not remove Ceph Managers. To remove a Ceph Manager from a node, remove it from the nodes spec and manually delete the mgr pod in the Rook namespace.

  • Lifted since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.10). Ceph does not support allocation of Ceph RGW pods on nodes where the Federal Information Processing Standard (FIPS) mode is enabled.

Addressing storage devices

There are several formats to use when specifying and addressing storage devices of a Ceph cluster. The default and recommended one is the /dev/disk/by-id format. This format is reliable and unaffected by the disk controller actions, such as device name shuffling or /dev/disk/by-path recalculating.

Difference between by-id, name, and by-path formats

The storage device /dev/disk/by-id format in most of the cases bases on a disk serial number, which is unique for each disk. A by-id symlink is created by the udev rules in the following format, where <BusID> is an ID of the bus to which the disk is attached and <DiskSerialNumber> stands for a unique disk serial number:

/dev/disk/by-id/<BusID>-<DiskSerialNumber>

Typical by-id symlinks for storage devices look as follows:

/dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
/dev/disk/by-id/scsi-SATA_HGST_HUS724040AL_PN1334PEHN18ZS
/dev/disk/by-id/ata-WDC_WD4003FZEX-00Z4SA0_WD-WMC5D0D9DMEH

In the example above, symlinks contain the following IDs:

  • Bus IDs: nvme, scsi-SATA and ata

  • Disk serial numbers: SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543, HGST_HUS724040AL_PN1334PEHN18ZS and WDC_WD4003FZEX-00Z4SA0_WD-WMC5D0D9DMEH.

An exception to this rule is the wwn by-id symlinks, which are programmatically generated at boot. They are not solely based on disk serial numbers but also include other node information. This can lead to the wwn being recalculated when the node reboots. As a result, this symlink type cannot guarantee a persistent disk identifier and should not be used as a stable storage device symlink in a Ceph cluster.

The storage device name and by-path formats cannot be considered persistent because the sequence in which block devices are added during boot is semi-arbitrary. This means that block device names, for example, nvme0n1 and sdc, are assigned to physical disks during discovery, which may vary inconsistently from the previous node state. The same inconsistency applies to by-path symlinks, as they rely on the shortest physical path to the device at boot and may differ from the previous node state.

Therefore, Mirantis highly recommends using storage device by-id symlinks that contain disk serial numbers. This approach enables you to use a persistent device identifier addressed in the Ceph cluster specification.

Example KaaSCephCluster with device by-id identifiers

Below is an example KaaSCephCluster custom resource using the /dev/disk/by-id format for storage devices specification:

Note

Container Cloud enables you to use fullPath for the by-id symlinks since 2.25.0. For the earlier product versions, use the name field instead.

 apiVersion: kaas.mirantis.com/v1alpha1
 kind: KaaSCephCluster
 metadata:
   name: ceph-cluster-managed-cluster
   namespace: managed-ns
 spec:
   cephClusterSpec:
     nodes:
       # Add the exact ``nodes`` names.
       # Obtain the name from the "get machine" list.
       cz812-managed-cluster-storage-worker-noefi-58spl:
         roles:
         - mgr
         - mon
       # All disk configuration must be reflected in ``status.providerStatus.hardware.storage`` of the ``Machine`` object
         storageDevices:
         - config:
             deviceClass: ssd
           fullPath: /dev/disk/by-id/scsi-1ATA_WDC_WDS100T2B0A-00SM50_200231440912
       cz813-managed-cluster-storage-worker-noefi-lr4k4:
         roles:
         - mgr
         - mon
         storageDevices:
         - config:
             deviceClass: nvme
           fullPath: /dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543
       cz814-managed-cluster-storage-worker-noefi-z2m67:
         roles:
         - mgr
         - mon
         storageDevices:
         - config:
             deviceClass: nvme
           fullPath: /dev/disk/by-id/nvme-SAMSUNG_ML1EB3T8HMLA-00007_S46FNY1R130423
     pools:
     - default: true
       deviceClass: ssd
       name: kubernetes
       replicated:
         size: 3
       role: kubernetes
   k8sCluster:
     name: managed-cluster
     namespace: managed-ns
Extended hardware configuration

Mirantis Container Cloud provides APIs that enable you to define hardware configurations that extend the reference architecture:

  • Bare Metal Host Profile API

    Enables for quick configuration of host boot and storage devices and assigning of custom configuration profiles to individual machines. See Create a custom bare metal host profile.

  • IP Address Management API

    Enables for quick configuration of host network interfaces and IP addresses and setting up of IP addresses ranges for automatic allocation. See Create L2 templates.

Typically, operations with the extended hardware configurations are available through the API and CLI, but not the web UI.

Automatic upgrade of a host operating system

To keep operating system on a bare metal host up to date with the latest security updates, the operating system requires periodic software packages upgrade that may or may not require the host reboot.

Mirantis Container Cloud uses life cycle management tools to update the operating system packages on the bare metal hosts. Container Cloud may also trigger restart of bare metal hosts to apply the updates.

In the management cluster of Container Cloud, software package upgrade and host restart is applied automatically when a new Container Cloud version with available kernel or software packages upgrade is released.

In managed clusters, package upgrade and host restart is applied as part of usual cluster upgrade using the Update cluster option in the Container Cloud web UI.

Operating system upgrade and host restart are applied to cluster nodes one by one. If Ceph is installed in the cluster, the Container Cloud orchestration securely pauses the Ceph OSDs on the node before restart. This allows avoiding degradation of the storage service.

Caution

  • Depending on the cluster configuration, applying security updates and host restart can increase the update time for each node to up to 1 hour.

  • Cluster nodes are updated one by one. Therefore, for large clusters, the update may take several days to complete.

Built-in load balancing

Caution

Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

The Mirantis Container Cloud managed clusters that are based on vSphere or bare metal use MetalLB for load balancing of services and HAProxy with VIP managed by Virtual Router Redundancy Protocol (VRRP) with Keepalived for the Kubernetes API load balancer.

Kubernetes API load balancing

Every control plane node of each Kubernetes cluster runs the kube-api service in a container. This service provides a Kubernetes API endpoint. Every control plane node also runs the haproxy server that provides load balancing with backend health checking for all kube-api endpoints as backends.

The default load balancing method is least_conn. With this method, a request is sent to the server with the least number of active connections. The default load balancing method cannot be changed using the Container Cloud API.

Only one of the control plane nodes at any given time serves as a front end for Kubernetes API. To ensure this, the Kubernetes clients use a virtual IP (VIP) address for accessing Kubernetes API. This VIP is assigned to one node at a time using VRRP. Keepalived running on each control plane node provides health checking and failover of the VIP.

Keepalived is configured in multicast mode.

Note

The use of VIP address for load balancing of Kubernetes API requires that all control plane nodes of a Kubernetes cluster are connected to a shared L2 segment. This limitation prevents from installing full L3 topologies where control plane nodes are split between different L2 segments and L3 networks.

Caution

External load balancers for services are not supported by the current version of the Container Cloud vSphere provider. The built-in load balancing described in this section is the only supported option and cannot be disabled.

Services load balancing

The services provided by the Kubernetes clusters, including Container Cloud and user services, are balanced by MetalLB. The metallb-speaker service runs on every worker node in the cluster and handles connections to the service IP addresses.

MetalLB runs in the MAC-based (L2) mode. It means that all control plane nodes must be connected to a shared L2 segment. This is a limitation that does not allow installing full L3 cluster topologies.

Caution

External load balancers for services are not supported by the current version of the Container Cloud vSphere provider. The built-in load balancing described in this section is the only supported option and cannot be disabled.

VMware vSphere network objects and IPAM recommendations

Warning

This section only applies to Container Cloud 2.27.2 (Cluster release 16.2.2) or earlier versions. Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

The VMware vSphere provider of Mirantis Container Cloud supports the following types of vSphere network objects:

  • Virtual network

    A network of virtual machines running on a hypervisor(s) that are logically connected to each other so that they can exchange data. Virtual machines can be connected to virtual networks that you create when you add a network.

  • Distributed port group

    A port group associated with a vSphere distributed switch that specifies port configuration options for each member port. Distributed port groups define how connection is established through the vSphere distributed switch to the network.

A Container Cloud cluster can be deployed using one of these network objects with or without a DHCP server in the network:

  • Non-DHCP

    Container Cloud uses IPAM service to manage IP addresses assignment to machines. You must provide additional network parameters, such as CIDR, gateway, IP ranges, and nameservers. Container Cloud processes this data to the cloud-init metadata and passes the data to machines during their bootstrap.

  • DHCP

    Container Cloud relies on a DHCP server to assign IP addresses to virtual machines.

Mirantis recommends using IP address management (IPAM) for cluster machines provided by Container Cloud. IPAM must be enabled for deployment in the non-DHCP vSphere networks. But Mirantis recommends enabling IPAM in the DHCP-based networks as well. In this case, the dedicated IPAM range should not intersect with the IP range used in the DHCP server configuration for the provided vSphere network. Such configuration prevents issues with accidental IP address change for machines. For the issue details, see vSphere troubleshooting.

Note

To obtain IPAM parameters for the selected vSphere network, contact your vSphere administrator who provides you with IP ranges dedicated to your environment only.

The following parameters are required to enable IPAM:

  • Network CIDR.

  • Network gateway address.

  • Minimum 1 DNS server.

  • IP address include range to be allocated for cluster machines. Make sure that this range is not part of the DHCP range if the network has a DHCP server.

    Minimal number of addresses in the range:

    • 3 IPs for management cluster

    • 3+N IPs for a managed cluster, where N is the number of worker nodes

  • Optional. IP address exclude range that is the list of IPs not to be assigned to machines from the include ranges.

A dedicated Container Cloud network must not contain any virtual machines with the keepalived instance running inside them as this may lead to the vrouter_id conflict. By default, the Container Cloud management cluster is deployed with vrouter_id set to 1. Managed clusters are deployed with the vrouter_id value starting from 2 and upper.

Kubernetes lifecycle management

The Kubernetes lifecycle management (LCM) engine in Mirantis Container Cloud consists of the following components:

LCM Controller

Responsible for all LCM operations. Consumes the LCMCluster object and orchestrates actions through LCM Agent.

LCM Agent

Runs on the target host. Executes Ansible playbooks in headless mode. Does not run on attached MKE clusters that are not originally deployed by Container Cloud.

Helm Controller

Responsible for the Helm charts life cycle, is installed by a cloud provider as a Helm v3 chart.

The Kubernetes LCM components handle the following custom resources:

  • LCMCluster

  • LCMMachine

  • HelmBundle

The following diagram illustrates handling of the LCM custom resources by the Kubernetes LCM components. On a managed cluster, apiserver handles multiple Kubernetes objects, for example, deployments, nodes, RBAC, and so on.

_images/lcm-components.png
LCM custom resources

The Kubernetes LCM components handle the following custom resources (CRs):

  • LCMMachine

  • LCMCluster

  • HelmBundle

LCMMachine

Describes a machine that is located on a cluster. It contains the machine type, control or worker, StateItems that correspond to Ansible playbooks and miscellaneous actions, for example, downloading a file or executing a shell command. LCMMachine reflects the current state of the machine, for example, a node IP address, and each StateItem through its status. Multiple LCMMachine CRs can correspond to a single cluster.

LCMCluster

Describes a managed cluster. In its spec, LCMCluster contains a set of StateItems for each type of LCMMachine, which describe the actions that must be performed to deploy the cluster. LCMCluster is created by the provider, using machineTypes of the Release object. The status field of LCMCluster reflects the status of the cluster, for example, the number of ready or requested nodes.

HelmBundle

Wrapper for Helm charts that is handled by Helm Controller. HelmBundle tracks what Helm charts must be installed on a managed cluster.

LCM Controller

LCM Controller runs on the management and regional cluster and orchestrates the LCMMachine objects according to their type and their LCMCluster object.

Once the LCMCluster and LCMMachine objects are created, LCM Controller starts monitoring them to modify the spec fields and update the status fields of the LCMMachine objects when required. The status field of LCMMachine is updated by LCM Agent running on a node of a management, regional, or managed cluster.

Each LCMMachine has the following lifecycle states:

  1. Uninitialized - the machine is not yet assigned to an LCMCluster.

  2. Pending - the agent reports a node IP address and host name.

  3. Prepare - the machine executes StateItems that correspond to the prepare phase. This phase usually involves downloading the necessary archives and packages.

  4. Deploy - the machine executes StateItems that correspond to the deploy phase that is becoming a Mirantis Kubernetes Engine (MKE) node.

  5. Ready - the machine is being deployed.

  6. Upgrade - the machine is being upgraded to the new MKE version.

  7. Reconfigure - the machine executes StateItems that correspond to the reconfigure phase. The machine configuration is being updated without affecting workloads running on the machine.

The templates for StateItems are stored in the machineTypes field of an LCMCluster object, with separate lists for the MKE manager and worker nodes. Each StateItem has the execution phase field for a management, regional, and managed cluster:

  1. The prepare phase is executed for all machines for which it was not executed yet. This phase comprises downloading the files necessary for the cluster deployment, installing the required packages, and so on.

  2. During the deploy phase, a node is added to the cluster. LCM Controller applies the deploy phase to the nodes in the following order:

    1. First manager node is deployed.

    2. The remaining manager nodes are deployed one by one and the worker nodes are deployed in batches (by default, up to 50 worker nodes at the same time).

LCM Controller deploys and upgrades a Mirantis Container Cloud cluster by setting StateItems of LCMMachine objects following the corresponding StateItems phases described above. The Container Cloud cluster upgrade process follows the same logic that is used for a new deployment, that is applying a new set of StateItems to the LCMMachines after updating the LCMCluster object. But if the existing worker node is being upgraded, LCM Controller performs draining and cordoning on this node honoring the Pod Disruption Budgets. This operation prevents unexpected disruptions of the workloads.

LCM Agent

LCM Agent handles a single machine that belongs to a management or managed cluster. It runs on the machine operating system but communicates with apiserver of the management cluster. LCM Agent is deployed as a systemd unit using cloud-init. LCM Agent has a built-in self-upgrade mechanism.

LCM Agent monitors the spec of a particular LCMMachine object to reconcile the machine state with the object StateItems and update the LCMMachine status accordingly. The actions that LCM Agent performs while handling the StateItems are as follows:

  • Download configuration files

  • Run shell commands

  • Run Ansible playbooks in headless mode

LCM Agent provides the IP address and host name of the machine for the LCMMachine status parameter.

Helm Controller

Helm Controller is used by Mirantis Container Cloud to handle management and managed clusters core addons such as StackLight and the application addons such as the OpenStack components.

Helm Controller is installed as a separate Helm v3 chart by the Container Cloud provider. Its Pods are created using Deployment.

The Helm release information is stored in the KaaSRelease object for the management clusters and in the ClusterRelease object for all types of the Container Cloud clusters. These objects are used by the Container Cloud provider. The Container Cloud provider uses the information from the ClusterRelease object together with the Container Cloud API Cluster spec. In Cluster spec, the operator can specify the Helm release name and charts to use. By combining the information from the Cluster providerSpec parameter and its ClusterRelease object, the cluster actuator generates the LCMCluster objects. These objects are further handled by LCM Controller and the HelmBundle object handled by Helm Controller. HelmBundle must have the same name as the LCMCluster object for the cluster that HelmBundle applies to.

Although a cluster actuator can only create a single HelmBundle per cluster, Helm Controller can handle multiple HelmBundle objects per cluster.

Helm Controller handles the HelmBundle objects and reconciles them with the state of Helm in its cluster.

Helm Controller can also be used by the management cluster with corresponding HelmBundle objects created as part of the initial management cluster setup.

Identity and access management

Identity and access management (IAM) provides a central point of users and permissions management of the Mirantis Container Cloud cluster resources in a granular and unified manner. Also, IAM provides infrastructure for single sign-on user experience across all Container Cloud web portals.

IAM for Container Cloud consists of the following components:

Keycloak
  • Provides the OpenID Connect endpoint

  • Integrates with an external identity provider (IdP), for example, existing LDAP or Google Open Authorization (OAuth)

  • Stores roles mapping for users

IAM Controller
  • Provides IAM API with data about Container Cloud projects

  • Handles all role-based access control (RBAC) components in Kubernetes API

IAM API

Provides an abstraction API for creating user scopes and roles

External identity provider integration

To be consistent and keep the integrity of a user database and user permissions, in Mirantis Container Cloud, IAM stores the user identity information internally. However in real deployments, the identity provider usually already exists.

Out of the box, in Container Cloud, IAM supports integration with LDAP and Google Open Authorization (OAuth). If LDAP is configured as an external identity provider, IAM performs one-way synchronization by mapping attributes according to configuration.

In the case of the Google Open Authorization (OAuth) integration, the user is automatically registered and their credentials are stored in the internal database according to the user template configuration. The Google OAuth registration workflow is as follows:

  1. The user requests a Container Cloud web UI resource.

  2. The user is redirected to the IAM login page and logs in using the Log in with Google account option.

  3. IAM creates a new user with the default access rights that are defined in the user template configuration.

  4. The user can access the Container Cloud web UI resource.

The following diagram illustrates the external IdP integration to IAM:

_images/iam-ext-idp.png

You can configure simultaneous integration with both external IdPs with the user identity matching feature enabled.

Authentication and authorization

Mirantis IAM uses the OpenID Connect (OIDC) protocol for handling authentication.

Implementation flow

Mirantis IAM performs as an OpenID Connect (OIDC) provider, it issues a token and exposes discovery endpoints.

The credentials can be handled by IAM itself or delegated to an external identity provider (IdP).

The issued JSON Web Token (JWT) is sufficient to perform operations across Mirantis Container Cloud according to the scope and role defined in it. Mirantis recommends using asymmetric cryptography for token signing (RS256) to minimize the dependency between IAM and managed components.

When Container Cloud calls Mirantis Kubernetes Engine (MKE), the user in Keycloak is created automatically with a JWT issued by Keycloak on behalf of the end user. MKE, in its turn, verifies whether the JWT is issued by Keycloak. If the user retrieved from the token does not exist in the MKE database, the user is automatically created in the MKE database based on the information from the token.

The authorization implementation is out of the scope of IAM in Container Cloud. This functionality is delegated to the component level. IAM interacts with a Container Cloud component using the OIDC token content that is processed by a component itself and required authorization is enforced. Such an approach enables you to have any underlying authorization that is not dependent on IAM and still to provide a unified user experience across all Container Cloud components.

Kubernetes CLI authentication flow

The following diagram illustrates the Kubernetes CLI authentication flow. The authentication flow for Helm and other Kubernetes-oriented CLI utilities is identical to the Kubernetes CLI flow, but JSON Web Tokens (JWT) must be pre-provisioned.

_images/iam-authn-k8s.png

See also

IAM resources

Monitoring

Mirantis Container Cloud uses StackLight, the logging, monitoring, and alerting solution that provides a single pane of glass for cloud maintenance and day-to-day operations as well as offers critical insights into cloud health including operational information about the components deployed in management and managed clusters.

StackLight is based on Prometheus, an open-source monitoring solution and a time series database.

Deployment architecture

Mirantis Container Cloud deploys the StackLight stack as a release of a Helm chart that contains the helm-controller and helmbundles.lcm.mirantis.com (HelmBundle) custom resources. The StackLight HelmBundle consists of a set of Helm charts with the StackLight components that include:

StackLight components overview

StackLight component

Description

Alerta

Receives, consolidates, and deduplicates the alerts sent by Alertmanager and visually represents them through a simple web UI. Using the Alerta web UI, you can view the most recent or watched alerts, group, and filter alerts.

Alertmanager

Handles the alerts sent by client applications such as Prometheus, deduplicates, groups, and routes alerts to receiver integrations. Using the Alertmanager web UI, you can view the most recent fired alerts, silence them, or view the Alertmanager configuration.

Elasticsearch Curator

Maintains the data (indexes) in OpenSearch by performing such operations as creating, closing, or opening an index as well as deleting a snapshot. Also, manages the data retention policy in OpenSearch.

Elasticsearch Exporter Compatible with OpenSearch

The Prometheus exporter that gathers internal OpenSearch metrics.

Grafana

Builds and visually represents metric graphs based on time series databases. Grafana supports querying of Prometheus using the PromQL language.

Database backends

StackLight uses PostgreSQL for Alerta and Grafana. PostgreSQL reduces the data storage fragmentation while enabling high availability. High availability is achieved using Patroni, the PostgreSQL cluster manager that monitors for node failures and manages failover of the primary node. StackLight also uses Patroni to manage major version upgrades of PostgreSQL clusters, which allows leveraging the database engine functionality and improvements as they are introduced upstream in new releases, maintaining functional continuity without version lock-in.

Logging stack

Responsible for collecting, processing, and persisting logs and Kubernetes events. By default, when deploying through the Container Cloud web UI, only the metrics stack is enabled on managed clusters. To enable StackLight to gather managed cluster logs, enable the logging stack during deployment. On management clusters, the logging stack is enabled by default. The logging stack components include:

  • OpenSearch, which stores logs and notifications.

  • Fluentd-logs, which collects logs, sends them to OpenSearch, generates metrics based on analysis of incoming log entries, and exposes these metrics to Prometheus.

  • OpenSearch Dashboards, which provides real-time visualization of the data stored in OpenSearch and enables you to detect issues.

  • Metricbeat, which collects Kubernetes events and sends them to OpenSearch for storage.

  • Prometheus-es-exporter, which presents the OpenSearch data as Prometheus metrics by periodically sending configured queries to the OpenSearch cluster and exposing the results to a scrapable HTTP endpoint like other Prometheus targets.

Note

The logging mechanism performance depends on the cluster log load. In case of a high load, you may need to increase the default resource requests and limits for fluentdLogs. For details, see StackLight configuration parameters: Resource limits.

Metric collector

Collects telemetry data (CPU or memory usage, number of active alerts, and so on) from Prometheus and sends the data to centralized cloud storage for further processing and analysis. Metric collector runs on the management cluster.

Note

This component is designated for internal StackLight use only.

Prometheus

Gathers metrics. Automatically discovers and monitors the endpoints. Using the Prometheus web UI, you can view simple visualizations and debug. By default, the Prometheus database stores metrics of the past 15 days or up to 15 GB of data depending on the limit that is reached first.

Prometheus Blackbox Exporter

Allows monitoring endpoints over HTTP, HTTPS, DNS, TCP, and ICMP.

Prometheus-es-exporter

Presents the OpenSearch data as Prometheus metrics by periodically sending configured queries to the OpenSearch cluster and exposing the results to a scrapable HTTP endpoint like other Prometheus targets.

Prometheus Node Exporter

Gathers hardware and operating system metrics exposed by kernel.

Prometheus Relay

Adds a proxy layer to Prometheus to merge the results from underlay Prometheus servers to prevent gaps in case some data is missing on some servers. Is available only in the HA StackLight mode.

Reference Application Removed in 2.28.3 (16.3.3)

Enables workload monitoring on non-MOSK managed clusters. Mimics a classical microservice application and provides metrics that describe the likely behavior of user workloads.

Note

For the feature support on MOSK deployments, refer to MOSK documentation: Deploy your first cloud application using automation.

Salesforce notifier

Enables sending Alertmanager notifications to Salesforce to allow creating Salesforce cases and closing them once the alerts are resolved. Disabled by default.

Salesforce reporter

Queries Prometheus for the data about the amount of vCPU, vRAM, and vStorage used and available, combines the data, and sends it to Salesforce daily. Mirantis uses the collected data for further analysis and reports to improve the quality of customer support. Disabled by default.

Telegraf

Collects metrics from the system. Telegraf is plugin-driven and has the concept of two distinct set of plugins: input plugins collect metrics from the system, services, or third-party APIs; output plugins write and expose metrics to various destinations.

The Telegraf agents used in Container Cloud include:

  • telegraf-ds-smart monitors SMART disks, and runs on both management and managed clusters.

  • telegraf-ironic monitors Ironic on the baremetal-based management clusters. The ironic input plugin collects and processes data from Ironic HTTP API, while the http_response input plugin checks Ironic HTTP API availability. As an output plugin, to expose collected data as Prometheus target, Telegraf uses prometheus.

  • telegraf-docker-swarm gathers metrics from the Mirantis Container Runtime API about the Docker nodes, networks, and Swarm services. This is a Docker Telegraf input plugin with downstream additions.

Telemeter

Enables a multi-cluster view through a Grafana dashboard of the management cluster. Telemeter includes a Prometheus federation push server and clients to enable isolated Prometheus instances, which cannot be scraped from a central Prometheus instance, to push metrics to the central location.

The Telemeter services are distributed between the management cluster that hosts the Telemeter server and managed clusters that host the Telemeter client. The metrics from managed clusters are aggregated on management clusters.

Note

This component is designated for internal StackLight use only.

Every Helm chart contains a default values.yml file. These default values are partially overridden by custom values defined in the StackLight Helm chart.

Before deploying a managed cluster, you can select the HA or non-HA StackLight architecture type. The non-HA mode is set by default. On management clusters, StackLight is deployed in the HA mode only. The following table lists the differences between the HA and non-HA modes:

StackLight database modes

Non-HA StackLight mode default

HA StackLight mode

  • One Prometheus instance

  • One Alertmanager instance Since 2.24.0 and 2.24.2 for MOSK 23.2

  • One OpenSearch instance

  • One PostgreSQL instance

  • One iam-proxy instance

One persistent volume is provided for storing data. In case of a service or node failure, a new pod is redeployed and the volume is reattached to provide the existing data. Such setup has a reduced hardware footprint but provides less performance.

  • Two Prometheus instances

  • Two Alertmanager instances

  • Three OpenSearch instances

  • Three PostgreSQL instances

  • Two iam-proxy instances Since 2.23.0 and 2.23.1 for MOSK 23.1

Local Volume Provisioner is used to provide local host storage. In case of a service or node failure, the traffic is automatically redirected to any other running Prometheus or OpenSearch server. For better performance, Mirantis recommends that you deploy StackLight in the HA mode. Two iam-proxy instances ensure access to HA components if one iam-proxy node fails.

Note

Before Container Cloud 2.24.0, Alertmanager has 2 replicas in the non-HA mode.

Depending on the Container Cloud cluster type and selected StackLight database mode, StackLight is deployed on the following number of nodes:

StackLight database modes

Cluster

StackLight database mode

Target nodes

Management

HA mode

All Kubernetes master nodes

Managed

Non-HA mode

  • All nodes with the stacklight label.

  • If no nodes have the stacklight label, StackLight is spread across all worker nodes. The minimal requirement is at least 1 worker node.

HA mode

All nodes with the stacklight label. The minimal requirement is 3 nodes with the stacklight label. Otherwise, StackLight deployment does not start.

Authentication flow

StackLight provides five web UIs including Prometheus, Alertmanager, Alerta, OpenSearch Dashboards, and Grafana. Access to StackLight web UIs is protected by Keycloak-based Identity and access management (IAM). All web UIs except Alerta are exposed to IAM through the IAM proxy middleware. The Alerta configuration provides direct integration with IAM.

The following diagram illustrates accessing the IAM-proxied StackLight web UIs, for example, Prometheus web UI:

_images/sl-auth-iam-proxied.png

Authentication flow for the IAM-proxied StackLight web UIs:

  1. A user enters the public IP of a StackLight web UI, for example, Prometheus web UI.

  2. The public IP leads to IAM proxy, deployed as a Kubernetes LoadBalancer, which protects the Prometheus web UI.

  3. LoadBalancer routes the HTTP request to Kubernetes internal IAM proxy service endpoints, specified in the X-Forwarded-Proto or X-Forwarded-Host headers.

  4. The Keycloak login form opens (the login_url field in the IAM proxy configuration, which points to Keycloak realm) and the user enters the user name and password.

  5. Keycloak validates the user name and password.

  6. The user obtains access to the Prometheus web UI (the upstreams field in the IAM proxy configuration).

Note

  • The discovery URL is the URL of the IAM service.

  • The upstream URL is the hidden endpoint of a web UI (Prometheus web UI in the example above).

The following diagram illustrates accessing the Alerta web UI:

_images/sl-authentication-direct.png

Authentication flow for the Alerta web UI:

  1. A user enters the public IP of the Alerta web UI.

  2. The public IP leads to Alerta deployed as a Kubernetes LoadBalancer type.

  3. LoadBalancer routes the HTTP request to the Kubernetes internal Alerta service endpoint.

  4. The Keycloak login form opens (Alerta refers to the IAM realm) and the user enters the user name and password.

  5. Keycloak validates the user name and password.

  6. The user obtains access to the Alerta web UI.

Supported features

Using the Mirantis Container Cloud web UI, on the pre-deployment stage of a managed cluster, you can view, enable or disable, or tune the following StackLight features available:

  • StackLight HA mode.

  • Database retention size and time for Prometheus.

  • Tunable index retention period for OpenSearch.

  • Tunable PersistentVolumeClaim (PVC) size for Prometheus and OpenSearch set to 16 GB for Prometheus and 30 GB for OpenSearch by default. The PVC size must be logically aligned with the retention periods or sizes for these components.

  • Email and Slack receivers for the Alertmanager notifications.

  • Predefined set of dashboards.

  • Predefined set of alerts and capability to add new custom alerts for Prometheus in the following exemplary format:

    - alert: HighErrorRate
      expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
      for: 10m
      labels:
        severity: page
      annotations:
        summary: High request latency
    
Monitored components

StackLight measures, analyzes, and reports in a timely manner about failures that may occur in the following Mirantis Container Cloud components and their sub-components, if any:

  • Ceph

  • Ironic (Container Cloud bare-metal provider)

  • Kubernetes services:

    • Calico

    • etcd

    • Kubernetes cluster

    • Kubernetes containers

    • Kubernetes deployments

    • Kubernetes nodes

  • NGINX

  • Node hardware and operating system

  • PostgreSQL

  • StackLight:

    • Alertmanager

    • OpenSearch

    • Grafana

    • Prometheus

    • Prometheus Relay

    • Salesforce notifier

    • Telemeter

  • SSL certificates

  • Mirantis Kubernetes Engine (MKE)

    • Docker/Swarm metrics (through Telegraf)

    • Built-in MKE metrics

Storage-based log retention strategy

Available since 2.26.0 (17.1.0 and 16.1.0)

StackLight uses a storage-based log retention strategy that optimizes storage utilization and ensures effective data retention. A proportion of available disk space is defined as 80% of disk space allocated for the OpenSearch node with the following data types:

  • 80% for system logs

  • 10% for audit logs

  • 5% for OpenStack notifications (applies only to MOSK clusters)

  • 5% for Kubernetes events

This approach ensures that storage resources are efficiently allocated based on the importance and volume of different data types.

The logging index management implies the following advantages:

  • Storage-based rollover mechanism

    The rollover mechanism for system and audit indices enforces shard size based on available storage, ensuring optimal resource utilization.

  • Consistent shard allocation

    The number of primary shards per index is dynamically set based on cluster size, which boosts search and facilitates ingestion for large clusters.

  • Minimal size of cluster state

    The logging size of the cluster state is minimal and uses static mappings, which are based on Elastic Common Schema (ESC) with slight deviations from the standard. Dynamic mapping in index templates is avoided to reduce overhead.

  • Storage compression

    The system and audit indices utilize the best_compression codec that minimizes the size of stored indices, resulting in significant storage savings of up to 50% on average.

  • No filter by logging level

    In light of non-even severity level over components in Container Cloud, logs of all severity levels are collected to prevent ignorance of important logs of low severity while debugging a cluster. Filtering by tags is still available.

Outbound cluster metrics

The data collected and transmitted through an encrypted channel back to Mirantis provides our Customer Success Organization information to better understand the operational usage patterns our customers are experiencing as well as to provide feedback on product usage statistics to enable our product teams to enhance our products and services for our customers.

Mirantis collects the following statistics using configuration-collector:

Mirantis collects hardware information using the following metrics:

  • mcc_hw_machine_chassis

  • mcc_hw_machine_cpu_model

  • mcc_hw_machine_cpu_number

  • mcc_hw_machine_nics

  • mcc_hw_machine_ram

  • mcc_hw_machine_storage (storage devices and disk layout)

  • mcc_hw_machine_vendor

Mirantis collects the summary of all deployed Container Cloud configurations using the following objects, if any:

Note

The data is anonymized from all sensitive information, such as IDs, IP addresses, passwords, private keys, and so on.

  • Cluster

  • Machine

  • MachinePool

  • MCCUpgrade

  • BareMetalHost

  • BareMetalHostProfile

  • IPAMHost

  • IPAddr

  • KaaSCephCluster

  • L2Template

  • Subnet

Note

In the Cluster releases 17.0.0, 16.0.0, and 14.1.0, Mirantis does not collect any configuration summary in light of the configuration-collector refactoring.

The node-level resource data are broken down into three broad categories: Cluster, Node, and Namespace. The telemetry data tracks Allocatable, Capacity, Limits, Requests, and actual Usage of node-level resources.

Terms explanation

Term

Definition

Allocatable

On a Kubernetes Node, the amount of compute resources that are available for pods

Capacity

The total number of available resources regardless of current consumption

Limits

Constraints imposed by Administrators

Requests

The resources that a given container application is requesting

Usage

The actual usage or consumption of a given resource

The full list of the outbound data includes:

From bare metal management clusters only
  • hostos_module_usage Since 2.28.0 (17.3.0, 16.3.0)

From all Container Cloud managed clusters
  • cluster_alerts_firing Since 2.23.0 (11.7.0)

  • cluster_filesystem_size_bytes

  • cluster_filesystem_usage_bytes

  • cluster_filesystem_usage_ratio

  • cluster_master_nodes_total

  • cluster_nodes_total

  • cluster_persistentvolumeclaim_requests_storage_bytes

  • cluster_total_alerts_triggered

  • cluster_capacity_cpu_cores

  • cluster_capacity_memory_bytes

  • cluster_usage_cpu_cores

  • cluster_usage_memory_bytes

  • cluster_usage_per_capacity_cpu_ratio

  • cluster_usage_per_capacity_memory_ratio

  • cluster_worker_nodes_total

  • cluster_workload_pods_total Since 2.22.0 (11.6.0)

  • cluster_workload_containers_total Since 2.22.0 (11.6.0)

  • kaas_info

  • kaas_cluster_machines_ready_total

  • kaas_cluster_machines_requested_total

  • kaas_clusters

  • kaas_cluster_updating Since 2.21.0 (11.5.0, 7.11.0)

  • kaas_license_expiry

  • kaas_machines_ready

  • kaas_machines_requested

  • kubernetes_api_availability

  • mcc_cluster_update_plan_status Since 2.28.0 (17.3.0, 16.3.0) as TechPreview

  • mke_api_availability

  • mke_cluster_nodes_total

  • mke_cluster_containers_total

  • mke_cluster_vcpu_free

  • mke_cluster_vcpu_used

  • mke_cluster_vram_free

  • mke_cluster_vram_used

  • mke_cluster_vstorage_free

  • mke_cluster_vstorage_used

  • node_labels Since 2.24.0 (14.0.0)

From Mirantis OpenStack for Kubernetes (MOSK) clusters only
  • openstack_cinder_api_latency_90

  • openstack_cinder_api_latency_99

  • openstack_cinder_api_status Removed in MOSK 24.1

  • openstack_cinder_availability

  • openstack_cinder_volumes_total

  • openstack_glance_api_status

  • openstack_glance_availability

  • openstack_glance_images_total

  • openstack_glance_snapshots_total Removed in MOSK 24.1

  • openstack_heat_availability

  • openstack_heat_stacks_total

  • openstack_host_aggregate_instances Removed in MOSK 23.2

  • openstack_host_aggregate_memory_used_ratio Removed in MOSK 23.2

  • openstack_host_aggregate_memory_utilisation_ratio Removed in MOSK 23.2

  • openstack_host_aggregate_cpu_utilisation_ratio Removed in MOSK 23.2

  • openstack_host_aggregate_vcpu_used_ratio Removed in MOSK 23.2

  • openstack_instance_availability

  • openstack_instance_create_end

  • openstack_instance_create_error

  • openstack_instance_create_start

  • openstack_keystone_api_latency_90

  • openstack_keystone_api_latency_99

  • openstack_keystone_api_status Removed in MOSK 24.1

  • openstack_keystone_availability

  • openstack_keystone_tenants_total

  • openstack_keystone_users_total

  • openstack_kpi_provisioning

  • openstack_lbaas_availability

  • openstack_mysql_flow_control

  • openstack_neutron_api_latency_90

  • openstack_neutron_api_latency_99

  • openstack_neutron_api_status Removed in MOSK 24.1

  • openstack_neutron_availability

  • openstack_neutron_lbaas_loadbalancers_total

  • openstack_neutron_networks_total

  • openstack_neutron_ports_total

  • openstack_neutron_routers_total

  • openstack_neutron_subnets_total

  • openstack_nova_all_compute_cpu_utilisation

  • openstack_nova_all_compute_mem_utilisation

  • openstack_nova_all_computes_total

  • openstack_nova_all_vcpus_total

  • openstack_nova_all_used_vcpus_total

  • openstack_nova_all_ram_total_gb

  • openstack_nova_all_used_ram_total_gb

  • openstack_nova_all_disk_total_gb

  • openstack_nova_all_used_disk_total_gb

  • openstack_nova_api_status Removed in MOSK 24.1

  • openstack_nova_availability

  • openstack_nova_compute_cpu_utilisation

  • openstack_nova_compute_mem_utilisation

  • openstack_nova_computes_total

  • openstack_nova_disk_total_gb

  • openstack_nova_instances_active_total

  • openstack_nova_ram_total_gb

  • openstack_nova_used_disk_total_gb

  • openstack_nova_used_ram_total_gb

  • openstack_nova_used_vcpus_total

  • openstack_nova_vcpus_total

  • openstack_public_api_status Since MOSK 22.5

  • openstack_quota_instances

  • openstack_quota_ram_gb

  • openstack_quota_vcpus

  • openstack_quota_volume_storage_gb

  • openstack_rmq_message_deriv

  • openstack_usage_instances

  • openstack_usage_ram_gb

  • openstack_usage_vcpus

  • openstack_usage_volume_storage_gb

  • osdpl_aodh_alarms Since MOSK 23.3

  • osdpl_api_success Since MOSK 24.1

  • osdpl_cinder_zone_volumes Since MOSK 23.3

  • osdpl_manila_shares Since MOSK 24.2

  • osdpl_masakari_hosts Since MOSK 24.2

  • osdpl_neutron_availability_zone_info Since MOSK 23.3

  • osdpl_neutron_zone_routers Since MOSK 23.3

  • osdpl_nova_aggregate_hosts Since MOSK 23.3

  • osdpl_nova_audit_orphaned_allocations Since MOSK 24.3

  • osdpl_nova_availability_zone_info Since MOSK 23.3

  • osdpl_nova_availability_zone_instances Since MOSK 23.3

  • osdpl_nova_availability_zone_hosts Since MOSK 23.3

  • osdpl_version_info Since MOSK 23.3

  • tf_operator_info Since MOSK 23.3 for Tungsten Fabric

StackLight proxy

StackLight components, which require external access, automatically use the same proxy that is configured for Mirantis Container Cloud clusters. Therefore, you only need to configure proxy during deployment of your management or managed clusters. No additional actions are required to set up proxy for StackLight. For more details about implementation of proxy support in Container Cloud, see Proxy and cache support.

Note

Proxy handles only the HTTP and HTTPS traffic. Therefore, for clusters with limited or no Internet access, it is not possible to set up Alertmanager email notifications, which use SMTP, when proxy is used.

Proxy is used for the following StackLight components:

Component

Cluster type

Usage

Alertmanager

Any

As a default http_config for all HTTP-based receivers except the predefined HTTP-alerta and HTTP-salesforce. For these receivers, http_config is overridden on the receiver level.

Metric Collector

Management

To send outbound cluster metrics to Mirantis.

Salesforce notifier

Any

To send notifications to the Salesforce instance.

Salesforce reporter

Any

To send metric reports to the Salesforce instance.

Reference Application for workload monitoring

Unsupported and removed in 2.28.3 (16.3.3) Available since 2.21.0 (11.5.0) for non-MOSK managed clusters

Note

For the feature support on MOSK deployments, refer to MOSK documentation: Deploy your first cloud application using automation.

Reference Application is a small microservice application that enables workload monitoring on non-MOSK managed clusters. It mimics a classical microservice application and provides metrics that describe the likely behavior of user workloads.

The application consists of the following API and database services that allow putting simple records into the database through the API and retrieving them:

Reference Application API

Runs on StackLight nodes and provides API access to the database. Runs three API instances for high availability.

PostgreSQL Since Container Cloud 2.22.0

Runs on worker nodes and stores the data on an attached PersistentVolumeClaim (PVC). Runs three database instances for high availability.

Note

Before version 2.22.0, Container Cloud used MariaDB as the database management system instead of PostgreSQL.

StackLight queries the API measuring response times for each query. No caching is being done, so each API request must go to the database, allowing to verify the availability of a stateful workload on the cluster.

Reference Application requires the following resources on top of the main product requirements:

  • Up to 1 GiB of RAM per cluster

  • Up to 3 GiB of storage per cluster

The feature is disabled by default and can be enabled using the StackLight configuration manifest as described in StackLight configuration parameters: Reference Application.

Hardware and system requirements

Caution

Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

Using Mirantis Container Cloud, you can deploy a Mirantis Kubernetes Engine (MKE) cluster on bare metal, OpenStack, or VMware vSphere cloud providers. Each cloud provider requires corresponding resources.

Requirements for a bootstrap node

A bootstrap node is necessary only to deploy the management cluster. When the bootstrap is complete, the bootstrap node can be redeployed and its resources can be reused for the managed cluster workloads.

The minimum reference system requirements of a baremetal-based bootstrap seed node are described in System requirements for the seed node. The minimum reference system requirements a bootstrap node for other supported Container Cloud providers are as follows:

  • Any local machine on Ubuntu 22.04 that requires access to the provider API with the following configuration:

    • 2 vCPUs

    • 4 GB of RAM

    • 5 GB of available storage

    • Docker version currently available for Ubuntu 22.04

  • Internet access for downloading of all required artifacts

Note

For the vSphere cloud provider, you can also use RHEL 8.7 with the same system requirements as for Ubuntu.

Caution

Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

Requirements for a baremetal-based cluster

If you use a firewall or proxy, make sure that the bootstrap and management clusters have access to the following IP ranges and domain names required for the Container Cloud content delivery network and alerting:

  • IP ranges:

  • Domain names:

    • mirror.mirantis.com and repos.mirantis.com for packages

    • binary.mirantis.com for binaries and Helm charts

    • mirantis.azurecr.io and *.blob.core.windows.net for Docker images

    • mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry (port 443 if proxy is enabled)

    • mirantis.my.salesforce.com and login.salesforce.com for Salesforce alerts

Note

  • Access to Salesforce is required from any Container Cloud cluster type.

  • If any additional Alertmanager notification receiver is enabled, for example, Slack, its endpoint must also be accessible from the cluster.

Caution

Regional clusters are unsupported since Container Cloud 2.25.0. Mirantis does not perform functional integration testing of the feature and the related code is removed in Container Cloud 2.26.0. If you still require this feature, contact Mirantis support for further information.

Reference hardware configuration

The following hardware configuration is used as a reference to deploy Mirantis Container Cloud with bare metal Container Cloud clusters with Mirantis Kubernetes Engine.

Reference hardware configuration for Container Cloud management and managed clusters on bare metal

Server role

Management cluster

Managed cluster

# of servers

3 1

6 2

CPU cores

Minimal: 16
Recommended: 32
Minimal: 16
Recommended: depends on workload

RAM, GB

Minimal: 64
Recommended: 256
Minimal: 64
Recommended: 128

System disk, GB 3

Minimal: SSD 1x 120
Recommended: NVME 1 x 960
Minimal: SSD 1 x 120
Recommended: NVME 1 x 960

SSD/HDD storage, GB

1 x 1900 4

2 x 1900

NICs 5

Minimal: 1 x 2-port
Recommended: 2 x 2-port
Minimal: 2 x 2-port
Recommended: depends on workload
1

Adding more than 3 nodes to a management cluster is not supported.

2

Three manager nodes for HA and three worker storage nodes for a minimal Ceph cluster.

3

A management cluster requires 2 volumes for Container Cloud (total 50 GB) and 5 volumes for StackLight (total 60 GB). A managed cluster requires 5 volumes for StackLight.

4

In total, at least 2 disks are required:

  • disk0 - minimum 120 GB for system

  • disk1 - minimum 120 GB for LocalVolumeProvisioner

For the default storage schema, see Default configuration of the host system storage

5

Only one PXE port per node is allowed. The out-of-band management (IPMI) port is not included.

System requirements for the seed node

The seed node is necessary only to deploy the management cluster. When the bootstrap is complete, the bootstrap node can be redeployed and its resources can be reused for the managed cluster workloads.

The minimum reference system requirements for a baremetal-based bootstrap seed node are as follows:

  • Basic server on Ubuntu 22.04 with the following configuration:

    • Kernel version 4.15.0-76.86 or later

    • 8 GB of RAM

    • 4 CPU

    • 10 GB of free disk space for the bootstrap cluster cache

  • No DHCP or TFTP servers on any NIC networks

  • Routable access IPMI network for the hardware servers. For more details, see Host networking.

  • Internet access for downloading of all required artifacts

Network fabric

The following diagram illustrates the physical and virtual L2 underlay networking schema for the final state of the Mirantis Container Cloud bare metal deployment.

_images/bm-cluster-physical-and-l2-networking.png

The network fabric reference configuration is a spine/leaf with 2 leaf ToR switches and one out-of-band (OOB) switch per rack.

Reference configuration uses the following switches for ToR and OOB:

  • Cisco WS-C3560E-24TD has 24 of 1 GbE ports. Used in OOB network segment.

  • Dell Force 10 S4810P has 48 of 1/10GbE ports. Used as ToR in Common/PXE network segment.

In the reference configuration, all odd interfaces from NIC0 are connected to TOR Switch 1, and all even interfaces from NIC0 are connected to TOR Switch 2. The Baseboard Management Controller (BMC) interfaces of the servers are connected to OOB Switch 1.

The following recommendations apply to all types of nodes:

  • Use the Link Aggregation Control Protocol (LACP) bonding mode with MC-LAG domains configured on leaf switches. This corresponds to the 802.3ad bond mode on hosts.

  • Use ports from different multi-port NICs when creating bonds. This makes network connections redundant if failure of a single NIC occurs.

  • Configure the ports that connect servers to the PXE network with PXE VLAN as native or untagged. On these ports, configure LACP fallback to ensure that the servers can reach DHCP server and boot over network.

DHCP range requirements for PXE

When setting up the network range for DHCP Preboot Execution Environment (PXE), keep in mind several considerations to ensure smooth server provisioning:

  • Determine the network size. For instance, if you target a concurrent provision of 50+ servers, a /24 network is recommended. This specific size is crucial as it provides sufficient scope for the DHCP server to provide unique IP addresses to each new Media Access Control (MAC) address, thereby minimizing the risk of collision.

    The concept of collision refers to the likelihood of two or more devices being assigned the same IP address. With a /24 network, the collision probability using the SDBM hash function, which is used by the DHCP server, is low. If a collision occurs, the DHCP server provides a free address using a linear lookup strategy.

  • In the context of PXE provisioning, technically, the IP address does not need to be consistent for every new DHCP request associated with the same MAC address. However, maintaining the same IP address can enhance user experience, making the /24 network size more of a recommendation than an absolute requirement.

  • For a minimal network size, it is sufficient to cover the number of concurrently provisioned servers plus one additional address (50 + 1). This calculation applies after covering any exclusions that exist in the range. You can define excludes in the corresponding field of the Subnet object. For details, see API Reference: Subnet resource.

  • When the available address space is less than the minimum described above, you will not be able to automatically provision all servers. However, you can manually provision them by combining manual IP assignment for each bare metal host with manual pauses. For these operations, use the host.dnsmasqs.metal3.io/address and baremetalhost.metal3.io/detached annotations in the BareMetalHost object. For details, see Operations Guide: Manually allocate IP addresses for bare metal hosts.

  • All addresses within the specified range must remain unused before provisioning. If an IP address in-use is issued by the DHCP server to a BOOTP client, that specific client cannot complete provisioning.

Management cluster storage

The management cluster requires minimum two storage devices per node. Each device is used for different type of storage.

  • The first device is always used for boot partitions and the root file system. SSD is recommended. RAID device is not supported.

  • One storage device per server is reserved for local persistent volumes. These volumes are served by the Local Storage Static Provisioner (local-volume-provisioner) and used by many services of Container Cloud.

You can configure host storage devices using the BareMetalHostProfile resources. For details, see Customize the default bare metal host profile.

Requirements for an OpenStack-based cluster

While planning the deployment of an OpenStack-based Mirantis Container Cloud cluster with Mirantis Kubernetes Engine (MKE), consider the following general requirements:

  • Kubernetes on OpenStack requires the Cinder API V3 and Octavia API availability.

  • Mirantis supports deployments based on OpenStack Victoria or Yoga with Open vSwitch (OVS) or Tungsten Fabric (TF) on top of Mirantis OpenStack for Kubernetes (MOSK) Victoria or Yoga with TF.

For system requirements for a bootstrap node, see Requirements for a bootstrap node.

If you use a firewall or proxy, make sure that the bootstrap and management clusters have access to the following IP ranges and domain names required for the Container Cloud content delivery network and alerting:

  • IP ranges:

  • Domain names:

    • mirror.mirantis.com and repos.mirantis.com for packages

    • binary.mirantis.com for binaries and Helm charts

    • mirantis.azurecr.io and *.blob.core.windows.net for Docker images

    • mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry (port 443 if proxy is enabled)

    • mirantis.my.salesforce.com and login.salesforce.com for Salesforce alerts

Note

  • Access to Salesforce is required from any Container Cloud cluster type.

  • If any additional Alertmanager notification receiver is enabled, for example, Slack, its endpoint must also be accessible from the cluster.

Caution

Regional clusters are unsupported since Container Cloud 2.25.0. Mirantis does not perform functional integration testing of the feature and the related code is removed in Container Cloud 2.26.0. If you still require this feature, contact Mirantis support for further information.

Note

The requirements in this section apply to the latest supported Container Cloud release.

Requirements for an OpenStack-based Container Cloud cluster

Resource

Management cluster

Managed cluster

Comments

# of nodes

3 (HA) + 1 (Bastion)

5 (6 with StackLight HA)

  • A bootstrap cluster requires access to the OpenStack API.

  • Each management cluster requires 3 nodes for the manager nodes HA. Adding more than 3 nodes to a management cluster is not supported.

  • A managed cluster requires 3 manager nodes for HA and 2 worker nodes for the Container Cloud workloads. If the multiserver mode is enabled for StackLight, 3 worker nodes are required for workloads.

  • Each management cluster requires 1 node for the Bastion instance that is created with a public IP address to allow SSH access to instances.

# of vCPUs per node

8

8

  • The Bastion node requires 1 vCPU.

  • Refer to the RAM recommendations described below to plan resources for different types of nodes.

RAM in GB per node

24

16

To prevent issues with low RAM, Mirantis recommends the following types of instances for a managed cluster with 50-200 nodes:

  • 16 vCPUs and 32 GB of RAM - manager node

  • 16 vCPUs and 128 GB of RAM - nodes where the StackLight server components run

The Bastion node requires 1 GB of RAM.

Storage in GB per node

120

120

  • For the Bastion node, the default amount of storage is enough

  • To boot machines from a block storage volume, verify that disks performance matches the etcd requirements as described in etcd documentation

  • To boot the Bastion node from a block storage volume, 80 GB is enough

Operating system

Ubuntu 22.04
CentOS 7.9 0
Ubuntu 22.04
CentOS 7.9 0

For management and managed clusters, a base Ubuntu 22.04 or CentOS 7.9 image must be present in Glance.

MCR

23.0.9 Since 16.1.0
23.0.7 Since 16.0.0
20.10.17 Since 14.0.0
20.10.13 Before 14.0.0
23.0.9 Since 16.1.0
23.0.7 Since 16.0.0
20.10.17 Since 14.0.0
20.10.13 Before 14.0.0

Mirantis Container Runtime (MCR) is deployed by Container Cloud as a Container Runtime Interface (CRI) instead of Docker Engine.

OpenStack version

Queens, Victoria, Yoga

Queens, Victoria, Yoga

OpenStack Victoria and Yoga are supported on top of MOSK clusters.

Obligatory OpenStack components

Octavia, Cinder, OVS/TF

Octavia, Cinder, OVS/TF

  • Tungsten Fabric is supported on OpenStack Victoria or Yoga.

  • Only Cinder API V3 is supported.

# of Cinder volumes

7 (total 110 GB)

5 (total 60 GB)

  • Each management cluster requires 2 volumes for Container Cloud (total 50 GB) and 5 volumes for StackLight (total 60 GB)

  • A managed cluster requires 5 volumes for StackLight

# of load balancers

10

6

  • LBs for a management cluster:

    • 1 for MKE

    • 1 for Container Cloud UI

    • 1 for Keycloak service

    • 1 for IAM service

    • 6 for StackLight

  • LBs for a managed cluster:

    • 1 for MKE

    • 5 for StackLight with enabled logging (or 4 without logging)

# of floating IPs

11

11

  • FIPs for a management cluster:

    • 1 for MKE

    • 1 for Container Cloud UI

    • 1 for Keycloak service

    • 1 for IAM service

    • 1 for the Bastion node (or 3 without Bastion: one FIP per manager node)

    • 6 for StackLight

  • FIPs for a managed cluster:

    • 1 for MKE

    • 3 for the manager nodes

    • 2 for the worker nodes

    • 5 for StackLight with enabled logging (4 without logging)

0(1,2)

A Container Cloud cluster based on both Ubuntu and CentOS operating systems is not supported.

Requirements for a VMware vSphere-based cluster

Warning

This section only applies to Container Cloud 2.27.2 (Cluster release 16.2.2) or earlier versions. Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

Note

Container Cloud is developed and tested on VMware vSphere 7.0 and 6.7.

For system requirements for a bootstrap node, see Requirements for a bootstrap node.

If you use a firewall or proxy, make sure that the bootstrap and management clusters have access to the following IP ranges and domain names required for the Container Cloud content delivery network and alerting:

  • IP ranges:

  • Domain names:

    • mirror.mirantis.com and repos.mirantis.com for packages

    • binary.mirantis.com for binaries and Helm charts

    • mirantis.azurecr.io and *.blob.core.windows.net for Docker images

    • mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry (port 443 if proxy is enabled)

    • mirantis.my.salesforce.com and login.salesforce.com for Salesforce alerts

Note

  • Access to Salesforce is required from any Container Cloud cluster type.

  • If any additional Alertmanager notification receiver is enabled, for example, Slack, its endpoint must also be accessible from the cluster.

Caution

Regional clusters are unsupported since Container Cloud 2.25.0. Mirantis does not perform functional integration testing of the feature and the related code is removed in Container Cloud 2.26.0. If you still require this feature, contact Mirantis support for further information.

Note

The requirements in this section apply to the latest supported Container Cloud release.

System requirements
Requirements for a vSphere-based Container Cloud cluster

Resource

Management cluster

Managed cluster

Comments

# of nodes

3 (HA)

5 (6 with StackLight HA)

  • A bootstrap cluster requires access to the vSphere API.

  • A management cluster requires 3 nodes for the manager nodes HA. Adding more than 3 nodes to a management cluster is not supported.

  • A managed cluster requires 3 manager nodes for HA and 2 worker nodes for the Container Cloud workloads. If the multiserver mode is enabled for StackLight, 3 worker nodes are required for workloads.

# of vCPUs per node

8

8

Refer to the RAM recommendations described below to plan resources for different types of nodes.

RAM in GB per node

32

16

To prevent issues with low RAM, Mirantis recommends the following VM templates for a managed cluster with 50-200 nodes:

  • 16 vCPUs and 40 GB of RAM - manager node

  • 16 vCPUs and 128 GB of RAM - nodes where the StackLight server components run

Storage in GB per node

120

120

The listed amount of disk space must be available as a shared datastore of any type, for example, NFS or vSAN, mounted on all hosts of the vCenter cluster.

Operating system

RHEL 8.7 1
Ubuntu 20.04
RHEL 8.7 1
Ubuntu 20.04

For a management and managed cluster, a base OS VM template must be present in the VMware VM templates folder available to Container Cloud. For details, see VsphereVMTemplate.

RHEL license
(for RHEL deployments only)

RHEL licenses for Virtual Datacenters

RHEL licenses for Virtual Datacenters

This license type allows running unlimited guests inside one hypervisor. The amount of licenses is equal to the amount of hypervisors in vCenter Server, which will be used to host RHEL-based machines. Container Cloud will schedule machines according to scheduling rules applied to vCenter Server. Therefore, make sure that your RedHat Customer portal account has enough licenses for allowed hypervisors.

MCR

23.0.9 Since 16.1.0
23.0.7 Since 16.0.1
20.10.17 Since 14.0.0
23.0.9 Since 16.1.0
23.0.7 Since 16.0.1
20.10.17 Since 14.0.0

Mirantis Container Runtime (MCR) is deployed by Container Cloud as a Container Runtime Interface (CRI) instead of Docker Engine.

VMware vSphere version

7.0, 6.7

7.0, 6.7

cloud-init version

20.3 for RHEL

20.3 for RHEL

The minimal cloud-init package version built for the VsphereVMTemplate.

VMware Tools version

11.0.5

11.0.5

The minimal open-vm-tools package version built for the VsphereVMTemplate.

Obligatory vSphere capabilities

DRS,
Shared datastore
DRS,
Shared datastore

A shared datastore must be mounted on all hosts of the vCenter cluster. Combined with Distributed Resources Scheduler (DRS), it ensures that the VMs are dynamically scheduled to the cluster hosts.

IP subnet size

/24

/24

Consider the supported VMware vSphere network objects and IPAM recommendations.

Minimal IP addresses distribution:

  • Management cluster:

    • 1 for the load balancer of Kubernetes API

    • 3 for manager nodes (one per node)

    • 6 for the Container Cloud services

    • 6 for StackLight

  • Managed cluster:

    • 1 for the load balancer of Kubernetes API

    • 3 for manager nodes

    • 2 for worker nodes

    • 6 for StackLight

1(1,2)
  • RHEL 8.7 is generally available since Cluster releases 16.0.0 and 14.1.0. Before these Cluster releases, it is supported as Technology Preview.

  • Container Cloud does not support mixed operating systems, RHEL combined with Ubuntu, in one cluster.

Requirements for deployment resources

The VMware vSphere provider of Mirantis Container Cloud requires the following resources to successfully create virtual machines for Container Cloud clusters:

  • Data center

    All resources below must be related to one data center.

  • Cluster

    All virtual machines must run on the hosts of one cluster.

  • Virtual Network or Distributed Port Group

    Network for virtual machines. For details, see VMware vSphere network objects and IPAM recommendations.

  • Datastore

    Storage for virtual machines disks and Kubernetes volumes.

  • Folder

    Placement of virtual machines.

  • Resource pool

    Pool of CPU and memory resources for virtual machines.

You must provide the data center and cluster resources by name. You can provide other resources by:

  • Name

    Resource name must be unique in the data center and cluster. Otherwise, the vSphere provider detects multiple resources with same name and cannot determine which one to use.

  • Full path (recommended)

    Full path to a resource depends on its type. For example:

    • Network

      /<data_center>/network/<network_name>

    • Resource pool

      /<data_center>/host/<cluster>/Resources/<resource pool_name>

    • Folder

      /<data_center>/vm/<folder1>/<folder2>/.../<folder_name> or /<data_center>/vm/<folder_name>

    • Datastore

      /<data_center>/datastore/<datastore_name>

You can determine the proper resource name using the vSphere UI.

To obtain the full path to vSphere resources:

  1. Download the latest version of GOVC utility depending on your operating system and unpack the govc binary into PATH on your machine.

  2. Set the environment variables to access your vSphere cluster. For example:

    export GOVC_USERNAME=user
    export GOVC_PASSWORD=password
    export GOVC_URL=https://vcenter.example.com
    
  3. List the data center root using the govc ls command. Example output:

    /<data_center>/vm
    /<data_center>/network
    /<data_center>/host
    /<data_center>/datastore
    
  4. Obtain the full path to resources by name for:

    1. Network or Distributed Port Group (Distributed Virtual Port Group):

      govc find /<data_center> -type n -name <network_name>
      
    2. Datastore:

      govc find /<data_center> -type s -name <datastore_name>
      
    3. Folder:

      govc find /<data_center> -type f -name <folder_name>
      
    4. Resource pool:

      govc find /<data_center> -type p -name <resource_pool_name>
      
  5. Verify the resource type by full path:

    govc object.collect -json -o "<full_path_to_resource>" | jq .Self.Type
    
Setup of deployment users and permissions

To deploy Mirantis Container Cloud on the VMware vSphere-based environment, you need to prepare vSphere accounts for Container Cloud. Contact your vSphere administrator to set up the required users and permissions following the steps below:

  1. Log in to the vCenter Server Web Console.

  2. Create the cluster-api user with the following privileges:

    Note

    Container Cloud uses two separate vSphere accounts for:

    • Cluster API related operations, such as create or delete VMs, and for preparation of the VM template using Packer

    • Storage operations, such as dynamic PVC provisioning

    You can also create one user that has all privileges sets mentioned above.

    The cluster-api user privileges

    Privilege

    Permission

    Content library

    • Download files

    • Read storage

    • Sync library item

    Datastore

    • Allocate space

    • Browse datastore

    • Low-level file operations

    • Update virtual machine metadata

    Distributed switch

    • Host operation

    • IPFIX operation

    • Modify

    • Network I/O control operation

    • Policy operation

    • Port configuration operation

    • Port setting operation

    • VSPAN operation

    Folder

    • Create folder

    • Rename folder

    Global

    Cancel task

    Host local operations

    • Create virtual machine

    • Delete virtual machine

    • Reconfigure virtual machine

    Network

    Assign network

    Resource

    Assign virtual machine to resource pool

    Scheduled task

    • Create tasks

    • Modify task

    • Remove task

    • Run task

    Sessions

    • Validate session

    • View and stop sessions

    Storage views

    View

    Tasks

    • Create task

    • Update task

    Virtual machine permissions

    Privilege

    Permission

    Change configuration

    • Acquire disk lease

    • Add existing disk

    • Add new disk

    • Add or remove device

    • Advanced configuration

    • Change CPU count

    • Change Memory

    • Change Settings

    • Change Swapfile placement

    • Change resource

    • Configure Host USB device

    • Configure Raw device

    • Configure managedBy

    • Display connection settings

    • Extend virtual disk

    • Modify device settings

    • Query Fault Tolerance compatibility

    • Query unowned files

    • Reload from path

    • Remove disk

    • Rename

    • Reset guest information

    • Set annotation

    • Toggle disk change tracking

    • Toggle fork parent

    • Upgrade virtual machine compatibility

    Interaction

    • Configure CD media

    • Configure floppy media

    • Console interaction

    • Device connection

    • Inject USB HID scan codes

    • Power off

    • Power on

    • Reset

    • Suspend

    Inventory

    • Create from existing

    • Create new

    • Move

    • Register

    • Remove

    • Unregister

    Provisioning

    • Allow disk access

    • Allow file access

    • Allow read-only disk access

    • Allow virtual machine download

    • Allow virtual machine files upload

    • Clone template

    • Clone virtual machine

    • Create template from virtual machine

    • Customize guest

    • Deploy template

    • Mark as template

    • Mark as virtual machine

    • Modify customization specification

    • Promote disks

    • Read customization specifications

    Snapshot management

    • Create snapshot

    • Remove snapshot

    • Rename snapshot

    • Revert to snapshot

    vSphere replication

    Monitor replication

  3. Create the storage user with the following privileges:

    Note

    For more details about all required privileges for the storage user, see vSphere Cloud Provider documentation.

    The storage user privileges

    Privilege

    Permission

    Cloud Native Storage

    Searchable

    Content library

    View configuration settings

    Datastore

    • Allocate space

    • Browse datastore

    • Low level file operations

    • Remove file

    Folder

    • Create folder

    Host configuration

    • Storage partition configuration

    Host local operations

    • Create virtual machine

    • Delete virtual machine

    • Reconfigure virtual machine

    Host profile

    View

    Profile-driven storage

    Profile-driven storage view

    Resource

    Assign virtual machine to resource pool

    Scheduled task

    • Create tasks

    • Modify task

    • Run task

    Sessions

    • Validate session

    • View and stop sessions

    Storage views

    View

    Virtual machine permissions

    Privilege

    Permission

    Change configuration

    • Add existing disk

    • Add new disk

    • Add or remove device

    • Advanced configuration

    • Change CPU count

    • Change Memory

    • Change Settings

    • Configure managedBy

    • Extend virtual disk

    • Remove disk

    • Rename

    Inventory

    • Create from existing

    • Create new

    • Remove

  4. For RHEL deployments, if you do not have a RHEL machine with the virt-who service configured to report the vSphere environment configuration and hypervisors information to RedHat Customer Portal or RedHat Satellite server, set up the virt-who service inside the Container Cloud machines for a proper RHEL license activation.

    Create a virt-who user with at least read-only access to all objects in the vCenter Data Center.

    The virt-who service on RHEL machines will be provided with the virt-who user credentials to properly manage RHEL subscriptions.

    For details on how to create the virt-who user, refer to the official RedHat Customer Portal documentation.

StackLight requirements for an MKE attached cluster

Available since 2.25.2 Unsupported since 2.27.3

Warning

This section only applies to Container Cloud 2.27.2 (Cluster release 16.2.2) or earlier versions. Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

During attachment of a Mirantis Kubernetes Engine (MKE) cluster that is not deployed by Container Cloud to a vSphere-based management cluster, you can add StackLight as the logging, monitoring, and alerting solution. In this scenario, your cluster must satisfy several requirements that primarily involve alignment of cluster resources with specific StackLight settings.

General requirements

While planning the attachment of an existing MKE cluster that is not deployed by Container Cloud to a vSphere-based management cluster, consider the following general requirements for StackLight:

Note

Attachment of MKE clusters is tested on Ubuntu 20.04.

Requirements for cluster size

While planning the attachment of an existing MKE cluster that is not deployed by Container Cloud to a vSphere-based management cluster, consider the cluster size requirements for StackLight. Depending on the following specific StackLight HA and logging settings, use the example size guidelines below:

  • The non-HA mode - StackLight services are installed on a minimum of one node with the StackLight label (StackLight nodes) with no redundancy using Persistent Volumes (PVs) from the default storage class to store data. Metric collection agents are installed on each node (Other nodes).

  • The HA mode - StackLight services are installed on a minimum of three nodes with the StackLight label (StackLight nodes) with redundancy using PVs provided by Local Volume Provisioner to store data. Metric collection agents are installed on each node (Other nodes).

  • Logging enabled - the Enable logging option is turned on, which enables the OpenSearch cluster to store infrastructure logs.

  • Logging disabled - the Enable logging option is turned off. In this case, StackLight will not install OpenSearch and will not collect infrastructure logs.

LoadBalancer (LB) Services support is required to provide external access to StackLight web UIs.

StackLight requirements for an attached MKE cluster, with logging enabled:

StackLight nodes 1

Other nodes

Storage (PVs)

LBs

Non-HA (1-node example)

  • RAM requests: 11 GB

  • RAM limits: 33 GB

  • CPU requests: 4.5 cores

  • CPU limits: 12 cores

  • RAM requests: 0.25 GB

  • RAM limits: 1 GB

  • CPU requests: 0.5 cores

  • CPU limits: 1 core

  • 1 PV for Prometheus (size is configurable; 1x total)

  • 2 PVs for Alertmanager (2 Gi/volume; 4 Gi total)

  • 1 PV for Patroni (10 G; 10 G total)

  • 1 PV for OpenSearch (size is configurable; 1x total)

5

HA (3-nodes example)

  • RAM requests: 10 GB

  • RAM limits: 25 GB

  • CPU requests: 2.8 cores

  • CPU limits: 7.5 cores

  • RAM requests: 0.25 GB

  • RAM limits: 1 GB

  • CPU requests: 0.5 cores

  • CPU limits: 1 core

  • 2 PVs (1 per StackLight node) for Prometheus (size is configurable; 2x total)

  • 2 PVs (1 per StackLight node) for Alertmanager (2 Gi/volume; 4 Gi total)

  • 3 PVs (1 per StackLight node) for Patroni (10 G/volume; 30 G total)

  • 3 PVs (1 per StackLight node) for OpenSearch (size is configurable; 3x total)

5

StackLight requirements for an attached MKE cluster, with logging disabled

StackLight nodes 1

Other nodes

Storage (PVs)

LBs

Non-HA (1-node example)

  • RAM requests: 4 GB

  • RAM limits: 23 GB

  • CPU requests: 3 cores

  • CPU limits: 9 cores

  • RAM requests: 0.05 GB

  • RAM limits: 0.1 GB

  • CPU requests: 0.01 cores

  • CPU limits: 0 cores

  • 1 PV for Prometheus (size is configurable; 1x total)

  • 2 PVs for Alertmanager (2 Gi/volume; 4Gi total)

  • 1 PV for Patroni (10 G; 10 G total)

4

HA (3-nodes example)

  • RAM requests: 3 GB

  • RAM limits: 15 GB

  • CPU requests: 1.6 cores

  • CPU limits: 4.2 cores

  • RAM requests: 0.05 GB

  • RAM limits: 0.1 GB

  • CPU requests: 0.01 cores

  • CPU limits: 0 core

  • 2 PVs (1 per StackLight node) for Prometheus (size is configurable; 2x total)

  • 2 PVs (1 per StackLight node) for Alertmanager (2 Gi/volume; 4 Gi total)

  • 3 PVs (1 per StackLight node) for Patroni (10 G/volume; 30 G total)

4

1(1,2)

In the non-HA mode, StackLight components are bound to the nodes labeled with the StackLight label. If there are no nodes labeled, StackLight components will be scheduled to all schedulable worker nodes until the StackLight label(s) are added. The requirements presented in the table for the non-HA mode are summarized requirements for all StackLight nodes.

Proxy and cache support

Proxy support

If you require all Internet access to go through a proxy server for security and audit purposes, you can bootstrap management clusters using proxy. The proxy server settings consist of three standard environment variables that are set prior to the bootstrap process:

  • HTTP_PROXY

  • HTTPS_PROXY

  • NO_PROXY

These settings are not propagated to managed clusters. However, you can enable a separate proxy access on a managed cluster using the Container Cloud web UI. This proxy is intended for the end user needs and is not used for a managed cluster deployment or for access to the Mirantis resources.

Caution

Since Container Cloud uses the OpenID Connect (OIDC) protocol for IAM authentication, management clusters require a direct non-proxy access from managed clusters.

StackLight components, which require external access, automatically use the same proxy that is configured for Container Cloud clusters.

On the managed clusters with limited Internet access, a proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled, for example, for the Salesforce integration and Alertmanager notifications external rules. For more details about proxy implementation in StackLight, see StackLight proxy.

For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Hardware and system requirements.

After enabling proxy support on managed clusters, proxy is used for:

  • Docker traffic on managed clusters

  • StackLight

  • OpenStack on MOSK-based clusters

Warning

Any modification to the Proxy object used in any cluster, for example, changing the proxy URL, NO_PROXY values, or certificate, leads to cordon-drain and Docker restart on the cluster machines.

Artifacts caching

The Container Cloud managed clusters are deployed without direct Internet access in order to consume less Internet traffic in your cloud. The Mirantis artifacts used during managed clusters deployment are downloaded through a cache running on a management cluster. The feature is enabled by default on new managed clusters and will be automatically enabled on existing clusters during upgrade to the latest version.

Caution

IAM operations require a direct non-proxy access of a managed cluster to a management cluster.

MKE API limitations

To ensure the Mirantis Container Cloud stability in managing the Container Cloud-based Mirantis Kubernetes Engine (MKE) clusters, the following MKE API functionality is not available for the Container Cloud-based MKE clusters as compared to the MKE clusters that are deployed not by Container Cloud. Use the Container Cloud web UI or CLI for this functionality instead.

Public APIs limitations in a Container Cloud-based MKE cluster

API endpoint

Limitation

GET /swarm

Swarm Join Tokens are filtered out for all users, including admins.

PUT /api/ucp/config-toml

All requests are forbidden.

POST /nodes/{id}/update

Requests for the following changes are forbidden:

  • Change Role

  • Add or remove the com.docker.ucp.orchestrator.swarm and com.docker.ucp.orchestrator.kubernetes labels.

DELETE /nodes/{id}

All requests are forbidden.

MKE configuration management

This section describes configuration specifics of an MKE cluster deployed using Container Cloud.

MKE configuration managed by Container Cloud

Since 2.25.1 (Cluster releases 16.0.1 and 17.0.1), Container Cloud does not override changes in MKE configuration except the following list of parameters that are automatically managed by Container Cloud. These parameters are always overridden by the Container Cloud default values if modified direclty using the MKE API. For details on configuration using the MKE API, see MKE configuration managed directly by the MKE API.

However, you can manually configure a few options from this list using the Cluster object of a Container Cloud cluster. They are labeled with the superscript and contain references to the respective configuration procedures in the Comments columns of the tables.

[audit_log_configuration]

MKE parameter name

Default value in Container Cloud

Comments

level

"metadata" 0
"" 1

You can configure this option either using MKE API with no Container Cloud overrides or using the Cluster object of a Container Cloud cluster. For details, see Configure Kubernetes auditing and profiling and MKE documentation: MKE audit logging.

If configured using the Cluster object, use the same object to disable the option. Otherwise, it will be overridden by Container Cloud.

support_bundle_include_audit_logs

false

For configuration procedure, see comments above.

0

For management clusters since 2.26.0 (Cluster release 16.1.0)

1

For management and managed clusters since 2.24.3 (Cluster releases 15.0.2 and 14.0.2)

[auth]

MKE parameter name

Default value in Container Cloud

default_new_user_role

"restrictedcontrol"

backend

"managed"

samlEnabled

false

managedPasswordDisabled

false

[auth.external_identity_provider]

MKE parameter name

Default value in Container Cloud

issuer

"https://<Keycloak-external-address>/auth/realms/iam"

userServiceId

"<userServiceId>"

clientId

"kaas"

wellKnownConfigUrl

"https://<Keycloak-external-address>/auth/realms/iam/.well-known/openid-configuration"

caBundle

"<caCert>"

usernameClaim

""

httpProxy

""

httpsProxy

""

[hardening_configuration]

MKE parameter name

Default value in Container Cloud

hardening_enabled

true

limit_kernel_capabilities

true

pids_limit_int

100000

pids_limit_k8s

100000

pids_limit_swarm

100000

[scheduling_configuration]

MKE parameter name

Default value in Container Cloud

enable_admin_ucp_scheduling

true

default_node_orchestrator

kubernetes

[tracking_configuration]

MKE parameter name

Default value in Container Cloud

cluster_label

"prod"

[cluster_config]

Caution

Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

MKE parameter name

Default value in Container Cloud

Comments

calico_ip_auto_method

  • Bare metal: interface=k8s-pods

  • OpenStack, vSphere: ""

calico_mtu

"1440"

For configuration steps, see Set the MTU size for Calico.

calico_vxlan

true

calico_vxlan_mtu

"1440"

calico_vxlan_port

"4792"

cloud_provider

  • Bare metal: ""

  • OpenStack, vSphere: external

  • vSphere before 2.25.1: vsphere

Depends on the selected cloud provider.

controller_port

  • Bare metal, vSphere: 4443

  • OpenStack: 6443

custom_kube_api_server_flags

["--event-ttl=720h"]

Applies only to MKE on the management cluster.

custom_kube_controller_manager_flags

  • ["--leader-elect-lease-duration=120s", "--leader-elect-renew-deadline=60s"]

  • ["--feature-gates=CSIMigrationvSphere=true"] 2

custom_kube_scheduler_flags

["--leader-elect-lease-duration=120s", "--leader-elect-renew-deadline=60s"]

custom_kubelet_flags

  • ["--serialize-image-pulls=false"]

  • ["--feature-gates=CSIMigrationvSphere=true"] 2

etcd_storage_quota

""

For configuration steps, see Increase storage quota for etcd.

exclude_server_identity_headers

true

ipip_mtu

"1440"

kube_api_server_auditing

true 4
false 5

For configuration steps, see Configure Kubernetes auditing and profiling.

kube_api_server_audit_log_maxage 6

30

kube_api_server_audit_log_maxbackup 6

10

kube_api_server_audit_log_maxsize 6

10

kube_api_server_profiling_enabled

false

For configuration steps, see Configure Kubernetes auditing and profiling.

kube_apiserver_port

  • Bare metal, vSphere: 5443

  • OpenStack: 443

kube_protect_kernel_defaults

true

local_volume_collection_mapping

false

manager_kube_reserved_resources

"cpu=1000m,memory=2Gi,ephemeral-storage=4Gi"

metrics_retention_time

"24h"

metrics_scrape_interval

"1m"

nodeport_range

"30000-32768"

pod_cidr

"10.233.64.0/18"

You can override this value in spec::clusterNetwork::pods::cidrBlocks: of the Cluster object.

priv_attributes_allowed_for_service_accounts 3

["hostBindMounts", "hostIPC", "hostNetwork", "hostPID", "kernelCapabilities", "privileged"]

priv_attributes_priv_attributes_service_accounts 3

["kube-system:helm-controller-sa", "kube-system:pod-garbage-collector", "stacklight:stacklight-helm-controller"]service_accounts

profiling_enabled

false

prometheus_memory_limit

"4Gi"

prometheus_memory_request

"2Gi"

secure_overlay

true

service_cluster_ip_range

"10.233.0.0/18"

You can override this value in spec::clusterNetwork::services::cidrBlocks: of the Cluster object.

swarm_port

2376

swarm_strategy

"spread"

unmanaged_cni

false

vxlan_vni

10000

worker_kube_reserved_resources

"cpu=100m,memory=300Mi,ephemeral-storage=500Mi"

2(1,2)

The CSIMigrationvSphere flag applies only to the vSphere provider since 2.25.1.

3(1,2)

For priv_attributes parameters, you can add custom options on top of existing parameters using the MKE API.

4

For management clusters since 2.26.0 (Cluster release 16.1.0).

5

For management and managed clusters since 2.24.3 (Cluster releases 15.0.2 and 14.0.2).

6(1,2,3)

For management and managed clusters since 2.27.0 (Cluster releases 17.2.0 and 16.2.0). For configuration steps, see Configure Kubernetes auditing and profiling.

Note

All possible values for parameters labeled with the superscript, which you can manually configure using the Cluster object are described in MKE Operations Guide: Configuration options.

MKE configuration managed directly by the MKE API

Since 2.25.1, aside from MKE parameters described in MKE configuration managed by Container Cloud, Container Cloud does not override changes in MKE configuration that are applied directly through the MKE API. For the configuration options and procedure, see MKE documentation:

  • MKE configuration options

  • Configure an existing MKE cluster

    While using this procedure, replace the command to upload the newly edited MKE configuration file with the following one:

    curl --silent --insecure -X PUT -H "X-UCP-Allow-Restricted-API: i-solemnly-swear-i-am-up-to-no-good" -H "accept: application/toml" -H "Authorization: Bearer $AUTHTOKEN" --upload-file 'mke-config.toml' https://$MKE_HOST/api/ucp/config-toml
    

Important

Mirantis cannot guarrantee the expected behavior of the functionality configured using the MKE API as long as customer-specific configuration does not undergo testing within Container Cloud. Therefore, Mirantis recommends that you test custom MKE settings configured through the MKE API on a staging environment before applying them to production.

Deployment Guide

Deploy a Container Cloud management cluster

Caution

Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

Note

The deprecated bootstrap procedure using Bootstrap v1 was removed for the sake of Bootstrap v2 in Container Cloud 2.26.0.

Introduction

Available since 2.25.0

Mirantis Container Cloud Bootstrap v2 provides best user experience to set up Container Cloud. Using Bootstrap v2, you can provision and operate management clusters using required objects through the Container Cloud web UI.

Basic concepts and components of Bootstrap v2 include:

  • Bootstrap cluster

    Bootstrap cluster is any kind-based Kubernetes cluster that contains a minimal set of Container Cloud bootstrap components allowing the user to prepare the configuration for management cluster deployment and start the deployment. The list of these components includes:

    • Bootstrap Controller

      Controller that is responsible for:

      1. Configuration of a bootstrap cluster with provider-specific charts through the bootstrap Helm bundle.

      2. Configuration and deployment of a management cluster and its related objects.

    • Helm Controller

      Operator that manages Helm chart releases. It installs the Container Cloud bootstrap and provider-specific charts configured in the bootstrap Helm bundle.

    • Public API charts

      Helm charts that contain custom resource definitions for Container Cloud resources of supported providers.

    • Admission Controller

      Controller that performs mutations and validations for the Container Cloud resources including cluster and machines configuration.

    • Bootstrap web UI

      User-friendly web interface to prepare the configuration for a management cluster deployment.

    Currently one bootstrap cluster can be used for deployment of only one management cluster. For example, to add a new management cluster with different settings, a new bootstrap cluster must be recreated from scratch.

  • Bootstrap region

    BootstrapRegion is the first object to create in the bootstrap cluster for the Bootstrap Controller to identify and install required provider components onto the bootstrap cluster. After, the user can prepare and deploy a management cluster with related resources.

    The bootstrap region is a starting point for the cluster deployment. The user needs to approve the BootstrapRegion object. Otherwise, the Bootstrap Controller will not be triggered for the cluster deployment.

  • Bootstrap Helm bundle

    Helm bundle that contains charts configuration for the bootstrap cluster. This object is managed by the Bootstrap Controller that updates the bundle depending on a provider selected by the user in the BootstrapRegion object. The Bootstrap Controller always configures provider-related charts listed in the regional section of the Container Cloud release for the selected provider. Depending on the provider and cluster configuration, the Bootstrap Controller may update or reconfigure this bundle even after the cluster deployment starts. For example, the Bootstrap Controller enables the provider in the bootstrap cluster only after the bootstrap region is approved for the deployment.

Overview of the deployment workflow

Caution

Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

Management cluster deployment consists of several sequential stages. Each stage finishes when a specific condition is met or specific configuration applies to a cluster or its machines.

In case of issues at any deployment stage, you can identify the problem and adjust it on the fly. The cluster deployment does not abort until all stages complete by means of the infinite-timeout option enabled by default in Bootstrap v2.

Infinite timeout prevents the bootstrap failure due to timeout. This option is useful in the following cases:

  • The network speed is slow for artifacts downloading

  • An infrastructure configuration does not allow booting fast

  • A bare-metal node inspecting presupposes more than two HDDSATA disks to attach to a machine

You can track the status of each stage in the bootstrapStatus section of the Cluster object that is updated by the Bootstrap Controller.

The Bootstrap Controller starts deploying the cluster after you approve the BootstrapRegion configuration.

The following table describes deployment states of a management cluster that apply in the strict order.

Deployment states of a management cluster

Step

State

Description

1

ProxySettingsHandled

Verifies proxy configuration in the Cluster object. If the bootstrap cluster was created without a proxy, no actions are applied to the cluster.

2

ClusterSSHConfigured

Verifies SSH configuration for the cluster and machines.

You can provide any number of SSH public keys, which are added to cluster machines. But the Bootstrap Controller always adds the bootstrap-key SSH public key to the cluster configuration. The Bootstrap Controller uses this SSH key to manage the lcm-agent configuration on cluster machines.

The bootstrap-key SSH key is copied to a bootstrap-key-<clusterName> object containing the cluster name in its name.

3

ProviderUpdatedInBootstrap

Synchronizes the provider and settings of its components between the Cluster object and bootstrap Helm bundle. Settings provided in the cluster configuration have higher priority than the default settings of the bootstrap cluster, except CDN.

4

ProviderEnabledInBootstrap

Enables the provider and its components if any were disabled by the Bootstrap Controller during preparation of the bootstrap region. A cluster and machines deployment starts after the provider enablement.

5

Nodes readiness

Waits for the provider to complete nodes deployment that comprises VMs creation and MKE installation.

6

ObjectsCreated

Creates required namespaces and IAM secrets.

7

ProviderConfigured

Verifies the provider configuration in the provisioned cluster.

8

HelmBundleReady

Verifies the Helm bundle readiness for the provisioned cluster.

9

ControllersDisabledBeforePivot

Collects the list of deployment controllers and disables them to prepare for pivot.

10

PivotDone

Moves all cluster-related objects from the bootstrap cluster to the provisioned cluster. The copies of Cluster and Machine objects remain in the bootstrap cluster to provide the status information to the user. About every minute, the Bootstrap Controller reconciles the status of the Cluster and Machine objects of the provisioned cluster to the bootstrap cluster.

11

ControllersEnabledAfterPivot

Enables controllers in the provisioned cluster.

12

MachinesLCMAgentUpdated

Updates the lcm-agent configuration on machines to target LCM agents to the provisioned cluster.

13

HelmControllerDisabledBeforeConfig

Disables the Helm Controller before reconfiguration.

14

HelmControllerConfigUpdated

Updates the Helm Controller configuration for the provisioned cluster.

15

Cluster readiness

Contains information about the global cluster status. The Bootstrap Controller verifies that OIDC, Helm releases, and all Deployments are ready. Once the cluster is ready, the Bootstrap Controller stops managing the cluster.

Set up a bootstrap cluster

Caution

Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

The setup of a bootstrap cluster comprises preparation of the seed node, configuration of environment variables, acquisition of the Container Cloud license file, and execution of the bootstrap script. The script eventually generates a link to the Bootstrap web UI for the management cluster deployment.

To set up a bootstrap cluster:

  1. Prepare the seed node:

    Bare metal
    1. Verify that the hardware allocated for the installation meets the minimal requirements described in Requirements for a baremetal-based cluster.

    2. Install basic Ubuntu 22.04 server using standard installation images of the operating system on the bare metal seed node.

    3. Log in to the seed node that is running Ubuntu 22.04.

    4. Prepare the system and network configuration:

      1. Establish a virtual bridge using an IP address of the PXE network on the seed node. Use the following netplan-based configuration file as an example:

        # cat /etc/netplan/config.yaml
        network:
          version: 2
          renderer: networkd
          ethernets:
            ens3:
                dhcp4: false
                dhcp6: false
          bridges:
              br0:
                  addresses:
                  # Replace with IP address from PXE network to create a virtual bridge
                  - 10.0.0.15/24
                  dhcp4: false
                  dhcp6: false
                  # Adjust for your environment
                  gateway4: 10.0.0.1
                  interfaces:
                  # Interface name may be different in your environment
                  - ens3
                  nameservers:
                      addresses:
                      # Adjust for your environment
                      - 8.8.8.8
                  parameters:
                      forward-delay: 4
                      stp: false
        
      2. Apply the new network configuration using netplan:

        sudo netplan apply
        
      3. Verify the new network configuration:

        sudo apt update && sudo apt install -y bridge-utils
        sudo brctl show
        

        Example of system response:

        bridge name     bridge id               STP enabled     interfaces
        br0             8000.fa163e72f146       no              ens3
        

        Verify that the interface connected to the PXE network belongs to the previously configured bridge.

      4. Install the current Docker version available for Ubuntu 22.04:

        sudo apt-get update
        sudo apt-get install docker.io
        
      5. Verify that your logged USER has access to the Docker daemon:

        sudo usermod -aG docker $USER
        
      6. Log out and log in again to the seed node to apply the changes.

      7. Verify that Docker is configured correctly and has access to Container Cloud CDN. For example:

        docker run --rm alpine sh -c "apk add --no-cache curl; \
        curl https://binary.mirantis.com"
        

        The system output must contain a json file with no error messages. In case of errors, follow the steps provided in Troubleshooting.

        Note

        If you require all Internet access to go through a proxy server for security and audit purposes, configure Docker proxy settings as described in the official Docker documentation.

        To verify that Docker is configured correctly and has access to Container Cloud CDN:

        docker run --rm alpine sh -c "export http_proxy=http://<proxy_ip:proxy_port>; \
        sed -i ‘s/https/http/g' /etc/apk/repositories; \
        apk add --no-cache wget ; \
        wget http://binary.mirantis.com; \
        cat index.html
        
    5. Verify that the seed node has direct access to the Baseboard Management Controller (BMC) of each bare metal host. All target hardware nodes must be in the power off state.

      For example, using the IPMI tool:

      apt install ipmitool
      ipmitool -I lanplus -H 'IPMI IP' -U 'IPMI Login' -P 'IPMI password' \
      chassis power status
      

      Example of system response:

      Chassis Power is off
      
    OpenStack
    1. Verify that the hardware allocated for installation meets minimal requirements described in Requirements for an OpenStack-based cluster.

    2. Configure Docker:

      1. Log in to any personal computer or VM running Ubuntu 22.04 that you will be using as the bootstrap node.

      2. If you use a newly created VM, run:

        sudo apt-get update
        
      3. Install the current Docker version available for Ubuntu 20.04:

        sudo apt install docker.io
        
      4. Grant your USER access to the Docker daemon:

        sudo usermod -aG docker $USER
        
      5. Log off and log in again to the bootstrap node to apply the changes.

      6. Verify that Docker is configured correctly and has access to Container Cloud CDN. For example:

        docker run --rm alpine sh -c "apk add --no-cache curl; \
        curl https://binary.mirantis.com"
        

        The system output must contain no error records. In case of issues, follow the steps provided in Troubleshooting.

  2. Prepare the bootstrap script:

    1. Download and run the Container Cloud bootstrap script:

      sudo apt-get update
      sudo apt-get install wget
      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      chmod 0755 get_container_cloud.sh
      ./get_container_cloud.sh
      
    2. Change the directory to the kaas-bootstrap folder created by the script.

  3. Obtain a Container Cloud license file required for the bootstrap:

    1. Select from the following options:

      • Open the email from support@mirantis.com with the subject Mirantis Container Cloud License File or Mirantis OpenStack License File

      • In the Mirantis CloudCare Portal, open the Account or Cloud page

    2. Download the License File and save it as mirantis.lic under the kaas-bootstrap directory on the bootstrap node.

    3. Verify that mirantis.lic contains the previously downloaded Container Cloud license by decoding the license JWT token, for example, using jwt.io.

      Example of a valid decoded Container Cloud license data with the mandatory license field:

      {
          "exp": 1652304773,
          "iat": 1636669973,
          "sub": "demo",
          "license": {
              "dev": false,
              "limits": {
                  "clusters": 10,
                  "workers_per_cluster": 10
              },
              "openstack": null
          }
      }
      

      Warning

      The MKE license does not apply to mirantis.lic. For details about MKE license, see MKE documentation.

  4. For the bare metal provider, export mandatory parameters.

    Bare metal network mandatory parameters

    Export the following mandatory parameters using the commands and table below:

    export KAAS_BM_ENABLED="true"
    #
    export KAAS_BM_PXE_IP="172.16.59.5"
    export KAAS_BM_PXE_MASK="24"
    export KAAS_BM_PXE_BRIDGE="br0"
    
    Bare metal prerequisites data

    Parameter

    Description

    Example value

    KAAS_BM_PXE_IP

    The provisioning IP address in the PXE network. This address will be assigned on the seed node to the interface defined by the KAAS_BM_PXE_BRIDGE parameter described below. The PXE service of the bootstrap cluster uses this address to network boot bare metal hosts.

    172.16.59.5

    KAAS_BM_PXE_MASK

    The PXE network address prefix length to be used with the KAAS_BM_PXE_IP address when assigning it to the seed node interface.

    24

    KAAS_BM_PXE_BRIDGE

    The PXE network bridge name that must match the name of the bridge created on the seed node during the Set up a bootstrap cluster stage.

    br0

  5. Optional. Add the following environment variables to bootstrap the cluster using proxy:

    • HTTP_PROXY

    • HTTPS_PROXY

    • NO_PROXY

    • PROXY_CA_CERTIFICATE_PATH

    Example snippet:

    export HTTP_PROXY=http://proxy.example.com:3128
    export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
    export NO_PROXY=172.18.10.0,registry.internal.lan
    export PROXY_CA_CERTIFICATE_PATH="/home/ubuntu/.mitmproxy/mitmproxy-ca-cert.cer"
    

    The following formats of variables are accepted:

    Proxy configuration data

    Variable

    Format

    HTTP_PROXY
    HTTPS_PROXY
    • http://proxy.example.com:port - for anonymous access.

    • http://user:password@proxy.example.com:port - for restricted access.

    NO_PROXY

    Comma-separated list of IP addresses or domain names.

    PROXY_CA_CERTIFICATE_PATH

    Optional. Absolute path to the proxy CA certificate for man-in-the-middle (MITM) proxies. Must be placed on the bootstrap node to be trusted. For details, see Install a CA certificate for a MITM proxy on a bootstrap node.

    Warning

    If you require Internet access to go through a MITM proxy, ensure that the proxy has streaming enabled as described in Enable streaming for MITM.

    For implementation details, see Proxy and cache support.

    After the bootstrap cluster is set up, the bootstrap-proxy object is created with the provided proxy settings. You can use this object later for the Cluster object configuration.

  6. Deploy the bootstrap cluster:

    ./bootstrap.sh bootstrapv2
    

    When the bootstrap is complete, the system outputs a link to the Bootstrap web UI.

  7. Make sure that port 80 is open for localhost to prevent security requirements for the seed node:

    Note

    Kind uses port mapping for the master node.

    telnet localhost 80
    

    Example of a positive system response:

    Connected to localhost.
    

    Example of a negative system response:

    telnet: connect to address ::1: Connection refused
    telnet: Unable to connect to remote host
    

    To open port 80:

    iptables -A INPUT -p tcp --dport 80 -j ACCEPT
    
  8. Access the Bootstrap web UI. It does not require any authorization.

    The bootstrap cluster setup automatically creates the following objects that you can view in the Bootstrap web UI:

    • Bootstrap SSH key

      The SSH key pair is automatically generated by the bootstrap script and the private key is added to the kaas-bootstrap folder. The public key is automatically created in the bootstrap cluster as the bootstrap-key object. It will be used later for setting up the cluster machines.

    • Bootstrap proxy

      If a bootstrap cluster is configured with proxy settings, the bootstrap-proxy object is created. It will be automatically used in the cluster configuration unless a custom proxy is specified.

    • Management kubeconfig

      If a bootstrap cluster is provided with the management cluster kubeconfig, it will be uploaded as a secret to the bootstrap cluster to the default and kaas projects as management-kubeconfig.

Deploy a management cluster using the Container Cloud API

Caution

Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

This section contains an overview of the cluster-related objects along with the configuration procedure of these objects during deployment of a management cluster using Bootstrap v2 through the Container Cloud API.

Deploy a management cluster using CLI

Caution

Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

The following procedure describes how to prepare and deploy a management cluster using Bootstrap v2 by operating YAML templates available in the kaas-bootstrap/templates/ folder.

To deploy a management cluster using CLI:

  1. Set up a bootstrap cluster.

  2. Export kubeconfig of the kind cluster:

    export KUBECONFIG=<pathToKindKubeconfig>
    

    By default, <pathToKindKubeconfig> is $HOME/.kube/kind-config-clusterapi.

  3. For the bare metal provider, configure BIOS on a bare metal host.

  4. For the OpenStack provider, prepare the OpenStack configuration.

    OpenStack configuration
    1. Log in to the OpenStack Horizon.

    2. In the Project section, select API Access.

    3. In the right-side drop-down menu Download OpenStack RC File, select OpenStack clouds.yaml File.

    4. Save the downloaded clouds.yaml file in the kaas-bootstrap folder created by the get_container_cloud.sh script.

    5. In clouds.yaml, add the password field with your OpenStack password under the clouds/openstack/auth section.

      Example:

      clouds:
        openstack:
          auth:
            auth_url: https://auth.openstack.example.com/v3
            username: your_username
            password: your_secret_password
            project_id: your_project_id
            user_domain_name: your_user_domain_name
          region_name: RegionOne
          interface: public
          identity_api_version: 3
      
    6. If you deploy Container Cloud on top of MOSK Victoria with Tungsten Fabric and use the default security group for newly created load balancers, add the following rules for the Kubernetes API server endpoint, Container Cloud application endpoint, and for the MKE web UI and API using the OpenStack CLI:

      • direction='ingress'

      • ethertype='IPv4'

      • protocol='tcp'

      • remote_ip_prefix='0.0.0.0/0'

      • port_range_max and port_range_min:

        • '443' for Kubernetes API and Container Cloud application endpoints

        • '6443' for MKE web UI and API

    7. Verify access to the target cloud endpoint from Docker. For example:

      docker run --rm alpine sh -c "apk add --no-cache curl; \
      curl https://auth.openstack.example.com/v3"
      

      The system output must contain no error records.

  5. Depending on the selected provider, navigate to one of the following locations:

    • Bare metal: kaas-bootstrap/templates/bm

    • OpenStack: kaas-bootstrap/templates

    Warning

    The kubectl apply command automatically saves the applied data as plain text into the kubectl.kubernetes.io/last-applied-configuration annotation of the corresponding object. This may result in revealing sensitive data in this annotation when creating or modifying objects containing credentials. Such Container Cloud objects include:

    • BareMetalHostCredential

    • ClusterOIDCConfiguration

    • License

    • OpenstackCredential

    • Proxy

    • ServiceUser

    • TLSConfig

    Therefore, do not use kubectl apply on these objects. Use kubectl create, kubectl patch, or kubectl edit instead.

    If you used kubectl apply on these objects, you can remove the kubectl.kubernetes.io/last-applied-configuration annotation from the objects using kubectl edit.

  6. Create the BootstrapRegion object by modifying bootstrapregion.yaml.template.

    Configuration of bootstrapregion.yaml.template
    1. Select from the following options:

      • Since Container Cloud 2.26.0 (Cluster releases 16.1.0 and 17.1.0), set the required <providerName> and use the default <regionName>, which is region-one.

      • Before Container Cloud 2.26.0, set the required <providerName> and <regionName>.

      apiVersion: kaas.mirantis.com/v1alpha1
      kind: BootstrapRegion
      metadata:
        name: <regionName>
        namespace: default
      spec:
        provider: <providerName>
      
    2. Create the object:

      ./kaas-bootstrap/bin/kubectl create -f \
          kaas-bootstrap/templates/<providerName>/bootstrapregion.yaml.template
      

    Note

    In the following steps, apply the changes to objects using the commands below with the required template name:

    • For bare metal:

      ./kaas-bootstrap/bin/kubectl create -f \
          kaas-bootstrap/templates/bm/<templateName>.yaml.template
      
    • For OpenStack:

      ./kaas-bootstrap/bin/kubectl create -f \
          kaas-bootstrap/templates/<templateName>.yaml.template
      
  7. For the OpenStack provider only. Create the Credentials object by modifying <providerName>-config.yaml.template.

    Configuration for OpenStack credentials
    1. Add the provider-specific parameters:

      Parameter

      Description

      SET_OS_AUTH_URL

      Identity endpoint URL.

      SET_OS_USERNAME

      OpenStack user name.

      SET_OS_PASSWORD

      Value of the OpenStack password. This field is available only when the user creates or changes password. Once the controller detects this field, it updates the password in the secret and removes the value field from the OpenStackCredential object.

      SET_OS_PROJECT_ID

      Unique ID of the OpenStack project.

    2. Skip this step since Container Cloud 2.26.0. Before this release, set the kaas.mirantis.com/region: <regionName> label that must match the BootstrapRegion object name.

    3. Skip this step since Container Cloud 2.26.0. Before this release, set the kaas.mirantis.com/regional-credential label to "true" to use the credentials for the management cluster deployment. For example, for OpenStack:

      cat openstack-config.yaml.template
      ---
      apiVersion: kaas.mirantis.com/v1alpha1
      kind: OpenstackCredential
      metadata:
        name: cloud-config
        labels:
          kaas.mirantis.com/regional-credential: "true"
      spec:
        ...
      
    4. Verify that the credentials for the management cluster deployment are valid. For example, for OpenStack:

      ./kaas-bootstrap/bin/kubectl get openstackcredentials <credsName> \
          -o yaml -o jsonpath='{.status.valid}'
      

      The output of the command must be "true". Otherwise, fix the issue with credentials before proceeding to the next step.

  8. Create the ServiceUser object by modifying serviceusers.yaml.template.

    Configuration of serviceusers.yaml.template

    Service user is the initial user to create in Keycloak for access to a newly deployed management cluster. By default, it has the global-admin, operator (namespaced), and bm-pool-operator (namespaced) roles.

    You can delete serviceuser after setting up other required users with specific roles or after any integration with an external identity provider, such as LDAP.

    apiVersion: kaas.mirantis.com/v1alpha1
    kind: ServiceUserList
    items:
    - apiVersion: kaas.mirantis.com/v1alpha1
      kind: ServiceUser
      metadata:
        name: SET_USERNAME
      spec:
        password:
          value: SET_PASSWORD
    
  9. Optional. Prepare any number of additional SSH keys using the following example:

    apiVersion: kaas.mirantis.com/v1alpha1
    kind: PublicKey
    metadata:
      name: <SSHKeyName>
      namespace: default
    spec:
      publicKey: |
        <insert your public key here>
    
  10. Optional. Add the Proxy object using the example below:

    apiVersion: kaas.mirantis.com/v1alpha1
    kind: Proxy
    metadata:
      labels:
        kaas.mirantis.com/region: <regionName>
      name: <proxyName>
      namespace: default
    spec:
      ...
    

    The region label must match the BootstrapRegion object name.

    Note

    The kaas.mirantis.com/region label is removed from all Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Therefore, do not add the label starting these releases. On existing clusters updated to these releases, or if manually added, this label will be ignored by Container Cloud.

  11. Configure and apply the cluster configuration using cluster deployment templates:

    1. In cluster.yaml.template, set mandatory cluster labels:

      labels:
        kaas.mirantis.com/provider: <providerName>
        kaas.mirantis.com/region: <regionName>
      

      Note

      The kaas.mirantis.com/region label is removed from all Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Therefore, do not add the label starting these releases. On existing clusters updated to these releases, or if manually added, this label will be ignored by Container Cloud.

    2. Configure provider-specific settings as required.

      Bare metal
      1. Inspect the default bare metal host profile definition in templates/bm/baremetalhostprofiles.yaml.template and adjust it to fit your hardware configuration. For details, see Customize the default bare metal host profile.

        Warning

        Any data stored on any device defined in the fileSystems list can be deleted or corrupted during cluster (re)deployment. It happens because each device from the fileSystems list is a part of the rootfs directory tree that is overwritten during (re)deployment.

        Examples of affected devices include:

        • A raw device partition with a file system on it

        • A device partition in a volume group with a logical volume that has a file system on it

        • An mdadm RAID device with a file system on it

        • An LVM RAID device with a file system on it

        The wipe field (deprecated) or wipeDevice structure (recommended since Container Cloud 2.26.0) have no effect in this case and cannot protect data on these devices.

        Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

      2. In templates/bm/baremetalhosts.yaml.template, update the bare metal host definitions according to your environment configuration. Use the reference table below to manually set all parameters that start with SET_.

        Note

        Before Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0), also set the name of the bootstrapRegion object from bootstrapregion.yaml.template for the kaas.mirantis.com/region label across all objects listed in templates/bm/baremetalhosts.yaml.template.

        Bare metal hosts template mandatory parameters

        Parameter

        Description

        Example value

        SET_MACHINE_0_IPMI_USERNAME

        The IPMI user name to access the BMC. 0

        user

        SET_MACHINE_0_IPMI_PASSWORD

        The IPMI password to access the BMC. 0

        password

        SET_MACHINE_0_MAC

        The MAC address of the first master node in the PXE network.

        ac:1f:6b:02:84:71

        SET_MACHINE_0_BMC_ADDRESS

        The IP address of the BMC endpoint for the first master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.

        192.168.100.11

        SET_MACHINE_1_IPMI_USERNAME

        The IPMI user name to access the BMC. 0

        user

        SET_MACHINE_1_IPMI_PASSWORD

        The IPMI password to access the BMC. 0

        password

        SET_MACHINE_1_MAC

        The MAC address of the second master node in the PXE network.

        ac:1f:6b:02:84:72

        SET_MACHINE_1_BMC_ADDRESS

        The IP address of the BMC endpoint for the second master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.

        192.168.100.12

        SET_MACHINE_2_IPMI_USERNAME

        The IPMI user name to access the BMC. 0

        user

        SET_MACHINE_2_IPMI_PASSWORD

        The IPMI password to access the BMC. 0

        password

        SET_MACHINE_2_MAC

        The MAC address of the third master node in the PXE network.

        ac:1f:6b:02:84:73

        SET_MACHINE_2_BMC_ADDRESS

        The IP address of the BMC endpoint for the third master node in the cluster. Must be an address from the OOB network that is accessible through the management network gateway.

        192.168.100.13

        0(1,2,3,4,5,6)

        The parameter requires a user name and password in plain text.

      3. Configure cluster network:

        Important

        Bootstrap V2 supports only separated PXE and LCM networks.

        • To ensure successful bootstrap, enable asymmetric routing on the interfaces of the management cluster nodes. This is required because the seed node relies on one network by default, which can potentially cause traffic asymmetry.

          In the kernelParameters section of bm/baremetalhostprofiles.yaml.template, set rp_filter to 2. This enables loose mode as defined in RFC3704.

          Example configuration of asymmetric routing
          ...
          kernelParameters:
            ...
            sysctl:
              # Enables the "Loose mode" for the "k8s-lcm" interface (management network)
              net.ipv4.conf.k8s-lcm.rp_filter: "2"
              # Enables the "Loose mode" for the "bond0" interface (PXE network)
              net.ipv4.conf.bond0.rp_filter: "2"
              ...
          

          Note

          More complicated solutions that are not described in this manual include getting rid of traffic asymmetry, for example:

          • Configure source routing on management cluster nodes.

          • Plug the seed node into the same networks as the management cluster nodes, which requires custom configuration of the seed node.

        • Update the network objects definition in templates/bm/ipam-objects.yaml.template according to the environment configuration. By default, this template implies the use of separate PXE and life-cycle management (LCM) networks.

        • Manually set all parameters that start with SET_.

        For configuration details of bond network interface for the PXE and management network, see Configure NIC bonding.

        Example of the default L2 template snippet for a management cluster:

        bonds:
          bond0:
            interfaces:
              - {{ nic 0 }}
              - {{ nic 1 }}
            parameters:
              mode: active-backup
              primary: {{ nic 0 }}
            dhcp4: false
            dhcp6: false
            addresses:
              - {{ ip "bond0:mgmt-pxe" }}
        vlans:
          k8s-lcm:
            id: SET_VLAN_ID
            link: bond0
            addresses:
              - {{ ip "k8s-lcm:kaas-mgmt" }}
            nameservers:
              addresses: {{ nameservers_from_subnet "kaas-mgmt" }}
            routes:
              - to: 0.0.0.0/0
                via: {{ gateway_from_subnet "kaas-mgmt" }}
        

        In this example, the following configuration applies:

        • A bond of two NIC interfaces

        • A static address in the PXE network set on the bond

        • An isolated L2 segment for the LCM network is configured using the k8s-lcm VLAN with the static address in the LCM network

        • The default gateway address is in the LCM network

        For general concepts of configuring separate PXE and LCM networks for a management cluster, see Separate PXE and management networks. For the latest object templates and variable names to use, see the following tables.

        Network parameters mapping overview

        Deployment file name

        Parameters list to update manually

        ipam-objects.yaml.template

        • SET_LB_HOST

        • SET_MGMT_ADDR_RANGE

        • SET_MGMT_CIDR

        • SET_MGMT_DNS

        • SET_MGMT_NW_GW

        • SET_MGMT_SVC_POOL

        • SET_PXE_ADDR_POOL

        • SET_PXE_ADDR_RANGE

        • SET_PXE_CIDR

        • SET_PXE_SVC_POOL

        • SET_VLAN_ID

        bootstrap.env

        • KAAS_BM_PXE_IP

        • KAAS_BM_PXE_MASK

        • KAAS_BM_PXE_BRIDGE

        The below table contains examples of mandatory parameter values to set in templates/bm/ipam-objects.yaml.template for the network scheme that has the following networks:

        • 172.16.59.0/24 - PXE network

        • 172.16.61.0/25 - LCM network

        Mandatory network parameters of the IPAM objects template

        Parameter

        Description

        Example value

        SET_PXE_CIDR

        The IP address of the PXE network in the CIDR notation. The minimum recommended network size is 256 addresses (/24 prefix length).

        172.16.59.0/24

        SET_PXE_SVC_POOL

        The IP address range to use for endpoints of load balancers in the PXE network for the Container Cloud services: Ironic-API, DHCP server, HTTP server, and caching server. The minimum required range size is 5 addresses.

        172.16.59.6-172.16.59.15

        SET_PXE_ADDR_POOL

        The IP address range in the PXE network to use for dynamic address allocation for hosts during inspection and provisioning.

        The minimum recommended range size is 30 addresses for management cluster nodes if it is located in a separate PXE network segment. Otherwise, it depends on the number of managed cluster nodes to deploy in the same PXE network segment as the management cluster nodes.

        172.16.59.51-172.16.59.200

        SET_PXE_ADDR_RANGE

        The IP address range in the PXE network to use for static address allocation on each management cluster node. The minimum recommended range size is 6 addresses.

        172.16.59.41-172.16.59.50

        SET_MGMT_CIDR

        The IP address of the LCM network for the management cluster in the CIDR notation. If managed clusters will have their separate LCM networks, those networks must be routable to the LCM network. The minimum recommended network size is 128 addresses (/25 prefix length).

        172.16.61.0/25

        SET_MGMT_NW_GW

        The default gateway address in the LCM network. This gateway must provide access to the OOB network of the Container Cloud cluster and to the Internet to download the Mirantis artifacts.

        172.16.61.1

        SET_LB_HOST

        The IP address of the externally accessible MKE API endpoint of the cluster in the CIDR notation. This address must be within the management SET_MGMT_CIDR network but must NOT overlap with any other addresses or address ranges within this network. External load balancers are not supported.

        172.16.61.5/32

        SET_MGMT_DNS

        An external (non-Kubernetes) DNS server accessible from the LCM network.

        8.8.8.8

        SET_MGMT_ADDR_RANGE

        The IP address range that includes addresses to be allocated to bare metal hosts in the LCM network for the management cluster.

        When this network is shared with managed clusters, the size of this range limits the number of hosts that can be deployed in all clusters sharing this network.

        When this network is solely used by a management cluster, the range must include at least 6 addresses for bare metal hosts of the management cluster.

        172.16.61.30-172.16.61.40

        SET_MGMT_SVC_POOL

        The IP address range to use for the externally accessible endpoints of load balancers in the LCM network for the Container Cloud services, such as Keycloak, web UI, and so on. The minimum required range size is 19 addresses.

        172.16.61.10-172.16.61.29

        SET_VLAN_ID

        The VLAN ID used for isolation of LCM network. The bootstrap.sh process and the seed node must have routable access to the network in this VLAN.

        3975

        When using separate PXE and LCM networks, the management cluster services are exposed in different networks using two separate MetalLB address pools:

        • Services exposed through the PXE network are as follows:

          • Ironic API as a bare metal provisioning server

          • HTTP server that provides images for network boot and server provisioning

          • Caching server for accessing the Container Cloud artifacts deployed on hosts

        • Services exposed through the LCM network are all other Container Cloud services, such as Keycloak, web UI, and so on.

        The default MetalLB configuration described in the MetalLBConfigTemplate object template of templates/bm/ipam-objects.yaml.template uses two separate MetalLB address pools. Also, it uses the interfaces selector in its l2Advertisements template.

        Caution

        When you change the L2Template object template in templates/bm/ipam-objects.yaml.template, ensure that interfaces listed in the interfaces field of the MetalLBConfigTemplate.spec.templates.l2Advertisements section match those used in your L2Template. For details about the interfaces selector, see API Reference: MetalLBConfigTemplate spec.

        See Configure MetalLB for details on MetalLB configuration.

      4. In cluster.yaml.template, update the cluster-related settings to fit your deployment.

      5. Optional. Enable WireGuard for traffic encryption on the Kubernetes workloads network.

        WireGuard configuration
        1. Ensure that the Calico MTU size is at least 60 bytes smaller than the interface MTU size of the workload network. IPv4 WireGuard uses a 60-byte header. For details, see Set the MTU size for Calico.

        2. In templates/bm/cluster.yaml.template, enable WireGuard by adding the secureOverlay parameter:

          spec:
            ...
            providerSpec:
              value:
                ...
                secureOverlay: true
          

          Caution

          Changing this parameter on a running cluster causes a downtime that can vary depending on the cluster size.

        For more details about WireGuard, see Calico documentation: Encrypt in-cluster pod traffic.

      OpenStack

      Adjust the templates/cluster.yaml.template parameters to suit your deployment:

      1. In the spec::providerSpec::value section, add the mandatory ExternalNetworkID parameter that is the ID of an external OpenStack network. It is required to have public Internet access to virtual machines.

      2. In the spec::clusterNetwork::services section, add the corresponding values for cidrBlocks.

      3. Configure other parameters as required.

    3. Configure StackLight. For parameters description, see StackLight configuration parameters.

    4. Optional. Configure additional cluster settings as described in Configure optional cluster settings.

  12. Apply configuration for machines using machines.yaml.template.

    Configuration of machines.yaml.template
    1. Add the following mandatory machine labels:

      labels:
        kaas.mirantis.com/provider: <providerName>
        cluster.sigs.k8s.io/cluster-name: <clusterName>
        kaas.mirantis.com/region: <regionName>
        cluster.sigs.k8s.io/control-plane: "true"
      

      Note

      The kaas.mirantis.com/region label is removed from all Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Therefore, do not add the label starting these releases. On existing clusters updated to these releases, or if manually added, this label will be ignored by Container Cloud.

    2. Configure the provider-specific settings:

      Bare metal

      Inspect the machines.yaml.template and adjust spec and labels of each entry according to your deployment. Adjust spec.providerSpec.value.hostSelector values to match BareMetalHost corresponding to each machine. For details, see API Reference: Bare metal Machine spec.

      OpenStack
      1. In templates/machines.yaml.template, modify the spec:providerSpec:value section for 3 control plane nodes marked with the cluster.sigs.k8s.io/control-plane label by substituting the flavor and image parameters with the corresponding values of the control plane nodes in the related OpenStack cluster. For example:

        spec: &cp_spec
          providerSpec:
            value:
              apiVersion: "openstackproviderconfig.k8s.io/v1alpha1"
              kind: "OpenstackMachineProviderSpec"
              flavor: kaas.minimal
              image: jammy-server-cloudimg-amd64-20240823
        

        Note

        The flavor parameter value provided in the example above is cloud-specific and must meet the Container Cloud requirements.

      2. Optional. Available as TechPreview. To boot cluster machines from a block storage volume, define the following parameter in the spec:providerSpec section of templates/machines.yaml.template:

        bootFromVolume:
          enabled: true
          volumeSize: 120
        

        Note

        The minimal storage requirement is 120 GB per node. For details, see Requirements for an OpenStack-based cluster.

        To boot the Bastion node from a volume, add the same parameter to templates/cluster.yaml.template in the spec:providerSpec section for Bastion. The default amount of storage 80 is enough.

      Also, modify other parameters as required.

  13. For the bare metal provider, monitor the inspecting process of the baremetal hosts and wait until all hosts are in the available state:

    kubectl get bmh -o go-template='{{- range .items -}} {{.status.provisioning.state}}{{"\n"}} {{- end -}}'
    

    Example of system response:

    available
    available
    available
    
  14. Monitor the BootstrapRegion object status and wait until it is ready.

    kubectl get bootstrapregions -o go-template='{{(index .items 0).status.ready}}{{"\n"}}'
    

    To obtain more granular status details, monitor status.conditions:

    kubectl get bootstrapregions -o go-template='{{(index .items 0).status.conditions}}{{"\n"}}'
    

    For a more convenient system response, consider using dedicated tools such as jq or yq and adjust the -o flag to output in json or yaml format accordingly.

    Note

    For the bare metal provider, before Container Cloud 2.26.0, the BareMetalObjectReferences condition is not mandatory and may remain in the not ready state with no effect on the BootstrapRegion object. Since Container Cloud 2.26.0, this condition is mandatory.

  15. Change the directory to /kaas-bootstrap/.

  16. Approve the BootstrapRegion object to start the cluster deployment:

    ./container-cloud bootstrap approve all
    
    ./container-cloud bootstrap approve <bootstrapRegionName>
    

    Caution

    Once you approve the BootstrapRegion object, no cluster or machine modification is allowed.

    Warning

    For the bare metal provider, do not manually restart or power off any of the bare metal hosts during the bootstrap process.

  17. Monitor the deployment progress. For deployment stages description, see Overview of the deployment workflow.

  18. Verify that network addresses used on your clusters do not overlap with the following default MKE network addresses for Swarm and MCR:

    • 10.0.0.0/16 is used for Swarm networks. IP addresses from this network are virtual.

    • 10.99.0.0/16 is used for MCR networks. IP addresses from this network are allocated on hosts.

    Verification of Swarm and MCR network addresses

    To verify Swarm and MCR network addresses, run on any master node:

    docker info
    

    Example of system response:

    Server:
     ...
     Swarm:
      ...
      Default Address Pool: 10.0.0.0/16
      SubnetSize: 24
      ...
     Default Address Pools:
       Base: 10.99.0.0/16, Size: 20
     ...
    

    Not all of Swarm and MCR addresses are usually in use. One Swarm Ingress network is created by default and occupies the 10.0.0.0/24 address block. Also, three MCR networks are created by default and occupy three address blocks: 10.99.0.0/20, 10.99.16.0/20, 10.99.32.0/20.

    To verify the actual networks state and addresses in use, run:

    docker network ls
    docker network inspect <networkName>
    
  19. Optional for the bare metal provider. If you plan to use multiple L2 segments for provisioning of managed cluster nodes, consider the requirements specified in Configure multiple DHCP ranges using Subnet resources.

Deploy a management cluster using the Container Cloud Bootstrap web UI

Caution

Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

This section describes how to configure the cluster-related objects and deploy a management cluster using Bootstrap v2 through the Container Cloud Bootstrap web UI.

Create a management cluster for the OpenStack provider

This section describes how to create an OpenStack-based management cluster using the Container Cloud Bootstrap web UI.

To create an OpenStack-based management cluster:

  1. If you deploy Container Cloud on top of MOSK Victoria with Tungsten Fabric and use the default security group for newly created load balancers, add the following rules for the Kubernetes API server endpoint, Container Cloud application endpoint, and for the MKE web UI and API using the OpenStack CLI:

    • direction='ingress'

    • ethertype='IPv4'

    • protocol='tcp'

    • remote_ip_prefix='0.0.0.0/0'

    • port_range_max and port_range_min:

      • '443' for Kubernetes API and Container Cloud application endpoints

      • '6443' for MKE web UI and API

  2. Set up a bootstrap cluster.

  3. Open the Container Cloud Bootstrap web UI.

  4. Create a bootstrap object.

    Bootstrap object configuration

    In the Bootstrap tab, create a bootstrap object:

    1. Set the bootstrap object name.

    2. Select the required provider.

    3. Optional. Recommended. Leave the Guided Bootstrap configuration check box selected. It enables the cluster creation helper in the next window with a series of guided steps for a complete setup of a functional management cluster.

      The cluster creation helper contains the same configuration windows as in separate tabs of the left-side menu, but the helper enables the configuration of essential provider components one-by-one inside one modal window.

      If you select this option, use the corresponding steps of this procedure described below for description of each tab in Guided Bootstrap configuration.

    4. Click Save.

    5. In the Status column of the Bootstrap page, monitor the bootstrap region readiness by hovering over the status icon of the bootstrap region.

      Once the orange blinking status icon becomes green and Ready, the bootstrap region deployment is complete. If the cluster status is Error, refer to Troubleshooting.

      You can monitor live deployment status of the following bootstrap region components:

      Component

      Status description

      Helm

      Installation status of bootstrap Helm releases

      Provider

      Status of provider configuration and installation for related charts and Deployments

      Deployments

      Readiness of all Deployments in the bootstrap cluster

  5. Configure credentials for the new cluster.

    Credentials configuration

    In the Credentials tab:

    1. Click Add Credential to add your OpenStack credentials. You can either upload your OpenStack clouds.yaml configuration file or fill in the fields manually.

    2. Verify that the new credentials status is Ready. If the status is Error, hover over the status to determine the reason of the issue.

  6. Optional. In the SSH Keys tab, click Add SSH Key to upload the public SSH key(s) for VMs creation.

  7. Optional. Enable proxy access to the cluster.

    Proxy configuration

    In the Proxies tab, configure proxy:

    1. Click Add Proxy.

    2. In the Add New Proxy wizard, fill out the form with the following parameters:

      Proxy configuration

      Parameter

      Description

      Proxy Name

      Name of the proxy server to use during cluster creation.

      Region Removed in 2.26.0 (16.1.0 and 17.1.0)

      From the drop-down list, select the required region.

      HTTP Proxy

      Add the HTTP proxy server domain name in the following format:

      • http://proxy.example.com:port - for anonymous access

      • http://user:password@proxy.example.com:port - for restricted access

      HTTPS Proxy

      Add the HTTPS proxy server domain name in the same format as for HTTP Proxy.

      No Proxy

      Comma-separated list of IP addresses or domain names.

      For implementation details, see Proxy and cache support.

    3. If your proxy requires a trusted CA certificate, select the CA Certificate check box and paste a CA certificate for a MITM proxy to the corresponding field or upload a certificate using Upload Certificate.

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for an OpenStack-based cluster.

  8. In the Clusters tab, click Create Cluster and fill out the form with the following parameters:

    Cluster configuration
    1. Add Cluster name.

    2. Set the provider Service User Name and Service User Password.

      Service user is the initial user to create in Keycloak for access to a newly deployed management cluster. By default, it has the global-admin, operator (namespaced), and bm-pool-operator (namespaced) roles.

      You can delete serviceuser after setting up other required users with specific roles or after any integration with an external identity provider, such as LDAP.

    3. Configure general provider settings and Kubernetes parameters:

      Provider and Kubernetes configuration
      Provider and Kubernetes configuration

      Section

      Parameter

      Description

      General Settings

      Provider

      Select OpenStack.

      Provider Credential

      From the drop-down list, select the OpenStack credentials name that you have previously created.

      Release Version

      The Container Cloud version.

      Proxy

      Optional. From the drop-down list, select the proxy server name that you have previously created.

      SSH Keys

      From the drop-down list, select the SSH key name(s) that you have previously added for SSH access to VMs.

      Container Registry

      From the drop-down list, select the Docker registry name that you have previously added using the Container Registries tab. For details, see Define a custom CA certificate for a private Docker registry.

      Provider

      External Network

      Type of the external network in the OpenStack cloud provider.

      DNS Name Servers

      Comma-separated list of the DNS hosts IPs for the OpenStack VMs configuration.

      Configure Bastion

      Optional. Configuration parameters for the Bastion node:

      • Flavor

      • Image

      • Availability Zone

      • Server Metadata

      For the parameters description, see Add a machine.

      Technology Preview: select Boot From Volume to boot the Bastion node from a block storage volume and select the required amount of storage (80 GB is enough).

      Kubernetes

      Node CIDR

      The Kubernetes nodes CIDR block. For example, 10.10.10.0/24.

      Services CIDR Blocks

      The Kubernetes Services CIDR block. For example, 10.233.0.0/18.

      Pods CIDR Blocks

      The Kubernetes Pods CIDR block. For example, 10.233.64.0/18.

      Note

      The network subnet size of Kubernetes pods influences the number of nodes that can be deployed in the cluster. The default subnet size /18 is enough to create a cluster with up to 256 nodes. Each node uses the /26 address blocks (64 addresses), at least one address block is allocated per node. These addresses are used by the Kubernetes pods with hostNetwork: false. The cluster size may be limited further when some nodes use more than one address block.

    4. Configure StackLight:

      StackLight configuration
    5. Click Create.

  9. Add machines to the bootstrap cluster:

    Machines configuration
    1. In the Clusters tab, click the required cluster name. The cluster page with Machines list opens.

    2. On the cluster page, click Create Machine.

    3. Fill out the form with the following parameters:

      Container Cloud machine configuration

      Parameter

      Description

      Count

      Specify the odd number of machines to create. Only Manager machines are allowed.

      Caution

      The required minimum number of manager machines is three for HA. A cluster can have more than three manager machines but only an odd number of machines.

      In an even-sized cluster, an additional machine remains in the Pending state until an extra manager machine is added. An even number of manager machines does not provide additional fault tolerance but increases the number of node required for etcd quorum.

      Flavor

      From the drop-down list, select the required hardware configuration for the machine. The list of available flavors corresponds to the one in your OpenStack environment.

      For the hardware requirements, see Requirements for an OpenStack-based cluster.

      Image

      From the drop-down list, select the required cloud image:

      • CentOS 7.9

      • Ubuntu 22.04

      If you do not have the required image in the list, add it to your OpenStack environment using the Horizon web UI by downloading it from:

      Warning

      A Container Cloud cluster based on both Ubuntu and CentOS operating systems is not supported.

      Availability Zone

      From the drop-down list, select the availability zone from which the new machine will be launched.

      Configure Server Metadata

      Optional. Select Configure Server Metadata and add the required number of string key-value pairs for the machine meta_data configuration in cloud-init.

      Prohibited keys are: KaaS, cluster, clusterID, namespace as they are used by Container Cloud.

      Boot From Volume

      Optional. Technology Preview. Select to boot a machine from a block storage volume. Use the Up and Down arrows in the Volume Size (GiB) field to define the required volume size.

      This option applies to clouds that do not have enough space on hypervisors. After enabling this option, the Cinder storage is used instead of the Nova storage.

    4. Click Create.

  10. Optional. Using the Container Cloud CLI, modify the provider-specific and other cluster settings as described in Configure optional cluster settings.

  11. Select from the following options to start cluster deployment:

    Click Deploy.

    Approve the previously created bootstrap region using the Container Cloud CLI:

    ./kaas-bootstrap/container-cloud bootstrap approve all
    
    ./kaas-bootstrap/container-cloud bootstrap approve  <bootstrapRegionName>
    

    Caution

    Once you approve the bootstrap region, no cluster or machine modification is allowed.

  12. Monitor the deployment progress of the cluster and machines.

    Monitoring of the cluster readiness

    To monitor the cluster readiness, hover over the status icon of a specific cluster in the Status column of the Clusters page.

    Once the orange blinking status icon becomes green and Ready, the cluster deployment or update is complete.

    You can monitor live deployment status of the following cluster components:

    Component

    Description

    Bastion

    For the OpenStack-based management clusters, the Bastion node IP address status that confirms the Bastion node creation

    Helm

    Installation or upgrade status of all Helm releases

    Kubelet

    Readiness of the node in a Kubernetes cluster, as reported by kubelet

    Kubernetes

    Readiness of all requested Kubernetes objects

    Nodes

    Equality of the requested nodes number in the cluster to the number of nodes having the Ready LCM status

    OIDC

    Readiness of the cluster OIDC configuration

    StackLight

    Health of all StackLight-related objects in a Kubernetes cluster

    Swarm

    Readiness of all nodes in a Docker Swarm cluster

    LoadBalancer

    Readiness of the Kubernetes API load balancer

    ProviderInstance

    Readiness of all machines in the underlying infrastructure (virtual or bare metal, depending on the provider type)

    Graceful Reboot

    Readiness of a cluster during a scheduled graceful reboot, available since Cluster releases 15.0.1 and 14.0.0.

    Infrastructure Status

    Available since Container Cloud 2.25.0 for bare metal and OpenStack providers. Readiness of the following cluster components:

    • Bare metal: the MetalLBConfig object along with MetalLB and DHCP subnets.

    • OpenStack: cluster network, routers, load balancers, and Bastion along with their ports and floating IPs.

    LCM Operation

    Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Health of all LCM operations on the cluster and its machines.

    LCM Agent

    Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Health of all LCM agents on cluster machines and the status of LCM agents update to the version from the current Cluster release.

    For the history of a cluster deployment or update, refer to Inspect the history of a cluster and machine deployment or update.

    Monitoring of machines readiness

    To monitor machines readiness, use the status icon of a specific machine on the Clusters page.

    • Quick status

      On the Clusters page, in the Managers column. The green status icon indicates that the machine is Ready, the orange status icon indicates that the machine is Updating.

    • Detailed status

      In the Machines section of a particular cluster page, in the Status column. Hover over a particular machine status icon to verify the deploy or update status of a specific machine component.

    You can monitor the status of the following machine components:

    Component

    Description

    Kubelet

    Readiness of a node in a Kubernetes cluster.

    Swarm

    Health and readiness of a node in a Docker Swarm cluster.

    LCM

    LCM readiness status of a node.

    ProviderInstance

    Readiness of a node in the underlying infrastructure (virtual or bare metal, depending on the provider type).

    Graceful Reboot

    Readiness of a machine during a scheduled graceful reboot of a cluster, available since Cluster releases 15.0.1 and 14.0.0.

    Infrastructure Status

    Available since Container Cloud 2.25.0 for the bare metal provider only. Readiness of the IPAMHost, L2Template, BareMetalHost, and BareMetalHostProfile objects associated with the machine.

    LCM Operation

    Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Health of all LCM operations on the machine.

    LCM Agent

    Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). Health of the LCM Agent on the machine and the status of the LCM Agent update to the version from the current Cluster release.

    The machine creation starts with the Provision status. During provisioning, the machine is not expected to be accessible since its infrastructure (VM, network, and so on) is being created.

    Other machine statuses are the same as the LCMMachine object states:

    1. Uninitialized - the machine is not yet assigned to an LCMCluster.

    2. Pending - the agent reports a node IP address and host name.

    3. Prepare - the machine executes StateItems that correspond to the prepare phase. This phase usually involves downloading the necessary archives and packages.

    4. Deploy - the machine executes StateItems that correspond to the deploy phase that is becoming a Mirantis Kubernetes Engine (MKE) node.

    5. Ready - the machine is being deployed.

    6. Upgrade - the machine is being upgraded to the new MKE version.

    7. Reconfigure - the machine executes StateItems that correspond to the reconfigure phase. The machine configuration is being updated without affecting workloads running on the machine.

    Once the status changes to Ready, the deployment of the cluster components on this machine is complete.

    You can also monitor the live machine status using API:

    kubectl get machines <machineName> -o wide
    

    Example of system response since Container Cloud 2.23.0:

    NAME   READY LCMPHASE  NODENAME              UPGRADEINDEX  REBOOTREQUIRED  WARNINGS
    demo-0 true  Ready     kaas-node-c6aa8ad3    1             false
    

    For the history of a machine deployment or update, refer to Inspect the history of a cluster and machine deployment or update.

    Alternatively, verify machine statuses from the seed node on which the bootstrap cluster is deployed:

    1. Log in to the seed node.

    2. Export KUBECONFIG to connect to the bootstrap cluster:

      export KUBECONFIG=~/.kube/kind-config-clusterapi
      
    3. Verify the statuses of available LCMMachine objects:

      kubectl get lcmmachines -o wide
      
    4. Verify the statuses of available cluster machines:

      kubectl get machines -o wide
      
  13. Verify that network addresses used on your clusters do not overlap with the following default MKE network addresses for Swarm and MCR:

    • 10.0.0.0/16 is used for Swarm networks. IP addresses from this network are virtual.

    • 10.99.0.0/16 is used for MCR networks. IP addresses from this network are allocated on hosts.

    Verification of Swarm and MCR network addresses

    To verify Swarm and MCR network addresses, run on any master node:

    docker info
    

    Example of system response:

    Server:
     ...
     Swarm:
      ...
      Default Address Pool: 10.0.0.0/16
      SubnetSize: 24
      ...
     Default Address Pools:
       Base: 10.99.0.0/16, Size: 20
     ...
    

    Not all of Swarm and MCR addresses are usually in use. One Swarm Ingress network is created by default and occupies the 10.0.0.0/24 address block. Also, three MCR networks are created by default and occupy three address blocks: 10.99.0.0/20, 10.99.16.0/20, 10.99.32.0/20.

    To verify the actual networks state and addresses in use, run:

    docker network ls
    docker network inspect <networkName>
    

Note

The Bootstrap web UI support for the bare metal provider will be added in one of the following Container Cloud releases.

Configure a bare metal deployment

During creation of a bare metal management cluster using Bootstrap v2, configure several cluster settings to fit your deployment.

Configure BIOS on a bare metal host

Before adding new BareMetalHost objects, configure hardware hosts to correctly load them over the PXE network.

Important

Consider the following common requirements for hardware hosts configuration:

  • Update firmware for BIOS and Baseboard Management Controller (BMC) to the latest available version, especially if you are going to apply the UEFI configuration.

    Container Cloud uses the ipxe.efi binary loader that might be not compatible with old firmware and have vendor-related issues with UEFI booting. For example, the Supermicro issue. In this case, we recommend using the legacy booting format.

  • Configure all or at least the PXE NIC on switches.

    If the hardware host has more than one PXE NIC to boot, we strongly recommend setting up only one in the boot order. It speeds up the provisioning phase significantly.

    Some hardware vendors require a host to be rebooted during BIOS configuration changes from legacy to UEFI or vice versa for the extra option with NIC settings to appear in the menu.

  • Connect only one Ethernet port on a host to the PXE network at any given time. Collect the physical address (MAC) of this interface and use it to configure the BareMetalHost object describing the host.

To configure BIOS on a bare metal host:

  1. Enable the global BIOS mode using BIOS > Boot > boot mode select > legacy. Reboot the host if required.

  2. Enable the LAN-PXE-OPROM support using the following menus:

    • BIOS > Advanced > PCI/PCIe Configuration > LAB OPROM TYPE > legacy

    • BIOS > Advanced > PCI/PCIe Configuration > Network Stack > enabled

    • BIOS > Advanced > PCI/PCIe Configuration > IPv4 PXE Support > enabled

  3. Set up the configured boot order:

    1. BIOS > Boot > Legacy-Boot-Order#1 > Hard Disk

    2. BIOS > Boot > Legacy-Boot-Order#2 > NIC

  4. Save changes and power off the host.

  1. Enable the global BIOS mode using BIOS > Boot > boot mode select > UEFI. Reboot the host if required.

  2. Enable the LAN-PXE-OPROM support using the following menus:

    • BIOS > Advanced > PCI/PCIe Configuration > LAB OPROM TYPE > uefi

    • BIOS > Advanced > PCI/PCIe Configuration > Network Stack > enabled

    • BIOS > Advanced > PCI/PCIe Configuration > IPv4 PXE Support > enabled

    Note

    UEFI support might not apply to all NICs. But at least built-in network interfaces should support it.

  3. Set up the configured boot order:

    1. BIOS > Boot > UEFI-Boot-Order#1 > UEFI Hard Disk

    2. BIOS > Boot > UEFI-Boot-Order#2 > UEFI Network

  4. Save changes and power off the host.

Customize the default bare metal host profile

This section describes the bare metal host profile settings and instructs how to configure this profile before deploying Mirantis Container Cloud on physical servers.

The bare metal host profile is a Kubernetes custom resource. It allows the Infrastructure Operator to define how the storage devices and the operating system are provisioned and configured.

The bootstrap templates for a bare metal deployment include the template for the default BareMetalHostProfile object in the following file that defines the default bare metal host profile:

templates/bm/baremetalhostprofiles.yaml.template

Note

Using BareMetalHostProfile, you can configure LVM or mdadm-based software RAID support during a management or managed cluster creation. For details, see Configure RAID support.

This feature is available as Technology Preview. Use such configuration for testing and evaluation purposes only. For the Technology Preview feature definition, refer to Technology Preview features.

Warning

Any data stored on any device defined in the fileSystems list can be deleted or corrupted during cluster (re)deployment. It happens because each device from the fileSystems list is a part of the rootfs directory tree that is overwritten during (re)deployment.

Examples of affected devices include:

  • A raw device partition with a file system on it

  • A device partition in a volume group with a logical volume that has a file system on it

  • An mdadm RAID device with a file system on it

  • An LVM RAID device with a file system on it

The wipe field (deprecated) or wipeDevice structure (recommended since Container Cloud 2.26.0) have no effect in this case and cannot protect data on these devices.

Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

The customization procedure of BareMetalHostProfile is almost the same for the management and managed clusters, with the following differences:

  • For a management cluster, the customization automatically applies to machines during bootstrap. And for a managed cluster, you apply the changes using kubectl before creating a managed cluster.

  • For a management cluster, you edit the default baremetalhostprofiles.yaml.template. And for a managed cluster, you create a new BareMetalHostProfile with the necessary configuration.

For the procedure details, see Create a custom bare metal host profile. Use this procedure for both types of clusters considering the differences described above.

Configure NIC bonding

You can configure L2 templates for the management cluster to set up a bond network interface for the PXE and management network.

This configuration must be applied to the bootstrap templates, before you run the bootstrap script to deploy the management cluster.

..admonition:: Configuration requirements for NIC bonding

  • Add at least two physical interfaces to each host in your management cluster.

  • Connect at least two interfaces per host to an Ethernet switch that supports Link Aggregation Control Protocol (LACP) port groups and LACP fallback.

  • Configure an LACP group on the ports connected to the NICs of a host.

  • Configure the LACP fallback on the port group to ensure that the host can boot over the PXE network before the bond interface is set up on the host operating system.

  • Configure server BIOS for both NICs of a bond to be PXE-enabled.

  • If the server does not support booting from multiple NICs, configure the port of the LACP group that is connected to the PXE-enabled NIC of a server to be the primary port. With this setting, the port becomes active in the fallback mode.

  • Configure the ports that connect servers to the PXE network with the PXE VLAN as native or untagged.

For reference configuration of network fabric in a baremetal-based cluster, see Network fabric.

To configure a bond interface that aggregates two interfaces for the PXE and management network:

  1. In kaas-bootstrap/templates/bm/ipam-objects.yaml.template:

    1. Verify that only the following parameters for the declaration of {{nic 0}} and {{nic 1}} are set, as shown in the example below:

      • dhcp4

      • dhcp6

      • match

      • set-name

      Remove other parameters.

    2. Verify that the declaration of the bond interface bond0 has the interfaces parameter listing both Ethernet interfaces.

    3. Verify that the node address in the PXE network (ip "bond0:mgmt-pxe" in the below example) is bound to the bond interface or to the virtual bridge interface tied to that bond.

      Caution

      No VLAN ID must be configured for the PXE network from the host side.

    4. Configure bonding options using the parameters field. The only mandatory option is mode. See the example below for details.

      Note

      You can set any mode supported by netplan and your hardware.

      Important

      Bond monitoring is disabled in Ubuntu by default. However, Mirantis highly recommends enabling it using Media Independent Interface (MII) monitoring by setting the mii-monitor-interval parameter to a non-zero value. For details, see Linux documentation: bond monitoring.

  2. Verify your configuration using the following example:

    kind: L2Template
    metadata:
      name: kaas-mgmt
      ...
    spec:
      ...
      l3Layout:
        - subnetName: kaas-mgmt
          scope:      namespace
      npTemplate: |
        version: 2
        ethernets:
          {{nic 0}}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{mac 0}}
            set-name: {{nic 0}}
          {{nic 1}}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{mac 1}}
            set-name: {{nic 1}}
        bonds:
          bond0:
            interfaces:
              - {{nic 0}}
              - {{nic 1}}
            parameters:
              mode: 802.3ad
              mii-monitor-interval: 100
            dhcp4: false
            dhcp6: false
            addresses:
              - {{ ip "bond0:mgmt-pxe" }}
        vlans:
          k8s-lcm:
            id: SET_VLAN_ID
            link: bond0
            addresses:
              - {{ ip "k8s-lcm:kaas-mgmt" }}
            nameservers:
              addresses: {{ nameservers_from_subnet "kaas-mgmt" }}
            routes:
              - to: 0.0.0.0/0
                via: {{ gateway_from_subnet "kaas-mgmt" }}
        ...
    
  3. Proceed to bootstrap your management cluster as described in Deploy a management cluster using CLI.

Separate PXE and management networks

This section describes how to configure a dedicated PXE network for a management bare metal cluster. A separate PXE network allows isolating sensitive bare metal provisioning process from the end users. The users still have access to Container Cloud services, such as Keycloak, to authenticate workloads in managed clusters, such as Horizon in a Mirantis OpenStack for Kubernetes cluster.

Note

This additional configuration procedure must be completed as part of the Deploy a management cluster using CLI steps. It substitutes or appends some configuration parameters and templates that are used in Deploy a management cluster using CLI for the management cluster to use two networks, PXE and management, instead of one PXE/management network. We recommend considering the Deploy a management cluster using CLI procedure first.

The following table describes the overall network mapping scheme with all L2/L3 parameters, for example, for two networks, PXE (CIDR 10.0.0.0/24) and management (CIDR 10.0.11.0/24):

Network mapping overview

Deployment file name

Network

Parameters and values

cluster.yaml

Management

  • SET_LB_HOST=10.0.11.90

  • SET_METALLB_ADDR_POOL=10.0.11.61-10.0.11.80

ipam-objects.yaml

PXE

  • SET_IPAM_CIDR=10.0.0.0/24

  • SET_PXE_NW_GW=10.0.0.1

  • SET_PXE_NW_DNS=8.8.8.8

  • SET_IPAM_POOL_RANGE=10.0.0.100-10.0.0.109

  • SET_METALLB_PXE_ADDR_POOL=10.0.0.61-10.0.0.70

ipam-objects.yaml

Management

  • SET_LCM_CIDR=10.0.11.0/24

  • SET_LCM_RANGE=10.0.11.100-10.0.11.199

  • SET_LB_HOST=10.0.11.90

  • SET_METALLB_ADDR_POOL=10.0.11.61-10.0.11.80

bootstrap.sh

PXE

  • KAAS_BM_PXE_IP=10.0.0.20

  • KAAS_BM_PXE_MASK=24

  • KAAS_BM_PXE_BRIDGE=br0

  • KAAS_BM_BM_DHCP_RANGE=10.0.0.30,10.0.0.59,255.255.255.0

  • BOOTSTRAP_METALLB_ADDRESS_POOL=10.0.0.61-10.0.0.80


When using separate PXE and management networks, the management cluster services are exposed in different networks using two separate MetalLB address pools:

  • Services exposed through the PXE network are as follows:

    • Ironic API as a bare metal provisioning server

    • HTTP server that provides images for network boot and server provisioning

    • Caching server for accessing the Container Cloud artifacts deployed on hosts

  • Services exposed through the management network are all other Container Cloud services, such as Keycloak, web UI, and so on.

To configure separate PXE and management networks:

  1. Inspect guidelines to follow during configuration of the Subnet object as a MetalLB address pool as described MetalLB configuration guidelines for subnets.

  2. To ensure successful bootstrap, enable asymmetric routing on the interfaces of the management cluster nodes. This is required because the seed node relies on one network by default, which can potentially cause traffic asymmetry.

    In the kernelParameters section of bm/baremetalhostprofiles.yaml.template, set rp_filter to 2. This enables loose mode as defined in RFC3704.

    Example configuration of asymmetric routing
    ...
    kernelParameters:
      ...
      sysctl:
        # Enables the "Loose mode" for the "k8s-lcm" interface (management network)
        net.ipv4.conf.k8s-lcm.rp_filter: "2"
        # Enables the "Loose mode" for the "bond0" interface (PXE network)
        net.ipv4.conf.bond0.rp_filter: "2"
        ...
    

    Note

    More complicated solutions that are not described in this manual include getting rid of traffic asymmetry, for example:

    • Configure source routing on management cluster nodes.

    • Plug the seed node into the same networks as the management cluster nodes, which requires custom configuration of the seed node.

  3. In kaas-bootstrap/templates/bm/ipam-objects.yaml.template:

    • Substitute all the Subnet object templates with the new ones as described in the example template below

    • Update the L2 template spec.l3Layout and spec.npTemplate fields as described in the example template below

    Example of the Subnet object templates
    # Subnet object that provides IP addresses for bare metal hosts of
    # management cluster in the PXE network.
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: mgmt-pxe
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        kaas-mgmt-pxe-subnet: ""
    spec:
      cidr: SET_IPAM_CIDR
      gateway: SET_PXE_NW_GW
      nameservers:
        - SET_PXE_NW_DNS
      includeRanges:
        - SET_IPAM_POOL_RANGE
      excludeRanges:
        - SET_METALLB_PXE_ADDR_POOL
    ---
    # Subnet object that provides IP addresses for bare metal hosts of
    # management cluster in the management network.
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: mgmt-lcm
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        kaas-mgmt-lcm-subnet: ""
        ipam/SVC-k8s-lcm: "1"
        ipam/SVC-ceph-cluster: "1"
        ipam/SVC-ceph-public: "1"
        cluster.sigs.k8s.io/cluster-name: CLUSTER_NAME
    spec:
      cidr: {{ SET_LCM_CIDR }}
      includeRanges:
        - {{ SET_LCM_RANGE }}
      excludeRanges:
        - SET_LB_HOST
        - SET_METALLB_ADDR_POOL
    ---
    # Deprecated since 2.27.0. Subnet object that provides configuration
    # for "services-pxe" MetalLB address pool that will be used to expose
    # services LB endpoints in the PXE network.
    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: mgmt-pxe-lb
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        metallb/address-pool-name: services-pxe
        metallb/address-pool-protocol: layer2
        metallb/address-pool-auto-assign: "false"
        cluster.sigs.k8s.io/cluster-name: CLUSTER_NAME
    spec:
      cidr: SET_IPAM_CIDR
      includeRanges:
        - SET_METALLB_PXE_ADDR_POOL
    
    Example of the L2 template spec
    kind: L2Template
    ...
    spec:
      ...
      l3Layout:
        - scope: namespace
          subnetName: kaas-mgmt-pxe
          labelSelector:
            kaas.mirantis.com/provider: baremetal
            kaas-mgmt-pxe-subnet: ""
        - scope: namespace
          subnetName: kaas-mgmt-lcm
          labelSelector:
            kaas.mirantis.com/provider: baremetal
            kaas-mgmt-lcm-subnet: ""
      npTemplate: |
        version: 2
        renderer: networkd
        ethernets:
          {{nic 0}}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{mac 0}}
            set-name: {{nic 0}}
          {{nic 1}}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{mac 1}}
            set-name: {{nic 1}}
        bridges:
          bm-pxe:
            interfaces:
             - {{ nic 0 }}
            dhcp4: false
            dhcp6: false
            addresses:
              - {{ ip "bm-pxe:kaas-mgmt-pxe" }}
            nameservers:
              addresses: {{ nameservers_from_subnet "kaas-mgmt-pxe" }}
            routes:
              - to: 0.0.0.0/0
                via: {{ gateway_from_subnet "kaas-mgmt-pxe" }}
          k8s-lcm:
            interfaces:
             - {{ nic 1 }}
            dhcp4: false
            dhcp6: false
            addresses:
              - {{ ip "k8s-lcm:kaas-mgmt-lcm" }}
            nameservers:
              addresses: {{ nameservers_from_subnet "kaas-mgmt-lcm" }}
    

    Deprecated since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0): the last Subnet template named mgmt-pxe-lb in the example above will be used to configure the MetalLB address pool in the PXE network. The bare metal provider will automatically configure MetalLB with address pools using the Subnet objects identified by specific labels.

    Warning

    The bm-pxe address must have a separate interface with only one address on this interface.

  4. Verify the current MetalLB configuration that is stored in MetalLB objects:

    kubectl -n metallb-system get ipaddresspools,l2advertisements
    

    For the example configuration described above, the system outputs a similar content:

    NAME                                    AGE
    ipaddresspool.metallb.io/default        129m
    ipaddresspool.metallb.io/services-pxe   129m
    
    NAME                                      AGE
    l2advertisement.metallb.io/default        129m
    l2advertisement.metallb.io/services-pxe   129m
    

    To verify the MetalLB objects:

    kubectl -n metallb-system get <object> -o json | jq '.spec'
    

    For the example configuration described above, the system outputs a similar content for ipaddresspool objects:

    {
      "addresses": [
        "10.0.11.61-10.0.11.80"
      ],
      "autoAssign": true,
      "avoidBuggyIPs": false
    }
    $ kubectl -n metallb-system get ipaddresspool.metallb.io/services-pxe -o json | jq '.spec'
    {
      "addresses": [
        "10.0.0.61-10.0.0.70"
      ],
      "autoAssign": false,
      "avoidBuggyIPs": false
    }
    

    The auto-assign parameter will be set to false for all address pools except the default one. So, a particular service will get an address from such an address pool only if the Service object has a special metallb.universe.tf/address-pool annotation that points to the specific address pool name.

    Note

    It is expected that every Container Cloud service on a management cluster will be assigned to one of the address pools. Current consideration is to have two MetalLB address pools:

    • services-pxe is a reserved address pool name to use for the Container Cloud services in the PXE network (Ironic API, HTTP server, caching server).

      The bootstrap cluster also uses the services-pxe address pool for its provision services for management cluster nodes to be provisioned from the bootstrap cluster. After the management cluster is deployed, the bootstrap cluster is deleted and that address pool is solely used by the newly deployed cluster.

    • default is an address pool to use for all other Container Cloud services in the management network. No annotation is required on the Service objects in this case.

  5. Select from the following options for configuration of the dedicatedMetallbPools flag:

    Skip this step because the flag is hardcoded to true.

    Verify that the flag is set to the default true value.

    The flag enables splitting of LB endpoints for the Container Cloud services. The metallb.universe.tf/address-pool annotations on the Service objects are configured by the bare metal provider automatically when the dedicatedMetallbPools flag is set to true.

    Example Service object configured by the baremetal-operator Helm release:

    apiVersion: v1
    kind: Service
    metadata:
      name: ironic-api
      annotations:
        metallb.universe.tf/address-pool: services-pxe
    spec:
      ports:
      - port: 443
        targetPort: 443
      type: LoadBalancer
    

    The metallb.universe.tf/address-pool annotation on the Service object is set to services-pxe by the baremetal provider, so the ironic-api service will be assigned an LB address from the corresponding MetalLB address pool.

  6. In addition to the network parameters defined in Deploy a management cluster using CLI, configure the following ones by replacing them in templates/bm/ipam-objects.yaml.template:

    New subnet template parameters

    Parameter

    Description

    Example value

    SET_LCM_CIDR

    Address of a management network for the management cluster in the CIDR notation. You can later share this network with managed clusters where it will act as the LCM network. If managed clusters have their separate LCM networks, those networks must be routable to the management network.

    10.0.11.0/24

    SET_LCM_RANGE

    Address range that includes addresses to be allocated to bare metal hosts in the management network for the management cluster. When this network is shared with managed clusters, the size of this range limits the number of hosts that can be deployed in all clusters that share this network. When this network is solely used by a management cluster, the range should include at least 3 IP addresses for bare metal hosts of the management cluster.

    10.0.11.100-10.0.11.109

    SET_METALLB_PXE_ADDR_POOL

    Address range to be used for LB endpoints of the Container Cloud services: Ironic-API, HTTP server, and caching server. This range must be within the PXE network. The minimum required range is 5 IP addresses.

    10.0.0.61-10.0.0.70

    The following parameters will now be tied to the management network while their meaning remains the same as described in Deploy a management cluster using CLI:

    Subnet template parameters migrated to management network

    Parameter

    Description

    Example value

    SET_LB_HOST

    IP address of the externally accessible API endpoint of the management cluster. This address must NOT be within the SET_METALLB_ADDR_POOL range but within the management network. External load balancers are not supported.

    10.0.11.90

    SET_METALLB_ADDR_POOL

    The address range to be used for the externally accessible LB endpoints of the Container Cloud services, such as Keycloak, web UI, and so on. This range must be within the management network. The minimum required range is 19 IP addresses.

    10.0.11.61-10.0.11.80

  7. Proceed to further steps in Deploy a management cluster using CLI.

Configure multiple DHCP ranges using Subnet resources

To facilitate multi-rack and other types of distributed bare metal datacenter topologies, the dnsmasq DHCP server used for host provisioning in Container Cloud supports working with multiple L2 segments through network routers that support DHCP relay.

Container Cloud has its own DHCP relay running on one of the management cluster nodes. That DHCP relay serves for proxying DHCP requests in the same L2 domain where the management cluster nodes are located.

Caution

Networks used for hosts provisioning of a managed cluster must have routes to the PXE network (when a dedicated PXE network is configured) or to the combined PXE/management network of the management cluster. This configuration enables hosts to have access to the management cluster services that are used during host provisioning.

Management cluster nodes must have routes through the PXE network to PXE network segments used on a managed cluster. The following example contains L2 template fragments for a management cluster node:

l3Layout:
  # PXE/static subnet for a management cluster
  - scope: namespace
    subnetName: kaas-mgmt-pxe
    labelSelector:
      kaas-mgmt-pxe-subnet: "1"
  # management (LCM) subnet for a management cluster
  - scope: namespace
    subnetName: kaas-mgmt-lcm
    labelSelector:
      kaas-mgmt-lcm-subnet: "1"
  # PXE/dhcp subnets for a managed cluster
  - scope: namespace
    subnetName: managed-dhcp-rack-1
  - scope: namespace
    subnetName: managed-dhcp-rack-2
  - scope: namespace
    subnetName: managed-dhcp-rack-3
  ...
npTemplate: |
  ...
  bonds:
    bond0:
      interfaces:
        - {{ nic 0 }}
        - {{ nic 1 }}
      parameters:
        mode: active-backup
        primary: {{ nic 0 }}
        mii-monitor-interval: 100
      dhcp4: false
      dhcp6: false
      addresses:
        # static address on management node in the PXE network
        - {{ ip "bond0:kaas-mgmt-pxe" }}
      routes:
        # routes to managed PXE network segments
        - to: {{ cidr_from_subnet "managed-dhcp-rack-1" }}
          via: {{ gateway_from_subnet "kaas-mgmt-pxe" }}
        - to: {{ cidr_from_subnet "managed-dhcp-rack-2" }}
          via: {{ gateway_from_subnet "kaas-mgmt-pxe" }}
        - to: {{ cidr_from_subnet "managed-dhcp-rack-3" }}
          via: {{ gateway_from_subnet "kaas-mgmt-pxe" }}
        ...

To configure DHCP ranges for dnsmasq, create the Subnet objects tagged with the ipam/SVC-dhcp-range label while setting up subnets for a managed cluster using CLI.

Caution

Support of multiple DHCP ranges has the following limitations:

  • Using of custom DNS server addresses for servers that boot over PXE is not supported.

  • The Subnet objects for DHCP ranges cannot be associated with any specific cluster, as DHCP server configuration is only applicable to the management cluster where DHCP server is running. The cluster.sigs.k8s.io/cluster-name label will be ignored.

    Note

    Before the Cluster release 16.1.0, the Subnet object contains the kaas.mirantis.com/region label that specifies the region where the DHCP ranges will be applied.

Migration of DHCP configuration for existing management clusters

Note

This section applies only to existing management clusters that are created before Container 2.24.0.

Caution

Since Container Cloud 2.24.0, you can only remove the deprecated dnsmasq.dhcp_range, dnsmasq.dhcp_ranges, dnsmasq.dhcp_routers, and dnsmasq.dhcp_dns_servers values from the cluster spec.

The Admission Controller does not accept any other changes in these values. This configuration is completely superseded by the Subnet object.

The DHCP configuration automatically migrated from the cluster spec to Subnet objects after cluster upgrade to 2.21.0.

To remove the deprecated dnsmasq parameters from the cluster spec:

  1. Open the management cluster spec for editing.

  2. In the baremetal-operator release values, remove the dnsmasq.dhcp_range, dnsmasq.dhcp_ranges, dnsmasq.dhcp_routers, and dnsmasq.dhcp_dns_servers parameters. For example:

    regional:
    - helmReleases:
      - name: baremetal-operator
        values:
          dnsmasq:
            dhcp_range: 10.204.1.0,10.204.5.255,255.255.255.0
    

    Caution

    The dnsmasq.dhcp_<name> parameters of the baremetal-operator Helm chart values in the Cluster spec are deprecated since the Cluster release 11.5.0 and removed in the Cluster release 14.0.0.

  3. Ensure that the required DHCP ranges and options are set in the Subnet objects. For configuration details, see Configure DHCP ranges for dnsmasq.

The dnsmasq configuration options dhcp-option=3 and dhcp-option=6 are absent in the default configuration. So, by default, dnsmasq will send the DNS server and default route to DHCP clients as defined in the dnsmasq official documentation:

  • The netmask and broadcast address are the same as on the host running dnsmasq.

  • The DNS server and default route are set to the address of the host running dnsmasq.

  • If the domain name option is set, this name is sent to DHCP clients.

Configure DHCP ranges for dnsmasq
  1. Create the Subnet objects tagged with the ipam/SVC-dhcp-range label.

    Caution

    For cluster-specific subnets, create Subnet objects in the same namespace as the related Cluster object project. For shared subnets, create Subnet objects in the default namespace.

    To create the Subnet objects, refer to Create subnets.

    Use the following Subnet object example to specify DHCP ranges and DHCP options to pass the default route address:

    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: mgmt-dhcp-range
      namespace: default
      labels:
        ipam/SVC-dhcp-range: ""
        kaas.mirantis.com/provider: baremetal
    spec:
      cidr: 10.11.0.0/24
      gateway: 10.11.0.1
      includeRanges:
        - 10.11.0.121-10.11.0.125
        - 10.11.0.191-10.11.0.199
    

    Note

    Setting of custom nameservers in the DHCP subnet is not supported.

    After creation of the above Subnet object, the provided data will be utilized to render the Dnsmasq object used for configuration of the dnsmasq deployment. You do not have to manually edit the Dnsmasq object.

  2. Verify that the changes are applied to the Dnsmasq object:

    kubectl --kubeconfig <pathToMgmtClusterKubeconfig> \
    -n kaas get dnsmasq dnsmasq-dynamic-config -o json
    
Configure DHCP relay on ToR switches

For servers to access the DHCP server across the L2 segment boundaries, for example, from another rack with a different VLAN for PXE network, you must configure DHCP relay (agent) service on the border switch of the segment. For example, on a top-of-rack (ToR) or leaf (distribution) switch, depending on the data center network topology.

Warning

To ensure predictable routing for the relay of DHCP packets, Mirantis strongly advises against the use of chained DHCP relay configurations. This precaution limits the number of hops for DHCP packets, with an optimal scenario being a single hop.

This approach is justified by the unpredictable nature of chained relay configurations and potential incompatibilities between software and hardware relay implementations.

The dnsmasq server listens on the PXE network of the management cluster by using the dhcp-lb Kubernetes Service.

To configure the DHCP relay service, specify the external address of the dhcp-lb Kubernetes Service as an upstream address for the relayed DHCP requests, which is the IP helper address for DHCP. There is the dnsmasq deployment behind this service that can only accept relayed DHCP requests.

Container Cloud has its own DHCP relay running on one of the management cluster nodes. That DHCP relay serves for proxying DHCP requests in the same L2 domain where the management cluster nodes are located.

To obtain the actual IP address issued to the dhcp-lb Kubernetes Service:

kubectl -n kaas get service dhcp-lb
Enable dynamic IP allocation

Available since the Cluster release 16.1.0

This section instructs you on how to enable dynamic IP allocation feature to increase the amount of baremetal hosts to be provisioned in parallel on managed clusters.

Using this feature, you can effortlessly deploy a large managed cluster by provisioning up to 100 hosts simultaneously. In addition to dynamic IP allocation, this feature disables the ping check in the DHCP server. Therefore, if you plan to deploy large managed clusters, enable this feature during the management cluster bootstrap.

Caution

Before using this feature, familiarize yourself with DHCP range requirements for PXE.

To enable dynamic IP allocation for large managed clusters:

In the Cluster object of the management cluster, modify the configuration of baremetal-operator by setting dynamic_bootp to true:

spec:
  ...
  providerSpec:
    value:
      kaas:
        ...
        regional:
          - helmReleases:
            - name: baremetal-operator
              values:
                dnsmasq:
                  dynamic_bootp: true
            provider: baremetal
          ...
Configure optional cluster settings

Note

Consider this section as part of the Bootstrap v2 CLI or web UI procedure.

During creation of a management cluster using Bootstrap v2, you can configure optional cluster settings using the Container Cloud API by modifying the Cluster object or cluster.yaml.template of the required provider.

To configure optional cluster settings:

  1. Select from the following options:

    • If you create a management cluster using the Container Cloud API, proceed to the next step and configure cluster.yaml.template of the required provider instead of the Cluster object while following the below procedure.

    • If you create a management cluster using the Container Cloud Bootstrap web UI:

      1. Log in to the seed node where the bootstrap cluster is located.

      2. Navigate to the kaas-bootstrap folder.

      3. Export KUBECONFIG to connect to the bootstrap cluster:

        export KUBECONFIG=<pathToKindKubeconfig>
        
      4. Obtain the cluster name and open its Cluster object for editing:

        kubectl get clusters
        
        kubectl edit cluster <clusterName>
        
  2. Technology Preview. Enable custom host names for cluster machines. When enabled, any machine host name in a particular region matches the related Machine object name. For example, instead of the default kaas-node-<UID>, a machine host name will be master-0. The custom naming format is more convenient and easier to operate with.

    To enable the feature on the management and its future managed clusters:

    1. In the Cluster object, find the spec.providerSpec.value.kaas.regional.helmReleases.name: <provider-name> section.

    2. Under values.config, add customHostnamesEnabled: true.

      For example, for the bare metal provider:

      regional:
       - helmReleases:
         - name: baremetal-provider
           values:
             config:
               allInOneAllowed: false
               customHostnamesEnabled: true
               internalLoadBalancers: false
         provider: baremetal-provider
      
    1. In the Cluster object, find the spec.providerSpec.value.kaas.regional section of the required region.

    2. In this section, find the required provider name under helmReleases.

    3. Under values.config, add customHostnamesEnabled: true.

      For example, for the bare metal provider in region-one:

      regional:
       - helmReleases:
         - name: baremetal-provider
           values:
             config:
               allInOneAllowed: false
               customHostnamesEnabled: true
               internalLoadBalancers: false
         provider: baremetal-provider
      

    Add the following environment variable:

    export CUSTOM_HOSTNAMES=true
    
  3. Technology Preview. Enable the Linux Audit daemon auditd to monitor activity of cluster processes and prevent potential malicious activity.

    Configuration for auditd

    In the Cluster object, add the auditd parameters:

    spec:
      providerSpec:
        value:
          audit:
            auditd:
              enabled: <bool>
              enabledAtBoot: <bool>
              backlogLimit: <int>
              maxLogFile: <int>
              maxLogFileAction: <string>
              maxLogFileKeep: <int>
              mayHaltSystem: <bool>
              presetRules: <string>
              customRules: <string>
              customRulesX32: <text>
              customRulesX64: <text>
    

    Configuration parameters for auditd:

    enabled

    Boolean, default - false. Enables the auditd role to install the auditd packages and configure rules. CIS rules: 4.1.1.1, 4.1.1.2.

    enabledAtBoot

    Boolean, default - false. Configures grub to audit processes that can be audited even if they start up prior to auditd startup. CIS rule: 4.1.1.3.

    backlogLimit

    Integer, default - none. Configures the backlog to hold records. If during boot audit=1 is configured, the backlog holds 64 records. If more than 64 records are created during boot, auditd records will be lost with a potential malicious activity being undetected. CIS rule: 4.1.1.4.

    maxLogFile

    Integer, default - none. Configures the maximum size of the audit log file. Once the log reaches the maximum size, it is rotated and a new log file is created. CIS rule: 4.1.2.1.

    maxLogFileAction

    String, default - none. Defines handling of the audit log file reaching the maximum file size. Allowed values:

    • keep_logs - rotate logs but never delete them

    • rotate - add a cron job to compress rotated log files and keep maximum 5 compressed files.

    • compress - compress log files and keep them under the /var/log/auditd/ directory. Requires auditd_max_log_file_keep to be enabled.

    CIS rule: 4.1.2.2.

    maxLogFileKeep

    Integer, default - 5. Defines the number of compressed log files to keep under the /var/log/auditd/ directory. Requires auditd_max_log_file_action=compress. CIS rules - none.

    mayHaltSystem

    Boolean, default - false. Halts the system when the audit logs are full. Applies the following configuration:

    • space_left_action = email

    • action_mail_acct = root

    • admin_space_left_action = halt

    CIS rule: 4.1.2.3.

    customRules

    String, default - none. Base64-encoded content of the 60-custom.rules file for any architecture. CIS rules - none.

    customRulesX32

    String, default - none. Base64-encoded content of the 60-custom.rules file for the i386 architecture. CIS rules - none.

    customRulesX64

    String, default - none. Base64-encoded content of the 60-custom.rules file for the x86_64 architecture. CIS rules - none.

    presetRules

    String, default - none. Comma-separated list of the following built-in preset rules:

    • access

    • actions

    • delete

    • docker

    • identity

    • immutable

    • logins

    • mac-policy

    • modules

    • mounts

    • perm-mod

    • privileged

    • scope

    • session

    • system-locale

    • time-change

    Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0) in the Technology Preview scope, you can collect some of the preset rules indicated above as groups and use them in presetRules:

    • ubuntu-cis-rules - this group contains rules to comply with the Ubuntu CIS Benchmark recommendations, including the following CIS Ubuntu 20.04 v2.0.1 rules:

      • scope - 5.2.3.1

      • actions - same as 5.2.3.2

      • time-change - 5.2.3.4

      • system-locale - 5.2.3.5

      • privileged - 5.2.3.6

      • access - 5.2.3.7

      • identity - 5.2.3.8

      • perm-mod - 5.2.3.9

      • mounts - 5.2.3.10

      • session - 5.2.3.11

      • logins - 5.2.3.12

      • delete - 5.2.3.13

      • mac-policy - 5.2.3.14

      • modules - 5.2.3.19

    • docker-cis-rules - this group contains rules to comply with Docker CIS Benchmark recommendations, including the docker Docker CIS v1.6.0 rules 1.1.3 - 1.1.18.

    You can also use two additional keywords inside presetRules:

    • none - select no built-in rules.

    • all - select all built-in rules. When using this keyword, you can add the ! prefix to a rule name to exclude some rules. You can use the ! prefix for rules only if you add the all keyword as the first rule. Place a rule with the ! prefix only after the all keyword.

    Example configurations:

    • presetRules: none - disable all preset rules

    • presetRules: docker - enable only the docker rules

    • presetRules: access,actions,logins - enable only the access, actions, and logins rules

    • presetRules: ubuntu-cis-rules - enable all rules from the ubuntu-cis-rules group

    • presetRules: docker-cis-rules,actions - enable all rules from the docker-cis-rules group and the actions rule

    • presetRules: all - enable all preset rules

    • presetRules: all,!immutable,!sessions - enable all preset rules except immutable and sessions


    CIS controls
    4.1.3 (time-change)
    4.1.4 (identity)
    4.1.5 (system-locale)
    4.1.6 (mac-policy)
    4.1.7 (logins)
    4.1.8 (session)
    4.1.9 (perm-mod)
    4.1.10 (access)
    4.1.11 (privileged)
    4.1.12 (mounts)
    4.1.13 (delete)
    4.1.14 (scope)
    4.1.15 (actions)
    4.1.16 (modules)
    4.1.17 (immutable)
    Docker CIS controls
    1.1.4
    1.1.8
    1.1.10
    1.1.12
    1.1.13
    1.1.15
    1.1.16
    1.1.17
    1.1.18
    1.2.3
    1.2.4
    1.2.5
    1.2.6
    1.2.7
    1.2.10
    1.2.11
  4. Configure OIDC integration:

    LDAP configuration

    Example configuration:

    spec:
      providerSpec:
        value:
          kaas:
            management:
              helmReleases:
              - name: iam
                values:
                  keycloak:
                    userFederation:
                      providers:
                        - displayName: "<LDAP_NAME>"
                          providerName: "ldap"
                          priority: 1
                          fullSyncPeriod: -1
                          changedSyncPeriod: -1
                          config:
                            pagination: "true"
                            debug: "false"
                            searchScope: "1"
                            connectionPooling: "true"
                            usersDn: "<DN>" # "ou=People, o=<ORGANIZATION>, dc=<DOMAIN_COMPONENT>"
                            userObjectClasses: "inetOrgPerson,organizationalPerson"
                            usernameLDAPAttribute: "uid"
                            rdnLDAPAttribute: "uid"
                            vendor: "ad"
                            editMode: "READ_ONLY"
                            uuidLDAPAttribute: "uid"
                            connectionUrl: "ldap://<LDAP_DNS>"
                            syncRegistrations: "false"
                            authType: "simple"
                            bindCredential: ""
                            bindDn: ""
                      mappers:
                        - name: "username"
                          federationMapperType: "user-attribute-ldap-mapper"
                          federationProviderDisplayName: "<LDAP_NAME>"
                          config:
                            ldap.attribute: "uid"
                            user.model.attribute: "username"
                            is.mandatory.in.ldap: "true"
                            read.only: "true"
                            always.read.value.from.ldap: "false"
                        - name: "full name"
                          federationMapperType: "full-name-ldap-mapper"
                          federationProviderDisplayName: "<LDAP_NAME>"
                          config:
                            ldap.full.name.attribute: "cn"
                            read.only: "true"
                            write.only: "false"
                        - name: "last name"
                          federationMapperType: "user-attribute-ldap-mapper"
                          federationProviderDisplayName: "<LDAP_NAME>"
                          config:
                            ldap.attribute: "sn"
                            user.model.attribute: "lastName"
                            is.mandatory.in.ldap: "true"
                            read.only: "true"
                            always.read.value.from.ldap: "true"
                        - name: "email"
                          federationMapperType: "user-attribute-ldap-mapper"
                          federationProviderDisplayName: "<LDAP_NAME>"
                          config:
                            ldap.attribute: "mail"
                            user.model.attribute: "email"
                            is.mandatory.in.ldap: "false"
                            read.only: "true"
                            always.read.value.from.ldap: "true"
    

    Note

    • Verify that the userFederation section is located on the same level as the initUsers section.

    • Verify that all attributes set in the mappers section are defined for users in the specified LDAP system. Missing attributes may cause authorization issues.

    For details, see Configure LDAP for IAM.

    Google OAuth configuration

    Example configuration:

    keycloak:
      externalIdP:
        google:
          enabled: true
          config:
            clientId: <Google_OAuth_client_ID>
            clientSecret: <Google_OAuth_client_secret>
    

    For details, see Configure Google OAuth IdP for IAM.

  5. Disable NTP that is enabled by default. This option disables the management of chrony configuration by Container Cloud to use your own system for chrony management. Otherwise, configure the regional NTP server parameters as described below.

    NTP configuration

    Configure the regional NTP server parameters to be applied to all machines of managed clusters.

    In the Cluster object, add the ntp:servers section with the list of required server names:

    spec:
      ...
      providerSpec:
        value:
          kaas:
          ...
          ntpEnabled: true
            regional:
              - helmReleases:
                - name: <providerName>-provider
                  values:
                    config:
                      lcm:
                        ...
                        ntp:
                          servers:
                          - 0.pool.ntp.org
                          ...
                provider: <providerName>
                ...
    

    To disable NTP:

    spec:
      ...
      providerSpec:
        value:
          ...
          ntpEnabled: false
          ...
    
  6. Applies only to the bare metal provider since the Cluster release 16.1.0. If you plan to deploy large managed clusters, enable dynamic IP allocation to increase the amount of baremetal hosts to be provisioned in parallel. For details, see Enable dynamic IP allocation.

  7. Applies to the OpenStack provider only:

    1. Configure periodic backups of MariaDB. For more details, see Configure periodic backups of MariaDB.

      Example configuration:

      spec:
        providerSpec:
          value:
            kaas:
              management:
                helmReleases:
                ...
                - name: iam
                  values:
                    keycloak:
                      mariadb:
                        conf:
                          phy_backup:
                            enabled: true
                            backup_timeout: 30000
                            allow_unsafe_backup: true
                            backups_to_keep: 3
                            backup_pvc_name: mariadb-phy-backup-data
                            full_backup_cycle: 70000
                            backup_required_space_ratio: 1.4
                            schedule_time: '30 2 * * *'
      
    2. Technology Preview. Create all load balancers of the cluster with a specific Octavia flavor by defining the following parameter in the spec:providerSpec section of templates/cluster.yaml.template:

      serviceAnnotations:
        loadbalancer.openstack.org/flavor-id: <octaviaFlavorID>
      

      For details, see OpenStack documentation: Octavia Flavors.

      Note

      This feature is not supported by OpenStack Queens.

Now, proceed with completing the bootstrap process using the Container Cloud Bootstrap web UI or API depending on the selected provider as described in Deploy a Container Cloud management cluster.

Post-deployment steps

After bootstrapping the management cluster, collect and save the following cluster details in a secure location:

  1. Obtain the management cluster kubeconfig:

    ./container-cloud get cluster-kubeconfig \
    --kubeconfig <pathToKindKubeconfig> \
    --cluster-name <clusterName>
    

    By default, pathToKindKubeconfig is $HOME/.kube/kind-config-clusterapi.

  2. Obtain the Keycloak credentials as described in Access the Keycloak Admin Console.

  3. Obtain MariaDB credentials for IAM.

  4. Remove the kind cluster:

    ./bin/kind delete cluster -n <kindClusterName>
    

    By default, kindClusterName is clusterapi.

Now, you can proceed with operating your management cluster through the Container Cloud web UI and deploying managed clusters as described in Operations Guide.

Troubleshooting

This section provides solutions to the issues that may occur while deploying a cluster with Container Cloud Bootstrap v2.

Troubleshoot the bootstrap region creation

If the BootstrapRegion object is in the Error state, find the error type in the Status field of the object for the following components to resolve the issue:

Field name

Troubleshooting steps

Helm

If the bootstrap HelmBundle is not ready for a long time, for example, during 15 minutes in case of an average network bandwidth, verify statuses of non-ready releases and resolve the issue depending on the error message of a particular release:

kubectl --kubeconfig <pathToKindKubeconfig> \
get helmbundle bootstrap -o json | \
jq '.status.releaseStatuses[] | select(.ready == false) | {name: .chart, message: .message}'

If fixing the issues with Helm releases does not help, collect the Helm Controller logs and filter them by error to find the root cause:

kubectl --kubeconfig <pathToKindKubeconfig> -n kube-sytem \
logs -lapp=helm-controller | grep "ERROR"

Deployments

If some of deployments are not ready for a long time while the bootstrap HelmBundle is ready, restart the affected deployments:

kubectl --kubeconfig <pathToKindKubeconfig> \
-n kaas rollout restart deploy <notReadyDeploymentName>

If restarting of the affected deployments does not help, collect and assess the logs of non-ready deployments:

kubectl --kubeconfig <pathToKindKubeconfig> \
-n kaas logs -lapp.kubernetes.io/name=<notReadyDeploymentName>

Provider

The status of this field becomes Ready when all provider-related HelmBundle charts are configured and in the Ready status.

Troubleshoot credentials creation

If the Credentials object is in the Error or Invalid state, verify whether the provided credentials are valid and adjust them accordingly.

Warning

The kubectl apply command automatically saves the applied data as plain text into the kubectl.kubernetes.io/last-applied-configuration annotation of the corresponding object. This may result in revealing sensitive data in this annotation when creating or modifying the object.

Therefore, do not use kubectl apply on this object. Use kubectl create, kubectl patch, or kubectl edit instead.

If you used kubectl apply on this object, you can remove the kubectl.kubernetes.io/last-applied-configuration annotation from the object using kubectl edit.

To adjust the Credentials object:

  1. Verify the Credentials object status:

    kubectl --kubeconfig <pathToKindKubeconfig> \
    get <providerName>credentials <credentialsObjectName> -o jsonpath='{.status.valid}{"\n"}'
    

    Replace <providerName> with the name of the selected provider. For example, openstackcredentials.

  2. Open the Credentials object for editing:

    kubectl --kubeconfig <pathToKindKubeconfig> \
    edit <providerName>credentials <credentialsObjectName>
    
  3. Adjust the credentials password:

    1. In password.secret.name of the Credentials object spec section, obtain the related Secret object.

    2. Replace the existing base64-encoded string of the related secret with a new one containing the adjusted password:

      apiVersion: v1
      kind: Secret
      data:
        value: Zm9vYmFyCg==
      
Troubleshoot machines creation

If a Machine object is stuck in the same status for a long time, identify the status phase of the affected machine and proceed as described below.

To verify the status of the created Machine objects:

kubectl --kubeconfig <pathToKindKubeconfig> \
get machines -o jsonpath='{.items[*].status.phase}'

The deployment statuses of a Machine object are the same as the LCMMachine object states:

  1. Uninitialized - the machine is not yet assigned to an LCMCluster.

  2. Pending - the agent reports a node IP address and host name.

  3. Prepare - the machine executes StateItems that correspond to the prepare phase. This phase usually involves downloading the necessary archives and packages.

  4. Deploy - the machine executes StateItems that correspond to the deploy phase that is becoming a Mirantis Kubernetes Engine (MKE) node.

  5. Ready - the machine is being deployed.

  6. Upgrade - the machine is being upgraded to the new MKE version.

  7. Reconfigure - the machine executes StateItems that correspond to the reconfigure phase. The machine configuration is being updated without affecting workloads running on the machine.

If the system response is empty, approve the BootstrapRegion object:

  • Using the Container Cloud web UI, navigate to the Bootstrap tab and approve the related BootstrapRegion object

  • Using the Container Cloud CLI:

    ./container-cloud bootstrap approve all
    

If the system response is not empty and the status remains the same for a while, the issue may relate to machine misconfiguration. Therefore, verify and adjust the parameters of the affected Machine object. For provider-related issues, refer to the Troubleshooting section.

Troubleshoot deployment stages

If the cluster deployment is stuck on the same stage for a long time, it may be related to configuration issues in the Machine or other deployment objects.

To troubleshoot cluster deployment:

  1. Identify the current deployment stage that got stuck:

    kubectl --kubeconfig <pathToKindKubeconfig> \
    get cluster <cluster-name> -o jsonpath='{.status.bootstrapStatus}{"\n"}'
    

    For the deployment stages description, see Overview of the deployment workflow.

  2. Collect the bootstrap-provider logs and identify a repetitive error that relates to the stuck deployment stage:

    kubectl --kubeconfig <pathToKindKubeconfig> \
    -n kaas logs -lapp.kubernetes.io/name=bootstrap-provider
    
    Examples of repetitive errors

    Error name

    Solution

    Cluster nodes are not yet ready

    Verify the Machine objects configuration.

    Starting pivot

    Contact Mirantis support for further issue assessment.

    Some objects in cluster are not ready with the same deployment names

    Verify the related deployment configuration.

Collect the bootstrap logs

If the bootstrap process is stuck or fails, collect and inspect the bootstrap and management cluster logs.

To collect the bootstrap logs:

If the Cluster object is not created yet
  1. List all available deployments:

    kubectl --kubeconfig <pathToKindKubeconfig> \
    -n kaas get deploy
    
  2. Collect the logs of the required deployment:

    kubectl --kubeconfig <pathToKindKubeconfig> \
    -n kaas logs -lapp.kubernetes.io/name=<deploymentName>
    
If the Cluster object is created

Select from the following options:

  • If a management cluster is not deployed yet:

    CLUSTER_NAME=<clusterName> ./bootstrap.sh collect_logs
    
  • If a management cluster is deployed or pivoting is done:

    1. Obtain the cluster kubeconfig:

      ./container-cloud get cluster-kubeconfig \
      --kubeconfig <pathToKindKubeconfig> \
      --cluster-name <clusterName> \
      --kubeconfig-output <pathToMgmtClusterKubeconfig>
      
    2. Collect the logs:

      CLUSTER_NAME=<cluster-name> \
      KUBECONFIG=<pathToMgmtClusterKubeconfig> \
      ./bootstrap.sh collect_logs
      
    3. Technology Preview. For bare metal clusters, assess the Ironic pod logs:

      • Extract the content of the 'message' fields from every log message:

        kubectl -n kaas logs <ironicPodName> -c syslog | jq -rRM 'fromjson? | .message'
        
      • Extract the content of the 'message' fields from the ironic_conductor source log messages:

        kubectl -n kaas logs <ironicPodName> -c syslog | jq -rRM 'fromjson? | select(.source == "ironic_conductor") | .message'
        

      The syslog container collects logs generated by Ansible during the node deployment and cleanup and outputs them in the JSON format.

Note

Add COLLECT_EXTENDED_LOGS=true before the collect_logs command to output the extended version of logs that contains system and MKE logs, logs from LCM Ansible and LCM Agent along with cluster events and Kubernetes resources description and logs.

Without the --extended flag, the basic version of logs is collected, which is sufficient for most use cases. The basic version of logs contains all events, Kubernetes custom resources, and logs from all Container Cloud components. This version does not require passing --key-file.

The logs are collected in the directory where the bootstrap script is located.

Logs structure

The Container Cloud logs structure in <output_dir>/<cluster_name>/ is as follows:

  • /events.log

    Human-readable table that contains information about the cluster events.

  • /system

    System logs.

  • /system/mke (or /system/MachineName/mke)

    Mirantis Kuberntes Engine (MKE) logs.

  • /objects/cluster

    Logs of the non-namespaced Kubernetes objects.

  • /objects/namespaced

    Logs of the namespaced Kubernetes objects.

  • /objects/namespaced/<namespaceName>/core/pods

    Logs of the pods from a specific Kubernetes namespace. For example, logs of the pods from the kaas namespace contain logs of Container Cloud controllers, including bootstrap-cluster-controller since Container Cloud 2.25.0.

  • /objects/namespaced/<namespaceName>/core/pods/<containerName>.prev.log

    Logs of the pods from a specific Kubernetes namespace that were previously removed or failed.

  • /objects/namespaced/<namespaceName>/core/pods/<ironicPodName>/syslog.log Technology Preview. Ironic pod logs of the bare metal clusters.

    Note

    Logs collected by the syslog container during the bootstrap phase are not transferred to the management cluster during pivoting. These logs are located in /volume/log/ironic/ansible_conductor.log inside the Ironic pod.

Each log entry of the management cluster logs contains a request ID that identifies chronology of actions performed on a cluster or machine. The format of the log entry is as follows:

<process ID>.[<subprocess ID>...<subprocess ID N>].req:<requestID>: <logMessage>

For example, os.machine.req:28 contains information about the task 28 applied to an OpenStack machine.

Since Container Cloud 2.22.0, the logging format has the following extended structure for the admission-controller, storage-discovery, and all supported <providerName>-provider services of a management cluster:

level:<debug,info,warn,error,panic>,
ts:<YYYY-MM-DDTHH:mm:ssZ>,
logger:<processID>.<subProcessID(s)>.req:<requestID>,
caller:<lineOfCode>,
msg:<message>,
error:<errorMessage>,
stacktrace:<codeInfo>

Since Container Cloud 2.23.0, this structure also applies to the <name>-controller services of a management cluster.

Example of a log extract for openstack-provider since 2.22.0
{"level":"error","ts":"2022-11-14T21:37:18Z","logger":"os.cluster.req:318","caller":"lcm/machine.go:808","msg":"","error":"could not determine machine demo-46880-bastion host name”,”stacktrace”:”sigs.k8s.io/cluster-api-provider-openstack/pkg/lcm.GetMachineConditions\n\t/go/src/sigs.k8s.io/cluster-api-provider-openstack/pkg/lcm/machine.go:808\nsigs.k8s.io/cluster-api-provider-openstack/pkg...."}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"service/reconcile.go:128","msg":"request: default/demo-46880-2"}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/machine_controller.go:201","msg":"Reconciling Machine \"default/demo-46880-2\""}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/actuator.go:454","msg":"Checking if machine exists: \"default/demo-46880-2\" (cluster: \"default/demo-46880\")"}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/machine_controller.go:327","msg":"Reconciling machine \"default/demo-46880-2\" triggers idempotent update"}
{"level":"info","ts":"2022-11-14T21:37:23Z","logger":"os.machine.req:476","caller":"machine/actuator.go:290","msg":"Updating machine: \"default/demo-46880-2\" (cluster: \"default/demo-46880\")"}
{"level":"info","ts":"2022-11-14T21:37:24Z","logger":"os.machine.req:476","caller":"lcm/machine.go:73","msg":"Machine in LCM cluster, reconciling LCM objects"}
{"level":"info","ts":"2022-11-14T21:37:26Z","logger":"os.machine.req:476","caller":"lcm/machine.go:902","msg":"Updating Machine default/demo-46880-2 conditions"}
  • level

    Informational level. Possible values: debug, info, warn, error, panic.

  • ts

    Time stamp in the <YYYY-MM-DDTHH:mm:ssZ> format. For example: 2022-11-14T21:37:23Z.

  • logger

    Details on the process ID being logged:

    • <processID>

      Primary process identifier. The list of possible values includes bm, os, iam, license, and bootstrap.

      Note

      The iam and license values are available since Container Cloud 2.23.0. The bootstrap value is available since Container Cloud 2.25.0.

    • <subProcessID(s)>

      One or more secondary process identifiers. The list of possible values includes cluster, machine, controller, and cluster-ctrl.

      Note

      The controller value is available since Container Cloud 2.23.0. The cluster-ctrl value is available since Container Cloud 2.25.0 for the bootstrap process identifier.

    • req

      Request ID number that increases when a service performs the following actions:

      • Receives a request from Kubernetes about creating, updating, or deleting an object

      • Receives an HTTP request

      • Runs a background process

      The request ID allows combining all operations performed with an object within one request. For example, the result of a Machine object creation, update of its statuses, and so on has the same request ID.

  • caller

    Code line used to apply the corresponding action to an object.

  • msg

    Description of a deployment or update phase. If empty, it contains the "error" key with a message followed by the "stacktrace" key with stack trace details. For example:

    "msg"="" "error"="Cluster nodes are not yet ready" "stacktrace": "<stack-trace-info>"
    

    The log format of the following Container Cloud components does not contain the "stacktrace" key for easier log handling: baremetal-provider, bootstrap-provider, and host-os-modules-controller.

Note

Logs may also include a number of informational key-value pairs containing additional cluster details. For example, "name": "object-name", "foobar": "baz".

Depending on the type of issue found in logs, apply the corresponding fixes. For example, if you detect the LoadBalancer ERROR state errors during the bootstrap of an OpenStack-based management cluster, contact your system administrator to fix the issue.

Requirements for a MITM proxy

Note

For MOSK, the feature is generally available since MOSK 23.1.

While bootstrapping a Container Cloud management cluster using proxy, you may require Internet access to go through a man-in-the-middle (MITM) proxy. Such configuration requires that you enable streaming and install a CA certificate on a bootstrap node.

Enable streaming for MITM

Ensure that the MITM proxy is configured with enabled streaming. For example, if you use mitmproxy, enable the stream_large_bodies=1 option:

./mitmdump --set stream_large_bodies=1
Install a CA certificate for a MITM proxy on a bootstrap node
  1. Log in to the bootstrap node.

  2. Install ca-certificates:

    apt install ca-certificates
    
  3. Copy your CA certificate to the /usr/local/share/ca-certificates/ directory. For example:

    sudo cp ~/.mitmproxy/mitmproxy-ca-cert.cer /usr/local/share/ca-certificates/mitmproxy-ca-cert.crt
    

    Replace ~/.mitmproxy/mitmproxy-ca-cert.cer with the path to your CA certificate.

    Caution

    The target CA certificate file must be in the PEM format with the .crt extension.

  4. Apply the changes:

    sudo update-ca-certificates
    

Now, proceed with bootstrapping your management cluster.

Create initial users after a management cluster bootstrap

Once you bootstrap your management cluster,create Keycloak users for access to the Container Cloud web UI. Use the created credentials to log in to the Container Cloud web UI.

Mirantis recommends creating at least two users, user and operator, that are required for a typical Container Cloud deployment.

To create the user for access to the Container Cloud web UI, use:

./container-cloud bootstrap user add \
    --username <userName> \
    --roles <roleName> \
    --kubeconfig <pathToMgmtKubeconfig>

Note

You will be asked for the user password interactively.

User creation parameters

Flag

Description

--username

Required. Name of the user to create.

--roles

Required. Comma-separated list of roles to assign to the user.

  • If you run the command without the --namespace flag, you can assign the following roles:

    • global-admin - read and write access for global role bindings

    • writer - read and write access

    • reader - view access

    • operator - create and manage access to the BaremetalHost objects (required for bare metal clusters only)

    • management-admin - full access to the management cluster, available since Container Cloud 2.25.0 (Cluster releases 17.0.0, 16.0.0, 14.1.0)

  • If you run the command for a specific project using the --namespace flag, you can assign the following roles:

    • operator or writer - read and write access

    • user or reader - view access

    • member - read and write access (excluding IAM objects)

    • bm-pool-operator - create and manage access to the BaremetalHost objects (required for bare metal clusters only)

--kubeconfig

Required. Path to the management cluster kubeconfig generated during the management cluster bootstrap.

--namespace

Optional. Name of the Container Cloud project where the user will be created. If not set, a global user will be created for all Container Cloud projects with the corresponding role access to view or manage all Container Cloud public objects.

--password-stdin

Optional. Flag to provide the user password through stdin:

echo '$PASSWORD' | ./container-cloud bootstrap user add \
    --username <userName> \
    --roles <roleName> \
    --kubeconfig <pathToMgmtKubeconfig> \
    --password-stdin

To delete the user, run:

./container-cloud bootstrap user delete --username <userName> --kubeconfig <pathToMgmtKubeconfig>

Troubleshooting

This section provides solutions to the issues that may occur while deploying a management cluster.

Collect the bootstrap logs

If the bootstrap script fails during the deployment process, collect and inspect the bootstrap and management cluster logs.

Note

The below procedure applies to Bootstrap v1. For the Boostrap v2 procedure, refer to Collect the bootstrap logs.

Collect the bootstrap cluster logs
  1. Log in to your local machine where the bootstrap script was executed.

  2. If you bootstrapped the cluster a while ago, verify that the bootstrap directory is updated.

    Select from the following options:

    • For clusters deployed using Container Cloud 2.11.0 or later:

      ./container-cloud bootstrap download --management-kubeconfig <pathToMgmtKubeconfig> \
      --target-dir <pathToBootstrapDirectory>
      
    • For clusters deployed using the Container Cloud release earlier than 2.11.0 or if you deleted the kaas-bootstrap folder, download and run the Container Cloud bootstrap script:

      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      
      chmod 0755 get_container_cloud.sh
      
      ./get_container_cloud.sh
      
  3. Run the following command:

    ./bootstrap.sh collect_logs
    

    Add COLLECT_EXTENDED_LOGS=true before the command to output the extended version of logs that contains system and MKE logs, logs from LCM Ansible and LCM Agent along with cluster events and Kubernetes resources description and logs.

    Without the --extended flag, the basic version of logs is collected, which is sufficient for most use cases. The basic version of logs contains all events, Kubernetes custom resources, and logs from all Container Cloud components. This version does not require passing --key-file.

    The logs are collected in the directory where the bootstrap script is located.

  4. Technology Preview. For bare metal clusters, assess the Ironic pod logs:

    • Extract the content of the 'message' fields from every log message:

      kubectl -n kaas logs <ironicPodName> -c syslog | jq -rRM 'fromjson? | .message'
      
    • Extract the content of the 'message' fields from the ironic_conductor source log messages:

      kubectl -n kaas logs <ironicPodName> -c syslog | jq -rRM 'fromjson? | select(.source == "ironic_conductor") | .message'
      

    The syslog container collects logs generated by Ansible during the node deployment and cleanup and outputs them in the JSON format.

See also

Logs structure

Troubleshoot the bootstrap node configuration

This section provides solutions to the issues that may occur while configuring the bootstrap node.

DNS settings

If you have issues related to the DNS settings, the following error message may occur:

curl: (6) Could not resolve host

The issue may occur if a VPN is used to connect to the cloud or a local DNS forwarder is set up.

The workaround is to change the default DNS settings for Docker:

  1. Log in to your local machine.

  2. Identify your internal or corporate DNS server address:

    systemd-resolve --status
    
  3. Create or edit /etc/docker/daemon.json by specifying your DNS address:

    {
      "dns": ["<YOUR_DNS_ADDRESS>"]
    }
    
  4. Restart the Docker daemon:

    sudo systemctl restart docker
    
Default network addresses

If you have issues related to the default network address configuration, curl either hangs or the following error occurs:

curl: (7) Failed to connect to xxx.xxx.xxx.xxx port xxxx: Host is unreachable

The issue may occur because the default Docker network address 172.17.0.0/16 and/or the kind Docker network, which is used by kind, overlap with your cloud address or other addresses of the network configuration.

Workaround:

  1. Log in to your local machine.

  2. Verify routing to the IP addresses of the target cloud endpoints:

    1. Obtain the IP address of your target cloud. For example:

      nslookup auth.openstack.example.com
      

      Example of system response:

      Name:   auth.openstack.example.com
      Address: 172.17.246.119
      
    2. Verify that this IP address is not routed through docker0 but through any other interface, for example, ens3:

      ip r get 172.17.246.119
      

      Example of the system response if the routing is configured correctly:

      172.17.246.119 via 172.18.194.1 dev ens3 src 172.18.1.1 uid 1000
        cache
      

      Example of the system response if the routing is configured incorrectly:

      172.17.246.119 via 172.18.194.1 dev docker0 src 172.18.1.1 uid 1000
        cache
      
  3. If the routing is incorrect, change the IP address of the default Docker bridge:

    1. Create or edit /etc/docker/daemon.json by adding the "bip" option:

      {
        "bip": "192.168.91.1/24"
      }
      
    2. Restart the Docker daemon:

      sudo systemctl restart docker
      
  4. If required, customize addresses for your kind Docker network or any other additional Docker networks:

    1. Remove the kind network:

      docker network rm 'kind'
      
    2. Choose from the following options:

      • Configure /etc/docker/daemon.json:

        Note

        The following steps are applied to to customize addresses for the kind Docker network. Use these steps as an example for any other additional Docker networks.

        1. Add the following section to /etc/docker/daemon.json:

          {
           "default-address-pools":
           [
             {"base":"192.169.0.0/16","size":24}
           ]
          }
          
        2. Restart the Docker daemon:

          sudo systemctl restart docker
          

          After Docker restart, the newly created local or global scope networks, including 'kind', will be dynamically assigned a subnet from the defined pool.

      • Recreate the 'kind' Docker network manually with a subnet that is not in use in your network. For example:

        docker network create -o com.docker.network.bridge.enable_ip_masquerade=true -d bridge --subnet 192.168.0.0/24 'kind'
        

        Caution

        Docker pruning removes the user defined networks, including 'kind'. Therefore, every time after running the Docker pruning commands, re-create the 'kind' network again using the command above.

Troubleshoot OpenStack-based deployments

This section provides solutions to the issues that may occur while deploying an OpenStack-based management cluster. To troubleshoot a managed cluster, see Operations Guide: Troubleshooting.

TLS handshake timeout

If you execute the bootstrap.sh script from an OpenStack VM that is running on the OpenStack environment used for bootstrapping the management cluster, the following error messages may occur that can be related to the MTU settings discrepancy:

curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to server:port

Failed to check if machine "<machine_name>" exists:
failed to create provider client ... TLS handshake timeout

To identify whether the issue is MTU-related:

  1. Log in to the OpenStack VM in question.

  2. Compare the MTU outputs for the docker0 and ens3 interfaces:

    ip addr
    

    Example of system response:

    3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500...
    ...
    2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450...
    

    If the MTU output values differ for docker0 and ens3, proceed with the workaround below. Otherwise, inspect the logs further to identify the root cause of the error messages.

Workaround:

  1. In your OpenStack environment used for Mirantis Container Cloud, log in to any machine with CLI access to OpenStack. For example, you can create a new Ubuntu VM (separate from the bootstrap VM) and install the python-openstackclient package on it.

  2. Change the vXLAN MTU size for the VM to the required value depending on your network infrastructure and considering your physical network configuration, such as Jumbo frames, and so on.

    openstack network set --mtu <YOUR_MTU_SIZE> <network-name>
    
  3. Stop and start the VM in Nova.

  4. Log in to the bootstrap VM dedicated for the management cluster.

  5. Re-execute the bootstrap.sh script.

Configure external identity provider for IAM

This section describes how to configure authentication for Mirantis Container Cloud depending on the external identity provider type integrated to your deployment.

Configure LDAP for IAM

If you integrate LDAP for IAM to Mirantis Container Cloud, add the required LDAP configuration to cluster.yaml.template during the bootstrap of the management cluster.

Note

The example below defines the recommended non-anonymous authentication type. If you require anonymous authentication, replace the following parameters with authType: "none":

authType: "simple"
bindCredential: ""
bindDn: ""

To configure LDAP for IAM:

  1. Open cluster.yaml.template stored in the following locations depending on the cloud provider type:

    • Bare metal: templates/bm/cluster.yaml.template

    • OpenStack: templates/cluster.yaml.template

    • vSphere: templates/vsphere/cluster.yaml.template

  2. Configure the keycloak:userFederation:providers: and keycloak:userFederation:mappers: sections as required:

    spec:
      providerSpec:
        value:
          kaas:
            management:
              helmReleases:
              - name: iam
                values:
                  keycloak:
                    userFederation:
                      providers:
                        - displayName: "<LDAP_NAME>"
                          providerName: "ldap"
                          priority: 1
                          fullSyncPeriod: -1
                          changedSyncPeriod: -1
                          config:
                            pagination: "true"
                            debug: "false"
                            searchScope: "1"
                            connectionPooling: "true"
                            usersDn: "<DN>" # "ou=People, o=<ORGANIZATION>, dc=<DOMAIN_COMPONENT>"
                            userObjectClasses: "inetOrgPerson,organizationalPerson"
                            usernameLDAPAttribute: "uid"
                            rdnLDAPAttribute: "uid"
                            vendor: "ad"
                            editMode: "READ_ONLY"
                            uuidLDAPAttribute: "uid"
                            connectionUrl: "ldap://<LDAP_DNS>"
                            syncRegistrations: "false"
                            authType: "simple"
                            bindCredential: ""
                            bindDn: ""
                      mappers:
                        - name: "username"
                          federationMapperType: "user-attribute-ldap-mapper"
                          federationProviderDisplayName: "<LDAP_NAME>"
                          config:
                            ldap.attribute: "uid"
                            user.model.attribute: "username"
                            is.mandatory.in.ldap: "true"
                            read.only: "true"
                            always.read.value.from.ldap: "false"
                        - name: "full name"
                          federationMapperType: "full-name-ldap-mapper"
                          federationProviderDisplayName: "<LDAP_NAME>"
                          config:
                            ldap.full.name.attribute: "cn"
                            read.only: "true"
                            write.only: "false"
                        - name: "last name"
                          federationMapperType: "user-attribute-ldap-mapper"
                          federationProviderDisplayName: "<LDAP_NAME>"
                          config:
                            ldap.attribute: "sn"
                            user.model.attribute: "lastName"
                            is.mandatory.in.ldap: "true"
                            read.only: "true"
                            always.read.value.from.ldap: "true"
                        - name: "email"
                          federationMapperType: "user-attribute-ldap-mapper"
                          federationProviderDisplayName: "<LDAP_NAME>"
                          config:
                            ldap.attribute: "mail"
                            user.model.attribute: "email"
                            is.mandatory.in.ldap: "false"
                            read.only: "true"
                            always.read.value.from.ldap: "true"
    

    Note

    • Verify that the userFederation section is located on the same level as the initUsers section.

    • Verify that all attributes set in the mappers section are defined for users in the specified LDAP system. Missing attributes may cause authorization issues.

Now, return to the bootstrap instruction depending on the provider type of your management cluster.

Configure Google OAuth IdP for IAM

Caution

The instruction below applies to the DNS-based management clusters. If you bootstrap a non-DNS-based management cluster, configure Google OAuth IdP for Keycloak after bootstrap using the official Keycloak documentation.

If you integrate Google OAuth external identity provider for IAM to Mirantis Container Cloud, create the authorization credentials for IAM in your Google OAuth account and configure cluster.yaml.template during the bootstrap of the management cluster.

To configure Google OAuth IdP for IAM:

  1. Create Google OAuth credentials for IAM:

    1. Log in to your https://console.developers.google.com.

    2. Navigate to Credentials.

    3. In the APIs Credentials menu, select OAuth client ID.

    4. In the window that opens:

      1. In the Application type menu, select Web application.

      2. In the Authorized redirect URIs field, type in <keycloak-url>/auth/realms/iam/broker/google/endpoint, where <keycloak-url> is the corresponding DNS address.

      3. Press Enter to add the URI.

      4. Click Create.

      A page with your client ID and client secret opens. Save these credentials for further usage.

  2. Log in to the bootstrap node.

  3. Open cluster.yaml.template stored in the following locations depending on the cloud provider type:

    • Bare metal: templates/bm/cluster.yaml.template

    • OpenStack: templates/cluster.yaml.template

    • vSphere: templates/vsphere/cluster.yaml.template

  4. In the keycloak:externalIdP: section, add the following snippet with your credentials created in previous steps:

    keycloak:
      externalIdP:
        google:
          enabled: true
          config:
            clientId: <Google_OAuth_client_ID>
            clientSecret: <Google_OAuth_client_secret>
    

Now, return to the bootstrap instruction depending on the provider type of your management cluster.

Operations Guide

Mirantis Container Cloud CLI

This section was moved to MOSK documentation: Container Cloud CLI.

Create and operate managed clusters

Note

This tutorial applies only to the Container Cloud web UI users with the m:kaas:namespace@operator or m:kaas:namespace@writer access role assigned by the Infrastructure Operator. To add a bare metal host, the m:kaas@operator or m:kaas:namespace@bm-pool-operator role is required.

After you deploy the Mirantis Container Cloud management cluster, you can start creating managed clusters that will be based on the same cloud provider type that you have for the management cluster: OpenStack, bare metal, or vSphere.

Caution

Since Container Cloud 2.27.3 (Cluster release 16.2.3), support for vSphere-based clusters is suspended. For details, see Deprecation notes.

The deployment procedure is performed using the Container Cloud web UI and comprises the following steps:

  1. Create a dedicated non-default project for managed clusters.

  2. For a baremetal-based managed cluster, create and configure bare metal hosts with corresponding labels for machines such as worker, manager, or storage.

  3. Create an initial cluster configuration depending on the provider type.

  4. Add the required amount of machines with the corresponding configuration to the managed cluster.

  5. For a baremetal-based managed cluster, add a Ceph cluster.

Note

The Container Cloud web UI communicates with Keycloak to authenticate users. Keycloak is exposed using HTTPS with self-signed TLS certificates that are not trusted by web browsers.

To use your own TLS certificates for Keycloak, refer to Configure TLS certificates for cluster applications.

Create a project for managed clusters

Note

The procedure below applies only to the Container Cloud web UI users with the m:kaas@global-admin or m:kaas@writer access role assigned by the infrastructure Operator.

The default project (Kubernetes namespace) in Container Cloud is dedicated for management clusters only. Managed clusters require a separate project. You can create as many projects as required by your company infrastructure.

To create a project for managed clusters using the Container Cloud web UI:

  1. Log in to the Container Cloud web UI as m:kaas@global-admin or m:kaas@writer.

  2. In the Projects tab, click Create.

  3. Type the new project name.

  4. Click Create.

Generate a kubeconfig for a managed cluster using API

This section was moved to Mirantis OpenStack for Kubernetes documentation: Getting access - Generate a kubeconfig for a cluster using API.

Create and operate a baremetal-based managed cluster

After bootstrapping your baremetal-based Mirantis Container Cloud management cluster as described in Deploy a Container Cloud management cluster, you can start creating the baremetal-based managed clusters.

Add a bare metal host

Before creating a bare metal managed cluster, add the required number of bare metal hosts either using the Container Cloud web UI for a default configuration or using CLI for an advanced configuration.

Add a bare metal host using web UI

This section describes how to add bare metal hosts using the Container Cloud web UI during a managed cluster creation.

Before you proceed with adding a bare metal host:

To add a bare metal host to a baremetal-based managed cluster:

  1. Optional. Create a custom bare metal host profile depending on your needs as described in Create a custom bare metal host profile.

    Note

    You can view the created profiles in the BM Host Profiles tab of the Container Cloud web UI.

  2. Log in to the Container Cloud web UI with the m:kaas@operator or m:kaas:namespace@bm-pool-operator permissions.

  3. Switch to the required non-default project using the Switch Project action icon located on top of the main left-side navigation panel.

    To create a project, refer to Create a project for managed clusters.

  4. Optional. Available since Container Cloud 2.24.0. In the Credentials tab, click Add Credential and add the IPMI user name and password of the bare metal host to access the Baseboard Management Controller (BMC).

  5. Select one of the following options:

    1. In the Baremetal tab, click Create Host.

    2. Fill out the Create baremetal host form as required:

      • Name

        Specify the name of the new bare metal host.

      • Boot Mode

        Specify the BIOS boot mode. Available options: Legacy, UEFI, or UEFISecureBoot.

      • MAC Address

        Specify the MAC address of the PXE network interface.

      • Baseboard Management Controller (BMC)

        Specify the following BMC details:

        • IP Address

          Specify the IP address to access the BMC.

        • Credential Name

          Specify the name of the previously added bare metal host credentials to associate with the current host.

        • Cert Validation

          Enable validation of the BMC API certificate. Applies only to the redfish+http BMC protocol. Disabled by default.

        • Power off host after creation

          Experimental. Select to power off the bare metal host after creation.

          Caution

          This option is experimental and intended only for testing and evaluation purposes. Do not use it for production deployments.

    1. In the Baremetal tab, click Add BM host.

    2. Fill out the Add new BM host form as required:

      • Baremetal host name

        Specify the name of the new bare metal host.

      • Provider Credential

        Optional. Available since Container Cloud 2.24.0. Specify the name of the previously added bare metal host credentials to associate with the current host.

      • Add New Credential

        Optional. Available since Container Cloud 2.24.0. Applies if you did not add bare metal host credentials using the Credentials tab. Add the bare metal host credentials:

        • Username

          Specify the name of the IPMI user to access the BMC.

        • Password

          Specify the IPMI password of the user to access the BMC.

      • Boot MAC address

        Specify the MAC address of the PXE network interface.

      • IP Address

        Specify the IP address to access the BMC.

      • Label

        Assign the machine label to the new host that defines which type of machine may be deployed on this bare metal host. Only one label can be assigned to a host. The supported labels include:

        • Manager

          This label is selected and set by default. Assign this label to the bare metal hosts that can be used to deploy machines with the manager type. These hosts must match the CPU and RAM requirements described in Reference hardware configuration.

        • Worker

          The host with this label may be used to deploy the worker machine type. Assign this label to the bare metal hosts that have sufficient CPU and RAM resources, as described in Reference hardware configuration.

        • Storage

          Assign this label to the bare metal hosts that have sufficient storage devices to match Reference hardware configuration. Hosts with this label will be used to deploy machines with the storage type that run Ceph OSDs.

  6. Click Create.

    While adding the bare metal host, Container Cloud discovers and inspects the hardware of the bare metal host and adds it to BareMetalHost.status for future references.

    During provisioning, baremetal-operator inspects the bare metal host and moves it to the Preparing state. The host becomes ready to be linked to a bare metal machine.

  7. Verify the results of the hardware inspection to avoid unexpected errors during the host usage:

    1. Select one of the following options:

      In the left sidebar, click Baremetal. The Hosts page opens.

      In the left sidebar, click BM Hosts.

    2. Verify that the bare metal host is registered and switched to one of the following statuses:

      • Preparing for a newly added host

      • Ready for a previously used host or for a host that is already linked to a machine

    3. Select one of the following options:

      On the Hosts page, click the host kebab menu and select Host info.

      On the BM Hosts page, click the name of the newly added bare metal host.

    4. In the window with the host details, scroll down to the Hardware section.

    5. Review the section and make sure that the number and models of disks, network interface cards, and CPUs match the hardware specification of the server.

      • If the hardware details are consistent with the physical server specifications for all your hosts, proceed to Add a managed baremetal cluster.

      • If you find any discrepancies in the hardware inspection results, it might indicate that the server has hardware issues or is not compatible with Container Cloud.

Add a bare metal host using CLI

This section describes how to add bare metal hosts using the Container Cloud CLI during a managed cluster creation.

To add a bare metal host using API:

  1. Create a project for a managed cluster as described in Create a project for managed clusters.

  2. Verify that you configured each bare metal host as described in Configure BIOS on a bare metal host.

  3. Optional. Create a custom bare metal host profile depending on your needs as described in Create a custom bare metal host profile.

  4. Log in to the host where your management cluster kubeconfig is located and where kubectl is installed.

  5. Select from the following options:

    Create a YAML file that describes the unique credentials of the new bare metal host as a BareMetalHostCredential object.

    Example of BareMetalHostCredential:

    apiVersion: kaas.mirantis.com/v1alpha1
    kind: BareMetalHostCredential
    metadata:
      labels:
        kaas.mirantis.com/provider: baremetal
        kaas.mirantis.com/region: region-one
      name: <bareMetalHostCredentialUniqueName>
      namespace: <managedClusterProjectName>
    spec:
      username: <ipmiUserName>
      password:
        value: <ipmiPassword>
    
    • In the metadata section, add a unique credentials name and the name of the non-default project (namespace) dedicated for the managed cluster being created.

    • In the spec section, add the IPMI user name and password in plain text to access the Baseboard Management Controller (BMC). The password will not be stored in the BareMetalHostCredential object but will be erased and saved in an underlying Secret object.

      Caution

      Each bare metal host must have a unique BareMetalHostCredential.

    Note

    The kaas.mirantis.com/region label is removed from all Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0). Therefore, do not add the label starting these releases. On existing clusters updated to these releases, or if manually added, this label will be ignored by Container Cloud.

    Create a secret YAML file that describes the unique credentials of the new bare metal host.

    Example of the bare metal host secret:

    apiVersion: v1
    data:
      password: <credentialsPassword>
      username: <credentialsUserName>
    kind: Secret
    metadata:
      labels:
        kaas.mirantis.com/credentials: "true"
        kaas.mirantis.com/provider: baremetal
        kaas.mirantis.com/region: region-one
      name: <credentialsName>
      namespace: <managedClusterProjectName>
    type: Opaque
    
    • In the data section, add the IPMI user name and password in the base64 encoding to access the BMC. To obtain the base64-encoded credentials, you can use the following command in your Linux console:

      echo -n <username|password> | base64
      

      Caution

      Each bare metal host must have a unique Secret.

    • In the metadata section, add the unique name of credentials and the name of the non-default project (namespace) dedicated for the managed cluster being created. To create a project, refer to Create a project for managed clusters.

  6. Apply the created YAML file with credentials to your deployment:

    Warning

    The kubectl apply command automatically saves the applied data as plain text into the kubectl.kubernetes.io/last-applied-configuration annotation of the corresponding object. This may result in revealing sensitive data in this annotation when creating or modifying the object.

    Therefore, do not use kubectl apply on this object. Use kubectl create, kubectl patch, or kubectl edit instead.

    If you used kubectl apply on this object, you can remove the kubectl.kubernetes.io/last-applied-configuration annotation from the object using kubectl edit.

    kubectl create -n <managedClusterProjectName> -f ${<BareMetalHostCredsFileName>}.yaml
    
  7. Create a YAML file that contains a description of the new bare metal host.

    Example of the bare metal host configuration file with the worker role:

    apiVersion: metal3.io/v1alpha1
    kind: BareMetalHost
    metadata:
      annotations:
        kaas.mirantis.com/baremetalhost-credentials-name: <bareMetalHostCredentialUniqueName>
      labels:
        kaas.mirantis.com/baremetalhost-id: <uniqueBareMetalHostHardwareNodeId>
        hostlabel.bm.kaas.mirantis.com/worker: "true"
        kaas.mirantis.com/provider: baremetal
        kaas.mirantis.com/region: region-one
      name: <BareMetalHostUniqueName>
      namespace: <managedClusterProjectName>
    spec:
      bmc:
        address: <ipAddressForIpmiAccess>
        credentialsName: ''
      bootMACAddress: <BareMetalHostBootMacAddress>
      online: true
    

    Note

    If you have a limited amount of free and unused IP addresses for server provisioning, you can add the baremetalhost.metal3.io/detached annotation that pauses automatic host management to manually allocate an IP address for the host. For details, see Manually allocate IP addresses for bare metal hosts.

    apiVersion: metal3.io/v1alpha1
    kind: BareMetalHost
    metadata:
      labels:
        kaas.mirantis.com/baremetalhost-id: <uniqueBareMetalHostHardwareNodeId>
        hostlabel.bm.kaas.mirantis.com/worker: "true"
        kaas.mirantis.com/provider: baremetal
        kaas.mirantis.com/region: region-one
      name: <BareMetalHostUniqueName>
      namespace: <managedClusterProjectName>
    spec:
      bmc:
        address: <ipAddressForBmcAccess>
        credentialsName: <credentialsSecretName>
      bootMACAddress: <BareMetalHostBootMacAddress>
      online: true
    

    For a detailed fields description, see BareMetalHost.

  8. Apply this configuration YAML file to your deployment:

    kubectl create -n <managedClusterProjectName> -f ${<BareMetalHostConfigFileName>}.yaml
    

    During provisioning, baremetal-operator inspects the bare metal host and moves it to the Preparing state. The host becomes ready to be linked to a bare metal machine.

    Caution

    If changing or adding of DHCP subnets is required to bootstrap new nodes, wait after changing or adding of DHCP subnets until the dnsmasq pod becomes ready, then create BareMetalHost objects.

    For details about the related known issue, refer to Release Notes: Inspection error on bare metal hosts after dnsmasq restart.

  9. Verify the new BareMetalHost object status:

    kubectl -n <managedClusterProjectName> get bmh -o wide <BareMetalHostUniqueName>
    

    Example of system response:

    NAMESPACE    NAME   STATUS   STATE      CONSUMER  BMC                        BOOTMODE  ONLINE  ERROR  REGION
    my-project   bmh1   OK       preparing            ip_address_for-bmc-access  legacy    true           region-one
    

    During provisioning, the status changes as follows:

    1. registering

    2. inspecting

    3. preparing

  10. After BareMetalHost switches to the preparing stage, the inspecting phase finishes and you can verify hardware information available in the object status. For example:

    • Verify the status of hardware NICs:

      kubectl -n <managedClusterProjectName> get bmh -o yaml <BareMetalHostUniqueName> -o json |  jq -r '[.status.hardware.nics]'
      

      Example of system response:

      [
        [
          {
            "ip": "172.18.171.32",
            "mac": "ac:1f:6b:02:81:1a",
            "model": "0x8086 0x1521",
            "name": "eno1",
            "pxe": true
          },
          {
            "ip": "fe80::225:90ff:fe33:d5ac%ens1f0",
            "mac": "00:25:90:33:d5:ac",
            "model": "0x8086 0x10fb",
            "name": "ens1f0"
          },
       ...
      
    • Verify the status of RAM:

      kubectl -n <managedClusterProjectName> get bmh -o yaml <BareMetalHostUniqueName> -o json |  jq -r '[.status.hardware.ramMebibytes]'
      

      Example of system response:

      [
        98304
      ]
      
Create a custom bare metal host profile

The bare metal host profile is a Kubernetes custom resource. It allows the operator to define how the storage devices and the operating system are provisioned and configured.

This section describes the bare metal host profile default settings and configuration of custom profiles for managed clusters using Mirantis Container Cloud API. This procedure also applies to a management cluster with a few differences described in Customize the default bare metal host profile.

Note

You can view the created profiles in the BM Host Profiles tab of the Container Cloud web UI.

Note

Using BareMetalHostProfile, you can configure LVM or mdadm-based software RAID support during a management or managed cluster creation. For details, see Configure RAID support.

This feature is available as Technology Preview. Use such configuration for testing and evaluation purposes only. For the Technology Preview feature definition, refer to Technology Preview features.

Default configuration of the host system storage

The default host profile requires three storage devices in the following strict order:

  1. Boot device and operating system storage

    This device contains boot data and operating system data. It is partitioned using the GUID Partition Table (GPT) labels. The root file system is an ext4 file system created on top of an LVM logical volume. For a detailed layout, refer to the table below.

  2. Local volumes device

    This device contains an ext4 file system with directories mounted as persistent volumes to Kubernetes. These volumes are used by the Mirantis Container Cloud services to store its data, including monitoring and identity databases.

  3. Ceph storage device

    This device is used as a Ceph datastore or Ceph OSD on managed clusters. It is used as a Ceph datastore or Ceph OSD.

Warning

Any data stored on any device defined in the fileSystems list can be deleted or corrupted during cluster (re)deployment. It happens because each device from the fileSystems list is a part of the rootfs directory tree that is overwritten during (re)deployment.

Examples of affected devices include:

  • A raw device partition with a file system on it

  • A device partition in a volume group with a logical volume that has a file system on it

  • An mdadm RAID device with a file system on it

  • An LVM RAID device with a file system on it

The wipe field (deprecated) or wipeDevice structure (recommended since Container Cloud 2.26.0) have no effect in this case and cannot protect data on these devices.

Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

The following table summarizes the default configuration of the host system storage set up by the Container Cloud bare metal management.

Default configuration of the bare metal host storage

Device/partition

Name/Mount point

Recommended size, GB

Description

/dev/sda1

bios_grub

4 MiB

The mandatory GRUB boot partition required for non-UEFI systems.

/dev/sda2

UEFI -> /boot/efi

0.2 GiB

The boot partition required for the UEFI boot mode.

/dev/sda3

config-2

64 MiB

The mandatory partition for the cloud-init configuration. Used during the first host boot for initial configuration.

/dev/sda4

lvm_root_part

100% of the remaining free space in the LVM volume group

The main LVM physical volume that is used to create the root file system.

/dev/sdb

lvm_lvp_part -> /mnt/local-volumes

100% of the remaining free space in the LVM volume group

The LVM physical volume that is used to create the file system for LocalVolumeProvisioner.

/dev/sdc

-

100% of the remaining free space in the LVM volume group

Clean raw disk that is used for the Ceph storage backend on managed clusters.

If required, you can customize the default host storage configuration. For details, see Create a custom host profile.

Wipe a device or partition

Available since 2.26.0 (17.1.0 and 16.1.0)

Before deploying a cluster, you may need to erase existing data from hardware devices to be used for deployment. You can either erase an existing partition or remove all existing partitions from a physical device. For this purpose, use the wipeDevice structure that configures cleanup behavior during configuration of a custom bare metal host profile described in Create a custom host profile.

The wipeDevice structure contains the following options:

  • eraseMetadata

    Configures metadata cleanup of a device

  • eraseDevice

    Configures a complete cleanup of a device

Erase metadata from a device

When you enable the eraseMetadata option, which is disabled by default, the Ansible provisioner attempts to clean up the existing metadata from the target device. Examples of metadata include:

  • Existing file system

  • Logical Volume Manager (LVM) or Redundant Array of Independent Disks (RAID) configuration

The behavior of metadata erasure varies depending on the target device:

  • If a device is part of other logical devices, for example, a partition, logical volume, or MD RAID volume, such logical device is disassembled and its file system metadata is erased. On the final erasure step, the file system metadata of the target device is erased as well.

  • If a device is a physical disk, then all its nested partitions along with their nested logical devices, if any, are erased and disassembled. On the final erasure step, all partitions and metadata of the target device are removed.

Caution

None of the eraseMetadata actions include overwriting the target device with data patterns. For this purpose, use the eraseDevice option as described in Erase a device.

To enable the eraseMetadata option, use the wipeDevice field in the spec:devices section of the BareMetalHostProfile object. For a detailed description of the option, see API Reference: BareMetalHostProfile.

Erase a device

If you require not only disassembling of existing logical volumes but also removing of all data ever written to the target device, configure the eraseDevice option, which is disabled by default. This option is not applicable to paritions, LVM, or MD RAID logical volumes because such volumes may use caching that prevents a physical device from being erased properly.

Important

The eraseDevice option does not replace the secure erase.

To configure the eraseDevice option, use the wipeDevice field in the spec:devices section of the BareMetalHostProfile object. For a detailed description of the option, see API Reference: BareMetalHostProfile.

Create a custom host profile

In addition to the default BareMetalHostProfile object installed with Mirantis Container Cloud, you can create custom profiles for managed clusters using Container Cloud API.

Note

The procedure below also applies to the Container Cloud management clusters.

Warning

Any data stored on any device defined in the fileSystems list can be deleted or corrupted during cluster (re)deployment. It happens because each device from the fileSystems list is a part of the rootfs directory tree that is overwritten during (re)deployment.

Examples of affected devices include:

  • A raw device partition with a file system on it

  • A device partition in a volume group with a logical volume that has a file system on it

  • An mdadm RAID device with a file system on it

  • An LVM RAID device with a file system on it

The wipe field (deprecated) or wipeDevice structure (recommended since Container Cloud 2.26.0) have no effect in this case and cannot protect data on these devices.

Therefore, to prevent data loss, move the necessary data from these file systems to another server beforehand, if required.

To create a custom bare metal host profile:

  1. Select from the following options:

    • For a management cluster, log in to the bare metal seed node that will be used to bootstrap the management cluster.

    • For a managed cluster, log in to the local machine where you management cluster kubeconfig is located and where kubectl is installed.

      Note

      The management cluster kubeconfig is created automatically during the last stage of the management cluster bootstrap.

  2. Select from the following options:

    • For a management cluster, open templates/bm/baremetalhostprofiles.yaml.template for editing.

    • For a managed cluster, create a new bare metal host profile under the templates/bm/ directory.

  3. Edit the host profile using the example template below to meet your hardware configuration requirements:

    Example template of a bare metal host profile
    apiVersion: metal3.io/v1alpha1
    kind: BareMetalHostProfile
    metadata:
      name: <profileName>
      namespace: <ManagedClusterProjectName>
      # Add the name of the non-default project for the managed cluster
      # being created.
    spec:
      devices:
      # From the HW node, obtain the first device, which size is at least 120Gib.
      - device:
          minSize: 120Gi
          wipe: true
        partitions:
        - name: bios_grub
          partflags:
          - bios_grub
          size: 4Mi
          wipe: true
        - name: uefi
          partflags:
          - esp
          size: 200Mi
          wipe: true
        - name: config-2
          size: 64Mi
          wipe: true
        - name: lvm_root_part
          size: 0
          wipe: true
      # From the HW node, obtain the second device, which size is at least 120Gib.
      # If a device exists but does not fit the size,
      # the BareMetalHostProfile will not be applied to the node.
      - device:
          minSize: 120Gi
          wipe: true
      # From the HW node, obtain the disk device with the exact device path.
      - device:
          byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:1
          minSize: 120Gi
          wipe: true
        partitions:
        - name: lvm_lvp_part
          size: 0
          wipe: true
      # Example of wiping a device w\o partitioning it.
      # Mandatory for the case when a disk is supposed to be used for Ceph backend.
      # later
      - device:
          byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2
          wipe: true
      fileSystems:
      - fileSystem: vfat
        partition: config-2
      - fileSystem: vfat
        mountPoint: /boot/efi
        partition: uefi
      - fileSystem: ext4
        logicalVolume: root
        mountPoint: /
      - fileSystem: ext4
        logicalVolume: lvp
        mountPoint: /mnt/local-volumes/
      logicalVolumes:
      - name: root
        size: 0
        vg: lvm_root
      - name: lvp
        size: 0
        vg: lvm_lvp
      postDeployScript: |
        #!/bin/bash -ex
        echo $(date) 'post_deploy_script done' >> /root/post_deploy_done
      preDeployScript: |
        #!/bin/bash -ex
        echo $(date) 'pre_deploy_script done' >> /root/pre_deploy_done
      volumeGroups:
      - devices:
        - partition: lvm_root_part
        name: lvm_root
      - devices:
        - partition: lvm_lvp_part
        name: lvm_lvp
      grubConfig:
        defaultGrubOptions:
        - GRUB_DISABLE_RECOVERY="true"
        - GRUB_PRELOAD_MODULES=lvm
        - GRUB_TIMEOUT=20
      kernelParameters:
        sysctl:
        # For the list of options prohibited to change, refer to
        # https://docs.mirantis.com/mke/3.7/install/predeployment/set-up-kernel-default-protections.html
          kernel.dmesg_restrict: "1"
          kernel.core_uses_pid: "1"
          fs.file-max: "9223372036854775807"
          fs.aio-max-nr: "1048576"
          fs.inotify.max_user_instances: "4096"
          vm.max_map_count: "262144"
    
  4. Optional. Configure wiping of the target device or partition to be used for cluster deployment as described in Wipe a device or partition.

  5. Optional. Configure multiple devices for LVM volume using the example template extract below for reference.

    Caution

    The following template extract contains only sections relevant to LVM configuration with multiple PVs. Expand the main template described in the previous step with the configuration below if required.

    spec:
      devices:
        ...
        - device:
          ...
          partitions:
            - name: lvm_lvp_part1
              size: 0
              wipe: true
        - device:
          ...
          partitions:
            - name: lvm_lvp_part2
              size: 0
              wipe: true
    volumeGroups:
      ...
      - devices:
        - partition: lvm_lvp_part1
        - partition: lvm_lvp_part2
        name: lvm_lvp
    logicalVolumes:
      ...
      - name: root
        size: 0
        vg: lvm_lvp
    fileSystems:
      ...
      - fileSystem: ext4
        logicalVolume: root
        mountPoint: /
    
  6. For a managed cluster, configure required disks for the Ceph cluster as described in Configure Ceph disks in a host profile.

  7. Optional. Technology Preview. Configure support of the Redundant Array of Independent Disks (RAID) that allows, for example, installing a cluster operating system on a RAID device, refer to Configure RAID support.

  8. Optional. Configure the RX/TX buffer size for physical network interfaces and txqueuelen for any network interfaces.

    This configuration can greatly benefit high-load and high-performance network interfaces. You can configure these parameters using the udev rules. For example:

    postDeployScript: |
      #!/bin/bash -ex
      ...
      echo 'ACTION=="add|change", SUBSYSTEM=="net", KERNEL=="eth*|en*", RUN+="/sbin/ethtool -G $name rx 4096 tx 4096"' > /etc/udev/rules.d/59-net.ring.rules
    
      echo 'ACTION=="add|change", SUBSYSTEM=="net", KERNEL=="eth*|en*|bond*|k8s-*|v*" ATTR{tx_queue_len}="10000"' > /etc/udev/rules.d/58-net.txqueue.rules
    
  9. Add or edit the mandatory parameters in the new BareMetalHostProfile object. For the parameters description, see API: BareMetalHostProfile spec.

    Note

    If asymmetric traffic is expected on some of the managed cluster nodes, enable the loose mode for the corresponding interfaces on those nodes by setting the net.ipv4.conf.<interface-name>.rp_filter parameter to "2" in the kernelParameters.sysctl section. For example:

    kernelParameters:
      sysctl:
        net.ipv4.conf.k8s-lcm.rp_filter: "2"
    
  10. Select from the following options:

    • For a management cluster, proceed with the cluster bootstrap procedure as described in Deploy a management cluster using CLI.

    • For a managed cluster, select from the following options:

      Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0)

      1. Log in to the Container Cloud web UI with the operator permissions.

      2. Switch to the required non-default project using the Switch Project action icon located on top of the main left-side navigation panel.

        To create a project, refer to Create a project for managed clusters.

      3. In the left sidebar, navigate to Baremetal and click the Host Profiles tab.

      4. Click Create Host Profile.

      5. Fill out the Create host profile form:

        • Name

          Name of the bare metal host profile.

        • Specification

          BareMetalHostProfile object specification in the YAML format that you have previously created. Click Edit to edit the BareMetalHostProfile object if required.

          Note

          Before Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0), the field name is YAML file, and you can upload the required YAML file instead of inserting and editing it.

        • Labels

          Available since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0). Key-value pairs attached to BareMetalHostProfile.

      1. Add the bare metal host profile to your management cluster:

        kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <managedClusterProjectName> apply -f <pathToBareMetalHostProfileFile>
        
      2. If required, further modify the host profile:

        kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <managedClusterProjectName> edit baremetalhostprofile <hostProfileName>
        
      3. Proceed with Add a bare metal host either using web UI or CLI.

Configure Ceph disks in a host profile

This section describes how to configure devices for the Ceph cluster in the BareMetalHostProfile object of a managed cluster.

To configure disks for a Ceph cluster:

  1. Open the BareMetalHostProfile object of a managed cluster for editing.

  2. In the spec.devices section, add each disk intended for use as a Ceph OSD data device with size: 0 and wipe: true.

    Example configuration for sde-sdh disks to use as Ceph OSDs:

    spec:
      devices:
      ...
      - device:
          byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:1
          size: 0
          wipe: true
      - device:
          byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2
          size: 0
          wipe: true
      - device:
          byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:3
          size: 0
          wipe: true
      - device:
          byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:4
          size: 0
          wipe: true
    
  3. Since Container Cloud 2.24.0, if you plan to use a separate metadata device for Ceph OSD, configure the spec.devices section as described below.

    Important

    Mirantis highly recommends configuring disk partitions for Ceph OSD metadata using BareMetalHostProfile.

    Configuration of a separate metadata device for Ceph OSD
    1. Add the device to spec.devices with a single partition that will use the entire disk size.

      For example, if you plan to use four Ceph OSDs with a separate metadata device for each Ceph OSD, configure the spec.devices section as follows:

      spec:
        devices:
        ...
        - device:
            byPath: /dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:5
            wipe: true
          partitions:
          - name: ceph_meta
            size: 0
            wipe: true
      
    2. Create a volume group on top of the defined partition and create the required number of logical volumes (LVs) on top of the created volume group (VG). Add one logical volume per one Ceph OSD on the node.

      Example snippet of an LVM configuration for a Ceph metadata disk:

      spec:
        ...
        volumeGroups: