The documentation is intended to help operators understand the core concepts
of the product.
The information provided in this documentation set is being constantly
improved and amended based on the feedback and kind requests from our
software consumers. This documentation set outlines description of
the features supported within three latest Container Cloud minor releases and
their supported Cluster releases, with a corresponding note
Available since <release-version>.
The following table lists the guides included in the documentation set you
are reading:
GUI elements that include any part of interactive user interface and
menu navigation.
Superscript
Some extra, brief information. For example, if a feature is
available from a specific release or if a feature is in the
Technology Preview development stage.
Note
The Note block
Messages of a generic meaning that may be useful to the user.
Caution
The Caution block
Information that prevents a user from mistakes and undesirable
consequences when following the procedures.
Warning
The Warning block
Messages that include details that can be easily missed, but should not
be ignored by the user and are valuable before proceeding.
See also
The See also block
List of references that may be helpful for understanding of some related
tools, concepts, and so on.
Learn more
The Learn more block
Used in the Release Notes to wrap a list of internal references to
the reference architecture, deployment and operation procedures specific
to a newly implemented product feature.
A Technology Preview feature provides early access to upcoming product
innovations, allowing customers to experiment with the functionality and
provide feedback.
Technology Preview features may be privately or publicly available but
neither are intended for production use. While Mirantis will provide
assistance with such features through official channels, normal Service
Level Agreements do not apply.
As Mirantis considers making future iterations of Technology Preview features
generally available, we will do our best to resolve any issues that customers
experience when using these features.
During the development of a Technology Preview feature, additional components
may become available to the public for evaluation. Mirantis cannot guarantee
the stability of such features. As a result, if you are using Technology
Preview features, you may not be able to seamlessly update to subsequent
product releases, as well as upgrade or migrate to the functionality that
has not been announced as full support yet.
Mirantis makes no guarantees that Technology Preview features will graduate
to generally available features.
The documentation set refers to Mirantis Container Cloud GA as to the latest
released GA version of the product. For details about the Container Cloud
GA minor releases dates, refer to
Container Cloud releases.
Mirantis Container Cloud enables you to ship code faster by enabling speed
with choice, simplicity, and security. Through a single pane of glass you can
deploy, manage, and observe Kubernetes clusters on private clouds or bare metal
infrastructure. Container Cloud provides the ability to leverage the following
on premises cloud infrastructure: OpenStack, VMware, and bare metal.
The list of the most common use cases includes:
Multi-cloud
Organizations are increasingly moving toward a multi-cloud strategy,
with the goal of enabling the effective placement of workloads over
multiple platform providers. Multi-cloud strategies can introduce
a lot of complexity and management overhead. Mirantis Container Cloud
enables you to effectively deploy and manage container clusters
(Kubernetes and Swarm) across multiple cloud provider platforms.
Hybrid cloud
The challenges of consistently deploying, tracking, and managing hybrid
workloads across multiple cloud platforms is compounded by not having
a single point that provides information on all available resources.
Mirantis Container Cloud enables hybrid cloud workload by providing
a central point of management and visibility of all your cloud resources.
Kubernetes cluster lifecycle management
The consistent lifecycle management of a single Kubernetes cluster
is a complex task on its own that is made infinitely more difficult
when you have to manage multiple clusters across different platforms
spread across the globe. Mirantis Container Cloud provides a single,
centralized point from which you can perform full lifecycle management
of your container clusters, including automated updates and upgrades.
Container Cloud also supports attachment of existing Mirantis Kubernetes
Engine clusters that are not originally deployed by Container Cloud.
Highly regulated industries
Regulated industries need a fine level of access control granularity,
high security standards and extensive reporting capabilities to ensure
that they can meet and exceed the security standards and requirements.
Mirantis Container Cloud provides for a fine-grained Role Based Access
Control (RBAC) mechanism and easy integration and federation to existing
identity management systems (IDM).
Logging, monitoring, alerting
A complete operational visibility is required to identify and address issues
in the shortest amount of time – before the problem becomes serious.
Mirantis StackLight is the proactive monitoring, logging, and alerting
solution designed for large-scale container and cloud observability with
extensive collectors, dashboards, trend reporting and alerts.
Storage
Cloud environments require a unified pool of storage that can be scaled up by
simply adding storage server nodes. Ceph is a unified, distributed storage
system designed for excellent performance, reliability, and scalability.
Deploy Ceph utilizing Rook to provide and manage a robust persistent storage
that can be used by Kubernetes workloads on the baremetal-based clusters.
Security
Security is a core concern for all enterprises, especially with more
of our systems being exposed to the Internet as a norm. Mirantis
Container Cloud provides for a multi-layered security approach that
includes effective identity management and role based authentication,
secure out of the box defaults and extensive security scanning and
monitoring during the development process.
5G and Edge
The introduction of 5G technologies and the support of Edge workloads
requires an effective multi-tenant solution to manage the underlying
container infrastructure. Mirantis Container Cloud provides for a full
stack, secure, multi-cloud cluster management and Day-2 operations
solution that supports both on premises bare metal and cloud.
Mirantis Container Cloud is a set of microservices
that are deployed using Helm charts and run in a Kubernetes cluster.
Container Cloud is based on the Kubernetes Cluster API community initiative.
The following diagram illustrates an overview of Container Cloud
and the clusters it manages:
All artifacts used by Kubernetes and workloads are stored
on the Container Cloud content delivery network (CDN):
mirror.mirantis.com (Debian packages including the Ubuntu mirrors)
binary.mirantis.com (Helm charts and binary artifacts)
mirantis.azurecr.io (Docker image registry)
All Container Cloud components are deployed in the Kubernetes clusters.
All Container Cloud APIs are implemented using the Kubernetes
Custom Resource Definition (CRD) that represents custom objects
stored in Kubernetes and allows you to expand Kubernetes API.
The Container Cloud logic is implemented using controllers.
A controller handles the changes in custom resources defined
in the controller CRD.
A custom resource consists of a spec that describes the desired state
of a resource provided by a user.
During every change, a controller reconciles the external state of a custom
resource with the user parameters and stores this external state in the
status subresource of its custom resource.
Since Container Cloud 2.27.3 (Cluster release 16.2.3), support
for vSphere-based clusters is suspended. For details, see
Deprecation notes.
The types of the Container Cloud clusters include:
Bootstrap cluster
Contains the Bootstrap web UI for the OpenStack and vSphere providers.
The Bootstrap web UI support for the bare metal provider will be added
in one of the following Container Cloud releases.
Runs the bootstrap process on a seed node that can be reused after the
management cluster deployment for other purposes. For the OpenStack or
vSphere provider, it can be an operator desktop computer. For the bare
metal provider, this is a data center node.
Requires access to one of the following provider backends: bare metal,
OpenStack, or vSphere.
Initially, the bootstrap cluster is created with the following minimal set
of components: Bootstrap Controller, public API charts, and the Bootstrap
web UI.
The user can interact with the bootstrap cluster through the Bootstrap web
UI or API to create the configuration for a management cluster and start
its deployment. More specifically, the user performs the following
operations:
Select the provider, add provider credentials.
Add proxy and SSH keys.
Configure the cluster and machines.
Deploy a management cluster.
The user can monitor the deployment progress of the cluster and machines.
After a successful deployment, the user can download the kubeconfig
artifact of the provisioned cluster.
Management cluster
Comprises Container Cloud as product and provides the following functionality:
Runs all public APIs and services including the web UIs
of Container Cloud.
Does not require access to any provider backend.
Runs the provider-specific services and internal API including
LCMMachine and LCMCluster. Also, it runs an LCM controller for
orchestrating managed clusters and other controllers for handling
different resources.
Requires two-way access to a provider backend. The provider connects
to a backend to spawn managed cluster nodes,
and the agent running on the nodes accesses the regional cluster
to obtain the deployment information.
For deployment details of a management cluster, see Deployment Guide.
Managed cluster
A Mirantis Kubernetes Engine (MKE) cluster that an end user
creates using the Container Cloud web UI.
Requires access to its management cluster. Each node of a managed
cluster runs an LCM Agent that connects to the LCM machine of the
management cluster to obtain the deployment details.
Since 2.25.2, an attached MKE cluster that is not created using
Container Cloud for vSphere-based clusters. In such case, nodes of the
attached cluster do not contain LCM Agent. For supported MKE versions that
can be attached to Container Cloud, see Release Compatibility Matrix.
Baremetal-based managed clusters support the Mirantis OpenStack for Kubernetes
(MOSK) product. For details, see
MOSK documentation.
All types of the Container Cloud clusters except the bootstrap cluster
are based on the MKE and Mirantis Container Runtime (MCR) architecture.
For details, see MKE and
MCR documentation.
The following diagram illustrates the distribution of services
between each type of the Container Cloud clusters:
The Mirantis Container Cloud provider is the central component
of Container Cloud that provisions a node of a management, regional,
or managed cluster and runs the LCM Agent on this node.
It runs in a management and regional clusters and requires connection
to a provider backend.
The Container Cloud provider interacts with the following
types of public API objects:
Public API object name
Description
Container Cloud release object
Contains the following information about clusters:
Version of the supported Cluster release for a management and
regional clusters
List of supported Cluster releases for the managed clusters
and supported upgrade path
Description of Helm charts that are installed
on the management and regional clusters
depending on the selected provider
Cluster release object
Provides a specific version of a management, regional, or
managed cluster.
Any Cluster release object, as well as a Container Cloud release
object never changes, only new releases can be added.
Any change leads to a new release of a cluster.
Contains references to all components and their versions
that are used to deploy all cluster types:
LCM components:
LCM Agent
Ansible playbooks
Scripts
Description of steps to execute during a cluster deployment
and upgrade
Helm Controller image references
Supported Helm charts description:
Helm chart name and version
Helm release name
Helm values
Cluster object
References the Credentials, KaaSRelease and ClusterRelease
objects.
Is tied to a specific Container Cloud region and provider.
Represents all cluster-level resources. For example, for the
OpenStack-based clusters, it represents networks, load balancer for
the Kubernetes API, and so on. It uses data from the Credentials
object to create these resources and data from the KaaSRelease
and ClusterRelease objects to ensure that all lower-level cluster
objects are created.
Machine object
References the Cluster object.
Represents one node of a managed cluster, for example, an OpenStack VM,
and contains all data to provision it.
Credentials object
Contains all information necessary to connect to a provider backend.
Is tied to a specific Container Cloud region and provider.
PublicKey object
Is provided to every machine to obtain an SSH access.
The following diagram illustrates the Container Cloud provider data flow:
The Container Cloud provider performs the following operations
in Container Cloud:
Consumes the below types of data from a management and regional cluster:
Credentials to connect to a provider backend
Deployment instructions from the KaaSRelease and ClusterRelease
objects
The cluster-level parameters from the Cluster objects
The machine-level parameters from the Machine objects
Prepares data for all Container Cloud components:
Creates the LCMCluster and LCMMachine custom resources
for LCM Controller and LCM Agent. The LCMMachine custom resources
are created empty to be later handled by the LCM Controller.
Creates the HelmBundle custom resources for the Helm Controller
using data from the KaaSRelease and ClusterRelease objects.
Creates service accounts for these custom resources.
Creates a scope in Identity and access management (IAM)
for a user access to a managed cluster.
Provisions nodes for a managed cluster using the cloud-init script
that downloads and runs the LCM Agent.
The Mirantis Container Cloud Release Controller is responsible
for the following functionality:
Monitor and control the KaaSRelease and ClusterRelease objects
present in a management cluster. If any release object is used
in a cluster, the Release Controller prevents the deletion
of such an object.
Trigger the Container Cloud auto-upgrade procedure if a new
KaaSRelease object is found:
Search for the managed clusters with old Cluster releases
that are not supported by a new Container Cloud release.
If any are detected, abort the auto-upgrade and display
a corresponding note about an old Cluster release in the Container
Cloud web UI for the managed clusters. In this case, a user must update
all managed clusters using the Container Cloud web UI.
Once all managed clusters are upgraded to the Cluster releases
supported by a new Container Cloud release,
the Container Cloud auto-upgrade is retriggered
by the Release Controller.
Trigger the Container Cloud release upgrade of all Container Cloud
components in a management cluster.
The upgrade itself is processed by the Container Cloud provider.
Trigger the Cluster release upgrade of a management cluster
to the Cluster release version that is indicated
in the upgraded Container Cloud release version.
The LCMCluster components, such as MKE, are upgraded before
the HelmBundle components, such as StackLight or Ceph.
Once a management cluster is upgraded, an option to update
a managed cluster becomes available in the Container Cloud web UI.
During a managed cluster update, all cluster components including
Kubernetes are automatically upgraded to newer versions if available.
The LCMCluster components, such as MKE, are upgraded before
the HelmBundle components, such as StackLight or Ceph.
The Operator can delay the Container Cloud automatic upgrade procedure for a
limited amount of time or schedule upgrade to run at desired hours or weekdays.
For details, see Schedule Mirantis Container Cloud upgrades.
Container Cloud remains operational during the management cluster upgrade.
Managed clusters are not affected during this upgrade. For the list of
components that are updated during the Container Cloud upgrade, see the
Components versions section of the corresponding Container Cloud release in
Release Notes.
When Mirantis announces support of the newest versions of
Mirantis Container Runtime (MCR) and Mirantis Kubernetes Engine
(MKE), Container Cloud automatically upgrades these components as well.
For the maintenance window best practices before upgrade of these
components, see
MKE Documentation.
The Mirantis Container Cloud web UI is mainly designed
to create and update the managed clusters as well as add or remove machines
to or from an existing managed cluster.
You can use the Container Cloud web UI
to obtain the management cluster details including endpoints, release version,
and so on.
The management cluster update occurs automatically
with a new release change log available through the Container Cloud web UI.
The Container Cloud web UI is a JavaScript application that is based
on the React framework. The Container Cloud web UI is designed to work
on a client side only. Therefore, it does not require a special backend.
It interacts with the Kubernetes and Keycloak APIs directly.
The Container Cloud web UI uses a Keycloak token
to interact with Container Cloud API and download kubeconfig
for the management and managed clusters.
The Container Cloud web UI uses NGINX that runs on a management cluster
and handles the Container Cloud web UI static files.
NGINX proxies the Kubernetes and Keycloak APIs
for the Container Cloud web UI.
The bare metal service provides for the discovery, deployment, and management
of bare metal hosts.
The bare metal management in Mirantis Container Cloud
is implemented as a set of modular microservices.
Each microservice implements a certain requirement or function
within the bare metal management system.
The backend bare metal manager in a standalone mode with its auxiliary
services that include httpd, dnsmasq, and mariadb.
OpenStack Ironic Inspector
Introspects and discovers the bare metal hosts inventory.
Includes OpenStack Ironic Python Agent (IPA) that is used
as a provision-time agent for managing bare metal hosts.
Ironic Operator
Monitors changes in the external IP addresses of httpd, ironic,
and ironic-inspector and automatically reconciles the configuration
for dnsmasq, ironic, baremetal-provider,
and baremetal-operator.
Bare Metal Operator
Manages bare metal hosts through the Ironic API. The Container Cloud
bare-metal operator implementation is based on the Metal³ project.
Bare metal resources manager
Ensures that the bare metal provisioning artifacts such as the
distribution image of the operating system is available and up to date.
cluster-api-provider-baremetal
The plugin for the Kubernetes Cluster API integrated with Container Cloud.
Container Cloud uses the Metal³ implementation of
cluster-api-provider-baremetal for the Cluster API.
HAProxy
Load balancer for external access to the Kubernetes API endpoint.
LCM Agent
Used for physical and logical storage, physical and logical network,
and control over the life cycle of a bare metal machine resources.
Ceph
Distributed shared storage is required by the Container Cloud services
to create persistent volumes to store their data.
MetalLB
Load balancer for Kubernetes services on bare metal. 1
Keepalived
Monitoring service that ensures availability of the virtual IP for
the external load balancer endpoint (HAProxy). 1
IPAM
IP address management services provide consistent IP address space
to the machines in bare metal clusters. See details in
IP Address Management.
Mirantis Container Cloud on bare metal uses IP Address Management (IPAM)
to keep track of the network addresses allocated to bare metal hosts.
This is necessary to avoid IP address conflicts
and expiration of address leases to machines through DHCP.
Note
Only IPv4 address family is currently supported by Container Cloud
and IPAM. IPv6 is not supported and not used in Container Cloud.
IPAM is provided by the kaas-ipam controller. Its functions
include:
Allocation of IP address ranges or subnets to newly created clusters using
SubnetPool and Subnet resources.
Allocation IP addresses to machines and cluster services at the request
of baremetal-provider using the IpamHost and IPaddr resources.
Creation and maintenance of host networking configuration
on the bare metal hosts using the IpamHost resources.
The IPAM service can support different networking topologies and network
hardware configurations on the bare metal hosts.
In the most basic network configuration, IPAM uses a single L3 network
to assign addresses to all bare metal hosts, as defined in
Managed cluster networking.
You can apply complex networking configurations to a bare metal host
using the L2 templates. The L2 templates imply multihomed host networking
and enable you to create a managed cluster where nodes use separate host
networks for different types of traffic. Multihoming is required
to ensure the security and performance of a managed cluster.
Caution
Modification of L2 templates in use is allowed with a mandatory
validation step from the Infrastructure Operator to prevent accidental
cluster failures due to unsafe changes. The list of risks posed by modifying
L2 templates includes:
Services running on hosts cannot reconfigure automatically to switch to
the new IP addresses and/or interfaces.
Connections between services are interrupted unexpectedly, which can cause
data loss.
Incorrect configurations on hosts can lead to irrevocable loss of
connectivity between services and unexpected cluster partition or
disassembly.
The main purpose of networking in a Container Cloud management cluster is to
provide access to the Container Cloud Management API that consists of the
Kubernetes API of the Container Cloud management cluster and the Container
Cloud LCM API. This API allows end users to provision and configure managed
clusters and machines. Also, this API is used by LCM agents in managed
clusters to obtain configuration and report status.
The following types of networks are supported for the management clusters in
Container Cloud:
PXE network
Enables PXE boot of all bare metal machines in the Container Cloud region.
PXE subnet
Provides IP addresses for DHCP and network boot of the bare metal hosts
for initial inspection and operating system provisioning.
This network may not have the default gateway or a router connected
to it. The PXE subnet is defined by the Container Cloud Operator
during bootstrap.
Provides IP addresses for the bare metal management services of
Container Cloud, such as bare metal provisioning service (Ironic).
These addresses are allocated and served by MetalLB.
Management network
Connects LCM Agents running on the hosts to the Container Cloud LCM API.
Serves the external connections to the Container Cloud Management API.
The network is also used for communication between kubelet
and the Kubernetes API server inside a Kubernetes cluster. The MKE
components use this network for communication inside a swarm cluster.
LCM subnet
Provides IP addresses for the Kubernetes nodes in the management cluster.
This network also provides a Virtual IP (VIP) address for the load
balancer that enables external access to the Kubernetes API
of a management cluster. This VIP is also the endpoint to access
the Container Cloud Management API in the management cluster.
Provides IP addresses for the externally accessible services of
Container Cloud, such as Keycloak, web UI, StackLight.
These addresses are allocated and served by MetalLB.
Kubernetes workloads network
Technology Preview
Serves the internal traffic between workloads on the management cluster.
Kubernetes workloads subnet
Provides IP addresses that are assigned to nodes and used by Calico.
Out-of-Band (OOB) network
Connects to Baseboard Management Controllers of the servers that host
the management cluster. The OOB subnet must be accessible from the
management network through IP routing. The OOB network
is not managed by Container Cloud and is not represented in the IPAM API.
A Kubernetes cluster networking is typically focused on connecting pods on
different nodes. On bare metal, however, the cluster networking is more
complex as it needs to facilitate many different types of traffic.
Kubernetes clusters managed by Mirantis Container Cloud
have the following types of traffic:
PXE network
Enables the PXE boot of all bare metal machines in Container Cloud.
This network is not configured on the hosts in a managed cluster.
It is used by the bare metal provider to provision additional
hosts in managed clusters and is disabled on the hosts after
provisioning is done.
Life-cycle management (LCM) network
Connects LCM Agents running on the hosts to the Container Cloud LCM API.
The LCM API is provided by the management cluster.
The LCM network is also used for communication between kubelet
and the Kubernetes API server inside a Kubernetes cluster. The MKE
components use this network for communication inside a swarm cluster.
When using the BGP announcement of the IP address for the cluster API
load balancer, which is available as Technology Preview since
Container Cloud 2.24.4, no segment stretching is required
between Kubernetes master nodes. Also, in this scenario, the load
balancer IP address is not required to match the LCM subnet CIDR address.
LCM subnet(s)
Provides IP addresses that are statically allocated by the IPAM service
to bare metal hosts. This network must be connected to the Kubernetes API
endpoint of the management cluster through an IP router.
LCM Agents running on managed clusters will connect to the management
cluster API through this router. LCM subnets may be different
per managed cluster as long as this connection requirement is satisfied.
The Virtual IP (VIP) address for load balancer that enables access to
the Kubernetes API of the managed cluster must be allocated from the LCM
subnet.
Cluster API subnet
Technology Preview
Provides a load balancer IP address for external access to the cluster
API. Mirantis recommends that this subnet stays unique per managed
cluster.
Kubernetes workloads network
Serves as an underlay network for traffic between pods in
the managed cluster. Do not share this network between clusters.
Kubernetes workloads subnet(s)
Provides IP addresses that are statically allocated by the IPAM service
to all nodes and that are used by Calico for cross-node communication
inside a cluster. By default, VXLAN overlay is used for Calico
cross-node communication.
Kubernetes external network
Serves ingress traffic to the managed cluster from the outside world.
You can share this network between clusters, but with dedicated subnets
per cluster. Several or all cluster nodes must be connected to
this network. Traffic from external users to the externally available
Kubernetes load-balanced services comes through the nodes that
are connected to this network.
Services subnet(s)
Provides IP addresses for externally available Kubernetes load-balanced
services. The address ranges for MetalLB are assigned from this subnet.
There can be several subnets per managed cluster that define
the address ranges or address pools for MetalLB.
External subnet(s)
Provides IP addresses that are statically allocated by the IPAM service
to nodes. The IP gateway in this network is used as the default route
on all nodes that are connected to this network. This network
allows external users to connect to the cluster services exposed as
Kubernetes load-balanced services. MetalLB speakers must run on the same
nodes. For details, see Configure node selector for MetalLB speaker.
Storage network
Serves storage access and replication traffic from and to Ceph OSD services.
The storage network does not need to be connected to any IP routers
and does not require external access, unless you want to use Ceph
from outside of a Kubernetes cluster.
To use a dedicated storage network, define and configure
both subnets listed below.
Storage access subnet(s)
Provides IP addresses that are statically allocated by the IPAM service
to Ceph nodes.
The Ceph OSD services bind to these addresses on their respective
nodes. Serves Ceph access traffic from and to storage clients.
This is a public network in Ceph terms. 1
Storage replication subnet(s)
Provides IP addresses that are statically allocated by the IPAM service
to Ceph nodes.
The Ceph OSD services bind to these addresses on their respective
nodes. Serves Ceph internal replication traffic. This is a
cluster network in Ceph terms. 1
Out-of-Band (OOB) network
Connects baseboard management controllers (BMCs) of the bare metal hosts.
This network must not be accessible from the managed clusters.
The following diagram illustrates the networking schema of the Container Cloud
deployment on bare metal with a managed cluster:
The following network roles are defined for all Mirantis Container Cloud
clusters nodes on bare metal including the bootstrap, management and managed
cluster nodes:
Out-of-band (OOB) network
Connects the Baseboard Management Controllers (BMCs) of the hosts
in the network to Ironic. This network is out of band for the
host operating system.
PXE network
Enables remote booting of servers through the PXE protocol. In management
clusters, DHCP server listens on this network for hosts discovery and
inspection. In managed clusters, hosts use this network for the initial
PXE boot and provisioning.
LCM network
Connects LCM Agents running on the node to the LCM API of the management
cluster. It is also used for communication between kubelet and the
Kubernetes API server inside a Kubernetes cluster. The MKE components use
this network for communication inside a swarm cluster.
In management clusters, it is replaced by the management network.
Kubernetes workloads (pods) network
Technology Preview
Serves connections between Kubernetes pods.
Each host has an address on this network, and this address is used
by Calico as an endpoint to the underlay network.
Kubernetes external network
Technology Preview
Serves external connection to the Kubernetes API
and the user services exposed by the cluster. In management clusters,
it is replaced by the management network.
Management network
Serves external connections to the Container Cloud Management API and
services of the management cluster. Not available in a managed cluster.
Storage access network
Connects Ceph nodes to the storage clients. The Ceph OSD service is
bound to the address on this network. This is a public network in
Ceph terms. 0
Storage replication network
Connects Ceph nodes to each other. Serves internal replication traffic.
This is a cluster network in Ceph terms. 0
Each network is represented on the host by a virtual Linux bridge. Physical
interfaces may be connected to one of the bridges directly, or through a
logical VLAN subinterface, or combined into a bond interface that is in
turn connected to a bridge.
The following table summarizes the default names used for the bridges
connected to the networks listed above:
The baremetal-based Mirantis Container Cloud uses Ceph as a distributed
storage system for file, block, and object storage. This section provides an
overview of a Ceph cluster deployed by Container Cloud.
Mirantis Container Cloud deploys Ceph on baremetal-based managed clusters
using Helm charts with the following components:
Rook Ceph Operator
A storage orchestrator that deploys Ceph on top of a Kubernetes cluster. Also
known as Rook or RookOperator. Rook operations include:
Deploying and managing a Ceph cluster based on provided Rook CRs such as
CephCluster, CephBlockPool, CephObjectStore, and so on.
Orchestrating the state of the Ceph cluster and all its daemons.
KaaSCephCluster custom resource (CR)
Represents the customization of a Kubernetes installation and allows you to
define the required Ceph configuration through the Container Cloud web UI
before deployment. For example, you can define the failure domain, Ceph pools,
Ceph node roles, number of Ceph components such as Ceph OSDs, and so on.
The ceph-kcc-controller controller on the Container Cloud management
cluster manages the KaaSCephCluster CR.
Ceph Controller
A Kubernetes controller that obtains the parameters from Container Cloud
through a CR, creates CRs for Rook and updates its CR status based on the Ceph
cluster deployment progress. It creates users, pools, and keys for OpenStack
and Kubernetes and provides Ceph configurations and keys to access them. Also,
Ceph Controller eventually obtains the data from the OpenStack Controller for
the Keystone integration and updates the RADOS Gateway services configurations
to use Kubernetes for user authentication. Ceph Controller operations include:
Transforming user parameters from the Container Cloud Ceph CR into Rook CRs
and deploying a Ceph cluster using Rook.
Providing integration of the Ceph cluster with Kubernetes.
Providing data for OpenStack to integrate with the deployed Ceph cluster.
Ceph Status Controller
A Kubernetes controller that collects all valuable parameters from the current
Ceph cluster, its daemons, and entities and exposes them into the
KaaSCephCluster status. Ceph Status Controller operations include:
Collecting all statuses from a Ceph cluster and corresponding Rook CRs.
Collecting additional information on the health of Ceph daemons.
Provides information to the status section of the KaaSCephCluster
CR.
Ceph Request Controller
A Kubernetes controller that obtains the parameters from Container Cloud
through a CR and manages Ceph OSD lifecycle management (LCM) operations. It
allows for a safe Ceph OSD removal from the Ceph cluster. Ceph Request
Controller operations include:
Providing an ability to perform Ceph OSD LCM operations.
Obtaining specific CRs to remove Ceph OSDs and executing them.
Pausing the regular Ceph Controller reconcile until all requests are
completed.
A typical Ceph cluster consists of the following components:
Ceph Monitors - three or, in rare cases, five Ceph Monitors.
Ceph Managers:
Before Container Cloud 2.22.0, one Ceph Manager.
Since Container Cloud 2.22.0, two Ceph Managers.
RADOS Gateway services - Mirantis recommends having three or more RADOS
Gateway instances for HA.
Ceph OSDs - the number of Ceph OSDs may vary according to the deployment
needs.
Warning
A Ceph cluster with 3 Ceph nodes does not provide
hardware fault tolerance and is not eligible
for recovery operations,
such as a disk or an entire Ceph node replacement.
A Ceph cluster uses the replication factor that equals 3.
If the number of Ceph OSDs is less than 3, a Ceph cluster
moves to the degraded state with the write operations
restriction until the number of alive Ceph OSDs
equals the replication factor again.
The placement of Ceph Monitors and Ceph Managers is defined in the
KaaSCephCluster CR.
The following diagram illustrates the way a Ceph cluster is deployed in
Container Cloud:
The following diagram illustrates the processes within a deployed Ceph cluster:
A Ceph cluster configuration in Mirantis Container Cloud
includes but is not limited to the following limitations:
Only one Ceph Controller per a managed cluster and only one Ceph cluster per
Ceph Controller are supported.
The replication size for any Ceph pool must be set to more than 1.
All CRUSH rules must have the same failure_domain.
Only one CRUSH tree per cluster. The separation of devices per Ceph pool is
supported through device classes
with only one pool of each type for a device class.
Only the following types of CRUSH buckets are supported:
topology.kubernetes.io/region
topology.kubernetes.io/zone
topology.rook.io/datacenter
topology.rook.io/room
topology.rook.io/pod
topology.rook.io/pdu
topology.rook.io/row
topology.rook.io/rack
topology.rook.io/chassis
Only IPv4 is supported.
If two or more Ceph OSDs are located on the same device, there must be no
dedicated WAL or DB for this class.
Only a full collocation or dedicated WAL and DB configurations are supported.
The minimum size of any defined Ceph OSD device is 5 GB.
Lifted since Container Cloud 2.24.2 (Cluster releases 14.0.1 and 15.0.1).
Ceph cluster does not support removable devices (with hotplug enabled) for
deploying Ceph OSDs.
Ceph OSDs support only raw disks as data devices meaning that no dm or
lvm devices are allowed.
When adding a Ceph node with the Ceph Monitor role, if any issues occur with
the Ceph Monitor, rook-ceph removes it and adds a new Ceph Monitor instead,
named using the next alphabetic character in order. Therefore, the Ceph Monitor
names may not follow the alphabetical order. For example, a, b, d,
instead of a, b, c.
Reducing the number of Ceph Monitors is not supported and causes the Ceph
Monitor daemons removal from random nodes.
Removal of the mgr role in the nodes section of the
KaaSCephCluster CR does not remove Ceph Managers. To remove a Ceph
Manager from a node, remove it from the nodes spec and manually delete
the mgr pod in the Rook namespace.
Lifted since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.10).
Ceph does not support allocation of Ceph RGW pods on nodes where the
Federal Information Processing Standard (FIPS) mode is enabled.
There are several formats to use when specifying and addressing storage devices
of a Ceph cluster. The default and recommended one is the /dev/disk/by-id
format. This format is reliable and unaffected by the disk controller actions,
such as device name shuffling or /dev/disk/by-path recalculating.
Difference between by-id, name, and by-path formats¶
The storage device /dev/disk/by-id format in most of the cases bases on
a disk serial number, which is unique for each disk. A by-id symlink
is created by the udev rules in the following format, where <BusID>
is an ID of the bus to which the disk is attached and <DiskSerialNumber>
stands for a unique disk serial number:
/dev/disk/by-id/<BusID>-<DiskSerialNumber>
Typical by-id symlinks for storage devices look as follows:
In the example above, symlinks contain the following IDs:
Bus IDs: nvme, scsi-SATA and ata
Disk serial numbers: SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543,
HGST_HUS724040AL_PN1334PEHN18ZS and
WDC_WD4003FZEX-00Z4SA0_WD-WMC5D0D9DMEH.
An exception to this rule is the wwnby-id symlinks, which are
programmatically generated at boot. They are not solely based on disk
serial numbers but also include other node information. This can lead
to the wwn being recalculated when the node reboots. As a result,
this symlink type cannot guarantee a persistent disk identifier and should
not be used as a stable storage device symlink in a Ceph cluster.
The storage device name and by-path formats cannot be considered
persistent because the sequence in which block devices are added during boot
is semi-arbitrary. This means that block device names, for example, nvme0n1
and sdc, are assigned to physical disks during discovery, which may vary
inconsistently from the previous node state. The same inconsistency applies
to by-path symlinks, as they rely on the shortest physical path
to the device at boot and may differ from the previous node state.
Therefore, Mirantis highly recommends using storage device by-id symlinks
that contain disk serial numbers. This approach enables you to use a persistent
device identifier addressed in the Ceph cluster specification.
Example KaaSCephCluster with device by-id identifiers¶
Below is an example KaaSCephCluster custom resource using the
/dev/disk/by-id format for storage devices specification:
Note
Container Cloud enables you to use fullPath for the by-id
symlinks since 2.25.0. For the earlier product versions, use the name
field instead.
apiVersion:kaas.mirantis.com/v1alpha1kind:KaaSCephClustermetadata:name:ceph-cluster-managed-clusternamespace:managed-nsspec:cephClusterSpec:nodes:# Add the exact ``nodes`` names.# Obtain the name from the "get machine" list.cz812-managed-cluster-storage-worker-noefi-58spl:roles:-mgr-mon# All disk configuration must be reflected in ``status.providerStatus.hardware.storage`` of the ``Machine`` objectstorageDevices:-config:deviceClass:ssdfullPath:/dev/disk/by-id/scsi-1ATA_WDC_WDS100T2B0A-00SM50_200231440912cz813-managed-cluster-storage-worker-noefi-lr4k4:roles:-mgr-monstorageDevices:-config:deviceClass:nvmefullPath:/dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543cz814-managed-cluster-storage-worker-noefi-z2m67:roles:-mgr-monstorageDevices:-config:deviceClass:nvmefullPath:/dev/disk/by-id/nvme-SAMSUNG_ML1EB3T8HMLA-00007_S46FNY1R130423pools:-default:truedeviceClass:ssdname:kubernetesreplicated:size:3role:kubernetesk8sCluster:name:managed-clusternamespace:managed-ns
Migrating device names used in KaaSCephCluster to device by-id symlinks¶
The majority of existing clusters uses device names as addressed storage
devices identifiers in the spec.cephClusterSpec.nodes section of
the KaaSCephCluster custom resource. Therefore, they are prone
to the issue of inconsistent storage device identifiers during cluster
update. Refer to Migrate Ceph cluster to address storage devices using by-id to mitigate possible
risks.
Mirantis Container Cloud provides APIs that enable you
to define hardware configurations that extend the reference architecture:
Bare Metal Host Profile API
Enables for quick configuration of host boot and storage devices
and assigning of custom configuration profiles to individual machines.
See Create a custom bare metal host profile.
IP Address Management API
Enables for quick configuration of host network interfaces and IP addresses
and setting up of IP addresses ranges for automatic allocation.
See Create L2 templates.
Typically, operations with the extended hardware configurations are available
through the API and CLI, but not the web UI.
To keep operating system on a bare metal host up to date with the latest
security updates, the operating system requires periodic software
packages upgrade that may or may not require the host reboot.
Mirantis Container Cloud uses life cycle management tools to update
the operating system packages on the bare metal hosts. Container Cloud
may also trigger restart of bare metal hosts to apply the updates.
In the management cluster of Container Cloud, software package upgrade and
host restart is applied automatically when a new Container Cloud version
with available kernel or software packages upgrade is released.
In managed clusters, package upgrade and host restart is applied
as part of usual cluster upgrade using the Update cluster option
in the Container Cloud web UI.
Operating system upgrade and host restart are applied to cluster
nodes one by one. If Ceph is installed in the cluster, the Container
Cloud orchestration securely pauses the Ceph OSDs on the node before
restart. This allows avoiding degradation of the storage service.
Caution
Depending on the cluster configuration, applying security
updates and host restart can increase the update time for each node to up to
1 hour.
Cluster nodes are updated one by one. Therefore, for large clusters,
the update may take several days to complete.
Since Container Cloud 2.27.3 (Cluster release 16.2.3), support
for vSphere-based clusters is suspended. For details, see
Deprecation notes.
The Mirantis Container Cloud managed clusters that are based on vSphere or
bare metal use MetalLB for load balancing of services and HAProxy with VIP
managed by Virtual Router Redundancy Protocol (VRRP) with Keepalived for the
Kubernetes API load balancer.
Every control plane node of each Kubernetes cluster runs the kube-api
service in a container. This service provides a Kubernetes API endpoint.
Every control plane node also runs the haproxy server that provides
load balancing with backend health checking for all kube-api endpoints as
backends.
The default load balancing method is least_conn. With this method,
a request is sent to the server with the least number of active
connections. The default load balancing method cannot be changed
using the Container Cloud API.
Only one of the control plane nodes at any given time serves as a
front end for Kubernetes API. To ensure this, the Kubernetes clients
use a virtual IP (VIP) address for accessing Kubernetes API.
This VIP is assigned to one node at a time using VRRP. Keepalived running on
each control plane node provides health checking and failover of the VIP.
Keepalived is configured in multicast mode.
Note
The use of VIP address for load balancing of Kubernetes API requires
that all control plane nodes of a Kubernetes cluster are connected
to a shared L2 segment. This limitation prevents from installing
full L3 topologies where control plane nodes are split
between different L2 segments and L3 networks.
Caution
External load balancers for services are not supported by
the current version of the Container Cloud vSphere provider.
The built-in load balancing described in this section is the only supported
option and cannot be disabled.
The services provided by the Kubernetes clusters, including
Container Cloud and user services, are balanced by MetalLB.
The metallb-speaker service runs on every worker node in
the cluster and handles connections to the service IP addresses.
MetalLB runs in the MAC-based (L2) mode. It means that all
control plane nodes must be connected to a shared L2 segment.
This is a limitation that does not allow installing full L3
cluster topologies.
Caution
External load balancers for services are not supported by
the current version of the Container Cloud vSphere provider.
The built-in load balancing described in this section is the only supported
option and cannot be disabled.
VMware vSphere network objects and IPAM recommendations¶
Warning
This section only applies to Container Cloud 2.27.2
(Cluster release 16.2.2) or earlier versions. Since Container Cloud 2.27.3
(Cluster release 16.2.3), support for vSphere-based clusters is suspended.
For details, see Deprecation notes.
The VMware vSphere provider of Mirantis Container Cloud supports
the following types of vSphere network objects:
Virtual network
A network of virtual machines running on a hypervisor(s) that are logically
connected to each other so that they can exchange data. Virtual machines
can be connected to virtual networks that you create when you add a network.
Distributed port group
A port group associated with a vSphere distributed switch that specifies
port configuration options for each member port. Distributed port groups
define how connection is established through the vSphere distributed switch
to the network.
A Container Cloud cluster can be deployed using one of these network objects
with or without a DHCP server in the network:
Non-DHCP
Container Cloud uses IPAM service to manage IP addresses assignment to
machines. You must provide additional network parameters, such as
CIDR, gateway, IP ranges, and nameservers.
Container Cloud processes this data to the cloud-init metadata and
passes the data to machines during their bootstrap.
DHCP
Container Cloud relies on a DHCP server to assign IP addresses
to virtual machines.
Mirantis recommends using IP address management (IPAM) for cluster
machines provided by Container Cloud. IPAM must be enabled
for deployment in the non-DHCP vSphere networks. But Mirantis
recommends enabling IPAM in the DHCP-based networks as well. In this case,
the dedicated IPAM range should not intersect with the IP range used in the
DHCP server configuration for the provided vSphere network.
Such configuration prevents issues with accidental IP address change
for machines. For the issue details, see
vSphere troubleshooting.
Note
To obtain IPAM parameters for the selected vSphere network, contact
your vSphere administrator who provides you with IP ranges dedicated to your
environment only.
The following parameters are required to enable IPAM:
Network CIDR.
Network gateway address.
Minimum 1 DNS server.
IP address include range to be allocated for cluster machines.
Make sure that this range is not part of the DHCP range if the network has
a DHCP server.
Minimal number of addresses in the range:
3 IPs for management cluster
3+N IPs for a managed cluster, where N is the number of worker nodes
Optional. IP address exclude range that is the list of IPs not to be
assigned to machines from the include ranges.
A dedicated Container Cloud network must not contain any virtual machines
with the keepalived instance running inside them as this may lead to the
vrouter_id conflict. By default, the Container Cloud management cluster
is deployed with vrouter_id set to 1.
Managed clusters are deployed with the vrouter_id value starting from
2 and upper.
The Kubernetes lifecycle management (LCM) engine in Mirantis Container Cloud
consists of the following components:
LCM Controller
Responsible for all LCM operations. Consumes the LCMCluster object
and orchestrates actions through LCM Agent.
LCM Agent
Runs on the target host. Executes Ansible playbooks in headless mode. Does
not run on attached MKE clusters that are not originally deployed by
Container Cloud.
Helm Controller
Responsible for the Helm charts life cycle, is installed by a cloud provider
as a Helm v3 chart.
The Kubernetes LCM components handle the following custom resources:
LCMCluster
LCMMachine
HelmBundle
The following diagram illustrates handling of the LCM custom resources by
the Kubernetes LCM components. On a managed cluster,
apiserver handles multiple Kubernetes objects,
for example, deployments, nodes, RBAC, and so on.
The Kubernetes LCM components handle the following custom resources (CRs):
LCMMachine
LCMCluster
HelmBundle
LCMMachine
Describes a machine that is located on a cluster.
It contains the machine type, control or worker,
StateItems that correspond to Ansible playbooks and miscellaneous actions,
for example, downloading a file or executing a shell command.
LCMMachine reflects the current state of the machine, for example,
a node IP address, and each StateItem through its status.
Multiple LCMMachine CRs can correspond to a single cluster.
LCMCluster
Describes a managed cluster. In its spec,
LCMCluster contains a set of StateItems for each type of LCMMachine,
which describe the actions that must be performed to deploy the cluster.
LCMCluster is created by the provider, using machineTypes
of the Release object. The status field of LCMCluster
reflects the status of the cluster,
for example, the number of ready or requested nodes.
HelmBundle
Wrapper for Helm charts that is handled by Helm Controller.
HelmBundle tracks what Helm charts must be installed
on a managed cluster.
LCM Controller runs on the management and regional cluster and orchestrates
the LCMMachine objects according to their type and their LCMCluster
object.
Once the LCMCluster and LCMMachine objects are created, LCM Controller
starts monitoring them to modify the spec fields and update
the status fields of the LCMMachine objects when required.
The status field of LCMMachine is updated by LCM Agent
running on a node of a management, regional, or managed cluster.
Each LCMMachine has the following lifecycle states:
Uninitialized - the machine is not yet assigned to an LCMCluster.
Pending - the agent reports a node IP address and host name.
Prepare - the machine executes StateItems that correspond
to the prepare phase. This phase usually involves downloading
the necessary archives and packages.
Deploy - the machine executes StateItems that correspond
to the deploy phase that is becoming a Mirantis Kubernetes Engine (MKE)
node.
Ready - the machine is being deployed.
Upgrade - the machine is being upgraded to the new MKE version.
Reconfigure - the machine executes StateItems that correspond
to the reconfigure phase. The machine configuration is being updated
without affecting workloads running on the machine.
The templates for StateItems are stored in the machineTypes
field of an LCMCluster object, with separate lists
for the MKE manager and worker nodes.
Each StateItem has the execution phase field for a management,
regional, and managed cluster:
The prepare phase is executed for all machines for which
it was not executed yet. This phase comprises downloading the files
necessary for the cluster deployment, installing the required packages,
and so on.
During the deploy phase, a node is added to the cluster.
LCM Controller applies the deploy phase to the nodes
in the following order:
First manager node is deployed.
The remaining manager nodes are deployed one by one
and the worker nodes are deployed in batches (by default,
up to 50 worker nodes at the same time).
LCM Controller deploys and upgrades a Mirantis Container Cloud cluster
by setting StateItems of LCMMachine objects following the corresponding
StateItems phases described above. The Container Cloud cluster upgrade
process follows the same logic that is used for a new deployment,
that is applying a new set of StateItems to the LCMMachines after
updating the LCMCluster object. But if the existing worker node is being
upgraded, LCM Controller performs draining and cordoning on this node honoring
the Pod Disruption Budgets.
This operation prevents unexpected disruptions of the workloads.
LCM Agent handles a single machine that belongs to a management or managed
cluster. It runs on the machine operating system but communicates with
apiserver of the management cluster. LCM Agent is deployed as a systemd
unit using cloud-init. LCM Agent has a built-in self-upgrade mechanism.
LCM Agent monitors the spec of a particular LCMMachine object
to reconcile the machine state with the object StateItems and update
the LCMMachine status accordingly. The actions that LCM Agent performs
while handling the StateItems are as follows:
Download configuration files
Run shell commands
Run Ansible playbooks in headless mode
LCM Agent provides the IP address and host name of the machine for the
LCMMachine status parameter.
Helm Controller is used by Mirantis Container Cloud to handle management and
managed clusters core addons such as StackLight and the application addons
such as the OpenStack components.
Helm Controller is installed as a separate Helm v3 chart by the Container
Cloud provider. Its Pods are created using Deployment.
The Helm release information is stored in the KaaSRelease object for
the management clusters and in the ClusterRelease object for all types of
the Container Cloud clusters.
These objects are used by the Container Cloud provider.
The Container Cloud provider uses the information from the
ClusterRelease object together with the Container Cloud API
Clusterspec. In Clusterspec, the operator can specify
the Helm release name and charts to use.
By combining the information from the ClusterproviderSpec parameter
and its ClusterRelease object, the cluster actuator generates
the LCMCluster objects. These objects are further handled by LCM Controller
and the HelmBundle object handled by Helm Controller.
HelmBundle must have the same name as the LCMCluster object
for the cluster that HelmBundle applies to.
Although a cluster actuator can only create a single HelmBundle
per cluster, Helm Controller can handle multiple HelmBundle objects
per cluster.
Helm Controller handles the HelmBundle objects and reconciles them with the
state of Helm in its cluster.
Helm Controller can also be used by the management cluster with corresponding
HelmBundle objects created as part of the initial management cluster setup.
Identity and access management (IAM) provides a central point
of users and permissions management of the Mirantis Container
Cloud cluster resources in a granular and unified manner.
Also, IAM provides infrastructure for single sign-on user experience
across all Container Cloud web portals.
IAM for Container Cloud consists of the following components:
Keycloak
Provides the OpenID Connect endpoint
Integrates with an external identity provider (IdP), for example,
existing LDAP or Google Open Authorization (OAuth)
Stores roles mapping for users
IAM Controller
Provides IAM API with data about Container Cloud projects
Handles all role-based access control (RBAC) components in Kubernetes API
IAM API
Provides an abstraction API for creating user scopes and roles
To be consistent and keep the integrity of a user database
and user permissions, in Mirantis Container Cloud,
IAM stores the user identity information internally.
However in real deployments, the identity provider usually already exists.
Out of the box, in Container Cloud, IAM supports
integration with LDAP and Google Open Authorization (OAuth).
If LDAP is configured as an external identity provider,
IAM performs one-way synchronization by mapping attributes according
to configuration.
In the case of the Google Open Authorization (OAuth) integration,
the user is automatically registered and their credentials are stored
in the internal database according to the user template configuration.
The Google OAuth registration workflow is as follows:
The user requests a Container Cloud web UI resource.
The user is redirected to the IAM login page and logs in using
the Log in with Google account option.
IAM creates a new user with the default access rights that are defined
in the user template configuration.
The user can access the Container Cloud web UI resource.
The following diagram illustrates the external IdP integration to IAM:
You can configure simultaneous integration with both external IdPs
with the user identity matching feature enabled.
Mirantis IAM performs as an OpenID Connect (OIDC) provider,
it issues a token and exposes discovery endpoints.
The credentials can be handled by IAM itself or delegated
to an external identity provider (IdP).
The issued JSON Web Token (JWT) is sufficient to perform operations across
Mirantis Container Cloud according to the scope and role defined
in it. Mirantis recommends using asymmetric cryptography for token signing
(RS256) to minimize the dependency between IAM and managed components.
When Container Cloud calls Mirantis Kubernetes Engine (MKE),
the user in Keycloak is created automatically with a JWT issued by Keycloak
on behalf of the end user.
MKE, in its turn, verifies whether the JWT is issued by Keycloak. If
the user retrieved from the token does not exist in the MKE database,
the user is automatically created in the MKE database based on the
information from the token.
The authorization implementation is out of the scope of IAM in Container
Cloud. This functionality is delegated to the component level.
IAM interacts with a Container Cloud component using the OIDC token
content that is processed by a component itself and required authorization
is enforced. Such an approach enables you to have any underlying authorization
that is not dependent on IAM and still to provide a unified user experience
across all Container Cloud components.
The following diagram illustrates the Kubernetes CLI authentication flow.
The authentication flow for Helm and other Kubernetes-oriented CLI utilities
is identical to the Kubernetes CLI flow,
but JSON Web Tokens (JWT) must be pre-provisioned.
Mirantis Container Cloud uses StackLight, the logging, monitoring, and
alerting solution that provides a single pane of glass for cloud maintenance
and day-to-day operations as well as offers critical insights into cloud
health including operational information about the components deployed in
management and managed clusters.
StackLight is based on Prometheus, an open-source monitoring solution and a
time series database.
Mirantis Container Cloud deploys the StackLight stack
as a release of a Helm chart that contains the helm-controller
and helmbundles.lcm.mirantis.com (HelmBundle) custom resources.
The StackLight HelmBundle consists of a set of Helm charts
with the StackLight components that include:
Receives, consolidates, and deduplicates the alerts sent by Alertmanager
and visually represents them through a simple web UI. Using the Alerta
web UI, you can view the most recent or watched alerts, group, and
filter alerts.
Alertmanager
Handles the alerts sent by client applications such as Prometheus,
deduplicates, groups, and routes alerts to receiver integrations.
Using the Alertmanager web UI, you can view the most recent fired
alerts, silence them, or view the Alertmanager configuration.
Elasticsearch Curator
Maintains the data (indexes) in OpenSearch by performing
such operations as creating, closing, or opening an index as well as
deleting a snapshot. Also, manages the data retention policy in
OpenSearch.
Elasticsearch Exporter Compatible with OpenSearch
The Prometheus exporter that gathers internal OpenSearch metrics.
Grafana
Builds and visually represents metric graphs based on time series
databases. Grafana supports querying of Prometheus using the PromQL
language.
Database backends
StackLight uses PostgreSQL for Alerta and Grafana. PostgreSQL reduces
the data storage fragmentation while enabling high availability.
High availability is achieved using Patroni, the PostgreSQL cluster
manager that monitors for node failures and manages failover
of the primary node. StackLight also uses Patroni to manage major
version upgrades of PostgreSQL clusters, which allows leveraging
the database engine functionality and improvements
as they are introduced upstream in new releases,
maintaining functional continuity without version lock-in.
Logging stack
Responsible for collecting, processing, and persisting logs and
Kubernetes events. By default, when deploying through the Container
Cloud web UI, only the metrics stack is enabled on managed clusters. To
enable StackLight to gather managed cluster logs, enable the logging
stack during deployment. On management clusters, the logging stack is
enabled by default. The logging stack components include:
OpenSearch, which stores logs and notifications.
Fluentd-logs, which collects logs, sends them to OpenSearch, generates
metrics based on analysis of incoming log entries, and exposes these
metrics to Prometheus.
OpenSearch Dashboards, which provides real-time visualization of
the data stored in OpenSearch and enables you to detect issues.
Metricbeat, which collects Kubernetes events and sends them to
OpenSearch for storage.
Prometheus-es-exporter, which presents the OpenSearch data
as Prometheus metrics by periodically sending configured queries to
the OpenSearch cluster and exposing the results to a scrapable HTTP
endpoint like other Prometheus targets.
Note
The logging mechanism performance depends on the cluster log load. In
case of a high load, you may need to increase the default resource requests
and limits for fluentdLogs. For details, see
StackLight configuration parameters: Resource limits.
Metric collector
Collects telemetry data (CPU or memory usage, number of active alerts,
and so on) from Prometheus and sends the data to centralized cloud
storage for further processing and analysis. Metric collector runs on
the management cluster.
Note
This component is designated for internal StackLight use only.
Prometheus
Gathers metrics. Automatically discovers and monitors the endpoints.
Using the Prometheus web UI, you can view simple visualizations and
debug. By default, the Prometheus database stores metrics of the past 15
days or up to 15 GB of data depending on the limit that is reached
first.
Prometheus Blackbox Exporter
Allows monitoring endpoints over HTTP, HTTPS, DNS, TCP, and ICMP.
Prometheus-es-exporter
Presents the OpenSearch data as Prometheus metrics by periodically
sending configured queries to the OpenSearch cluster and exposing the
results to a scrapable HTTP endpoint like other Prometheus targets.
Prometheus Node Exporter
Gathers hardware and operating system metrics exposed by kernel.
Prometheus Relay
Adds a proxy layer to Prometheus to merge the results from underlay
Prometheus servers to prevent gaps in case some data is missing on
some servers. Is available only in the HA StackLight mode.
Reference Application Available since 2.21.0
Enables workload monitoring on non-MOSK managed clusters.
Mimics a classical microservice application and provides metrics that
describe the likely behavior of user workloads.
Enables sending Alertmanager notifications to Salesforce to allow
creating Salesforce cases and closing them once the alerts are resolved.
Disabled by default.
Salesforce reporter
Queries Prometheus for the data about the amount of vCPU, vRAM, and
vStorage used and available, combines the data, and sends it to
Salesforce daily. Mirantis uses the collected data for further analysis
and reports to improve the quality of customer support. Disabled by
default.
Telegraf
Collects metrics from the system. Telegraf is plugin-driven and has
the concept of two distinct set of plugins: input plugins collect
metrics from the system, services, or third-party APIs; output plugins
write and expose metrics to various destinations.
The Telegraf agents used in Container Cloud include:
telegraf-ds-smart monitors SMART disks, and runs on both
management and managed clusters.
telegraf-ironic monitors Ironic on the baremetal-based
management clusters. The ironic input plugin collects and
processes data from Ironic HTTP API, while the http_response
input plugin checks Ironic HTTP API availability. As an output plugin,
to expose collected data as Prometheus target, Telegraf uses
prometheus.
telegraf-docker-swarm gathers metrics from the Mirantis Container
Runtime API about the Docker nodes, networks, and Swarm services. This
is a Docker Telegraf input plugin with downstream additions.
Telemeter
Enables a multi-cluster view through a Grafana dashboard of the
management cluster. Telemeter includes a Prometheus federation push
server and clients to enable isolated Prometheus instances, which
cannot be scraped from a central Prometheus instance, to push metrics
to the central location.
The Telemeter services are distributed between the management cluster
that hosts the Telemeter server and managed clusters that host the
Telemeter client. The metrics from managed clusters are aggregated
on management clusters.
Note
This component is designated for internal StackLight use only.
Every Helm chart contains a default values.yml file. These default values
are partially overridden by custom values defined in the StackLight Helm chart.
Before deploying a managed cluster, you can select the HA or non-HA StackLight
architecture type. The non-HA mode is set by default. On management clusters,
StackLight is deployed in the HA mode only.
The following table lists the differences between the HA and non-HA modes:
One Alertmanager instance
Since 2.24.0 and 2.24.2 for MOSK 23.2
One OpenSearch instance
One PostgreSQL instance
One iam-proxy instance
One persistent volume is provided for storing data. In case of a service
or node failure, a new pod is redeployed and the volume is reattached to
provide the existing data. Such setup has a reduced hardware footprint
but provides less performance.
Two Prometheus instances
Two Alertmanager instances
Three OpenSearch instances
Three PostgreSQL instances
Two iam-proxy instances
Since 2.23.0 and 2.23.1 for MOSK 23.1
Local Volume Provisioner is used to provide local host storage. In case
of a service or node failure, the traffic is automatically redirected to
any other running Prometheus or OpenSearch server. For better
performance, Mirantis recommends that you deploy StackLight in the HA
mode. Two iam-proxy instances ensure access to HA components if one
iam-proxy node fails.
Note
Before Container Cloud 2.24.0, Alertmanager has 2 replicas in the
non-HA mode.
Depending on the Container Cloud cluster type and selected StackLight database
mode, StackLight is deployed on the following number of nodes:
StackLight provides five web UIs including Prometheus, Alertmanager, Alerta,
OpenSearch Dashboards, and Grafana. Access to StackLight web UIs is protected
by Keycloak-based Identity and access management (IAM). All web UIs except
Alerta are exposed to IAM through the IAM proxy middleware. The Alerta
configuration provides direct integration with IAM.
The following diagram illustrates accessing the IAM-proxied StackLight web UIs,
for example, Prometheus web UI:
Authentication flow for the IAM-proxied StackLight web UIs:
A user enters the public IP of a StackLight web UI, for example, Prometheus
web UI.
The public IP leads to IAM proxy, deployed as a Kubernetes LoadBalancer,
which protects the Prometheus web UI.
LoadBalancer routes the HTTP request to Kubernetes internal IAM proxy
service endpoints, specified in the X-Forwarded-Proto or X-Forwarded-Host
headers.
The Keycloak login form opens (the login_url field in the IAM proxy
configuration, which points to Keycloak realm) and the user enters
the user name and password.
Keycloak validates the user name and password.
The user obtains access to the Prometheus web UI (the upstreams field
in the IAM proxy configuration).
Note
The discovery URL is the URL of the IAM service.
The upstream URL is the hidden endpoint of a web UI (Prometheus web UI in
the example above).
The following diagram illustrates accessing the Alerta web UI:
Authentication flow for the Alerta web UI:
A user enters the public IP of the Alerta web UI.
The public IP leads to Alerta deployed as a Kubernetes LoadBalancer type.
LoadBalancer routes the HTTP request to the Kubernetes internal Alerta
service endpoint.
The Keycloak login form opens (Alerta refers to the IAM realm) and
the user enters the user name and password.
Using the Mirantis Container Cloud web UI,
on the pre-deployment stage of a managed cluster,
you can view, enable or disable, or tune the following StackLight features
available:
StackLight HA mode.
Database retention size and time for Prometheus.
Tunable index retention period for OpenSearch.
Tunable PersistentVolumeClaim (PVC) size for Prometheus and OpenSearch
set to 16 GB for Prometheus and 30 GB for OpenSearch by
default. The PVC size must be logically aligned with the retention periods or
sizes for these components.
Email and Slack receivers for the Alertmanager notifications.
Predefined set of dashboards.
Predefined set of alerts and capability to add
new custom alerts for Prometheus in the following exemplary format:
StackLight measures, analyzes, and reports in a timely manner about failures
that may occur in the following Mirantis Container Cloud
components and their sub-components, if any:
StackLight uses a storage-based log retention strategy that optimizes storage
utilization and ensures effective data retention.
A proportion of available disk space is defined as 80% of disk space allocated
for the OpenSearch node with the following data types:
80% for system logs
10% for audit logs
5% for OpenStack notifications (applies only to MOSK clusters)
5% for Kubernetes events
This approach ensures that storage resources are efficiently allocated based
on the importance and volume of different data types.
The logging index management implies the following advantages:
Storage-based rollover mechanism
The rollover mechanism for system and audit indices enforces shard size
based on available storage, ensuring optimal resource utilization.
Consistent shard allocation
The number of primary shards per index is dynamically set based on cluster
size, which boosts search and facilitates ingestion for large clusters.
Minimal size of cluster state
The logging size of the cluster state is minimal and uses static mappings,
which are based on Elastic Common Schema (ESC) with slight deviations
from the standard. Dynamic mapping in index templates is avoided to reduce
overhead.
Storage compression
The system and audit indices utilize the best_compression codec that
minimizes the size of stored indices, resulting in significant storage
savings of up to 50% on average.
No filter by logging level
In light of non-even severity level over components in Container Cloud,
logs of all severity levels are collected to prevent ignorance of important
logs of low severity while debugging a cluster. Filtering by tags is still
available.
The data collected and transmitted through an encrypted channel back to
Mirantis provides our Customer Success Organization information to better
understand the operational usage patterns our customers are experiencing
as well as to provide feedback on product usage statistics to enable our
product teams to enhance our products and services for our customers.
Mirantis collects the following statistics using configuration-collector:
Since the Cluster releases 17.1.0 and 16.1.0
Mirantis collects hardware information using the following metrics:
mcc_hw_machine_chassis
mcc_hw_machine_cpu_model
mcc_hw_machine_cpu_number
mcc_hw_machine_nics
mcc_hw_machine_ram
mcc_hw_machine_storage (storage devices and disk layout)
mcc_hw_machine_vendor
Before the Cluster releases 17.0.0, 16.0.0, and 14.1.0
Mirantis collects the summary of all deployed Container Cloud configurations
using the following objects, if any:
Note
The data is anonymized from all sensitive information, such as IDs,
IP addresses, passwords, private keys, and so on.
Cluster
Machine
MachinePool
MCCUpgrade
BareMetalHost
BareMetalHostProfile
IPAMHost
IPAddr
KaaSCephCluster
L2Template
Subnet
Note
In the Cluster releases 17.0.0, 16.0.0, and 14.1.0, Mirantis does
not collect any configuration summary in light of the
configuration-collector refactoring.
The node-level resource data are broken down into three broad categories:
Cluster, Node, and Namespace. The telemetry data tracks Allocatable,
Capacity, Limits, Requests, and actual Usage of node-level resources.
StackLight components, which require external access, automatically use the
same proxy that is configured for Mirantis Container Cloud clusters. Therefore,
you only need to configure proxy during deployment of your management or
managed clusters. No additional actions are required to set up proxy for
StackLight. For more details about implementation of proxy support in
Container Cloud, see Proxy and cache support.
Note
Proxy handles only the HTTP and HTTPS traffic. Therefore, for
clusters with limited or no Internet access, it is not possible to set up
Alertmanager email notifications, which use SMTP, when proxy is used.
Proxy is used for the following StackLight components:
Component
Cluster type
Usage
Alertmanager
Any
As a default http_config
for all HTTP-based receivers except the predefined HTTP-alerta and
HTTP-salesforce. For these receivers, http_config is overridden on
the receiver level.
Metric Collector
Management
To send outbound cluster metrics to Mirantis.
Salesforce notifier
Any
To send notifications to the Salesforce instance.
Salesforce reporter
Any
To send metric reports to the Salesforce instance.
Reference Application is a small microservice application that enables
workload monitoring on non-MOSK managed clusters. It mimics a
classical microservice application and provides metrics that describe the
likely behavior of user workloads.
The application consists of the following API and database services that allow
putting simple records into the database through the API and retrieving them:
Reference Application API
Runs on StackLight nodes and provides API access to the database.
Runs three API instances for high availability.
PostgreSQL Since Container Cloud 2.22.0
Runs on worker nodes and stores the data on an attached PersistentVolumeClaim
(PVC). Runs three database instances for high availability.
Note
Before version 2.22.0, Container Cloud used MariaDB as the
database management system instead of PostgreSQL.
StackLight queries the API measuring response times for each query.
No caching is being done, so each API request must go to the database,
allowing to verify the availability of a stateful workload on the cluster.
Reference Application requires the following resources on top of the main
product requirements:
Since Container Cloud 2.27.3 (Cluster release 16.2.3), support
for vSphere-based clusters is suspended. For details, see
Deprecation notes.
Using Mirantis Container Cloud, you can deploy a Mirantis Kubernetes Engine
(MKE) cluster on bare metal, OpenStack, or VMware vSphere cloud providers.
Each cloud provider requires corresponding resources.
A bootstrap node is necessary only to deploy the management cluster.
When the bootstrap is complete, the bootstrap node can be
redeployed and its resources can be reused
for the managed cluster workloads.
The minimum reference system requirements of a baremetal-based bootstrap
seed node are described in System requirements for the seed node.
The minimum reference system requirements a bootstrap node for other supported
Container Cloud providers are as follows:
Any local machine on Ubuntu 20.04 that requires access to the provider API
with the following configuration:
2 vCPUs
4 GB of RAM
5 GB of available storage
Docker version currently available for Ubuntu 20.04
Internet access for downloading of all required artifacts
Note
For the vSphere cloud provider, you can also use RHEL 8.7 with the
same system requirements as for Ubuntu.
Caution
Since Container Cloud 2.27.3 (Cluster release 16.2.3), support
for vSphere-based clusters is suspended. For details, see
Deprecation notes.
If you use a firewall or proxy, make sure that the bootstrap and management
clusters have access to the following IP ranges and domain names
required for the Container Cloud content delivery network and alerting:
mirror.mirantis.com and repos.mirantis.com for packages
binary.mirantis.com for binaries and Helm charts
mirantis.azurecr.io and *.blob.core.windows.net for Docker images
mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry
(port 443 if proxy is enabled)
mirantis.my.salesforce.com and login.salesforce.com
for Salesforce alerts
Note
Access to Salesforce is required from any Container Cloud
cluster type.
If any additional Alertmanager notification receiver is enabled,
for example, Slack, its endpoint must also be accessible
from the cluster.
Caution
Regional clusters are unsupported since Container Cloud 2.25.0.
Mirantis does not perform functional integration testing of the feature and
the related code is removed in Container Cloud 2.26.0. If you still
require this feature, contact Mirantis support for further information.
The following hardware configuration is used as a reference to deploy
Mirantis Container Cloud with bare metal Container Cloud clusters with
Mirantis Kubernetes Engine.
Reference hardware configuration for Container Cloud
management and managed clusters on bare metal¶
A management cluster requires 2 volumes for Container Cloud
(total 50 GB) and 5 volumes for StackLight (total 60 GB).
A managed cluster requires 5 volumes for StackLight.
The seed node is necessary only to deploy the management cluster.
When the bootstrap is complete, the bootstrap node can be
redeployed and its resources can be reused
for the managed cluster workloads.
The minimum reference system requirements for a baremetal-based bootstrap
seed node are as follows:
Basic server on Ubuntu 20.04 with the following configuration:
Kernel version 4.15.0-76.86 or later
8 GB of RAM
4 CPU
10 GB of free disk space for the bootstrap cluster cache
No DHCP or TFTP servers on any NIC networks
Routable access IPMI network for the hardware servers. For more details, see
Host networking.
Internet access for downloading of all required artifacts
The following diagram illustrates the physical and virtual L2 underlay
networking schema for the final state of the Mirantis Container Cloud
bare metal deployment.
The network fabric reference configuration is a spine/leaf with 2 leaf ToR
switches and one out-of-band (OOB) switch per rack.
Reference configuration uses the following switches for ToR and OOB:
Cisco WS-C3560E-24TD has 24 of 1 GbE ports. Used in OOB network
segment.
Dell Force 10 S4810P has 48 of 1/10GbE ports. Used as ToR in Common/PXE
network segment.
In the reference configuration, all odd interfaces from NIC0 are connected
to TORSwitch1, and all even interfaces from NIC0 are connected
to TORSwitch2. The Baseboard Management Controller (BMC) interfaces
of the servers are connected to OOBSwitch1.
The following recommendations apply to all types of nodes:
Use the Link Aggregation Control Protocol (LACP) bonding mode
with MC-LAG domains configured on leaf switches. This corresponds to
the 802.3ad bond mode on hosts.
Use ports from different multi-port NICs when creating bonds. This makes
network connections redundant if failure of a single NIC occurs.
Configure the ports that connect servers to the PXE network with PXE VLAN
as native or untagged. On these ports, configure LACP fallback to ensure
that the servers can reach DHCP server and boot over network.
When setting up the network range for DHCP Preboot Execution Environment
(PXE), keep in mind several considerations to ensure smooth server
provisioning:
Determine the network size. For instance, if you target a concurrent
provision of 50+ servers, a /24 network is recommended. This specific size
is crucial as it provides sufficient scope for the DHCP server to provide
unique IP addresses to each new Media Access Control (MAC) address,
thereby minimizing the risk of collision.
The concept of collision refers to the likelihood of two or more devices
being assigned the same IP address. With a /24 network, the collision
probability using the SDBM hash function, which is used by the DHCP server,
is low. If a collision occurs, the DHCP server
provides a free address using a linear lookup strategy.
In the context of PXE provisioning, technically, the IP address does not
need to be consistent for every new DHCP request associated with the same
MAC address. However, maintaining the same IP address can enhance user
experience, making the /24 network size more of a recommendation
than an absolute requirement.
For a minimal network size, it is sufficient to cover the number of
concurrently provisioned servers plus one additional address (50 + 1).
This calculation applies after covering any exclusions that exist in the
range. You can define excludes in the corresponding field of the Subnet
object. For details, see API Reference: Subnet resource.
When the available address space is less than the minimum described above,
you will not be able to automatically provision all servers. However, you
can manually provision them by combining manual IP assignment for each
bare metal host with manual pauses. For these operations, use the
host.dnsmasqs.metal3.io/address and baremetalhost.metal3.io/detached
annotations in the BareMetalHost object. For details, see
Operations Guide: Manually allocate IP addresses for bare metal hosts.
All addresses within the specified range must remain unused before
provisioning. If an IP address in-use is issued by the DHCP server to a
BOOTP client, that specific client cannot complete provisioning.
The management cluster requires minimum two storage devices per node.
Each device is used for different type of storage.
The first device is always used for boot partitions and the root
file system. SSD is recommended. RAID device is not supported.
One storage device per server is reserved for local persistent
volumes. These volumes are served by the Local Storage Static Provisioner
(local-volume-provisioner) and used by many services of Container Cloud.
While planning the deployment of an OpenStack-based Mirantis Container Cloud
cluster with Mirantis Kubernetes Engine (MKE), consider the following general
requirements:
Kubernetes on OpenStack requires the Cinder API V3 and Octavia API
availability.
Mirantis supports deployments based on OpenStack Victoria or Yoga with
Open vSwitch (OVS) or Tungsten Fabric (TF) on top of Mirantis OpenStack for Kubernetes
(MOSK) Victoria or Yoga with TF.
If you use a firewall or proxy, make sure that the bootstrap and management
clusters have access to the following IP ranges and domain names
required for the Container Cloud content delivery network and alerting:
mirror.mirantis.com and repos.mirantis.com for packages
binary.mirantis.com for binaries and Helm charts
mirantis.azurecr.io and *.blob.core.windows.net for Docker images
mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry
(port 443 if proxy is enabled)
mirantis.my.salesforce.com and login.salesforce.com
for Salesforce alerts
Note
Access to Salesforce is required from any Container Cloud
cluster type.
If any additional Alertmanager notification receiver is enabled,
for example, Slack, its endpoint must also be accessible
from the cluster.
Caution
Regional clusters are unsupported since Container Cloud 2.25.0.
Mirantis does not perform functional integration testing of the feature and
the related code is removed in Container Cloud 2.26.0. If you still
require this feature, contact Mirantis support for further information.
Note
The requirements in this section apply to the latest supported
Container Cloud release.
Requirements for an OpenStack-based Container Cloud cluster¶
Resource
Management cluster
Managed cluster
Comments
# of nodes
3 (HA) + 1 (Bastion)
5 (6 with StackLight HA)
A bootstrap cluster requires access to the OpenStack API.
Each management cluster requires 3 nodes for the manager nodes HA.
Adding more than 3 nodes to a management cluster is not supported.
A managed cluster requires 3 manager nodes for HA and 2 worker nodes for the
Container Cloud workloads. If the multiserver mode is enabled for StackLight,
3 worker nodes are required for workloads.
Each management cluster requires 1 node for the Bastion instance
that is created with a public IP address to allow SSH access to instances.
# of vCPUs per node
8
8
The Bastion node requires 1 vCPU.
Refer to the RAM recommendations described below to plan resources
for different types of nodes.
RAM in GB per node
24
16
To prevent issues with low RAM, Mirantis recommends the following types
of instances for a managed cluster with 50-200 nodes:
16 vCPUs and 32 GB of RAM - manager node
16 vCPUs and 128 GB of RAM - nodes where the StackLight server components run
The Bastion node requires 1 GB of RAM.
Storage in GB per node
120
120
For the Bastion node, the default amount of storage is enough
To boot machines from a block storage volume, verify that disks
performance matches the etcd requirements as described in
etcd documentation
To boot the Bastion node from a block storage volume, 80 GB is enough
This section only applies to Container Cloud 2.27.2
(Cluster release 16.2.2) or earlier versions. Since Container Cloud 2.27.3
(Cluster release 16.2.3), support for vSphere-based clusters is suspended.
For details, see Deprecation notes.
Note
Container Cloud is developed and tested on VMware vSphere 7.0 and
6.7.
If you use a firewall or proxy, make sure that the bootstrap and management
clusters have access to the following IP ranges and domain names
required for the Container Cloud content delivery network and alerting:
mirror.mirantis.com and repos.mirantis.com for packages
binary.mirantis.com for binaries and Helm charts
mirantis.azurecr.io and *.blob.core.windows.net for Docker images
mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry
(port 443 if proxy is enabled)
mirantis.my.salesforce.com and login.salesforce.com
for Salesforce alerts
Note
Access to Salesforce is required from any Container Cloud
cluster type.
If any additional Alertmanager notification receiver is enabled,
for example, Slack, its endpoint must also be accessible
from the cluster.
Caution
Regional clusters are unsupported since Container Cloud 2.25.0.
Mirantis does not perform functional integration testing of the feature and
the related code is removed in Container Cloud 2.26.0. If you still
require this feature, contact Mirantis support for further information.
Note
The requirements in this section apply to the latest supported
Container Cloud release.
Requirements for a vSphere-based Container Cloud cluster¶
Resource
Management cluster
Managed cluster
Comments
# of nodes
3 (HA)
5 (6 with StackLight HA)
A bootstrap cluster requires access to the vSphere API.
A management cluster requires 3 nodes for the manager nodes HA. Adding
more than 3 nodes to a management cluster is not supported.
A managed cluster requires 3 manager nodes for HA and 2 worker nodes for the
Container Cloud workloads. If the multiserver mode is enabled for StackLight,
3 worker nodes are required for workloads.
# of vCPUs per node
8
8
Refer to the RAM recommendations described below to plan resources
for different types of nodes.
RAM in GB per node
32
16
To prevent issues with low RAM, Mirantis recommends the following VM
templates for a managed cluster with 50-200 nodes:
16 vCPUs and 40 GB of RAM - manager node
16 vCPUs and 128 GB of RAM - nodes where the StackLight server components run
Storage in GB per node
120
120
The listed amount of disk space must be available as a shared
datastore of any type, for example, NFS or vSAN, mounted on all
hosts of the vCenter cluster.
For a management and managed cluster, a base OS VM template
must be present in the VMware VM templates folder
available to Container Cloud. For details, see VsphereVMTemplate.
This license type allows running unlimited guests inside one hypervisor.
The amount of licenses is equal to the amount of hypervisors in
vCenter Server, which will be used to host RHEL-based machines.
Container Cloud will schedule machines according to scheduling rules
applied to vCenter Server. Therefore, make sure that your
RedHat Customer portal account has enough licenses for allowed
hypervisors.
MCR
23.0.9 Since 16.1.0
23.0.7 Since 16.0.1
20.10.17 Since 14.0.0
23.0.9 Since 16.1.0
23.0.7 Since 16.0.1
20.10.17 Since 14.0.0
Mirantis Container Runtime (MCR) is deployed by Container Cloud as a
Container Runtime Interface (CRI) instead of Docker Engine.
VMware vSphere version
7.0, 6.7
7.0, 6.7
cloud-init version
20.3 for RHEL
20.3 for RHEL
The minimal cloud-init package version built for the
VsphereVMTemplate.
VMware Tools version
11.0.5
11.0.5
The minimal open-vm-tools package version built for the
VsphereVMTemplate.
Obligatory vSphere capabilities
DRS,
Shared datastore
DRS,
Shared datastore
A shared datastore must be mounted on all hosts of the
vCenter cluster. Combined with Distributed Resources Scheduler (DRS),
it ensures that the VMs are dynamically scheduled to the cluster hosts.
The VMware vSphere provider of Mirantis Container Cloud requires the following
resources to successfully create virtual machines for Container Cloud clusters:
Data center
All resources below must be related to one data center.
Cluster
All virtual machines must run on the hosts of one cluster.
Storage for virtual machines disks and Kubernetes volumes.
Folder
Placement of virtual machines.
Resource pool
Pool of CPU and memory resources for virtual machines.
You must provide the data center and cluster resources by name.
You can provide other resources by:
Name
Resource name must be unique in the data center and cluster.
Otherwise, the vSphere provider detects multiple resources with same name
and cannot determine which one to use.
Full path (recommended)
Full path to a resource depends on its type. For example:
To deploy Mirantis Container Cloud on the VMware vSphere-based environment,
you need to prepare vSphere accounts for Container Cloud. Contact your vSphere
administrator to set up the required users and permissions following the steps
below:
Log in to the vCenter Server Web Console.
Create the cluster-api user with the following privileges:
Note
Container Cloud uses two separate vSphere accounts for:
Cluster API related operations, such as create or delete VMs, and for
preparation of the VM template using Packer
Storage operations, such as dynamic PVC provisioning
You can also create one user that has all privileges sets
mentioned above.
For RHEL deployments, if you do not have a RHEL machine with the
virt-who service configured to report the vSphere environment
configuration and hypervisors information to RedHat Customer Portal
or RedHat Satellite server, set up the virt-who service
inside the Container Cloud machines for a proper RHEL license activation.
Create a virt-who user with at least read-only access
to all objects in the vCenter Data Center.
The virt-who service on RHEL machines will be provided with the
virt-who user credentials to properly manage RHEL subscriptions.
StackLight requirements for an MKE attached cluster¶
Available since 2.25.2Unsupported since 2.27.3
Warning
This section only applies to Container Cloud 2.27.2
(Cluster release 16.2.2) or earlier versions. Since Container Cloud 2.27.3
(Cluster release 16.2.3), support for vSphere-based clusters is suspended.
For details, see Deprecation notes.
During attachment of a Mirantis Kubernetes Engine (MKE) cluster that is not
deployed by Container Cloud to a vSphere-based management cluster, you can
add StackLight as the logging, monitoring, and alerting solution.
In this scenario, your cluster must satisfy several requirements that primarily
involve alignment of cluster resources with specific StackLight settings.
While planning the attachment of an existing MKE cluster that is not deployed
by Container Cloud to a vSphere-based management cluster, consider
the following general requirements for StackLight:
For StackLight in non-HA mode, make sure that you have the default storage
class configured on the MKE cluster being attached. To select and configure
a persistent storage for StackLight, refer to MKE documentation: Persistent
Kubernetes storage.
While planning the attachment of an existing MKE cluster that is not deployed
by Container Cloud to a vSphere-based management cluster, consider the cluster
size requirements for StackLight. Depending on the following specific
StackLight HA and logging settings, use the example size guidelines below:
The non-HA mode - StackLight services are installed on a minimum of one node
with the StackLight label (StackLight nodes) with no redundancy
using Persistent Volumes (PVs) from the default storage class to store data.
Metric collection agents are installed on each node (Other nodes).
The HA mode - StackLight services are installed on a minimum of three nodes
with the StackLight label (StackLight nodes) with redundancy
using PVs provided by Local Volume Provisioner to store data. Metric
collection agents are installed on each node (Other nodes).
Logging enabled - the Enable logging option is turned on, which
enables the OpenSearch cluster to store infrastructure logs.
Logging disabled - the Enable logging option is turned off. In
this case, StackLight will not install OpenSearch and will not collect
infrastructure logs.
LoadBalancer (LB) Services support is required to provide external access
to StackLight web UIs.
StackLight requirements for an attached MKE cluster, with logging enabled:¶
In the non-HA mode, StackLight components are bound to the nodes labeled
with the StackLight label. If there are no nodes labeled, StackLight
components will be scheduled to all schedulable worker nodes until the
StackLight label(s) are added. The requirements presented in the
table for the non-HA mode are summarized requirements for all StackLight
nodes.
If you require all Internet access to go through a proxy server
for security and audit purposes, you can bootstrap management clusters using
proxy. The proxy server settings consist of three standard environment
variables that are set prior to the bootstrap process:
HTTP_PROXY
HTTPS_PROXY
NO_PROXY
These settings are not propagated to managed clusters. However, you can enable
a separate proxy access on a managed cluster using the Container Cloud web UI.
This proxy is intended for the end user needs and is not used for a managed
cluster deployment or for access to the Mirantis resources.
Caution
Since Container Cloud uses the OpenID Connect (OIDC) protocol
for IAM authentication, management clusters require
a direct non-proxy access from managed clusters.
StackLight components, which require external access, automatically use the
same proxy that is configured for Container Cloud clusters.
On the managed clusters with limited Internet access, a proxy is required for
StackLight components that use HTTP and HTTPS and are disabled by default but
need external access if enabled, for example, for the Salesforce integration
and Alertmanager notifications external rules.
For more details about proxy implementation in StackLight, see StackLight proxy.
For the list of Mirantis resources and IP addresses to be accessible
from the Container Cloud clusters, see Hardware and system requirements.
After enabling proxy support on managed clusters, proxy is used for:
Docker traffic on managed clusters
StackLight
OpenStack on MOSK-based clusters
Warning
Any modification to the Proxy object used in any cluster, for
example, changing the proxy URL, NO_PROXY values, or
certificate, leads to cordon-drain and Docker
restart on the cluster machines.
The Container Cloud managed clusters are deployed without direct Internet
access in order to consume less Internet traffic in your cloud.
The Mirantis artifacts used during managed clusters deployment are downloaded
through a cache running on a management cluster.
The feature is enabled by default on new managed clusters
and will be automatically enabled on existing clusters during upgrade
to the latest version.
Caution
IAM operations require a direct non-proxy access
of a managed cluster to a management cluster.
To ensure the Mirantis Container Cloud stability in managing the Container
Cloud-based Mirantis Kubernetes Engine (MKE) clusters, the following MKE API
functionality is not available for the Container Cloud-based MKE clusters as
compared to the MKE clusters that are deployed not by Container Cloud.
Use the Container Cloud web UI or CLI for this functionality instead.
Public APIs limitations in a Container Cloud-based MKE cluster¶
API endpoint
Limitation
GET/swarm
Swarm Join Tokens are filtered out for all users, including admins.
PUT/api/ucp/config-toml
All requests are forbidden.
POST/nodes/{id}/update
Requests for the following changes are forbidden:
Change Role
Add or remove the com.docker.ucp.orchestrator.swarm and
com.docker.ucp.orchestrator.kubernetes labels.
Since 2.25.1 (Cluster releases 16.0.1 and 17.0.1), Container Cloud does not
override changes in MKE configuration except the following list of parameters
that are automatically managed by Container Cloud. These parameters are always
overridden by the Container Cloud default values if modified direclty using
the MKE API. For details on configuration using the MKE API, see
MKE configuration managed directly by the MKE API.
However, you can manually configure a few options from this list using the
Cluster object of a Container Cloud cluster. They are labeled with the
superscript and contain references to the
respective configuration procedures in the Comments columns of the tables.
All possible values for parameters labeled with the
superscript, which you can manually
configure using the Cluster object are described in
MKE Operations Guide: Configuration options.
MKE configuration managed directly by the MKE API¶
Since 2.25.1, aside from MKE parameters described in MKE configuration managed by Container Cloud,
Container Cloud does not override changes in MKE configuration that are applied
directly through the MKE API. For the configuration options and procedure, see
MKE documentation:
Mirantis cannot guarrantee the expected behavior of the
functionality configured using the MKE API as long as customer-specific
configuration does not undergo testing within Container Cloud. Therefore,
Mirantis recommends that you test custom MKE settings configured through
the MKE API on a staging environment before applying them to production.
Mirantis Container Cloud Bootstrap v2 provides best user experience to set up
Container Cloud. Using Bootstrap v2, you can provision and operate management
clusters using required objects through the Container Cloud web UI.
Basic concepts and components of Bootstrap v2 include:
Bootstrap cluster
Bootstrap cluster is any kind-based Kubernetes cluster that contains a
minimal set of Container Cloud bootstrap components allowing the user to
prepare the configuration for management cluster deployment and start the
deployment. The list of these components includes:
Bootstrap Controller
Controller that is responsible for:
Configuration of a bootstrap cluster with provider-specific charts
through the bootstrap Helm bundle.
Configuration and deployment of a management cluster and
its related objects.
Helm Controller
Operator that manages Helm chart releases. It installs the Container
Cloud bootstrap and provider-specific charts configured in the bootstrap
Helm bundle.
Public API charts
Helm charts that contain custom resource definitions for Container Cloud
resources of supported providers.
Admission Controller
Controller that performs mutations and validations for the Container
Cloud resources including cluster and machines configuration.
Bootstrap web UI
User-friendly web interface to prepare the configuration for a
management cluster deployment.
Currently one bootstrap cluster can be used for deployment of only one
management cluster. For example, to add a new management cluster with
different settings, a new bootstrap cluster must be recreated from scratch.
Bootstrap region
BootstrapRegion is the first object to create in the bootstrap cluster
for the Bootstrap Controller to identify and install required provider
components onto the bootstrap cluster. After, the user can prepare and
deploy a management cluster with related resources.
The bootstrap region is a starting point for the cluster deployment. The
user needs to approve the BootstrapRegion object. Otherwise, the
Bootstrap Controller will not be triggered for the cluster deployment.
Bootstrap Helm bundle
Helm bundle that contains charts configuration for the bootstrap cluster.
This object is managed by the Bootstrap Controller that updates the bundle
depending on a provider selected by the user in the BootstrapRegion
object. The Bootstrap Controller always configures provider-related charts
listed in the regional section of the Container Cloud release for the
selected provider. Depending on the provider and cluster configuration,
the Bootstrap Controller may update or reconfigure this bundle even after
the cluster deployment starts. For example, the Bootstrap Controller
enables the provider in the bootstrap cluster only after the bootstrap
region is approved for the deployment.
Since Container Cloud 2.27.3 (Cluster release 16.2.3), support
for vSphere-based clusters is suspended. For details, see
Deprecation notes.
Management cluster deployment consists of several sequential stages.
Each stage finishes when a specific condition is met or specific configuration
applies to a cluster or its machines.
In case of issues at any deployment stage, you can identify the problem
and adjust it on the fly. The cluster deployment does not abort until all
stages complete by means of the infinite-timeout option enabled
by default in Bootstrap v2.
Infinite timeout prevents the bootstrap failure due to timeout. This option
is useful in the following cases:
The network speed is slow for artifacts downloading
An infrastructure configuration does not allow booting fast
A bare-metal node inspecting presupposes more than two HDDSATA disks
to attach to a machine
You can track the status of each stage in the bootstrapStatus section of
the Cluster object that is updated by the Bootstrap Controller.
The Bootstrap Controller starts deploying the cluster after you approve the
BootstrapRegion configuration.
The following table describes deployment states of a management cluster that
apply in the strict order.
Verifies proxy configuration in the Cluster object.
If the bootstrap cluster was created without a proxy, no actions are
applied to the cluster.
2
ClusterSSHConfigured
Verifies SSH configuration for the cluster and machines.
You can provide any number of SSH public keys, which are added to
cluster machines. But the Bootstrap Controller always adds the
bootstrap-key SSH public key to the cluster configuration. The
Bootstrap Controller uses this SSH key to manage the lcm-agent
configuration on cluster machines.
The bootstrap-key SSH key is copied to a
bootstrap-key-<clusterName> object containing the cluster name in
its name.
3
ProviderUpdatedInBootstrap
Synchronizes the provider and settings of its components between the
Cluster object and bootstrap Helm bundle. Settings provided in
the cluster configuration have higher priority than the default
settings of the bootstrap cluster, except CDN.
4
ProviderEnabledInBootstrap
Enables the provider and its components if any were disabled by the
Bootstrap Controller during preparation of the bootstrap region.
A cluster and machines deployment starts after the provider enablement.
5
Nodes readiness
Waits for the provider to complete nodes deployment that comprises VMs
creation and MKE installation.
6
ObjectsCreated
Creates required namespaces and IAM secrets.
7
ProviderConfigured
Verifies the provider configuration in the provisioned cluster.
8
HelmBundleReady
Verifies the Helm bundle readiness for the provisioned cluster.
9
ControllersDisabledBeforePivot
Collects the list of deployment controllers and disables them to
prepare for pivot.
10
PivotDone
Moves all cluster-related objects from the bootstrap cluster to the
provisioned cluster. The copies of Cluster and Machine objects
remain in the bootstrap cluster to provide the status information to the
user. About every minute, the Bootstrap Controller reconciles the status
of the Cluster and Machine objects of the provisioned cluster
to the bootstrap cluster.
11
ControllersEnabledAfterPivot
Enables controllers in the provisioned cluster.
12
MachinesLCMAgentUpdated
Updates the lcm-agent configuration on machines to target LCM
agents to the provisioned cluster.
13
HelmControllerDisabledBeforeConfig
Disables the Helm Controller before reconfiguration.
14
HelmControllerConfigUpdated
Updates the Helm Controller configuration for the provisioned cluster.
15
Cluster readiness
Contains information about the global cluster status. The Bootstrap
Controller verifies that OIDC, Helm releases, and all Deployments are
ready. Once the cluster is ready, the Bootstrap Controller stops
managing the cluster.
Since Container Cloud 2.27.3 (Cluster release 16.2.3), support
for vSphere-based clusters is suspended. For details, see
Deprecation notes.
The setup of a bootstrap cluster comprises preparation of the seed node,
configuration of environment variables, acquisition of the Container Cloud
license file, and execution of the bootstrap script. The script eventually
generates a link to the Bootstrap web UI for the management cluster deployment.
Install basic Ubuntu 20.04 server using standard installation images
of the operating system on the bare metal seed node.
Log in to the seed node that is running Ubuntu 20.04.
Prepare the system and network configuration:
Establish a virtual bridge using an IP address of the PXE network on the
seed node. Use the following netplan-based configuration file
as an example:
# cat /etc/netplan/config.yamlnetwork:version:2renderer:networkdethernets:ens3:dhcp4:falsedhcp6:falsebridges:br0:addresses:# Replace with IP address from PXE network to create a virtual bridge-10.0.0.15/24dhcp4:falsedhcp6:false# Adjust for your environmentgateway4:10.0.0.1interfaces:# Interface name may be different in your environment-ens3nameservers:addresses:# Adjust for your environment-8.8.8.8parameters:forward-delay:4stp:false
Apply the new network configuration using netplan:
The system output must contain a json file with no error messages.
In case of errors, follow the steps provided in Troubleshooting.
Note
If you require all Internet access to go through a proxy server
for security and audit purposes, configure Docker proxy settings
as described in the official
Docker documentation.
Verify that the seed node has direct access to the Baseboard
Management Controller (BMC) of each bare metal host. All target
hardware nodes must be in the poweroff state.
The provisioning IP address in the PXE network. This address will be
assigned on the seed node to the interface defined by the
KAAS_BM_PXE_BRIDGE parameter described below. The PXE service
of the bootstrap cluster uses this address to network boot
bare metal hosts.
172.16.59.5
KAAS_BM_PXE_MASK
The PXE network address prefix length to be used with the
KAAS_BM_PXE_IP address when assigning it to the seed node
interface.
24
KAAS_BM_PXE_BRIDGE
The PXE network bridge name that must match the name of the bridge
created on the seed node during the Set up a bootstrap cluster stage.
br0
Optional. Add the following environment variables to bootstrap the cluster
using proxy:
After the bootstrap cluster is set up, the bootstrap-proxy object is
created with the provided proxy settings. You can use this object later for
the Cluster object configuration.
Deploy the bootstrap cluster:
./bootstrap.shbootstrapv2
When the bootstrap is complete, the system outputs a link to the Bootstrap
web UI.
Make sure that port 80 is open for localhost to prevent security
requirements for the seed node:
Access the Bootstrap web UI. It does not require any authorization.
The bootstrap cluster setup automatically creates the following objects
that you can view in the Bootstrap web UI:
Bootstrap SSH key
The SSH key pair is automatically generated by the bootstrap script and
the private key is added to the kaas-bootstrap folder. The public
key is automatically created in the bootstrap cluster as the
bootstrap-key object. It will be used later for setting up the
cluster machines.
Bootstrap proxy
If a bootstrap cluster is configured with proxy settings, the
bootstrap-proxy object is created. It will be automatically used in
the cluster configuration unless a custom proxy is specified.
Management kubeconfig
If a bootstrap cluster is provided with the management cluster
kubeconfig, it will be uploaded as a secret to the bootstrap cluster
to the default and kaas projects as management-kubeconfig.
Deploy a management cluster using the Container Cloud API¶
Caution
Since Container Cloud 2.27.3 (Cluster release 16.2.3), support
for vSphere-based clusters is suspended. For details, see
Deprecation notes.
This section contains an overview of the cluster-related objects along with
the configuration procedure of these objects during deployment of a
management cluster using Bootstrap v2 through the Container Cloud API.
Overview of the cluster-related objects in the Container Cloud API/CLI¶
The following cluster-related objects are available through the Container
Cloud API. Use these objects to deploy a management cluster using the
Container Cloud API.
Region and provider names for a management cluster and all related
objects. First object to create in the bootstrap cluster. For
the bootstrap region definition, see Introduction.
ProviderCredentials
Provider credentials to access cloud infrastructure where the
Container Cloud machines are deployed. Before Container Cloud 2.26.0
(Cluster releases 17.1.0 and 16.1.0), requires the region name label.
SSHKey
Optional. SSH configuration with any number of SSH public keys to be
added to cluster machines.
By default, any bootstrap cluster has a pregenerated bootstrap-key
object to use for the cluster configuration. This is the service SSH key
used by the Bootstrap Controller to access machines for their
deployment. The private part of bootstrap-key is always saved to
kaas-bootstrap/ssh_key.
Proxy
Proxy configuration. Mandatory for offline environments with no direct
access to the Internet. Such configuration usually contains proxy for
the bootstrap cluster and already has the bootstrap-proxy object
to use in the cluster configuration by default.
Provider-specific configuration for a management cluster. Before
Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0), requires
the region name label with the name of the BootstrapRegion object.
Machine
Machine configuration that must fit the following requirements:
Role - only manager
Number - odd for the management cluster HA
Mandatory labels - provider, cluster-name, and region
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
ServiceUser
Service user is the initial user to create in Keycloak for
access to a newly deployed management cluster. By default, it has the
global-admin, operator (namespaced), and bm-pool-operator
(namespaced) roles.
You can delete serviceuser after setting up other required users with
specific roles or after any integration with an external identity provider,
such as LDAP.
BareMetalHost
For the bare metal provider only. Information about hardware
configuration of a machine. Required for further machine selection
during bootstrap. For details, see API Reference: BareMetalHost.
BareMetalHostCredential
For the bare metal provider only. The object is created for each
BareMetalHost and contains information about the Baseboard
Management Controller (bmc) credentials. For details,
see API Reference: BareMetalHostCredential.
BareMetalHostProfile
For the bare metal provider only. Provisioning and configuration
settings of the storage devices and the operating system. For details,
see API Reference: BareMetalHostProfile.
L2Template
For the bare metal provider only. Advanced host networking configuration
for clusters, which enables, for example, creation of bond interfaces
on top of physical interfaces on the host or the use of multiple subnets
to separate different types of network traffic. For details, see
API Reference: L2Template.
MetalLBConfig
For the bare metal provider only. Default and mandatory object for the
MetalLB configuration. For details, see
API Reference: MetalLBConfig.
Before Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0)
contains a reference to the MetalLBConfigTemplate object, which is
deprecated in 2.27.0.
MetalLBConfigTemplate
For the bare metal provider only. Deprecated in Container Cloud 2.27.0
(17.2.0 and 16.2.0). Before Container Cloud 2.27.0, default object for
the MetalLB configuration, which enables the use of Subnet objects
to define MetalLB IP address pools. For details, see
API Reference: MetalLBConfigTemplate.
Subnet
For the bare metal provider only. Configuration for IP address
allocation for cluster nodes. For details, see
API Reference: Subnet.
Since Container Cloud 2.27.3 (Cluster release 16.2.3), support
for vSphere-based clusters is suspended. For details, see
Deprecation notes.
The following procedure describes how to prepare and deploy a management
cluster using Bootstrap v2 by operating YAML templates available in the
kaas-bootstrap/templates/ folder.
If you deploy Container Cloud on top of MOSK Victoria with Tungsten Fabric
and use the default security group for newly created load balancers, add the
following rules for the Kubernetes API server endpoint, Container Cloud
application endpoint, and for the MKE web UI and API using the OpenStack CLI:
direction='ingress'
ethertype='IPv4'
protocol='tcp'
remote_ip_prefix='0.0.0.0/0'
port_range_max and port_range_min:
'443' for Kubernetes API and Container Cloud application endpoints
'6443' for MKE web UI and API
Verify access to the target cloud endpoint from Docker. For example:
Depending on the selected provider, navigate to one of the following
locations:
Bare metal: kaas-bootstrap/templates/bm
OpenStack: kaas-bootstrap/templates
Warning
The kubectl apply command automatically saves the
applied data as plain text into the
kubectl.kubernetes.io/last-applied-configuration annotation of the
corresponding object. This may result in revealing sensitive data in this
annotation when creating or modifying objects containing credentials.
Such Container Cloud objects include:
BareMetalHostCredential
ClusterOIDCConfiguration
License
OpenstackCredential
Proxy
ServiceUser
TLSConfig
Therefore, do not use kubectl apply on these objects.
Use kubectl create, kubectl patch, or
kubectl edit instead.
If you used kubectl apply on these objects, you
can remove the kubectl.kubernetes.io/last-applied-configuration
annotation from the objects using kubectl edit.
Create the BootstrapRegion object by modifying
bootstrapregion.yaml.template.
Configuration of bootstrapregion.yaml.template
Select from the following options:
Since Container Cloud 2.26.0 (Cluster releases 16.1.0 and 17.1.0),
set the required <providerName> and use the default
<regionName>, which is region-one.
Before Container Cloud 2.26.0, set the required <providerName>
and <regionName>.
For the OpenStack provider only. Create the Credentials object by
modifying <providerName>-config.yaml.template.
Configuration for OpenStack credentials
Add the provider-specific parameters:
Parameter
Description
SET_OS_AUTH_URL
Identity endpoint URL.
SET_OS_USERNAME
OpenStack user name.
SET_OS_PASSWORD
Value of the OpenStack password. This field is available only
when the user creates or changes password. Once the controller
detects this field, it updates the password in the secret and
removes the value field from the OpenStackCredential
object.
SET_OS_PROJECT_ID
Unique ID of the OpenStack project.
Skip this step since Container Cloud 2.26.0. Before this release, set
the kaas.mirantis.com/region:<regionName> label that must match
the BootstrapRegion object name.
Skip this step since Container Cloud 2.26.0. Before this release, set
the kaas.mirantis.com/regional-credential label to "true"
to use the credentials for the management cluster deployment. For
example, for OpenStack:
The output of the command must be "true". Otherwise, fix the issue
with credentials before proceeding to the next step.
Create the ServiceUser object by modifying
serviceusers.yaml.template.
Configuration of serviceusers.yaml.template
Service user is the initial user to create in Keycloak for
access to a newly deployed management cluster. By default, it has the
global-admin, operator (namespaced), and bm-pool-operator
(namespaced) roles.
You can delete serviceuser after setting up other required users with
specific roles or after any integration with an external identity provider,
such as LDAP.
The region label must match the BootstrapRegion object name.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Configure and apply the cluster configuration using cluster deployment
templates:
In cluster.yaml.template, set mandatory cluster labels:
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Configure provider-specific settings as required.
Bare metal
Inspect the default bare metal host profile definition in
templates/bm/baremetalhostprofiles.yaml.template and adjust it to fit
your hardware configuration. For details, see Customize the default bare metal host profile.
Warning
Any data stored on any device defined in the fileSystems
list can be deleted or corrupted during cluster (re)deployment. It happens
because each device from the fileSystems list is a part of the
rootfs directory tree that is overwritten during (re)deployment.
Examples of affected devices include:
A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a
file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it
The wipe field (deprecated) or wipeDevice structure (recommended
since Container Cloud 2.26.0) have no effect in this case and cannot
protect data on these devices.
Therefore, to prevent data loss, move the necessary data from these file
systems to another server beforehand, if required.
In templates/bm/baremetalhosts.yaml.template, update the bare metal host
definitions according to your environment configuration. Use the reference
table below to manually set all parameters that start with SET_.
Note
Before Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0),
also set the name of the bootstrapRegion object from
bootstrapregion.yaml.template for the kaas.mirantis.com/region label
across all objects listed in templates/bm/baremetalhosts.yaml.template.
The MAC address of the first master node in the PXE network.
ac:1f:6b:02:84:71
SET_MACHINE_0_BMC_ADDRESS
The IP address of the BMC endpoint for the first master node in
the cluster. Must be an address from the OOB network
that is accessible through the management network gateway.
The MAC address of the second master node in the PXE network.
ac:1f:6b:02:84:72
SET_MACHINE_1_BMC_ADDRESS
The IP address of the BMC endpoint for the second master node in
the cluster. Must be an address from the OOB network
that is accessible through the management network gateway.
The MAC address of the third master node in the PXE network.
ac:1f:6b:02:84:73
SET_MACHINE_2_BMC_ADDRESS
The IP address of the BMC endpoint for the third master node in
the cluster. Must be an address from the OOB network
that is accessible through the management network gateway.
The parameter requires a user name and password in plain text.
Configure cluster network:
Important
Bootstrap V2 supports only separated
PXE and LCM networks.
To ensure successful bootstrap, enable asymmetric routing on the interfaces
of the management cluster nodes. This is required because the seed node
relies on one network by default, which can potentially cause
traffic asymmetry.
In the kernelParameters section of
bm/baremetalhostprofiles.yaml.template, set rp_filter to 2.
This enables loose mode as defined in
RFC3704.
Example configuration of asymmetric routing
...kernelParameters:...sysctl:# Enables the "Loose mode" for the "k8s-lcm" interface (management network)net.ipv4.conf.k8s-lcm.rp_filter:"2"# Enables the "Loose mode" for the "bond0" interface (PXE network)net.ipv4.conf.bond0.rp_filter:"2"...
Note
More complicated solutions that are not described in this manual
include getting rid of traffic asymmetry, for example:
Configure source routing on management cluster nodes.
Plug the seed node into the same networks as the management cluster nodes,
which requires custom configuration of the seed node.
Update the network objects definition in
templates/bm/ipam-objects.yaml.template according to the environment
configuration. By default, this template implies the use of separate PXE
and life-cycle management (LCM) networks.
Manually set all parameters that start with SET_.
For configuration details of bond network interface for the PXE and management
network, see Configure NIC bonding.
Example of the default L2 template snippet for a management cluster:
In this example, the following configuration applies:
A bond of two NIC interfaces
A static address in the PXE network set on the bond
An isolated L2 segment for the LCM network is configured using
the k8s-lcm VLAN with the static address in the LCM network
The default gateway address is in the LCM network
For general concepts of configuring separate PXE and LCM networks for a
management cluster, see Separate PXE and management networks. For the latest object
templates and variable names to use, see the following tables.
The below table contains examples of mandatory parameter values to set
in templates/bm/ipam-objects.yaml.template for the network scheme that
has the following networks:
172.16.59.0/24 - PXE network
172.16.61.0/25 - LCM network
Mandatory network parameters of the IPAM objects template¶
Parameter
Description
Example value
SET_PXE_CIDR
The IP address of the PXE network in the CIDR notation. The minimum
recommended network size is 256 addresses (/24 prefix length).
172.16.59.0/24
SET_PXE_SVC_POOL
The IP address range to use for endpoints of load balancers in the PXE
network for the Container Cloud services: Ironic-API, DHCP server,
HTTP server, and caching server. The minimum required range size is
5 addresses.
172.16.59.6-172.16.59.15
SET_PXE_ADDR_POOL
The IP address range in the PXE network to use for dynamic address
allocation for hosts during inspection and provisioning.
The minimum recommended range size is 30 addresses for management
cluster nodes if it is located in a separate PXE network segment.
Otherwise, it depends on the number of managed cluster nodes to
deploy in the same PXE network segment as the management cluster nodes.
172.16.59.51-172.16.59.200
SET_PXE_ADDR_RANGE
The IP address range in the PXE network to use for static address
allocation on each management cluster node. The minimum recommended
range size is 6 addresses.
172.16.59.41-172.16.59.50
SET_MGMT_CIDR
The IP address of the LCM network for the management cluster
in the CIDR notation.
If managed clusters will have their separate LCM networks, those
networks must be routable to the LCM network. The minimum
recommended network size is 128 addresses (/25 prefix length).
172.16.61.0/25
SET_MGMT_NW_GW
The default gateway address in the LCM network. This gateway
must provide access to the OOB network of the Container Cloud cluster
and to the Internet to download the Mirantis artifacts.
172.16.61.1
SET_LB_HOST
The IP address of the externally accessible MKE API endpoint
of the cluster in the CIDR notation. This address must be within
the management SET_MGMT_CIDR network but must NOT overlap
with any other addresses or address ranges within this network.
External load balancers are not supported.
172.16.61.5/32
SET_MGMT_DNS
An external (non-Kubernetes) DNS server accessible from the
LCM network.
8.8.8.8
SET_MGMT_ADDR_RANGE
The IP address range that includes addresses to be allocated to
bare metal hosts in the LCM network for the management cluster.
When this network is shared with managed clusters, the size of this
range limits the number of hosts that can be deployed in all clusters
sharing this network.
When this network is solely used by a management cluster, the range
must include at least 6 addresses for bare metal hosts of the
management cluster.
172.16.61.30-172.16.61.40
SET_MGMT_SVC_POOL
The IP address range to use for the externally accessible endpoints
of load balancers in the LCM network for the Container Cloud
services, such as Keycloak, web UI, and so on. The minimum required
range size is 19 addresses.
172.16.61.10-172.16.61.29
SET_VLAN_ID
The VLAN ID used for isolation of LCM network. The
bootstrap.sh process and the seed node must have routable
access to the network in this VLAN.
3975
When using separate PXE and LCM networks, the management cluster
services are exposed in different networks using two separate MetalLB
address pools:
Services exposed through the PXE network are as follows:
Ironic API as a bare metal provisioning server
HTTP server that provides images for network boot and server
provisioning
Caching server for accessing the Container Cloud artifacts deployed
on hosts
Services exposed through the LCM network are all other
Container Cloud services, such as Keycloak, web UI, and so on.
The default MetalLB configuration described in the MetalLBConfigTemplate
object template of templates/bm/ipam-objects.yaml.template uses two
separate MetalLB address pools. Also, it uses the interfaces selector
in its l2Advertisements template.
Caution
When you change the L2Template object template in
templates/bm/ipam-objects.yaml.template, ensure that interfaces
listed in the interfaces field of the
MetalLBConfigTemplate.spec.templates.l2Advertisements section
match those used in your L2Template. For details about the
interfaces selector, see API Reference:
MetalLBConfigTemplate spec.
In cluster.yaml.template, update the cluster-related
settings to fit your deployment.
Optional. Enable WireGuard for traffic encryption on the Kubernetes workloads
network.
WireGuard configuration
Ensure that the Calico MTU size is at least 60 bytes smaller than the
interface MTU size of the workload network. IPv4 WireGuard uses a
60-byte header. For details, see Set the MTU size for Calico.
In templates/bm/cluster.yaml.template, enable WireGuard by adding
the secureOverlay parameter:
spec:...providerSpec:value:...secureOverlay:true
Caution
Changing this parameter on a running cluster causes a
downtime that can vary depending on the cluster size.
Adjust the templates/cluster.yaml.template parameters to suit your
deployment:
In the spec::providerSpec::value section, add the mandatory
ExternalNetworkID parameter that is the ID of an external
OpenStack network. It is required to have public Internet access
to virtual machines.
In the spec::clusterNetwork::services section, add the
corresponding values for cidrBlocks.
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Configure the provider-specific settings:
Bare metal
Inspect the machines.yaml.template and adjust spec and labels of
each entry according to your deployment. Adjust
spec.providerSpec.value.hostSelector values to match BareMetalHost
corresponding to each machine. For details, see API Reference:
Bare metal Machine spec.
OpenStack
In templates/machines.yaml.template, modify the
spec:providerSpec:value section for 3 control plane nodes marked with
the cluster.sigs.k8s.io/control-plane label by substituting the
flavor and image parameters with the corresponding values
of the control plane nodes in the related OpenStack cluster. For example:
The flavor parameter value provided in the example above
is cloud-specific and must meet the Container Cloud
requirements.
Optional. Available as TechPreview. To boot cluster machines from a block
storage volume, define the following parameter in the spec:providerSpec
section of templates/machines.yaml.template:
To boot the Bastion node from a volume, add the same parameter to
templates/cluster.yaml.template in the spec:providerSpec section
for Bastion. The default amount of storage 80 is enough.
Also, modify other parameters as required.
For the bare metal provider, monitor the inspecting process of the
baremetal hosts and wait until all hosts are in the available state:
kubectlgetbmh-ogo-template='{{- range .items -}} {{.status.provisioning.state}}{{"\n"}} {{- end -}}'
Example of system response:
available
available
available
Monitor the BootstrapRegion object status and wait until it is ready.
For a more convenient system response, consider using dedicated tools such
as jq or yq and adjust the -o flag to output in
json or yaml format accordingly.
Note
For the bare metal provider, before Container Cloud 2.26.0,
the BareMetalObjectReferences condition is not mandatory and may
remain in the notready state with no effect on the
BootstrapRegion object. Since Container Cloud 2.26.0, this condition
is mandatory.
Change the directory to /kaas-bootstrap/.
Approve the BootstrapRegion object to start the cluster deployment:
Not all of Swarm and MCR addresses are usually in use. One Swarm Ingress
network is created by default and occupies the 10.0.0.0/24 address
block. Also, three MCR networks are created by default and occupy
three address blocks: 10.99.0.0/20, 10.99.16.0/20,
10.99.32.0/20.
To verify the actual networks state and addresses in use, run:
dockernetworkls
dockernetworkinspect<networkName>
Optional for the bare metal provider. If you plan to use multiple L2 segments for provisioning of managed
cluster nodes, consider the requirements specified in Configure multiple DHCP ranges using Subnet resources.
Deploy a management cluster using the Container Cloud Bootstrap web UI¶
Caution
Since Container Cloud 2.27.3 (Cluster release 16.2.3), support
for vSphere-based clusters is suspended. For details, see
Deprecation notes.
This section describes how to configure the cluster-related objects and
deploy a management cluster using Bootstrap v2 through the Container Cloud
Bootstrap web UI.
Create a management cluster for the OpenStack provider¶
This section describes how to create an OpenStack-based management cluster
using the Container Cloud Bootstrap web UI.
To create an OpenStack-based management cluster:
If you deploy Container Cloud on top of MOSK Victoria with Tungsten Fabric
and use the default security group for newly created load balancers, add the
following rules for the Kubernetes API server endpoint, Container Cloud
application endpoint, and for the MKE web UI and API using the OpenStack CLI:
direction='ingress'
ethertype='IPv4'
protocol='tcp'
remote_ip_prefix='0.0.0.0/0'
port_range_max and port_range_min:
'443' for Kubernetes API and Container Cloud application endpoints
Optional. Recommended. Leave the
Guided Bootstrap configuration check box selected.
It enables the cluster creation helper in the next window with a series
of guided steps for a complete setup of a functional management cluster.
The cluster creation helper contains the same configuration windows as in
separate tabs of the left-side menu, but the helper enables the
configuration of essential provider components one-by-one inside one modal
window.
If you select this option, use the corresponding steps of this procedure
described below for description of each tab in
Guided Bootstrap configuration.
Click Save.
In the Status column of the Bootstrap page,
monitor the bootstrap region readiness by hovering over the status icon of
the bootstrap region.
Once the orange blinking status icon becomes green and Ready,
the bootstrap region deployment is complete. If the cluster status
is Error, refer to Troubleshooting.
You can monitor live deployment status of the following bootstrap region
components:
Component
Status description
Helm
Installation status of bootstrap Helm releases
Provider
Status of provider configuration and installation for related charts
and Deployments
Deployments
Readiness of all Deployments in the bootstrap cluster
Configure credentials for the new cluster.
Credentials configuration
In the Credentials tab:
Click Add Credential to add your OpenStack credentials.
You can either upload your OpenStack clouds.yaml configuration
file or fill in the fields manually.
Verify that the new credentials status is Ready.
If the status is Error, hover over the status to determine
the reason of the issue.
Optional. In the SSH Keys tab, click Add SSH Key
to upload the public SSH key(s) for VMs creation.
Optional. Enable proxy access to the cluster.
Proxy configuration
In the Proxies tab, configure proxy:
Click Add Proxy.
In the Add New Proxy wizard, fill out the form
with the following parameters:
If your proxy requires a trusted CA certificate, select the
CA Certificate check box and paste a CA certificate for a MITM
proxy to the corresponding field or upload a certificate using
Upload Certificate.
In the Clusters tab, click Create Cluster
and fill out the form with the following parameters:
Cluster configuration
Add Cluster name.
Set the provider Service User Name and
Service User Password.
Service user is the initial user to create in Keycloak for
access to a newly deployed management cluster. By default, it has the
global-admin, operator (namespaced), and bm-pool-operator
(namespaced) roles.
You can delete serviceuser after setting up other required users with
specific roles or after any integration with an external identity provider,
such as LDAP.
Configure general provider settings and Kubernetes parameters:
Technology Preview: select Boot From Volume to boot the
Bastion node from a block storage volume and select the required amount
of storage (80 GB is enough).
Kubernetes
Node CIDR
The Kubernetes nodes CIDR block. For example, 10.10.10.0/24.
Services CIDR Blocks
The Kubernetes Services CIDR block. For example, 10.233.0.0/18.
Pods CIDR Blocks
The Kubernetes Pods CIDR block. For example, 10.233.64.0/18.
Note
The network subnet size of Kubernetes pods influences the number of
nodes that can be deployed in the cluster.
The default subnet size /18 is enough to create a cluster with
up to 256 nodes. Each node uses the /26 address blocks
(64 addresses), at least one address block is allocated per node.
These addresses are used by the Kubernetes pods with
hostNetwork:false. The cluster size may be limited
further when some nodes use more than one address block.
Configure StackLight:
StackLight configuration
Click Create.
Add machines to the bootstrap cluster:
Machines configuration
In the Clusters tab, click the required cluster name.
The cluster page with Machines list opens.
Specify the odd number of machines to create. Only
Manager machines are allowed.
Caution
The required minimum number of manager machines is three for HA.
A cluster can have more than three manager machines but only an odd number of
machines.
In an even-sized cluster, an additional machine remains in the Pending
state until an extra manager machine is added.
An even number of manager machines does not provide additional fault
tolerance but increases the number of node required for etcd quorum.
Flavor
From the drop-down list, select the required hardware
configuration for the machine. The list of available flavors
corresponds to the one in your OpenStack environment.
A Container Cloud cluster based on both Ubuntu and
CentOS operating systems is not supported.
Availability Zone
From the drop-down list, select the availability zone from which
the new machine will be launched.
Configure Server Metadata
Optional. Select Configure Server Metadata and add
the required number of string key-value pairs for the machine
meta_data configuration in cloud-init.
Prohibited keys are: KaaS, cluster, clusterID,
namespace as they are used by Container Cloud.
Boot From Volume
Optional. Technology Preview. Select to boot a machine from a block
storage volume. Use the Up and Down arrows
in the Volume Size (GiB) field to define the required
volume size.
This option applies to clouds that do not have enough space on
hypervisors. After enabling this option, the Cinder storage
is used instead of the Nova storage.
Click Create.
Optional. Using the Container Cloud CLI, modify the provider-specific and
other cluster settings as described in Configure optional cluster settings.
Select from the following options to start cluster deployment:
If you use the Guided Bootstrap configuration
Click Deploy.
If you use the left-side web UI menu
Approve the previously created bootstrap region using the Container Cloud
CLI:
Once you approve the bootstrap region, no cluster or
machine modification is allowed.
Monitor the deployment progress of the cluster and machines.
Monitoring of the cluster readiness
To monitor the cluster readiness, hover over the status icon of a specific
cluster in the Status column of the Clusters page.
Once the orange blinking status icon becomes green and Ready,
the cluster deployment or update is complete.
You can monitor live deployment status of the following cluster components:
Component
Description
Bastion
For the OpenStack-based management clusters, the Bastion node
IP address status that confirms the Bastion node creation
Helm
Installation or upgrade status of all Helm releases
Kubelet
Readiness of the node in a Kubernetes cluster, as reported by kubelet
Kubernetes
Readiness of all requested Kubernetes objects
Nodes
Equality of the requested nodes number in the cluster to the number
of nodes having the Ready LCM status
OIDC
Readiness of the cluster OIDC configuration
StackLight
Health of all StackLight-related objects in a Kubernetes cluster
Swarm
Readiness of all nodes in a Docker Swarm cluster
LoadBalancer
Readiness of the Kubernetes API load balancer
ProviderInstance
Readiness of all machines in the underlying infrastructure
(virtual or bare metal, depending on the provider type)
Graceful Reboot
Readiness of a cluster during a scheduled graceful reboot,
available since Cluster releases 15.0.1 and 14.0.0.
Infrastructure Status
Available since Container Cloud 2.25.0 for bare metal and OpenStack
providers. Readiness of the following cluster components:
Bare metal: the MetalLBConfig object along with MetalLB and DHCP
subnets.
OpenStack: cluster network, routers, load balancers, and Bastion
along with their ports and floating IPs.
LCM Operation
Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and
16.1.0). Health of all LCM operations on the cluster and its machines.
LCM Agent
Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and
16.2.0). Health of all LCM agents on cluster machines and the status of
LCM agents update to the version from the current Cluster release.
To monitor machines readiness, use the status icon of a specific machine on
the Clusters page.
Quick status
On the Clusters page, in the Managers column.
The green status icon indicates that the machine is Ready,
the orange status icon indicates that the machine is Updating.
Detailed status
In the Machines section of a particular cluster
page, in the Status column. Hover over a particular machine
status icon to verify the deploy or update status of a
specific machine component.
You can monitor the status of the following machine components:
Component
Description
Kubelet
Readiness of a node in a Kubernetes cluster.
Swarm
Health and readiness of a node in a Docker Swarm cluster.
LCM
LCM readiness status of a node.
ProviderInstance
Readiness of a node in the underlying infrastructure
(virtual or bare metal, depending on the provider type).
Graceful Reboot
Readiness of a machine during a scheduled graceful reboot of a cluster,
available since Cluster releases 15.0.1 and 14.0.0.
Infrastructure Status
Available since Container Cloud 2.25.0 for the bare metal provider only.
Readiness of the IPAMHost, L2Template, BareMetalHost, and
BareMetalHostProfile objects associated with the machine.
LCM Operation
Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and
16.1.0). Health of all LCM operations on the machine.
LCM Agent
Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and
16.2.0). Health of the LCM Agent on the machine and the status of the
LCM Agent update to the version from the current Cluster release.
The machine creation starts with the Provision status.
During provisioning, the machine is not expected to be accessible
since its infrastructure (VM, network, and so on) is being created.
Other machine statuses are the same as the LCMMachine object states:
Uninitialized - the machine is not yet assigned to an LCMCluster.
Pending - the agent reports a node IP address and host name.
Prepare - the machine executes StateItems that correspond
to the prepare phase. This phase usually involves downloading
the necessary archives and packages.
Deploy - the machine executes StateItems that correspond
to the deploy phase that is becoming a Mirantis Kubernetes Engine (MKE)
node.
Ready - the machine is being deployed.
Upgrade - the machine is being upgraded to the new MKE version.
Reconfigure - the machine executes StateItems that correspond
to the reconfigure phase. The machine configuration is being updated
without affecting workloads running on the machine.
Once the status changes to Ready, the deployment of the cluster
components on this machine is complete.
You can also monitor the live machine status using API:
kubectlgetmachines<machineName>-owide
Example of system response since Container Cloud 2.23.0:
Not all of Swarm and MCR addresses are usually in use. One Swarm Ingress
network is created by default and occupies the 10.0.0.0/24 address
block. Also, three MCR networks are created by default and occupy
three address blocks: 10.99.0.0/20, 10.99.16.0/20,
10.99.32.0/20.
To verify the actual networks state and addresses in use, run:
dockernetworkls
dockernetworkinspect<networkName>
Note
The Bootstrap web UI support for the bare metal provider will be
added in one of the following Container Cloud releases.
Before adding new BareMetalHost objects, configure hardware hosts to
correctly load them over the PXE network.
Important
Consider the following common requirements for hardware hosts
configuration:
Update firmware for BIOS and Baseboard Management Controller (BMC) to the
latest available version, especially if you are going to apply the UEFI
configuration.
Container Cloud uses the ipxe.efi binary loader that might be not
compatible with old firmware and have vendor-related issues with UEFI
booting. For example, the Supermicro issue.
In this case, we recommend using the legacy booting format.
Configure all or at least the PXE NIC on switches.
If the hardware host has more than one PXE NIC to boot, we strongly
recommend setting up only one in the boot order. It speeds up the
provisioning phase significantly.
Some hardware vendors require a host to be rebooted during BIOS
configuration changes from legacy to UEFI or vice versa for the
extra option with NIC settings to appear in the menu.
Connect only one Ethernet port on a host to the PXE network at any given
time. Collect the physical address (MAC) of this interface and use it to
configure the BareMetalHost object describing the host.
To configure BIOS on a bare metal host:
Legacy hardware host configuration
Enable the global BIOS mode using
BIOS > Boot > boot mode select > legacy. Reboot the host
if required.
Enable the LAN-PXE-OPROM support using the following menus:
This section describes the bare metal host profile settings and
instructs how to configure this profile before deploying
Mirantis Container Cloud on physical servers.
The bare metal host profile is a Kubernetes custom resource.
It allows the Infrastructure Operator to define how the storage devices
and the operating system are provisioned and configured.
The bootstrap templates for a bare metal deployment include the template for
the default BareMetalHostProfile object in the following file
that defines the default bare metal host profile:
templates/bm/baremetalhostprofiles.yaml.template
Note
Using BareMetalHostProfile, you can configure LVM or mdadm-based
software RAID support during a management or managed cluster
creation. For details, see Configure RAID support.
This feature is available as Technology Preview. Use such
configuration for testing and evaluation purposes only. For the
Technology Preview feature definition, refer to Technology Preview features.
Warning
Any data stored on any device defined in the fileSystems
list can be deleted or corrupted during cluster (re)deployment. It happens
because each device from the fileSystems list is a part of the
rootfs directory tree that is overwritten during (re)deployment.
Examples of affected devices include:
A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a
file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it
The wipe field (deprecated) or wipeDevice structure (recommended
since Container Cloud 2.26.0) have no effect in this case and cannot
protect data on these devices.
Therefore, to prevent data loss, move the necessary data from these file
systems to another server beforehand, if required.
The customization procedure of BareMetalHostProfile is almost the same for
the management and managed clusters, with the following differences:
For a management cluster, the customization automatically applies
to machines during bootstrap. And for a managed cluster, you apply
the changes using kubectl before creating a managed cluster.
For a management cluster, you edit the default
baremetalhostprofiles.yaml.template. And for a managed cluster, you
create a new BareMetalHostProfile with the necessary configuration.
For the procedure details, see Create a custom bare metal host profile.
Use this procedure for both types of clusters considering the differences
described above.
You can configure L2 templates for the management cluster to set up
a bond network interface for the PXE and management network.
This configuration must be applied to the bootstrap templates,
before you run the bootstrap script to deploy the management
cluster.
..admonition:: Configuration requirements for NIC bonding
Add at least two physical interfaces to each host in your management
cluster.
Connect at least two interfaces per host to an Ethernet switch
that supports Link Aggregation Control Protocol (LACP)
port groups and LACP fallback.
Configure an LACP group on the ports connected
to the NICs of a host.
Configure the LACP fallback on the port group to ensure that
the host can boot over the PXE network before the bond interface
is set up on the host operating system.
Configure server BIOS for both NICs of a bond to be PXE-enabled.
If the server does not support booting from multiple NICs,
configure the port of the LACP group that is connected to the
PXE-enabled NIC of a server to be the primary port.
With this setting, the port becomes active in the fallback mode.
Configure the ports that connect servers to the PXE network with the
PXE VLAN as native or untagged.
For reference configuration of network fabric in a baremetal-based cluster,
see Network fabric.
To configure a bond interface that aggregates two interfaces
for the PXE and management network:
In kaas-bootstrap/templates/bm/ipam-objects.yaml.template:
Verify that only the following parameters for the declaration
of {{nic0}} and {{nic1}} are set, as shown in the example
below:
dhcp4
dhcp6
match
set-name
Remove other parameters.
Verify that the declaration of the bond interface bond0 has the
interfaces parameter listing both Ethernet interfaces.
Verify that the node address in the PXE network (ip"bond0:mgmt-pxe"
in the below example) is bound to the bond interface or to the virtual
bridge interface tied to that bond.
Caution
No VLAN ID must be configured for the PXE network
from the host side.
Configure bonding options using the parameters field. The only
mandatory option is mode. See the example below for details.
Note
You can set any mode supported by
netplan
and your hardware.
Important
Bond monitoring is disabled in Ubuntu by default. However,
Mirantis highly recommends enabling it using Media Independent Interface
(MII) monitoring by setting the mii-monitor-interval parameter to a
non-zero value. For details, see Linux documentation: bond monitoring.
Verify your configuration using the following example:
This section describes how to configure a dedicated PXE network for a
management bare metal cluster.
A separate PXE network allows isolating sensitive bare metal provisioning
process from the end users. The users still have access to Container Cloud
services, such as Keycloak, to authenticate workloads in managed clusters,
such as Horizon in a Mirantis OpenStack for Kubernetes cluster.
The following table describes the overall network mapping scheme with all
L2/L3 parameters, for example, for two networks, PXE (CIDR 10.0.0.0/24)
and management (CIDR 10.0.11.0/24):
When using separate PXE and management networks, the management cluster
services are exposed in different networks using two separate MetalLB
address pools:
Services exposed through the PXE network are as follows:
Ironic API as a bare metal provisioning server
HTTP server that provides images for network boot and server
provisioning
Caching server for accessing the Container Cloud artifacts deployed
on hosts
Services exposed through the management network are all other Container Cloud
services, such as Keycloak, web UI, and so on.
To configure separate PXE and management networks:
To ensure successful bootstrap, enable asymmetric routing on the interfaces
of the management cluster nodes. This is required because the seed node
relies on one network by default, which can potentially cause
traffic asymmetry.
In the kernelParameters section of
bm/baremetalhostprofiles.yaml.template, set rp_filter to 2.
This enables loose mode as defined in
RFC3704.
Example configuration of asymmetric routing
...kernelParameters:...sysctl:# Enables the "Loose mode" for the "k8s-lcm" interface (management network)net.ipv4.conf.k8s-lcm.rp_filter:"2"# Enables the "Loose mode" for the "bond0" interface (PXE network)net.ipv4.conf.bond0.rp_filter:"2"...
Note
More complicated solutions that are not described in this manual
include getting rid of traffic asymmetry, for example:
Configure source routing on management cluster nodes.
Plug the seed node into the same networks as the management cluster nodes,
which requires custom configuration of the seed node.
In kaas-bootstrap/templates/bm/ipam-objects.yaml.template:
Substitute all the Subnet object templates with the new ones
as described in the example template below
Update the L2 template spec.l3Layout and spec.npTemplate fields
as described in the example template below
Example of the Subnet object templates
# Subnet object that provides IP addresses for bare metal hosts of# management cluster in the PXE network.apiVersion:"ipam.mirantis.com/v1alpha1"kind:Subnetmetadata:name:mgmt-pxenamespace:defaultlabels:kaas.mirantis.com/provider:baremetalkaas-mgmt-pxe-subnet:""spec:cidr:SET_IPAM_CIDRgateway:SET_PXE_NW_GWnameservers:-SET_PXE_NW_DNSincludeRanges:-SET_IPAM_POOL_RANGEexcludeRanges:-SET_METALLB_PXE_ADDR_POOL---# Subnet object that provides IP addresses for bare metal hosts of# management cluster in the management network.apiVersion:"ipam.mirantis.com/v1alpha1"kind:Subnetmetadata:name:mgmt-lcmnamespace:defaultlabels:kaas.mirantis.com/provider:baremetalkaas-mgmt-lcm-subnet:""ipam/SVC-k8s-lcm:"1"ipam/SVC-ceph-cluster:"1"ipam/SVC-ceph-public:"1"cluster.sigs.k8s.io/cluster-name:CLUSTER_NAMEspec:cidr:{{SET_LCM_CIDR}}includeRanges:-{{SET_LCM_RANGE}}excludeRanges:-SET_LB_HOST-SET_METALLB_ADDR_POOL---# Deprecated since 2.27.0. Subnet object that provides configuration# for "services-pxe" MetalLB address pool that will be used to expose# services LB endpoints in the PXE network.apiVersion:"ipam.mirantis.com/v1alpha1"kind:Subnetmetadata:name:mgmt-pxe-lbnamespace:defaultlabels:kaas.mirantis.com/provider:baremetalmetallb/address-pool-name:services-pxemetallb/address-pool-protocol:layer2metallb/address-pool-auto-assign:"false"cluster.sigs.k8s.io/cluster-name:CLUSTER_NAMEspec:cidr:SET_IPAM_CIDRincludeRanges:-SET_METALLB_PXE_ADDR_POOL
Deprecated since Container Cloud 2.27.0 (Cluster releases 17.2.0 and
16.2.0): the last Subnet template named mgmt-pxe-lb in the example
above will be used to configure the MetalLB address pool in the PXE network.
The bare metal provider will automatically configure MetalLB
with address pools using the Subnet objects identified by specific
labels.
Warning
The bm-pxe address must have a separate interface
with only one address on this interface.
Verify the current MetalLB configuration that is stored in MetalLB
objects:
The auto-assign parameter will be set to false for all address
pools except the default one. So, a particular service will get an
address from such an address pool only if the Service object has a
special metallb.universe.tf/address-pool annotation that points to
the specific address pool name.
Note
It is expected that every Container Cloud service on a management
cluster will be assigned to one of the address pools.
Current consideration is to have two MetalLB address pools:
services-pxe is a reserved address pool name to use for
the Container Cloud services in the PXE network (Ironic API,
HTTP server, caching server).
The bootstrap cluster also uses the services-pxe address
pool for its provision services for management cluster nodes
to be provisioned from the bootstrap cluster. After the
management cluster is deployed, the bootstrap cluster is
deleted and that address pool is solely used by the newly
deployed cluster.
default is an address pool to use for all other Container
Cloud services in the management network. No annotation
is required on the Service objects in this case.
Select from the following options for configuration of the
dedicatedMetallbPools flag:
Since Container Cloud 2.25.0
Skip this step because the flag is hardcoded to true.
Since Container Cloud 2.24.0
Verify that the flag is set to the default true value.
The flag enables splitting of LB endpoints for the Container
Cloud services. The metallb.universe.tf/address-pool annotations on
the Service objects are configured by the bare metal provider
automatically when the dedicatedMetallbPools flag is set to true.
Example Service object configured by the baremetal-operator Helm
release:
The metallb.universe.tf/address-pool annotation on the Service
object is set to services-pxe by the baremetal provider, so the
ironic-api service will be assigned an LB address from the
corresponding MetalLB address pool.
In addition to the network parameters defined in Deploy a management cluster using CLI,
configure the following ones by replacing them in
templates/bm/ipam-objects.yaml.template:
Address of a management network for the management cluster
in the CIDR notation. You can later share this network with managed
clusters where it will act as the LCM network.
If managed clusters have their separate LCM networks,
those networks must be routable to the management network.
10.0.11.0/24
SET_LCM_RANGE
Address range that includes addresses to be allocated to
bare metal hosts in the management network for the management
cluster. When this network is shared with managed clusters,
the size of this range limits the number of hosts that can be
deployed in all clusters that share this network.
When this network is solely used by a management cluster,
the range should include at least 3 IP addresses
for bare metal hosts of the management cluster.
10.0.11.100-10.0.11.109
SET_METALLB_PXE_ADDR_POOL
Address range to be used for LB endpoints of the Container Cloud
services: Ironic-API, HTTP server, and caching server.
This range must be within the PXE network.
The minimum required range is 5 IP addresses.
10.0.0.61-10.0.0.70
The following parameters will now be tied to the management network
while their meaning remains the same as described in
Deploy a management cluster using CLI:
Subnet template parameters migrated to management network¶
Parameter
Description
Example value
SET_LB_HOST
IP address of the externally accessible API endpoint
of the management cluster. This address must NOT be
within the SET_METALLB_ADDR_POOL range but within the
management network. External load balancers are not supported.
10.0.11.90
SET_METALLB_ADDR_POOL
The address range to be used for the externally accessible LB
endpoints of the Container Cloud services, such as Keycloak, web UI,
and so on. This range must be within the management network.
The minimum required range is 19 IP addresses.
Configure multiple DHCP ranges using Subnet resources¶
To facilitate multi-rack and other types of distributed bare metal datacenter
topologies, the dnsmasq DHCP server used for host provisioning in Container
Cloud supports working with multiple L2 segments through network routers that
support DHCP relay.
Container Cloud has its own DHCP relay running on one of the management
cluster nodes. That DHCP relay serves for proxying DHCP requests in the
same L2 domain where the management cluster nodes are located.
Caution
Networks used for hosts provisioning of a managed cluster
must have routes to the PXE network (when a dedicated PXE network
is configured) or to the combined PXE/management network
of the management cluster. This configuration enables hosts to
have access to the management cluster services that are used
during host provisioning.
Management cluster nodes must have routes through the PXE network
to PXE network segments used on a managed cluster.
The following example contains L2 template fragments for a
management cluster node:
l3Layout:# PXE/static subnet for a management cluster-scope:namespacesubnetName:kaas-mgmt-pxelabelSelector:kaas-mgmt-pxe-subnet:"1"# management (LCM) subnet for a management cluster-scope:namespacesubnetName:kaas-mgmt-lcmlabelSelector:kaas-mgmt-lcm-subnet:"1"# PXE/dhcp subnets for a managed cluster-scope:namespacesubnetName:managed-dhcp-rack-1-scope:namespacesubnetName:managed-dhcp-rack-2-scope:namespacesubnetName:managed-dhcp-rack-3...npTemplate:|...bonds:bond0:interfaces:- {{ nic 0 }}- {{ nic 1 }}parameters:mode: active-backupprimary: {{ nic 0 }}mii-monitor-interval: 100dhcp4: falsedhcp6: falseaddresses:# static address on management node in the PXE network- {{ ip "bond0:kaas-mgmt-pxe" }}routes:# routes to managed PXE network segments- to: {{ cidr_from_subnet "managed-dhcp-rack-1" }}via: {{ gateway_from_subnet "kaas-mgmt-pxe" }}- to: {{ cidr_from_subnet "managed-dhcp-rack-2" }}via: {{ gateway_from_subnet "kaas-mgmt-pxe" }}- to: {{ cidr_from_subnet "managed-dhcp-rack-3" }}via: {{ gateway_from_subnet "kaas-mgmt-pxe" }}...
To configure DHCP ranges for dnsmasq, create the Subnet objects
tagged with the ipam/SVC-dhcp-range label while setting up subnets
for a managed cluster using CLI.
Caution
Support of multiple DHCP ranges has the following limitations:
Using of custom DNS server addresses for servers that boot over PXE
is not supported.
The Subnet objects for DHCP ranges cannot be associated with any
specific cluster, as DHCP server configuration is only applicable to the
management cluster where DHCP server is running.
The cluster.sigs.k8s.io/cluster-name label will be ignored.
Note
Before the Cluster release 16.1.0, the Subnet object contains
the kaas.mirantis.com/region label that specifies the region
where the DHCP ranges will be applied.
Migration of DHCP configuration for existing management clusters¶
Note
This section applies only to existing management clusters that
are created before Container 2.24.0.
Caution
Since Container Cloud 2.24.0, you can only remove the deprecated
dnsmasq.dhcp_range, dnsmasq.dhcp_ranges, dnsmasq.dhcp_routers,
and dnsmasq.dhcp_dns_servers values from the cluster spec.
The Admission Controller does not accept any other changes in these values.
This configuration is completely superseded by the Subnet object.
The DHCP configuration automatically migrated from the cluster spec to
Subnet objects after cluster upgrade to 2.21.0.
To remove the deprecated dnsmasq parameters from the cluster spec:
Open the management cluster spec for editing.
In the baremetal-operator release values, remove the
dnsmasq.dhcp_range, dnsmasq.dhcp_ranges, dnsmasq.dhcp_routers,
and dnsmasq.dhcp_dns_servers parameters. For example:
The dnsmasq.dhcp_<name> parameters of the
baremetal-operator Helm chart values in the Clusterspec are
deprecated since the Cluster release 11.5.0 and removed in the
Cluster release 14.0.0.
Ensure that the required DHCP ranges and options are set in the Subnet
objects. For configuration details, see Configure DHCP ranges for dnsmasq.
The dnsmasq configuration options dhcp-option=3 and dhcp-option=6
are absent in the default configuration. So, by default, dnsmasq
will send the DNS server and default route to DHCP clients as defined in the
dnsmasq official documentation:
The netmask and broadcast address are the same as on the host
running dnsmasq.
The DNS server and default route are set to the address of the host
running dnsmasq.
If the domain name option is set, this name is sent to DHCP clients.
Create the Subnet objects tagged with the ipam/SVC-dhcp-range label.
Caution
For cluster-specific subnets, create Subnet objects in the
same namespace as the related Cluster object project. For shared
subnets, create Subnet objects in the default namespace.
Setting of custom nameservers in the DHCP subnet is not supported.
After creation of the above Subnet object, the provided data will be
utilized to render the Dnsmasq object used for configuration of the
dnsmasq deployment. You do not have to manually edit the Dnsmasq object.
Verify that the changes are applied to the Dnsmasq object:
For servers to access the DHCP server across the L2 segment boundaries,
for example, from another rack with a different VLAN for PXE network,
you must configure DHCP relay (agent) service on the border switch
of the segment. For example, on a top-of-rack (ToR) or leaf (distribution)
switch, depending on the data center network topology.
Warning
To ensure predictable routing for the relay of DHCP packets,
Mirantis strongly advises against the use of chained DHCP relay
configurations. This precaution limits the number of hops for DHCP packets,
with an optimal scenario being a single hop.
This approach is justified by the unpredictable nature of chained relay
configurations and potential incompatibilities between software and
hardware relay implementations.
The dnsmasq server listens on the PXE network of the management
cluster by using the dhcp-lb Kubernetes Service.
To configure the DHCP relay service, specify the external address of the
dhcp-lb Kubernetes Service as an upstream address for the relayed DHCP
requests, which is the IP helper address for DHCP. There is the dnsmasq
deployment behind this service that can only accept relayed DHCP requests.
Container Cloud has its own DHCP relay running on one of the management
cluster nodes. That DHCP relay serves for proxying DHCP requests in the
same L2 domain where the management cluster nodes are located.
To obtain the actual IP address issued to the dhcp-lb Kubernetes
Service:
This section instructs you on how to enable dynamic IP allocation feature
to increase the amount of baremetal hosts to be provisioned in parallel on
managed clusters.
Using this feature, you can effortlessly deploy a large managed cluster by
provisioning up to 100 hosts simultaneously. In addition to dynamic
IP allocation, this feature disables the ping check in the DHCP server.
Therefore, if you plan to deploy large managed clusters, enable this feature
during the management cluster bootstrap.
Consider this section as part of the Bootstrap v2
CLI or web UI procedure.
During creation of a management cluster using Bootstrap v2, you can configure
optional cluster settings using the Container Cloud API by modifying the
Cluster object or cluster.yaml.template of the required provider.
To configure optional cluster settings:
Select from the following options:
If you create a management cluster using the Container Cloud API,
proceed to the next step and configure cluster.yaml.template of
the required provider instead of the Cluster object while following
the below procedure.
If you create a management cluster using the Container Cloud Bootstrap
web UI:
Log in to the seed node where the bootstrap cluster is located.
Navigate to the kaas-bootstrap folder.
Export KUBECONFIG to connect to the bootstrap cluster:
exportKUBECONFIG=<pathToKindKubeconfig>
Obtain the cluster name and open its Cluster object for editing:
Technology Preview. Enable custom host names for cluster machines.
When enabled, any machine host name in a particular region matches the related
Machine object name. For example, instead of the default
kaas-node-<UID>, a machine host name will be master-0. The custom
naming format is more convenient and easier to operate with.
To enable the feature on the management and its future managed clusters:
Since 2.26.0
In the Cluster object, find the
spec.providerSpec.value.kaas.regional.helmReleases.name:<provider-name> section.
Under values.config, add customHostnamesEnabled:true.
Boolean, default - false. Enables the auditd role to install the
auditd packages and configure rules. CIS rules: 4.1.1.1, 4.1.1.2.
enabledAtBoot
Boolean, default - false. Configures grub to audit processes that can
be audited even if they start up prior to auditd startup. CIS rule:
4.1.1.3.
backlogLimit
Integer, default - none. Configures the backlog to hold records. If during
boot audit=1 is configured, the backlog holds 64 records. If more than
64 records are created during boot, auditd records will be lost with a
potential malicious activity being undetected. CIS rule: 4.1.1.4.
maxLogFile
Integer, default - none. Configures the maximum size of the audit log file.
Once the log reaches the maximum size, it is rotated and a new log file is
created. CIS rule: 4.1.2.1.
maxLogFileAction
String, default - none. Defines handling of the audit log file reaching the
maximum file size. Allowed values:
keep_logs - rotate logs but never delete them
rotate - add a cron job to compress rotated log files and keep
maximum 5 compressed files.
compress - compress log files and keep them under the
/var/log/auditd/ directory. Requires
auditd_max_log_file_keep to be enabled.
CIS rule: 4.1.2.2.
maxLogFileKeep
Integer, default - 5. Defines the number of compressed log files to keep
under the /var/log/auditd/ directory. Requires
auditd_max_log_file_action=compress. CIS rules - none.
mayHaltSystem
Boolean, default - false. Halts the system when the audit logs are
full. Applies the following configuration:
space_left_action=email
action_mail_acct=root
admin_space_left_action=halt
CIS rule: 4.1.2.3.
customRules
String, default - none. Base64-encoded content of the 60-custom.rules
file for any architecture. CIS rules - none.
customRulesX32
String, default - none. Base64-encoded content of the 60-custom.rules
file for the i386 architecture. CIS rules - none.
customRulesX64
String, default - none. Base64-encoded content of the 60-custom.rules
file for the x86_64 architecture. CIS rules - none.
presetRules
String, default - none. Comma-separated list of the following built-in
preset rules:
access
actions
delete
docker
identity
immutable
logins
mac-policy
modules
mounts
perm-mod
privileged
scope
session
system-locale
time-change
You can use two keywords for these rules:
none - disables all built-in rules.
all - enables all built-in rules. With this key, you can add the
! prefix to a rule name to exclude some rules. You can use the
! prefix for rules only if you add the all keyword as the
first rule. Place a rule with the ! prefix only after
the all keyword.
Example configurations:
presetRules:none - disable all preset rules
presetRules:docker - enable only the docker rules
presetRules:access,actions,logins - enable only the
access, actions, and logins rules
presetRules:all - enable all preset rules
presetRules:all,!immutable,!sessions - enable all preset
rules except immutable and sessions
Verify that the userFederation section is located
on the same level as the initUsers section.
Verify that all attributes set in the mappers section
are defined for users in the specified LDAP system.
Missing attributes may cause authorization issues.
Disable NTP that is enabled by default. This option disables the
management of chrony configuration by Container Cloud to use your own
system for chrony management. Otherwise, configure the regional NTP server
parameters as described below.
NTP configuration
Configure the regional NTP server parameters to be applied to all machines
of managed clusters.
In the Cluster object, add the ntp:servers section
with the list of required server names:
Applies only to the bare metal provider since the Cluster release 16.1.0.
If you plan to deploy large managed clusters, enable dynamic IP allocation
to increase the amount of baremetal hosts to be provisioned in parallel.
For details, see Enable dynamic IP allocation.
Technology Preview. Create all load balancers of the cluster with a specific
Octavia flavor by defining the following parameter in the
spec:providerSpec section of templates/cluster.yaml.template:
This feature is not supported by OpenStack Queens.
Now, proceed with completing the bootstrap process using the Container Cloud
Bootstrap web UI or API depending on the selected provider as described in
Deploy a Container Cloud management cluster.
Now, you can proceed with operating your management cluster through the
Container Cloud web UI and deploying managed clusters as described in
Operations Guide.
If the BootstrapRegion object is in the Error state, find the error
type in the Status field of the object for the following components to
resolve the issue:
Field name
Troubleshooting steps
Helm
If the bootstrap HelmBundle is not ready for a long time, for example,
during 15 minutes in case of an average network bandwidth, verify
statuses of non-ready releases and resolve the issue depending
on the error message of a particular release:
If the Credentials object is in the Error or Invalid state,
verify whether the provided credentials are valid and adjust them accordingly.
Warning
The kubectl apply command automatically saves the
applied data as plain text into the
kubectl.kubernetes.io/last-applied-configuration annotation of the
corresponding object. This may result in revealing sensitive data in this
annotation when creating or modifying the object.
Therefore, do not use kubectl apply on this object.
Use kubectl create, kubectl patch, or
kubectl edit instead.
If you used kubectl apply on this object, you
can remove the kubectl.kubernetes.io/last-applied-configuration
annotation from the object using kubectl edit.
The deployment statuses of a Machine object are the same as the
LCMMachine object states:
Uninitialized - the machine is not yet assigned to an LCMCluster.
Pending - the agent reports a node IP address and host name.
Prepare - the machine executes StateItems that correspond
to the prepare phase. This phase usually involves downloading
the necessary archives and packages.
Deploy - the machine executes StateItems that correspond
to the deploy phase that is becoming a Mirantis Kubernetes Engine (MKE)
node.
Ready - the machine is being deployed.
Upgrade - the machine is being upgraded to the new MKE version.
Reconfigure - the machine executes StateItems that correspond
to the reconfigure phase. The machine configuration is being updated
without affecting workloads running on the machine.
If the system response is empty, approve the BootstrapRegion object:
Using the Container Cloud web UI, navigate to the
Bootstrap tab and approve the related BootstrapRegion object
Using the Container Cloud CLI:
./container-cloudbootstrapapproveall
If the system response is not empty and the status remains the same for a
while, the issue may relate to machine misconfiguration. Therefore, verify
and adjust the parameters of the affected Machine object.
For provider-related issues, refer to the Troubleshooting section.
If the cluster deployment is stuck on the same stage for a long time, it may
be related to configuration issues in the Machine or other deployment
objects.
To troubleshoot cluster deployment:
Identify the current deployment stage that got stuck:
The syslog container collects logs generated by Ansible during the node
deployment and cleanup and outputs them in the JSON format.
Note
Add COLLECT_EXTENDED_LOGS=true before the
collect_logs command to output the extended version of logs
that contains system and MKE logs, logs from LCM Ansible and LCM Agent
along with cluster events and Kubernetes resources description and logs.
Without the --extended flag, the basic version of logs is collected, which
is sufficient for most use cases. The basic version of logs contains all
events, Kubernetes custom resources, and logs from all Container Cloud
components. This version does not require passing --key-file.
The logs are collected in the directory where the bootstrap script
is located.
The Container Cloud logs structure in <output_dir>/<cluster_name>/
is as follows:
/events.log
Human-readable table that contains information about the cluster events.
/system
System logs.
/system/mke (or /system/MachineName/mke)
Mirantis Kuberntes Engine (MKE) logs.
/objects/cluster
Logs of the non-namespaced Kubernetes objects.
/objects/namespaced
Logs of the namespaced Kubernetes objects.
/objects/namespaced/<namespaceName>/core/pods
Logs of the pods from a specific Kubernetes namespace. For example, logs
of the pods from the kaas namespace contain logs of Container Cloud
controllers, including bootstrap-cluster-controller
since Container Cloud 2.25.0.
Logs of the pods from a specific Kubernetes namespace that were previously
removed or failed.
/objects/namespaced/<namespaceName>/core/pods/<ironicPodName>/syslog.log
Technology Preview. Ironic pod logs of the bare metal clusters.
Note
Logs collected by the syslog container during the bootstrap phase
are not transferred to the management cluster during pivoting.
These logs are located in
/volume/log/ironic/ansible_conductor.log inside the Ironic
pod.
Each log entry of the management cluster logs contains a request ID that
identifies chronology of actions performed on a cluster or machine.
The format of the log entry is as follows:
<process ID>.[<subprocess ID>...<subprocess ID N>].req:<requestID>: <logMessage>
For example, os.machine.req:28 contains information about the task 28
applied to an OpenStack machine.
Since Container Cloud 2.22.0, the logging format has the following extended
structure for the admission-controller, storage-discovery, and all
supported <providerName>-provider services of a management cluster:
Informational level. Possible values: debug, info, warn,
error, panic.
ts
Time stamp in the <YYYY-MM-DDTHH:mm:ssZ> format. For example:
2022-11-14T21:37:23Z.
logger
Details on the process ID being logged:
<processID>
Primary process identifier. The list of possible values includes
bm, os, iam, license, and bootstrap.
Note
The iam and license values are available since
Container Cloud 2.23.0. The bootstrap value is available since
Container Cloud 2.25.0.
<subProcessID(s)>
One or more secondary process identifiers. The list of possible values
includes cluster, machine, controller, and cluster-ctrl.
Note
The controller value is available since Container Cloud
2.23.0. The cluster-ctrl value is available since Container Cloud
2.25.0 for the bootstrap process identifier.
req
Request ID number that increases when a service performs the following
actions:
Receives a request from Kubernetes about creating, updating,
or deleting an object
Receives an HTTP request
Runs a background process
The request ID allows combining all operations performed with an object
within one request. For example, the result of a Machine object
creation, update of its statuses, and so on has the same request ID.
caller
Code line used to apply the corresponding action to an object.
msg
Description of a deployment or update phase. If empty, it contains the
"error" key with a message followed by the "stacktrace" key with
stack trace details. For example:
"msg"="" "error"="Cluster nodes are not yet ready" "stacktrace": "<stack-trace-info>"
The log format of the following Container Cloud components does
not contain the "stacktrace" key for easier log handling:
baremetal-provider, bootstrap-provider, and
host-os-modules-controller.
Note
Logs may also include a number of informational key-value pairs
containing additional cluster details. For example,
"name":"object-name","foobar":"baz".
Depending on the type of issue found in logs, apply the corresponding fixes.
For example, if you detect the LoadBalancerERRORstate errors
during the bootstrap of an OpenStack-based management cluster,
contact your system administrator to fix the issue.
For MOSK, the feature is generally available since
MOSK 23.1.
While bootstrapping a Container Cloud management cluster using proxy, you may
require Internet access to go through a man-in-the-middle (MITM) proxy. Such
configuration requires that you enable streaming and install a CA certificate
on a bootstrap node.
Replace ~/.mitmproxy/mitmproxy-ca-cert.cer with the path to your CA
certificate.
Caution
The target CA certificate file must be in the PEM format
with the .crt extension.
Apply the changes:
sudoupdate-ca-certificates
Now, proceed with bootstrapping your management cluster.
Create initial users after a management cluster bootstrap¶
Once you bootstrap your management cluster,create Keycloak users for access
to the Container Cloud web UI. Use the created credentials to log in to the
Container Cloud web UI.
Mirantis recommends creating at least two users, user and operator,
that are required for a typical Container Cloud deployment.
To create the user for access to the Container Cloud web UI, use:
Required. Comma-separated list of roles to assign to the user.
If you run the command without the --namespace flag,
you can assign the following roles:
global-admin - read and write access for global role bindings
writer - read and write access
reader - view access
operator - create and manage access to the BaremetalHost
objects (required for bare metal clusters only)
management-admin - full access to the management cluster,
available since Container Cloud 2.25.0 (Cluster releases
17.0.0, 16.0.0, 14.1.0)
If you run the command for a specific project using the
--namespace flag, you can assign the following roles:
operator or writer - read and write access
user or reader - view access
member - read and write access (excluding IAM objects)
bm-pool-operator - create and manage access to the
BaremetalHost objects (required for bare metal clusters only)
--kubeconfig
Required. Path to the management cluster kubeconfig generated during
the management cluster bootstrap.
--namespace
Optional. Name of the Container Cloud project where the user will be
created. If not set, a global user will be created for all Container
Cloud projects with the corresponding role access to view or manage
all Container Cloud public objects.
--password-stdin
Optional. Flag to provide the user password through stdin:
For clusters deployed using the Container Cloud release earlier than 2.11.0
or if you deleted the kaas-bootstrap folder, download and run
the Container Cloud bootstrap script:
Add COLLECT_EXTENDED_LOGS=true before the command to output the
extended version of logs that contains system and MKE logs, logs from
LCM Ansible and LCM Agent along with cluster events and Kubernetes
resources description and logs.
Without the --extended flag, the basic version of logs is collected, which
is sufficient for most use cases. The basic version of logs contains all
events, Kubernetes custom resources, and logs from all Container Cloud
components. This version does not require passing --key-file.
The logs are collected in the directory where the bootstrap script
is located.
Technology Preview. For bare metal clusters, assess the Ironic pod logs:
Extract the content of the 'message' fields from every log message:
The issue may occur because the default Docker network address
172.17.0.0/16 and/or the kind Docker network, which is used by
kind, overlap with your cloud address or other addresses
of the network configuration.
Workaround:
Log in to your local machine.
Verify routing to the IP addresses of the target cloud endpoints:
Obtain the IP address of your target cloud. For example:
If the routing is incorrect, change the IP address
of the default Docker bridge:
Create or edit /etc/docker/daemon.json by adding the "bip"
option:
{"bip":"192.168.91.1/24"}
Restart the Docker daemon:
sudosystemctlrestartdocker
If required, customize addresses for your kind Docker network
or any other additional Docker networks:
Remove the kind network:
dockernetworkrm'kind'
Choose from the following options:
Configure /etc/docker/daemon.json:
Note
The following steps are applied to to customize addresses
for the kind Docker network. Use these steps as an
example for any other additional Docker networks.
Add the following section to /etc/docker/daemon.json:
Docker pruning removes the user defined networks,
including 'kind'. Therefore,
every time after running the Docker pruning commands,
re-create the 'kind' network again
using the command above.
This section provides solutions to the issues that may occur while deploying
an OpenStack-based management cluster. To troubleshoot a managed cluster, see
Operations Guide: Troubleshooting.
If you execute the bootstrap.sh script from an OpenStack VM
that is running on the OpenStack environment used for bootstrapping
the management cluster, the following error messages may occur
that can be related to the MTU settings discrepancy:
If the MTU output values differ for docker0 and ens3, proceed
with the workaround below. Otherwise, inspect the logs further
to identify the root cause of the error messages.
Workaround:
In your OpenStack environment used for Mirantis Container
Cloud, log in to any machine with CLI access to OpenStack.
For example, you can create a new Ubuntu VM (separate from the bootstrap VM)
and install the python-openstackclient package on it.
Change the vXLAN MTU size for the VM to the required value
depending on your network infrastructure and considering your
physical network configuration, such as Jumbo frames, and so on.
This section describes how to configure authentication for Mirantis
Container Cloud depending on the external identity provider type
integrated to your deployment.
If you integrate LDAP for IAM to Mirantis Container Cloud,
add the required LDAP configuration to cluster.yaml.template
during the bootstrap of the management cluster.
Note
The example below defines the recommended non-anonymous
authentication type. If you require anonymous authentication,
replace the following parameters with authType: "none":
authType:"simple"bindCredential:""bindDn:""
To configure LDAP for IAM:
Open cluster.yaml.template stored in the following locations depending
on the cloud provider type:
Bare metal: templates/bm/cluster.yaml.template
OpenStack: templates/cluster.yaml.template
vSphere: templates/vsphere/cluster.yaml.template
Configure the keycloak:userFederation:providers:
and keycloak:userFederation:mappers: sections as required:
Verify that the userFederation section is located
on the same level as the initUsers section.
Verify that all attributes set in the mappers section
are defined for users in the specified LDAP system.
Missing attributes may cause authorization issues.
Now, return to the bootstrap instruction depending on the provider type
of your management cluster.
The instruction below applies to the DNS-based management
clusters. If you bootstrap a non-DNS-based management cluster,
configure Google OAuth IdP for Keycloak after bootstrap using the
official Keycloak documentation.
If you integrate Google OAuth external identity provider for IAM to
Mirantis Container Cloud, create the authorization credentials for IAM
in your Google OAuth account and configure cluster.yaml.template
during the bootstrap of the management cluster.
In the APIs Credentials menu, select
OAuth client ID.
In the window that opens:
In the Application type menu, select
Web application.
In the Authorized redirect URIs field, type in
<keycloak-url>/auth/realms/iam/broker/google/endpoint,
where <keycloak-url> is the corresponding DNS address.
Press Enter to add the URI.
Click Create.
A page with your client ID and client secret opens. Save these
credentials for further usage.
Log in to the bootstrap node.
Open cluster.yaml.template stored in the following locations depending
on the cloud provider type:
Bare metal: templates/bm/cluster.yaml.template
OpenStack: templates/cluster.yaml.template
vSphere: templates/vsphere/cluster.yaml.template
In the keycloak:externalIdP: section, add the following snippet
with your credentials created in previous steps:
The Mirantis Container Cloud APIs are implemented using the Kubernetes
CustomResourceDefinitions (CRDs) that enable you to expand the Kubernetes API.
For details, see API Reference.
You can operate Container Cloud using the kubectl
command-line tool that is based on the Kubernetes API.
For the kubectl reference, see the official
Kubernetes documentation.
The Container Cloud Operations Guide mostly contains manuals that describe
the Container Cloud web UI that is intuitive and easy to get started with.
Some sections are divided into a web UI instruction and an analogous
but more advanced CLI one.
Certain Container Cloud operations can be performed only using CLI
with the corresponding steps described in dedicated sections.
For details, refer to the required component section of this guide.
This tutorial applies only to the Container Cloud web UI users
with the m:kaas:namespace@operator or m:kaas:namespace@writer
access role assigned by the Infrastructure Operator.
To add a bare metal host, the m:kaas@operator or
m:kaas:namespace@bm-pool-operator role is required.
After you deploy the Mirantis Container Cloud management cluster,
you can start creating managed clusters that will be based on the same cloud
provider type that you have for the management cluster: OpenStack, bare metal,
or vSphere.
Caution
Since Container Cloud 2.27.3 (Cluster release 16.2.3), support
for vSphere-based clusters is suspended. For details, see
Deprecation notes.
The deployment procedure is performed using the Container Cloud web UI
and comprises the following steps:
Create a dedicated non-default project for managed clusters.
For a baremetal-based managed cluster, create and configure bare metal
hosts with corresponding labels for machines such as worker,
manager, or storage.
Create an initial cluster configuration depending on the provider type.
Add the required amount of machines with the corresponding configuration
to the managed cluster.
For a baremetal-based managed cluster, add a Ceph cluster.
Note
The Container Cloud web UI communicates with Keycloak
to authenticate users. Keycloak is exposed using HTTPS with
self-signed TLS certificates that are not trusted by web browsers.
The procedure below applies only to the Container Cloud web UI
users with the m:kaas@global-admin or m:kaas@writer access role
assigned by the infrastructure Operator.
The default project (Kubernetes namespace) in Container Cloud is dedicated
for management clusters only. Managed clusters require a separate project.
You can create as many projects as required by your company infrastructure.
To create a project for managed clusters using the Container Cloud web UI:
Log in to the Container Cloud web UI as m:kaas@global-admin or
m:kaas@writer.
In the Projects tab, click Create.
Type the new project name.
Click Create.
Generate a kubeconfig for a managed cluster using API¶
This section describes how to generate a managed cluster kubeconfig using
the Container Cloud API. You can also download a managed cluster kubeconfig
using the Download Kubeconfig option in the Container Cloud web
UI. For details, see Connect to a Mirantis Container Cloud cluster.
To generate a managed cluster kubeconfig using API:
The kubeconfig of your <username> that you can download through
the Container Cloud web UI using Download Kubeconfig located
under your <username> on the top-left of the page.
Obtain the <cluster> object of the <cluster_name> managed cluster:
Generate the managed cluster kubeconfig using the data from
<cluster.status> and <token> obtained in the previous steps.
Use the following template as an example:
Create and operate a baremetal-based managed cluster¶
After bootstrapping your baremetal-based Mirantis Container Cloud
management cluster as described in Deploy a Container Cloud management cluster,
you can start creating the baremetal-based managed clusters.
Before creating a bare metal managed cluster, add the required number
of bare metal hosts either using the Container Cloud web UI for a default
configuration or using CLI for an advanced configuration.
Optional. Available since Container Cloud 2.24.0. In the
Credentials tab, click Add Credential and add the
IPMI user name and password of the bare metal host to access the Baseboard
Management Controller (BMC).
Select one of the following options:
Since 2.26.0 (17.1.0 and 16.1.0)
In the Baremetal tab, click Create Host.
Fill out the Create baremetal host form as required:
Name
Specify the name of the new bare metal host.
Boot Mode
Specify the BIOS boot mode. Available options: Legacy,
UEFI, or UEFISecureBoot.
MAC Address
Specify the MAC address of the PXE network interface.
Baseboard Management Controller (BMC)
Specify the following BMC details:
IP Address
Specify the IP address to access the BMC.
Credential Name
Specify the name of the previously added bare metal host
credentials to associate with the current host.
Cert Validation
Enable validation of the BMC API certificate. Applies only to the
redfish+http BMC protocol. Disabled by default.
Power off host after creation
Experimental. Select to power off the bare metal host after
creation.
Caution
This option is experimental and intended only for
testing and evaluation purposes. Do not use it for
production deployments.
Before 2.26.0 (17.1.0 and 16.1.0)
In the Baremetal tab, click Add BM host.
Fill out the Add new BM host form as required:
Baremetal host name
Specify the name of the new bare metal host.
Provider Credential
Optional. Available since Container Cloud 2.24.0. Specify the name
of the previously added bare metal host credentials to associate
with the current host.
Add New Credential
Optional. Available since Container Cloud 2.24.0. Applies if you
did not add bare metal host credentials using the
Credentials tab. Add the bare metal host credentials:
Username
Specify the name of the IPMI user to access the BMC.
Password
Specify the IPMI password of the user to access the BMC.
Boot MAC address
Specify the MAC address of the PXE network interface.
IP Address
Specify the IP address to access the BMC.
Label
Assign the machine label to the new host that defines which type of
machine may be deployed on this bare metal host. Only one label can
be assigned to a host. The supported labels include:
Manager
This label is selected and set by default.
Assign this label to the bare metal hosts that can be used
to deploy machines with the manager type. These hosts
must match the CPU and RAM requirements described in
Reference hardware configuration.
Worker
The host with this label may be used to deploy
the worker machine type. Assign this label to the bare metal
hosts that have sufficient CPU and RAM resources, as described in
Reference hardware configuration.
Storage
Assign this label to the bare metal hosts that have sufficient
storage devices to match Reference hardware configuration.
Hosts with this label will be used to deploy machines
with the storage type that run Ceph OSDs.
Click Create.
While adding the bare metal host, Container Cloud discovers and inspects
the hardware of the bare metal host and adds it to BareMetalHost.status
for future references.
During provisioning, baremetal-operator inspects the bare metal host
and moves it to the Preparing state. The host becomes ready to be linked
to a bare metal machine.
Verify the results of the hardware inspection to avoid unexpected errors
during the host usage:
Select one of the following options:
Since 2.26.0 (17.1.0 and 16.1.0)
In the left sidebar, click Baremetal. The Hosts
page opens.
Before 2.26.0 (17.1.0 and 16.1.0)
In the left sidebar, click BM Hosts.
Verify that the bare metal host is registered and switched to one of the
following statuses:
Preparing for a newly added host
Ready for a previously used host or for a host that is
already linked to a machine
Select one of the following options:
Since 2.26.0 (17.1.0 and 16.1.0)
On the Hosts page, click the host kebab menu and select
Host info.
Before 2.26.0 (17.1.0 and 16.1.0)
On the BM Hosts page, click the name of the newly added
bare metal host.
In the window with the host details, scroll down to the
Hardware section.
Review the section and make sure that the number and models
of disks, network interface cards, and CPUs match the hardware
specification of the server.
If the hardware details are consistent with the physical server
specifications for all your hosts, proceed to
Add a managed baremetal cluster.
If you find any discrepancies in the hardware inspection results,
it might indicate that the server has hardware issues or
is not compatible with Container Cloud.
In the metadata section, add a unique credentials name and the
name of the non-default project (namespace) dedicated for the
managed cluster being created.
In the spec section, add the IPMI user name and password in plain
text to access the Baseboard Management Controller (BMC). The password
will not be stored in the BareMetalHostCredential object but will
be erased and saved in an underlying Secret object.
Caution
Each bare metal host must have a unique
BareMetalHostCredential.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Before Cluster releases 12.5.0 and 11.5.0
Create a secret YAML file that describes the unique credentials of the
new bare metal host.
In the data section, add the IPMI user name and password in the
base64 encoding to access the BMC. To obtain the base64-encoded
credentials, you can use the following command in your Linux console:
echo-n<username|password>|base64
Caution
Each bare metal host must have a unique Secret.
In the metadata section, add the unique name of credentials and
the name of the non-default project (namespace) dedicated for
the managed cluster being created. To create a project, refer to
Create a project for managed clusters.
Apply the created YAML file with credentials to your deployment:
Warning
The kubectl apply command automatically saves the
applied data as plain text into the
kubectl.kubernetes.io/last-applied-configuration annotation of the
corresponding object. This may result in revealing sensitive data in this
annotation when creating or modifying the object.
Therefore, do not use kubectl apply on this object.
Use kubectl create, kubectl patch, or
kubectl edit instead.
If you used kubectl apply on this object, you
can remove the kubectl.kubernetes.io/last-applied-configuration
annotation from the object using kubectl edit.
If you have a limited amount of free and unused IP addresses
for server provisioning, you can add the
baremetalhost.metal3.io/detached annotation that pauses automatic
host management to manually allocate an IP address for the host. For
details, see Manually allocate IP addresses for bare metal hosts.
During provisioning, baremetal-operator inspects the bare metal host
and moves it to the Preparing state. The host becomes ready to be linked
to a bare metal machine.
During provisioning, the status changes as follows:
registering
inspecting
preparing
After BareMetalHost switches to the preparing stage, the
inspecting phase finishes and you can verify hardware information
available in the object status. For example:
The bare metal host profile is a Kubernetes custom resource.
It allows the operator to define how the storage devices
and the operating system are provisioned and configured.
This section describes the bare metal host profile default settings
and configuration of custom profiles for managed clusters
using Mirantis Container Cloud API. This procedure also applies
to a management cluster with a few differences described in
Customize the default bare metal host profile.
Note
You can view the created profiles in the
BM Host Profiles tab of the Container Cloud web UI.
Note
Using BareMetalHostProfile, you can configure LVM or mdadm-based
software RAID support during a management or managed cluster
creation. For details, see Configure RAID support.
This feature is available as Technology Preview. Use such
configuration for testing and evaluation purposes only. For the
Technology Preview feature definition, refer to Technology Preview features.
The default host profile requires three storage devices in the following
strict order:
Boot device and operating system storage
This device contains boot data and operating system data. It
is partitioned using the GUID Partition Table (GPT) labels.
The root file system is an ext4 file system
created on top of an LVM logical volume.
For a detailed layout, refer to the table below.
Local volumes device
This device contains an ext4 file system with directories mounted
as persistent volumes to Kubernetes. These volumes are used by
the Mirantis Container Cloud services to store its data,
including monitoring and identity databases.
Ceph storage device
This device is used as a Ceph datastore or Ceph OSD on managed clusters.
It is used as a Ceph datastore or Ceph OSD.
Warning
Any data stored on any device defined in the fileSystems
list can be deleted or corrupted during cluster (re)deployment. It happens
because each device from the fileSystems list is a part of the
rootfs directory tree that is overwritten during (re)deployment.
Examples of affected devices include:
A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a
file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it
The wipe field (deprecated) or wipeDevice structure (recommended
since Container Cloud 2.26.0) have no effect in this case and cannot
protect data on these devices.
Therefore, to prevent data loss, move the necessary data from these file
systems to another server beforehand, if required.
The following table summarizes the default configuration of the host system
storage set up by the Container Cloud bare metal management.
Default configuration of the bare metal host storage¶
Device/partition
Name/Mount point
Recommended size, GB
Description
/dev/sda1
bios_grub
4 MiB
The mandatory GRUB boot partition required for non-UEFI systems.
/dev/sda2
UEFI -> /boot/efi
0.2 GiB
The boot partition required for the UEFI boot mode.
/dev/sda3
config-2
64 MiB
The mandatory partition for the cloud-init configuration.
Used during the first host boot for initial configuration.
/dev/sda4
lvm_root_part
100% of the remaining free space in the LVM volume group
The main LVM physical volume that is used to create the root file system.
/dev/sdb
lvm_lvp_part -> /mnt/local-volumes
100% of the remaining free space in the LVM volume group
The LVM physical volume that is used to create the file system
for LocalVolumeProvisioner.
/dev/sdc
-
100% of the remaining free space in the LVM volume group
Clean raw disk that is used for the Ceph storage backend on
managed clusters.
If required, you can customize the default host storage configuration.
For details, see Create a custom host profile.
Before deploying a cluster, you may need to erase existing data from hardware
devices to be used for deployment. You can either erase an existing partition
or remove all existing partitions from a physical device. For this purpose,
use the wipeDevice structure that configures cleanup behavior during
configuration of a custom bare metal host profile described in
Create a custom host profile.
The wipeDevice structure contains the following options:
When you enable the eraseMetadata option, which is disabled by default,
the Ansible provisioner attempts to clean up the existing metadata from
the target device. Examples of metadata include:
Existing file system
Logical Volume Manager (LVM) or Redundant Array of Independent Disks (RAID)
configuration
The behavior of metadata erasure varies depending on the target device:
If a device is part of other logical devices, for example, a partition,
logical volume, or MD RAID volume, such logical device is disassembled and
its file system metadata is erased. On the final erasure step,
the file system metadata of the target device is erased as well.
If a device is a physical disk, then all its nested partitions along with
their nested logical devices, if any, are erased and disassembled.
On the final erasure step, all partitions and metadata of the target device
are removed.
Caution
None of the eraseMetadata actions include overwriting the
target device with data patterns. For this purpose, use the eraseDevice
option as described in Erase a device.
To enable the eraseMetadata option, use the wipeDevice field in the
spec:devices section of the BareMetalHostProfile object. For a
detailed description of the option, see API Reference:
BareMetalHostProfile.
If you require not only disassembling of existing logical volumes but also
removing of all data ever written to the target device, configure the
eraseDevice option, which is disabled by default. This option is not
applicable to paritions, LVM, or MD RAID logical volumes because such volumes
may use caching that prevents a physical device from being erased properly.
Important
The eraseDevice option does not replace the secure erase.
To configure the eraseDevice option, use the wipeDevice field in the
spec:devices section of the BareMetalHostProfile object. For a
detailed description of the option, see API Reference:
BareMetalHostProfile.
In addition to the default BareMetalHostProfile object installed
with Mirantis Container Cloud, you can create custom profiles
for managed clusters using Container Cloud API.
Note
The procedure below also applies to the Container Cloud
management clusters.
Warning
Any data stored on any device defined in the fileSystems
list can be deleted or corrupted during cluster (re)deployment. It happens
because each device from the fileSystems list is a part of the
rootfs directory tree that is overwritten during (re)deployment.
Examples of affected devices include:
A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a
file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it
The wipe field (deprecated) or wipeDevice structure (recommended
since Container Cloud 2.26.0) have no effect in this case and cannot
protect data on these devices.
Therefore, to prevent data loss, move the necessary data from these file
systems to another server beforehand, if required.
To create a custom bare metal host profile:
Select from the following options:
For a management cluster, log in to the bare metal seed node that will be
used to bootstrap the management cluster.
For a managed cluster, log in to the local machine where you management
cluster kubeconfig is located and where kubectl is installed.
Note
The management cluster kubeconfig is created automatically
during the last stage of the management cluster bootstrap.
Select from the following options:
For a management cluster, open
templates/bm/baremetalhostprofiles.yaml.template for editing.
For a managed cluster, create a new bare metal host profile
under the templates/bm/ directory.
Edit the host profile using the example template below to meet
your hardware configuration requirements:
Example template of a bare metal host profile
apiVersion:metal3.io/v1alpha1kind:BareMetalHostProfilemetadata:name:<profileName>namespace:<ManagedClusterProjectName># Add the name of the non-default project for the managed cluster# being created.spec:devices:# From the HW node, obtain the first device, which size is at least 120Gib.-device:minSize:120Giwipe:truepartitions:-name:bios_grubpartflags:-bios_grubsize:4Miwipe:true-name:uefipartflags:-espsize:200Miwipe:true-name:config-2size:64Miwipe:true-name:lvm_root_partsize:0wipe:true# From the HW node, obtain the second device, which size is at least 120Gib.# If a device exists but does not fit the size,# the BareMetalHostProfile will not be applied to the node.-device:minSize:120Giwipe:true# From the HW node, obtain the disk device with the exact device path.-device:byPath:/dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:1minSize:120Giwipe:truepartitions:-name:lvm_lvp_partsize:0wipe:true# Example of wiping a device w\o partitioning it.# Mandatory for the case when a disk is supposed to be used for Ceph backend.# later-device:byPath:/dev/disk/by-path/pci-0000:00:05.0-scsi-0:0:0:2wipe:truefileSystems:-fileSystem:vfatpartition:config-2-fileSystem:vfatmountPoint:/boot/efipartition:uefi-fileSystem:ext4logicalVolume:rootmountPoint:/-fileSystem:ext4logicalVolume:lvpmountPoint:/mnt/local-volumes/logicalVolumes:-name:rootsize:0vg:lvm_root-name:lvpsize:0vg:lvm_lvppostDeployScript:|#!/bin/bash -execho $(date) 'post_deploy_script done' >> /root/post_deploy_donepreDeployScript:|#!/bin/bash -execho $(date) 'pre_deploy_script done' >> /root/pre_deploy_donevolumeGroups:-devices:-partition:lvm_root_partname:lvm_root-devices:-partition:lvm_lvp_partname:lvm_lvpgrubConfig:defaultGrubOptions:-GRUB_DISABLE_RECOVERY="true"-GRUB_PRELOAD_MODULES=lvm-GRUB_TIMEOUT=20kernelParameters:sysctl:# For the list of options prohibited to change, refer to# https://docs.mirantis.com/mke/3.7/install/predeployment/set-up-kernel-default-protections.htmlkernel.dmesg_restrict:"1"kernel.core_uses_pid:"1"fs.file-max:"9223372036854775807"fs.aio-max-nr:"1048576"fs.inotify.max_user_instances:"4096"vm.max_map_count:"262144"
Optional. Configure wiping of the target device or partition to be used
for cluster deployment as described in Wipe a device or partition.
Optional. Configure multiple devices for LVM volume using the example
template extract below for reference.
Caution
The following template extract contains only sections relevant
to LVM configuration with multiple PVs.
Expand the main template described in the previous step
with the configuration below if required.
Optional. Technology Preview. Configure support of the Redundant Array of
Independent Disks (RAID) that allows, for example, installing a cluster
operating system on a RAID device, refer to Configure RAID support.
Optional. Configure the RX/TX buffer size for physical network interfaces
and txqueuelen for any network interfaces.
This configuration can greatly benefit high-load and high-performance
network interfaces. You can configure these parameters using the
udev rules. For example:
Add or edit the mandatory parameters in the new BareMetalHostProfile
object. For the parameters description, see
API: BareMetalHostProfile spec.
Note
If asymmetric traffic is expected on some of the managed cluster
nodes, enable the loose mode for the corresponding interfaces on those
nodes by setting the net.ipv4.conf.<interface-name>.rp_filter
parameter to "2" in the kernelParameters.sysctl section.
For example:
Create a volume group on top of the defined partition and create the
required number of logical volumes (LVs) on top of the created volume
group (VG). Add one logical volume per one Ceph OSD on the node.
Example snippet of an LVM configuration for a Ceph metadata disk:
Plan LVs of a separate metadata device thoroughly.
Any logical volume misconfiguration causes redeployment of all
Ceph OSDs that use this disk as metadata devices.
Note
General Ceph recommendation is to have a metadata device in
between 1% to 4% of the Ceph OSD data size.
Mirantis highly recommends having at least 4% of Ceph OSD data size.
If you plan using a disk as a separate metadata device for 10 Ceph
OSDs, define the size of an LV for each Ceph OSD in between 1% to 4%
of the corresponding Ceph OSD data size. If RADOS Gateway is enabled,
the minimum data size must be 4%. For details, see Ceph documentation:
Bluestore config reference.
For example, if the total data size of 10 Ceph OSDs equals 1Tb
with 100Gb each, assign a metadata disk less than 10Gb with
1Gb per each LV. The recommended size is 40Gb with 4Gb
per each LV.
After applying BareMetalHostProfile, the bare metal provider
creates an LVM partitioning for the metadata disk and places
these volumes as /dev paths, for example, /dev/bluedb/meta_1 or
/dev/bluedb/meta_3.
Example template of a host profile configuration for Ceph