The documentation is intended to help operators understand the core concepts
of the product.
The information provided in this documentation set is being constantly
improved and amended based on the feedback and kind requests from our
software consumers. This documentation set outlines description of
the features supported within three latest Container Cloud minor releases and
their supported Cluster releases, with a corresponding note
Available since <release-version>.
The following table lists the guides included in the documentation set you
are reading:
GUI elements that include any part of interactive user interface and
menu navigation.
Superscript
Some extra, brief information. For example, if a feature is
available from a specific release or if a feature is in the
Technology Preview development stage.
Note
The Note block
Messages of a generic meaning that may be useful to the user.
Caution
The Caution block
Information that prevents a user from mistakes and undesirable
consequences when following the procedures.
Warning
The Warning block
Messages that include details that can be easily missed, but should not
be ignored by the user and are valuable before proceeding.
See also
The See also block
List of references that may be helpful for understanding of some related
tools, concepts, and so on.
Learn more
The Learn more block
Used in the Release Notes to wrap a list of internal references to
the reference architecture, deployment and operation procedures specific
to a newly implemented product feature.
A Technology Preview feature provides early access to upcoming product
innovations, allowing customers to experiment with the functionality and
provide feedback.
Technology Preview features may be privately or publicly available but
neither are intended for production use. While Mirantis will provide
assistance with such features through official channels, normal Service
Level Agreements do not apply.
As Mirantis considers making future iterations of Technology Preview features
generally available, we will do our best to resolve any issues that customers
experience when using these features.
During the development of a Technology Preview feature, additional components
may become available to the public for evaluation. Mirantis cannot guarantee
the stability of such features. As a result, if you are using Technology
Preview features, you may not be able to seamlessly update to subsequent
product releases, as well as upgrade or migrate to the functionality that
has not been announced as full support yet.
Mirantis makes no guarantees that Technology Preview features will graduate
to generally available features.
The documentation set refers to Mirantis Container Cloud GA as to the latest
released GA version of the product. For details about the Container Cloud
GA minor releases dates, refer to
Container Cloud releases.
Mirantis Container Cloud enables you to ship code faster by enabling speed
with choice, simplicity, and security. Through a single pane of glass you can
deploy, manage, and observe Kubernetes clusters on bare metal infrastructure.
The list of the most common use cases includes:
Kubernetes cluster lifecycle management
The consistent lifecycle management of a single Kubernetes cluster
is a complex task on its own that is made infinitely more difficult
when you have to manage multiple clusters across different platforms
spread across the globe. Mirantis Container Cloud provides a single,
centralized point from which you can perform full lifecycle management
of your container clusters, including automated updates and upgrades.
Highly regulated industries
Regulated industries need a fine level of access control granularity,
high security standards and extensive reporting capabilities to ensure
that they can meet and exceed the security standards and requirements.
Mirantis Container Cloud provides for a fine-grained Role Based Access
Control (RBAC) mechanism and easy integration and federation to existing
identity management systems (IDM).
Logging, monitoring, alerting
A complete operational visibility is required to identify and address issues
in the shortest amount of time – before the problem becomes serious.
Mirantis StackLight is the proactive monitoring, logging, and alerting
solution designed for large-scale container and cloud observability with
extensive collectors, dashboards, trend reporting and alerts.
Storage
Cloud environments require a unified pool of storage that can be scaled up by
simply adding storage server nodes. Ceph is a unified, distributed storage
system designed for excellent performance, reliability, and scalability.
Deploy Ceph utilizing Rook to provide and manage a robust persistent storage
that can be used by Kubernetes workloads on the baremetal-based clusters.
Security
Security is a core concern for all enterprises, especially with more
of our systems being exposed to the Internet as a norm. Mirantis
Container Cloud provides for a multi-layered security approach that
includes effective identity management and role based authentication,
secure out of the box defaults and extensive security scanning and
monitoring during the development process.
5G and Edge
The introduction of 5G technologies and the support of Edge workloads
requires an effective multi-tenant solution to manage the underlying
container infrastructure. Mirantis Container Cloud provides for a full
stack, secure, multi-cloud cluster management and Day-2 operations
solution.
Mirantis Container Cloud is a set of microservices
that are deployed using Helm charts and run in a Kubernetes cluster.
Container Cloud is based on the Kubernetes Cluster API community initiative.
The following diagram illustrates an overview of Container Cloud
and the clusters it manages:
All artifacts used by Kubernetes and workloads are stored
on the Container Cloud content delivery network (CDN):
mirror.mirantis.com (Debian packages including the Ubuntu mirrors)
binary.mirantis.com (Helm charts and binary artifacts)
mirantis.azurecr.io (Docker image registry)
All Container Cloud components are deployed in the Kubernetes clusters.
All Container Cloud APIs are implemented using the Kubernetes
Custom Resource Definition (CRD) that represents custom objects
stored in Kubernetes and allows you to expand Kubernetes API.
The Container Cloud logic is implemented using controllers.
A controller handles the changes in custom resources defined
in the controller CRD.
A custom resource consists of a spec that describes the desired state
of a resource provided by a user.
During every change, a controller reconciles the external state of a custom
resource with the user parameters and stores this external state in the
status subresource of its custom resource.
The types of the Container Cloud clusters include:
Bootstrap cluster
Runs the bootstrap process on a seed data center bare metal node that can be
reused after the management cluster deployment for other purposes.
Requires access to the bare metal provider backend.
Initially, the bootstrap cluster is created with the following minimal set
of components: Bootstrap Controller, public API charts, and the Bootstrap
API.
The user can interact with the bootstrap cluster through the Bootstrap API
to create the configuration for a management cluster and start its
deployment. More specifically, the user performs the following operations:
Create required deployment objects.
Optionally add proxy and SSH keys.
Configure the cluster and machines.
Deploy a management cluster.
The user can monitor the deployment progress of the cluster and machines.
After a successful deployment, the user can download the kubeconfig
artifact of the provisioned cluster.
Management cluster
Comprises Container Cloud as product and provides the following functionality:
Runs all public APIs and services including the web UIs
of Container Cloud.
Does not require access to any provider backend.
Runs the provider-specific services and internal API including
LCMMachine and LCMCluster. Also, it runs an LCM controller for
orchestrating managed clusters and other controllers for handling
different resources.
Requires two-way access to a provider backend. The provider connects
to a backend to spawn managed cluster nodes,
and the agent running on the nodes accesses the regional cluster
to obtain the deployment information.
For deployment details of a management cluster, see Deployment Guide.
Managed cluster
A Mirantis Kubernetes Engine (MKE) cluster that an end user
creates using the Container Cloud web UI.
Requires access to its management cluster. Each node of a managed
cluster runs an LCM Agent that connects to the LCM machine of the
management cluster to obtain the deployment details.
Supports Mirantis OpenStack for Kubernetes (MOSK). For details, see
MOSK documentation.
All types of the Container Cloud clusters except the bootstrap cluster
are based on the MKE and Mirantis Container Runtime (MCR) architecture.
For details, see MKE and
MCR documentation.
The following diagram illustrates the distribution of services
between each type of the Container Cloud clusters:
The Mirantis Container Cloud provider is the central component of Container
Cloud that provisions a node of a management or managed cluster and runs the
LCM Agent on this node. It runs in a management cluster and requires connection
to a provider backend.
The Container Cloud provider interacts with the following types of public API
objects:
Public API object name
Description
Container Cloud release object
Contains the following information about clusters:
Version of the supported Cluster release for a management cluster
List of supported Cluster releases for the managed clusters
and supported upgrade path
Description of Helm charts that are installed on the management cluster
Cluster release object
Provides a specific version of a management or managed cluster.
Any Cluster release object, as well as a Container Cloud release
object never changes, only new releases can be added.
Any change leads to a new release of a cluster.
Contains references to all components and their versions
that are used to deploy all cluster types:
LCM components:
LCM Agent
Ansible playbooks
Scripts
Description of steps to execute during a cluster deployment
and upgrade
Helm Controller image references
Supported Helm charts description:
Helm chart name and version
Helm release name
Helm values
Cluster object
References the Credentials, KaaSRelease and ClusterRelease
objects.
Represents all cluster-level resources, for example, networks, load
balancer for the Kubernetes API, and so on. It uses data from the
Credentials object to create these resources and data from the
KaaSRelease and ClusterRelease objects to ensure that all
lower-level cluster objects are created.
Machine object
References the Cluster object.
Represents one node of a managed cluster and contains all data to
provision it.
Credentials object
Contains all information necessary to connect to a provider backend.
PublicKey object
Is provided to every machine to obtain an SSH access.
The following diagram illustrates the Container Cloud provider data flow:
The Container Cloud provider performs the following operations
in Container Cloud:
Consumes the below types of data from a management cluster:
Credentials to connect to a provider backend
Deployment instructions from the KaaSRelease and ClusterRelease
objects
The cluster-level parameters from the Cluster objects
The machine-level parameters from the Machine objects
Prepares data for all Container Cloud components:
Creates the LCMCluster and LCMMachine custom resources
for LCM Controller and LCM Agent. The LCMMachine custom resources
are created empty to be later handled by the LCM Controller.
Creates the HelmBundle custom resources for the Helm Controller
using data from the KaaSRelease and ClusterRelease objects.
Creates service accounts for these custom resources.
Creates a scope in Identity and access management (IAM)
for a user access to a managed cluster.
Provisions nodes for a managed cluster using the cloud-init script
that downloads and runs the LCM Agent.
The Mirantis Container Cloud Release Controller is responsible
for the following functionality:
Monitor and control the KaaSRelease and ClusterRelease objects
present in a management cluster. If any release object is used
in a cluster, the Release Controller prevents the deletion
of such an object.
Trigger the Container Cloud auto-update procedure if a new
KaaSRelease object is found:
Search for the managed clusters with old Cluster releases
that are not supported by a new Container Cloud release.
If any are detected, abort the auto-update and display
a corresponding note about an old Cluster release in the Container
Cloud web UI for the managed clusters. In this case, a user must update
all managed clusters using the Container Cloud web UI.
Once all managed clusters are updated to the Cluster releases
supported by a new Container Cloud release,
the Container Cloud auto-update is retriggered
by the Release Controller.
Trigger the Container Cloud release update of all Container Cloud
components in a management cluster.
The update itself is processed by the Container Cloud provider.
Trigger the Cluster release update of a management cluster
to the Cluster release version that is indicated
in the updated Container Cloud release version.
The LCMCluster components, such as MKE, are updated before
the HelmBundle components, such as StackLight or Ceph.
Once a management cluster is updated, an option to update
a managed cluster becomes available in the Container Cloud web UI.
During a managed cluster update, all cluster components including
Kubernetes are automatically updated to newer versions if available.
The LCMCluster components, such as MKE, are updated before
the HelmBundle components, such as StackLight or Ceph.
The Operator can delay the Container Cloud automatic upgrade procedure for a
limited amount of time or schedule upgrade to run at desired hours or weekdays.
For details, see Schedule Mirantis Container Cloud updates.
Container Cloud remains operational during the management cluster upgrade.
Managed clusters are not affected during this upgrade. For the list of
components that are updated during the Container Cloud upgrade, see the
Components versions section of the corresponding Container Cloud release in
Release Notes.
When Mirantis announces support of the newest versions of
Mirantis Container Runtime (MCR) and Mirantis Kubernetes Engine
(MKE), Container Cloud automatically upgrades these components as well.
For the maintenance window best practices before upgrade of these
components, see
MKE Documentation.
The Mirantis Container Cloud web UI is mainly designed
to create and update the managed clusters as well as add or remove machines
to or from an existing managed cluster.
You can use the Container Cloud web UI
to obtain the management cluster details including endpoints, release version,
and so on.
The management cluster update occurs automatically
with a new release change log available through the Container Cloud web UI.
The Container Cloud web UI is a JavaScript application that is based
on the React framework. The Container Cloud web UI is designed to work
on a client side only. Therefore, it does not require a special backend.
It interacts with the Kubernetes and Keycloak APIs directly.
The Container Cloud web UI uses a Keycloak token
to interact with Container Cloud API and download kubeconfig
for the management and managed clusters.
The Container Cloud web UI uses NGINX that runs on a management cluster
and handles the Container Cloud web UI static files.
NGINX proxies the Kubernetes and Keycloak APIs
for the Container Cloud web UI.
The bare metal service provides for the discovery, deployment, and management
of bare metal hosts.
The bare metal management in Mirantis Container Cloud
is implemented as a set of modular microservices.
Each microservice implements a certain requirement or function
within the bare metal management system.
The backend bare metal manager in a standalone mode with its auxiliary
services that include httpd, dnsmasq, and mariadb.
OpenStack Ironic Inspector
Introspects and discovers the bare metal hosts inventory.
Includes OpenStack Ironic Python Agent (IPA) that is used
as a provision-time agent for managing bare metal hosts.
Ironic Operator
Monitors changes in the external IP addresses of httpd, ironic,
and ironic-inspector and automatically reconciles the configuration
for dnsmasq, ironic, baremetal-provider,
and baremetal-operator.
Bare Metal Operator
Manages bare metal hosts through the Ironic API. The Container Cloud
bare-metal operator implementation is based on the Metal³ project.
Bare metal resources manager
Ensures that the bare metal provisioning artifacts such as the
distribution image of the operating system is available and up to date.
cluster-api-provider-baremetal
The plugin for the Kubernetes Cluster API integrated with Container Cloud.
Container Cloud uses the Metal³ implementation of
cluster-api-provider-baremetal for the Cluster API.
HAProxy
Load balancer for external access to the Kubernetes API endpoint.
LCM Agent
Used for physical and logical storage, physical and logical network,
and control over the life cycle of a bare metal machine resources.
Ceph
Distributed shared storage is required by the Container Cloud services
to create persistent volumes to store their data.
MetalLB
Load balancer for Kubernetes services on bare metal. 1
Keepalived
Monitoring service that ensures availability of the virtual IP for
the external load balancer endpoint (HAProxy). 1
IPAM
IP address management services provide consistent IP address space
to the machines in bare metal clusters. See details in
IP Address Management.
Mirantis Container Cloud on bare metal uses IP Address Management (IPAM)
to keep track of the network addresses allocated to bare metal hosts.
This is necessary to avoid IP address conflicts
and expiration of address leases to machines through DHCP.
Note
Only IPv4 address family is currently supported by Container Cloud
and IPAM. IPv6 is not supported and not used in Container Cloud.
IPAM is provided by the kaas-ipam controller. Its functions include:
Allocation of IP address ranges or subnets to newly created clusters using
the Subnet resource.
Allocation of IP addresses to machines and cluster services at the request
of baremetal-provider using the IpamHost and IPaddr resources.
Creation and maintenance of host networking configuration
on the bare metal hosts using the IpamHost resources.
The IPAM service can support different networking topologies and network
hardware configurations on the bare metal hosts.
In the most basic network configuration, IPAM uses a single L3 network
to assign addresses to all bare metal hosts, as defined in
Managed cluster networking.
You can apply complex networking configurations to a bare metal host
using the L2 templates. The L2 templates imply multihomed host networking
and enable you to create a managed cluster where nodes use separate host
networks for different types of traffic. Multihoming is required
to ensure the security and performance of a managed cluster.
Caution
Modification of L2 templates in use is allowed with a mandatory
validation step from the Infrastructure Operator to prevent accidental
cluster failures due to unsafe changes. The list of risks posed by modifying
L2 templates includes:
Services running on hosts cannot reconfigure automatically to switch to
the new IP addresses and/or interfaces.
Connections between services are interrupted unexpectedly, which can cause
data loss.
Incorrect configurations on hosts can lead to irrevocable loss of
connectivity between services and unexpected cluster partition or
disassembly.
The main purpose of networking in a Container Cloud management cluster is to
provide access to the Container Cloud Management API that consists of:
Container Cloud Public API
Used by end users to provision and configure managed clusters and machines.
Includes the Container Cloud web UI.
Container Cloud LCM API
Used by LCM agents in managed clusters to obtain configuration and report
status. Contains provider-specific services and internal API including
LCMMachine and LCMCluster objects.
The following types of networks are supported for the management clusters in
Container Cloud:
PXE network
Enables PXE boot of all bare metal machines in the Container Cloud region.
PXE subnet
Provides IP addresses for DHCP and network boot of the bare metal hosts
for initial inspection and operating system provisioning.
This network may not have the default gateway or a router connected
to it. The PXE subnet is defined by the Container Cloud Operator
during bootstrap.
Provides IP addresses for the bare metal management services of
Container Cloud, such as bare metal provisioning service (Ironic).
These addresses are allocated and served by MetalLB.
Management network
Connects LCM Agents running on the hosts to the Container Cloud LCM API.
Serves the external connections to the Container Cloud Management API.
The network is also used for communication between kubelet
and the Kubernetes API server inside a Kubernetes cluster. The MKE
components use this network for communication inside a swarm cluster.
LCM subnet
Provides IP addresses for the Kubernetes nodes in the management cluster.
This network also provides a Virtual IP (VIP) address for the load
balancer that enables external access to the Kubernetes API
of a management cluster. This VIP is also the endpoint to access
the Container Cloud Management API in the management cluster.
Provides IP addresses for the externally accessible services of
Container Cloud, such as Keycloak, web UI, StackLight.
These addresses are allocated and served by MetalLB.
Kubernetes workloads network
Technology Preview
Serves the internal traffic between workloads on the management cluster.
Kubernetes workloads subnet
Provides IP addresses that are assigned to nodes and used by Calico.
Out-of-Band (OOB) network
Connects to Baseboard Management Controllers of the servers that host
the management cluster. The OOB subnet must be accessible from the
management network through IP routing. The OOB network
is not managed by Container Cloud and is not represented in the IPAM API.
A Kubernetes cluster networking is typically focused on connecting pods on
different nodes. On bare metal, however, the cluster networking is more
complex as it needs to facilitate many different types of traffic.
Kubernetes clusters managed by Mirantis Container Cloud
have the following types of traffic:
PXE network
Enables the PXE boot of all bare metal machines in Container Cloud.
This network is not configured on the hosts in a managed cluster.
It is used by the bare metal provider to provision additional
hosts in managed clusters and is disabled on the hosts after
provisioning is done.
Life-cycle management (LCM) network
Connects LCM Agents running on the hosts to the Container Cloud LCM API.
The LCM API is provided by the management cluster.
The LCM network is also used for communication between kubelet
and the Kubernetes API server inside a Kubernetes cluster. The MKE
components use this network for communication inside a swarm cluster.
When using the BGP announcement of the IP address for the cluster API
load balancer, which is available as Technology Preview since
Container Cloud 2.24.4, no segment stretching is required
between Kubernetes master nodes. Also, in this scenario, the load
balancer IP address is not required to match the LCM subnet CIDR address.
LCM subnet(s)
Provides IP addresses that are statically allocated by the IPAM service
to bare metal hosts. This network must be connected to the Kubernetes API
endpoint of the management cluster through an IP router.
LCM Agents running on managed clusters will connect to the management
cluster API through this router. LCM subnets may be different
per managed cluster as long as this connection requirement is satisfied.
The Virtual IP (VIP) address for load balancer that enables access to
the Kubernetes API of the managed cluster must be allocated from the LCM
subnet.
Cluster API subnet
Technology Preview
Provides a load balancer IP address for external access to the cluster
API. Mirantis recommends that this subnet stays unique per managed
cluster.
Kubernetes workloads network
Serves as an underlay network for traffic between pods in
the managed cluster. Do not share this network between clusters.
Kubernetes workloads subnet(s)
Provides IP addresses that are statically allocated by the IPAM service
to all nodes and that are used by Calico for cross-node communication
inside a cluster. By default, VXLAN overlay is used for Calico
cross-node communication.
Kubernetes external network
Serves ingress traffic to the managed cluster from the outside world.
You can share this network between clusters, but with dedicated subnets
per cluster. Several or all cluster nodes must be connected to
this network. Traffic from external users to the externally available
Kubernetes load-balanced services comes through the nodes that
are connected to this network.
Services subnet(s)
Provides IP addresses for externally available Kubernetes load-balanced
services. The address ranges for MetalLB are assigned from this subnet.
There can be several subnets per managed cluster that define
the address ranges or address pools for MetalLB.
External subnet(s)
Provides IP addresses that are statically allocated by the IPAM service
to nodes. The IP gateway in this network is used as the default route
on all nodes that are connected to this network. This network
allows external users to connect to the cluster services exposed as
Kubernetes load-balanced services. MetalLB speakers must run on the same
nodes. For details, see Configure node selector for MetalLB speaker.
Storage network
Serves storage access and replication traffic from and to Ceph OSD services.
The storage network does not need to be connected to any IP routers
and does not require external access, unless you want to use Ceph
from outside of a Kubernetes cluster.
To use a dedicated storage network, define and configure
both subnets listed below.
Storage access subnet(s)
Provides IP addresses that are statically allocated by the IPAM service
to Ceph nodes.
The Ceph OSD services bind to these addresses on their respective
nodes. Serves Ceph access traffic from and to storage clients.
This is a public network in Ceph terms. 1
Storage replication subnet(s)
Provides IP addresses that are statically allocated by the IPAM service
to Ceph nodes.
The Ceph OSD services bind to these addresses on their respective
nodes. Serves Ceph internal replication traffic. This is a
cluster network in Ceph terms. 1
Out-of-Band (OOB) network
Connects baseboard management controllers (BMCs) of the bare metal hosts.
This network must not be accessible from the managed clusters.
The following diagram illustrates the networking schema of the Container Cloud
deployment on bare metal with a managed cluster:
The following network roles are defined for all Mirantis Container Cloud
clusters nodes on bare metal including the bootstrap, management and managed
cluster nodes:
Out-of-band (OOB) network
Connects the Baseboard Management Controllers (BMCs) of the hosts
in the network to Ironic. This network is out of band for the
host operating system.
PXE network
Enables remote booting of servers through the PXE protocol. In management
clusters, DHCP server listens on this network for hosts discovery and
inspection. In managed clusters, hosts use this network for the initial
PXE boot and provisioning.
LCM network
Connects LCM Agents running on the node to the LCM API of the management
cluster. It is also used for communication between kubelet and the
Kubernetes API server inside a Kubernetes cluster. The MKE components use
this network for communication inside a swarm cluster.
In management clusters, it is replaced by the management network.
Kubernetes workloads (pods) network
Technology Preview
Serves connections between Kubernetes pods.
Each host has an address on this network, and this address is used
by Calico as an endpoint to the underlay network.
Kubernetes external network
Technology Preview
Serves external connection to the Kubernetes API
and the user services exposed by the cluster. In management clusters,
it is replaced by the management network.
Management network
Serves external connections to the Container Cloud Management API and
services of the management cluster. Not available in a managed cluster.
Storage access network
Connects Ceph nodes to the storage clients. The Ceph OSD service is
bound to the address on this network. This is a public network in
Ceph terms. 0
Storage replication network
Connects Ceph nodes to each other. Serves internal replication traffic.
This is a cluster network in Ceph terms. 0
Each network is represented on the host by a virtual Linux bridge. Physical
interfaces may be connected to one of the bridges directly, or through a
logical VLAN subinterface, or combined into a bond interface that is in
turn connected to a bridge.
The following table summarizes the default names used for the bridges
connected to the networks listed above:
The baremetal-based Mirantis Container Cloud uses Ceph as a distributed
storage system for file, block, and object storage. This section provides an
overview of a Ceph cluster deployed by Container Cloud.
Mirantis Container Cloud deploys Ceph on baremetal-based managed clusters
using Helm charts with the following components:
Rook Ceph Operator
A storage orchestrator that deploys Ceph on top of a Kubernetes cluster. Also
known as Rook or RookOperator. Rook operations include:
Deploying and managing a Ceph cluster based on provided Rook CRs such as
CephCluster, CephBlockPool, CephObjectStore, and so on.
Orchestrating the state of the Ceph cluster and all its daemons.
KaaSCephCluster custom resource (CR)
Represents the customization of a Kubernetes installation and allows you to
define the required Ceph configuration through the Container Cloud web UI
before deployment. For example, you can define the failure domain, Ceph pools,
Ceph node roles, number of Ceph components such as Ceph OSDs, and so on.
The ceph-kcc-controller controller on the Container Cloud management
cluster manages the KaaSCephCluster CR.
Ceph Controller
A Kubernetes controller that obtains the parameters from Container Cloud
through a CR, creates CRs for Rook and updates its CR status based on the Ceph
cluster deployment progress. It creates users, pools, and keys for OpenStack
and Kubernetes and provides Ceph configurations and keys to access them. Also,
Ceph Controller eventually obtains the data from the OpenStack Controller for
the Keystone integration and updates the RADOS Gateway services configurations
to use Kubernetes for user authentication. Ceph Controller operations include:
Transforming user parameters from the Container Cloud Ceph CR into Rook CRs
and deploying a Ceph cluster using Rook.
Providing integration of the Ceph cluster with Kubernetes.
Providing data for OpenStack to integrate with the deployed Ceph cluster.
Ceph Status Controller
A Kubernetes controller that collects all valuable parameters from the current
Ceph cluster, its daemons, and entities and exposes them into the
KaaSCephCluster status. Ceph Status Controller operations include:
Collecting all statuses from a Ceph cluster and corresponding Rook CRs.
Collecting additional information on the health of Ceph daemons.
Provides information to the status section of the KaaSCephCluster
CR.
Ceph Request Controller
A Kubernetes controller that obtains the parameters from Container Cloud
through a CR and manages Ceph OSD lifecycle management (LCM) operations. It
allows for a safe Ceph OSD removal from the Ceph cluster. Ceph Request
Controller operations include:
Providing an ability to perform Ceph OSD LCM operations.
Obtaining specific CRs to remove Ceph OSDs and executing them.
Pausing the regular Ceph Controller reconcile until all requests are
completed.
A typical Ceph cluster consists of the following components:
Ceph Monitors - three or, in rare cases, five Ceph Monitors.
Ceph Managers:
Before Container Cloud 2.22.0, one Ceph Manager.
Since Container Cloud 2.22.0, two Ceph Managers.
RADOS Gateway services - Mirantis recommends having three or more RADOS
Gateway instances for HA.
Ceph OSDs - the number of Ceph OSDs may vary according to the deployment
needs.
Warning
A Ceph cluster with 3 Ceph nodes does not provide
hardware fault tolerance and is not eligible
for recovery operations,
such as a disk or an entire Ceph node replacement.
A Ceph cluster uses the replication factor that equals 3.
If the number of Ceph OSDs is less than 3, a Ceph cluster
moves to the degraded state with the write operations
restriction until the number of alive Ceph OSDs
equals the replication factor again.
The placement of Ceph Monitors and Ceph Managers is defined in the
KaaSCephCluster CR.
The following diagram illustrates the way a Ceph cluster is deployed in
Container Cloud:
The following diagram illustrates the processes within a deployed Ceph cluster:
A Ceph cluster configuration in Mirantis Container Cloud
includes but is not limited to the following limitations:
Only one Ceph Controller per a managed cluster and only one Ceph cluster per
Ceph Controller are supported.
The replication size for any Ceph pool must be set to more than 1.
All CRUSH rules must have the same failure_domain.
Only one CRUSH tree per cluster. The separation of devices per Ceph pool is
supported through device classes
with only one pool of each type for a device class.
Only the following types of CRUSH buckets are supported:
topology.kubernetes.io/region
topology.kubernetes.io/zone
topology.rook.io/datacenter
topology.rook.io/room
topology.rook.io/pod
topology.rook.io/pdu
topology.rook.io/row
topology.rook.io/rack
topology.rook.io/chassis
Only IPv4 is supported.
If two or more Ceph OSDs are located on the same device, there must be no
dedicated WAL or DB for this class.
Only a full collocation or dedicated WAL and DB configurations are supported.
The minimum size of any defined Ceph OSD device is 5 GB.
Lifted since Container Cloud 2.24.2 (Cluster releases 14.0.1 and 15.0.1).
Ceph cluster does not support removable devices (with hotplug enabled) for
deploying Ceph OSDs.
Ceph OSDs support only raw disks as data devices meaning that no dm or
lvm devices are allowed.
When adding a Ceph node with the Ceph Monitor role, if any issues occur with
the Ceph Monitor, rook-ceph removes it and adds a new Ceph Monitor instead,
named using the next alphabetic character in order. Therefore, the Ceph Monitor
names may not follow the alphabetical order. For example, a, b, d,
instead of a, b, c.
Reducing the number of Ceph Monitors is not supported and causes the Ceph
Monitor daemons removal from random nodes.
Removal of the mgr role in the nodes section of the
KaaSCephCluster CR does not remove Ceph Managers. To remove a Ceph
Manager from a node, remove it from the nodes spec and manually delete
the mgr pod in the Rook namespace.
Lifted since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.10).
Ceph does not support allocation of Ceph RGW pods on nodes where the
Federal Information Processing Standard (FIPS) mode is enabled.
There are several formats to use when specifying and addressing storage devices
of a Ceph cluster. The default and recommended one is the /dev/disk/by-id
format. This format is reliable and unaffected by the disk controller actions,
such as device name shuffling or /dev/disk/by-path recalculating.
Difference between by-id, name, and by-path formats¶
The storage device /dev/disk/by-id format in most of the cases bases on
a disk serial number, which is unique for each disk. A by-id symlink
is created by the udev rules in the following format, where <BusID>
is an ID of the bus to which the disk is attached and <DiskSerialNumber>
stands for a unique disk serial number:
/dev/disk/by-id/<BusID>-<DiskSerialNumber>
Typical by-id symlinks for storage devices look as follows:
In the example above, symlinks contain the following IDs:
Bus IDs: nvme, scsi-SATA and ata
Disk serial numbers: SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543,
HGST_HUS724040AL_PN1334PEHN18ZS and
WDC_WD4003FZEX-00Z4SA0_WD-WMC5D0D9DMEH.
An exception to this rule is the wwnby-id symlinks, which are
programmatically generated at boot. They are not solely based on disk
serial numbers but also include other node information. This can lead
to the wwn being recalculated when the node reboots. As a result,
this symlink type cannot guarantee a persistent disk identifier and should
not be used as a stable storage device symlink in a Ceph cluster.
The storage device name and by-path formats cannot be considered
persistent because the sequence in which block devices are added during boot
is semi-arbitrary. This means that block device names, for example, nvme0n1
and sdc, are assigned to physical disks during discovery, which may vary
inconsistently from the previous node state. The same inconsistency applies
to by-path symlinks, as they rely on the shortest physical path
to the device at boot and may differ from the previous node state.
Therefore, Mirantis highly recommends using storage device by-id symlinks
that contain disk serial numbers. This approach enables you to use a persistent
device identifier addressed in the Ceph cluster specification.
Example KaaSCephCluster with device by-id identifiers¶
Below is an example KaaSCephCluster custom resource using the
/dev/disk/by-id format for storage devices specification:
Note
Container Cloud enables you to use fullPath for the by-id
symlinks since 2.25.0. For the earlier product versions, use the name
field instead.
apiVersion:kaas.mirantis.com/v1alpha1kind:KaaSCephClustermetadata:name:ceph-cluster-managed-clusternamespace:managed-nsspec:cephClusterSpec:nodes:# Add the exact ``nodes`` names.# Obtain the name from the "get machine" list.cz812-managed-cluster-storage-worker-noefi-58spl:roles:-mgr-mon# All disk configuration must be reflected in ``status.providerStatus.hardware.storage`` of the ``Machine`` objectstorageDevices:-config:deviceClass:ssdfullPath:/dev/disk/by-id/scsi-1ATA_WDC_WDS100T2B0A-00SM50_200231440912cz813-managed-cluster-storage-worker-noefi-lr4k4:roles:-mgr-monstorageDevices:-config:deviceClass:nvmefullPath:/dev/disk/by-id/nvme-SAMSUNG_MZ1LB3T8HMLA-00007_S46FNY0R394543cz814-managed-cluster-storage-worker-noefi-z2m67:roles:-mgr-monstorageDevices:-config:deviceClass:nvmefullPath:/dev/disk/by-id/nvme-SAMSUNG_ML1EB3T8HMLA-00007_S46FNY1R130423pools:-default:truedeviceClass:ssdname:kubernetesreplicated:size:3role:kubernetesk8sCluster:name:managed-clusternamespace:managed-ns
Migrating device names used in KaaSCephCluster to device by-id symlinks¶
The majority of existing clusters uses device names as addressed storage
devices identifiers in the spec.cephClusterSpec.nodes section of
the KaaSCephCluster custom resource. Therefore, they are prone
to the issue of inconsistent storage device identifiers during cluster
update. Refer to Migrate Ceph cluster to address storage devices using by-id to mitigate possible
risks.
Mirantis Container Cloud provides APIs that enable you
to define hardware configurations that extend the reference architecture:
Bare Metal Host Profile API
Enables for quick configuration of host boot and storage devices
and assigning of custom configuration profiles to individual machines.
See Create a custom bare metal host profile.
IP Address Management API
Enables for quick configuration of host network interfaces and IP addresses
and setting up of IP addresses ranges for automatic allocation.
See Create L2 templates.
Typically, operations with the extended hardware configurations are available
through the API and CLI, but not the web UI.
To keep operating system on a bare metal host up to date with the latest
security updates, the operating system requires periodic software
packages upgrade that may or may not require the host reboot.
Mirantis Container Cloud uses life cycle management tools to update
the operating system packages on the bare metal hosts. Container Cloud
may also trigger restart of bare metal hosts to apply the updates.
In the management cluster of Container Cloud, software package upgrade and
host restart is applied automatically when a new Container Cloud version
with available kernel or software packages upgrade is released.
In managed clusters, package upgrade and host restart is applied
as part of usual cluster upgrade using the Update cluster option
in the Container Cloud web UI.
Operating system upgrade and host restart are applied to cluster
nodes one by one. If Ceph is installed in the cluster, the Container
Cloud orchestration securely pauses the Ceph OSDs on the node before
restart. This allows avoiding degradation of the storage service.
Caution
Depending on the cluster configuration, applying security
updates and host restart can increase the update time for each node to up to
1 hour.
Cluster nodes are updated one by one. Therefore, for large clusters,
the update may take several days to complete.
The Mirantis Container Cloud managed clusters use MetalLB for load balancing
of services and HAProxy with VIP managed by Virtual Router Redundancy Protocol
(VRRP) with Keepalived for the Kubernetes API load balancer.
Every control plane node of each Kubernetes cluster runs the kube-api
service in a container. This service provides a Kubernetes API endpoint.
Every control plane node also runs the haproxy server that provides
load balancing with backend health checking for all kube-api endpoints as
backends.
The default load balancing method is least_conn. With this method,
a request is sent to the server with the least number of active
connections. The default load balancing method cannot be changed
using the Container Cloud API.
Only one of the control plane nodes at any given time serves as a
front end for Kubernetes API. To ensure this, the Kubernetes clients
use a virtual IP (VIP) address for accessing Kubernetes API.
This VIP is assigned to one node at a time using VRRP. Keepalived running on
each control plane node provides health checking and failover of the VIP.
Keepalived is configured in multicast mode.
Note
The use of VIP address for load balancing of Kubernetes API requires
that all control plane nodes of a Kubernetes cluster are connected to a
shared L2 segment. This limitation prevents from installing full L3
topologies where control plane nodes are split between different L2 segments
and L3 networks.
The services provided by the Kubernetes clusters, including Container Cloud and
user services, are balanced by MetalLB. The metallb-speaker service runs on
every worker node in the cluster and handles connections to the service IP
addresses.
MetalLB runs in the MAC-based (L2) mode. It means that all control plane nodes
must be connected to a shared L2 segment. This is a limitation that does not
allow installing full L3 cluster topologies.
The Kubernetes lifecycle management (LCM) engine in Mirantis Container Cloud
consists of the following components:
LCM Controller
Responsible for all LCM operations. Consumes the LCMCluster object
and orchestrates actions through LCM Agent.
LCM Agent
Runs on the target host. Executes Ansible playbooks in headless mode.
Helm Controller
Responsible for the Helm charts life cycle, is installed by the provider
as a Helm v3 chart.
The Kubernetes LCM components handle the following custom resources:
LCMCluster
LCMMachine
HelmBundle
The following diagram illustrates handling of the LCM custom resources by the
Kubernetes LCM components. On a managed cluster, apiserver handles multiple
Kubernetes objects, for example, deployments, nodes, RBAC, and so on.
The Kubernetes LCM components handle the following custom resources (CRs):
LCMMachine
LCMCluster
HelmBundle
LCMMachine
Describes a machine that is located on a cluster.
It contains the machine type, control or worker,
StateItems that correspond to Ansible playbooks and miscellaneous actions,
for example, downloading a file or executing a shell command.
LCMMachine reflects the current state of the machine, for example,
a node IP address, and each StateItem through its status.
Multiple LCMMachine CRs can correspond to a single cluster.
LCMCluster
Describes a managed cluster. In its spec,
LCMCluster contains a set of StateItems for each type of LCMMachine,
which describe the actions that must be performed to deploy the cluster.
LCMCluster is created by the provider, using machineTypes
of the Release object. The status field of LCMCluster
reflects the status of the cluster,
for example, the number of ready or requested nodes.
HelmBundle
Wrapper for Helm charts that is handled by Helm Controller.
HelmBundle tracks what Helm charts must be installed
on a managed cluster.
LCM Controller runs on the management cluster and orchestrates the
LCMMachine objects according to their type and their LCMCluster object.
Once the LCMCluster and LCMMachine objects are created, LCM Controller
starts monitoring them to modify the spec fields and update
the status fields of the LCMMachine objects when required.
The status field of LCMMachine is updated by LCM Agent
running on a node of a management or managed cluster.
Each LCMMachine has the following lifecycle states:
Uninitialized - the machine is not yet assigned to an LCMCluster.
Pending - the agent reports a node IP address and host name.
Prepare - the machine executes StateItems that correspond
to the prepare phase. This phase usually involves downloading
the necessary archives and packages.
Deploy - the machine executes StateItems that correspond
to the deploy phase that is becoming a Mirantis Kubernetes Engine (MKE)
node.
Ready - the machine is being deployed.
Upgrade - the machine is being upgraded to the new MKE version.
Reconfigure - the machine executes StateItems that correspond
to the reconfigure phase. The machine configuration is being updated
without affecting workloads running on the machine.
The templates for StateItems are stored in the machineTypes
field of an LCMCluster object, with separate lists
for the MKE manager and worker nodes.
Each StateItem has the execution phase field for a management and
managed cluster:
The prepare phase is executed for all machines for which
it was not executed yet. This phase comprises downloading the files
necessary for the cluster deployment, installing the required packages,
and so on.
During the deploy phase, a node is added to the cluster.
LCM Controller applies the deploy phase to the nodes
in the following order:
First manager node is deployed.
The remaining manager nodes are deployed one by one
and the worker nodes are deployed in batches (by default,
up to 50 worker nodes at the same time).
LCM Controller deploys and upgrades a Mirantis Container Cloud cluster
by setting StateItems of LCMMachine objects following the corresponding
StateItems phases described above. The Container Cloud cluster upgrade
process follows the same logic that is used for a new deployment,
that is applying a new set of StateItems to the LCMMachines after
updating the LCMCluster object. But if the existing worker node is being
upgraded, LCM Controller performs draining and cordoning on this node honoring
the Pod Disruption Budgets.
This operation prevents unexpected disruptions of the workloads.
LCM Agent handles a single machine that belongs to a management or managed
cluster. It runs on the machine operating system but communicates with
apiserver of the management cluster. LCM Agent is deployed as a systemd
unit using cloud-init. LCM Agent has a built-in self-upgrade mechanism.
LCM Agent monitors the spec of a particular LCMMachine object
to reconcile the machine state with the object StateItems and update
the LCMMachine status accordingly. The actions that LCM Agent performs
while handling the StateItems are as follows:
Download configuration files
Run shell commands
Run Ansible playbooks in headless mode
LCM Agent provides the IP address and host name of the machine for the
LCMMachine status parameter.
Helm Controller is used by Mirantis Container Cloud to handle management and
managed clusters core addons such as StackLight and the application addons
such as the OpenStack components.
Helm Controller is installed as a separate Helm v3 chart by the Container
Cloud provider. Its Pods are created using Deployment.
The Helm release information is stored in the KaaSRelease object for
the management clusters and in the ClusterRelease object for all types of
the Container Cloud clusters.
These objects are used by the Container Cloud provider.
The Container Cloud provider uses the information from the
ClusterRelease object together with the Container Cloud API
Clusterspec. In Clusterspec, the operator can specify
the Helm release name and charts to use.
By combining the information from the ClusterproviderSpec parameter
and its ClusterRelease object, the cluster actuator generates
the LCMCluster objects. These objects are further handled by LCM Controller
and the HelmBundle object handled by Helm Controller.
HelmBundle must have the same name as the LCMCluster object
for the cluster that HelmBundle applies to.
Although a cluster actuator can only create a single HelmBundle
per cluster, Helm Controller can handle multiple HelmBundle objects
per cluster.
Helm Controller handles the HelmBundle objects and reconciles them with the
state of Helm in its cluster.
Helm Controller can also be used by the management cluster with corresponding
HelmBundle objects created as part of the initial management cluster setup.
Identity and access management (IAM) provides a central point
of users and permissions management of the Mirantis Container
Cloud cluster resources in a granular and unified manner.
Also, IAM provides infrastructure for single sign-on user experience
across all Container Cloud web portals.
IAM for Container Cloud consists of the following components:
Keycloak
Provides the OpenID Connect endpoint
Integrates with an external identity provider (IdP), for example,
existing LDAP or Google Open Authorization (OAuth)
Stores roles mapping for users
IAM Controller
Provides IAM API with data about Container Cloud projects
Handles all role-based access control (RBAC) components in Kubernetes API
IAM API
Provides an abstraction API for creating user scopes and roles
To be consistent and keep the integrity of a user database
and user permissions, in Mirantis Container Cloud,
IAM stores the user identity information internally.
However in real deployments, the identity provider usually already exists.
Out of the box, in Container Cloud, IAM supports
integration with LDAP and Google Open Authorization (OAuth).
If LDAP is configured as an external identity provider,
IAM performs one-way synchronization by mapping attributes according
to configuration.
In the case of the Google Open Authorization (OAuth) integration,
the user is automatically registered and their credentials are stored
in the internal database according to the user template configuration.
The Google OAuth registration workflow is as follows:
The user requests a Container Cloud web UI resource.
The user is redirected to the IAM login page and logs in using
the Log in with Google account option.
IAM creates a new user with the default access rights that are defined
in the user template configuration.
The user can access the Container Cloud web UI resource.
The following diagram illustrates the external IdP integration to IAM:
You can configure simultaneous integration with both external IdPs
with the user identity matching feature enabled.
Mirantis IAM performs as an OpenID Connect (OIDC) provider,
it issues a token and exposes discovery endpoints.
The credentials can be handled by IAM itself or delegated
to an external identity provider (IdP).
The issued JSON Web Token (JWT) is sufficient to perform operations across
Mirantis Container Cloud according to the scope and role defined
in it. Mirantis recommends using asymmetric cryptography for token signing
(RS256) to minimize the dependency between IAM and managed components.
When Container Cloud calls Mirantis Kubernetes Engine (MKE),
the user in Keycloak is created automatically with a JWT issued by Keycloak
on behalf of the end user.
MKE, in its turn, verifies whether the JWT is issued by Keycloak. If
the user retrieved from the token does not exist in the MKE database,
the user is automatically created in the MKE database based on the
information from the token.
The authorization implementation is out of the scope of IAM in Container
Cloud. This functionality is delegated to the component level.
IAM interacts with a Container Cloud component using the OIDC token
content that is processed by a component itself and required authorization
is enforced. Such an approach enables you to have any underlying authorization
that is not dependent on IAM and still to provide a unified user experience
across all Container Cloud components.
The following diagram illustrates the Kubernetes CLI authentication flow.
The authentication flow for Helm and other Kubernetes-oriented CLI utilities
is identical to the Kubernetes CLI flow,
but JSON Web Tokens (JWT) must be pre-provisioned.
Mirantis Container Cloud uses StackLight, the logging, monitoring, and
alerting solution that provides a single pane of glass for cloud maintenance
and day-to-day operations as well as offers critical insights into cloud
health including operational information about the components deployed in
management and managed clusters.
StackLight is based on Prometheus, an open-source monitoring solution and a
time series database.
Mirantis Container Cloud deploys the StackLight stack
as a release of a Helm chart that contains the helm-controller
and helmbundles.lcm.mirantis.com (HelmBundle) custom resources.
The StackLight HelmBundle consists of a set of Helm charts
with the StackLight components that include:
Receives, consolidates, and deduplicates the alerts sent by Alertmanager
and visually represents them through a simple web UI. Using the Alerta
web UI, you can view the most recent or watched alerts, group, and
filter alerts.
Alertmanager
Handles the alerts sent by client applications such as Prometheus,
deduplicates, groups, and routes alerts to receiver integrations.
Using the Alertmanager web UI, you can view the most recent fired
alerts, silence them, or view the Alertmanager configuration.
Elasticsearch Curator
Maintains the data (indexes) in OpenSearch by performing
such operations as creating, closing, or opening an index as well as
deleting a snapshot. Also, manages the data retention policy in
OpenSearch.
Elasticsearch Exporter Compatible with OpenSearch
The Prometheus exporter that gathers internal OpenSearch metrics.
Grafana
Builds and visually represents metric graphs based on time series
databases. Grafana supports querying of Prometheus using the PromQL
language.
Database backends
StackLight uses PostgreSQL for Alerta and Grafana. PostgreSQL reduces
the data storage fragmentation while enabling high availability.
High availability is achieved using Patroni, the PostgreSQL cluster
manager that monitors for node failures and manages failover
of the primary node. StackLight also uses Patroni to manage major
version upgrades of PostgreSQL clusters, which allows leveraging
the database engine functionality and improvements
as they are introduced upstream in new releases,
maintaining functional continuity without version lock-in.
Logging stack
Responsible for collecting, processing, and persisting logs and
Kubernetes events. By default, when deploying through the Container
Cloud web UI, only the metrics stack is enabled on managed clusters. To
enable StackLight to gather managed cluster logs, enable the logging
stack during deployment. On management clusters, the logging stack is
enabled by default. The logging stack components include:
OpenSearch, which stores logs and notifications.
Fluentd-logs, which collects logs, sends them to OpenSearch, generates
metrics based on analysis of incoming log entries, and exposes these
metrics to Prometheus.
OpenSearch Dashboards, which provides real-time visualization of
the data stored in OpenSearch and enables you to detect issues.
Metricbeat, which collects Kubernetes events and sends them to
OpenSearch for storage.
Prometheus-es-exporter, which presents the OpenSearch data
as Prometheus metrics by periodically sending configured queries to
the OpenSearch cluster and exposing the results to a scrapable HTTP
endpoint like other Prometheus targets.
Note
The logging mechanism performance depends on the cluster log load. In
case of a high load, you may need to increase the default resource requests
and limits for fluentdLogs. For details, see
StackLight configuration parameters: Resource limits.
Metric collector
Collects telemetry data (CPU or memory usage, number of active alerts,
and so on) from Prometheus and sends the data to centralized cloud
storage for further processing and analysis. Metric collector runs on
the management cluster.
Note
This component is designated for internal StackLight use only.
Prometheus
Gathers metrics. Automatically discovers and monitors the endpoints.
Using the Prometheus web UI, you can view simple visualizations and
debug. By default, the Prometheus database stores metrics of the past 15
days or up to 15 GB of data depending on the limit that is reached
first.
Prometheus Blackbox Exporter
Allows monitoring endpoints over HTTP, HTTPS, DNS, TCP, and ICMP.
Prometheus-es-exporter
Presents the OpenSearch data as Prometheus metrics by periodically
sending configured queries to the OpenSearch cluster and exposing the
results to a scrapable HTTP endpoint like other Prometheus targets.
Prometheus Node Exporter
Gathers hardware and operating system metrics exposed by kernel.
Prometheus Relay
Adds a proxy layer to Prometheus to merge the results from underlay
Prometheus servers to prevent gaps in case some data is missing on
some servers. Is available only in the HA StackLight mode.
Salesforce notifier
Enables sending Alertmanager notifications to Salesforce to allow
creating Salesforce cases and closing them once the alerts are resolved.
Disabled by default.
Salesforce reporter
Queries Prometheus for the data about the amount of vCPU, vRAM, and
vStorage used and available, combines the data, and sends it to
Salesforce daily. Mirantis uses the collected data for further analysis
and reports to improve the quality of customer support. Disabled by
default.
Telegraf
Collects metrics from the system. Telegraf is plugin-driven and has
the concept of two distinct set of plugins: input plugins collect
metrics from the system, services, or third-party APIs; output plugins
write and expose metrics to various destinations.
The Telegraf agents used in Container Cloud include:
telegraf-ds-smart monitors SMART disks, and runs on both
management and managed clusters.
telegraf-ironic monitors Ironic on the baremetal-based
management clusters. The ironic input plugin collects and
processes data from Ironic HTTP API, while the http_response
input plugin checks Ironic HTTP API availability. As an output plugin,
to expose collected data as Prometheus target, Telegraf uses
prometheus.
telegraf-docker-swarm gathers metrics from the Mirantis Container
Runtime API about the Docker nodes, networks, and Swarm services. This
is a Docker Telegraf input plugin with downstream additions.
Telemeter
Enables a multi-cluster view through a Grafana dashboard of the
management cluster. Telemeter includes a Prometheus federation push
server and clients to enable isolated Prometheus instances, which
cannot be scraped from a central Prometheus instance, to push metrics
to the central location.
The Telemeter services are distributed between the management cluster
that hosts the Telemeter server and managed clusters that host the
Telemeter client. The metrics from managed clusters are aggregated
on management clusters.
Note
This component is designated for internal StackLight use only.
Every Helm chart contains a default values.yml file. These default values
are partially overridden by custom values defined in the StackLight Helm chart.
Before deploying a managed cluster, you can select the HA or non-HA StackLight
architecture type. The non-HA mode is set by default on managed clusters. On
management clusters, StackLight is deployed in the HA mode only.
The following table lists the differences between the HA and non-HA modes:
One Alertmanager instance
Since 2.24.0 and 2.24.2 for MOSK 23.2
One OpenSearch instance
One PostgreSQL instance
One iam-proxy instance
One persistent volume is provided for storing data. In case of a service
or node failure, a new pod is redeployed and the volume is reattached to
provide the existing data. Such setup has a reduced hardware footprint
but provides less performance.
Two Prometheus instances
Two Alertmanager instances
Three OpenSearch instances
Three PostgreSQL instances
Two iam-proxy instances
Since 2.23.0 and 2.23.1 for MOSK 23.1
Local Volume Provisioner is used to provide local host storage. In case
of a service or node failure, the traffic is automatically redirected to
any other running Prometheus or OpenSearch server. For better
performance, Mirantis recommends that you deploy StackLight in the HA
mode. Two iam-proxy instances ensure access to HA components if one
iam-proxy node fails.
Note
Before Container Cloud 2.24.0, Alertmanager has 2 replicas in the
non-HA mode.
Caution
Non-HA StackLight requires a backend storage provider,
for example, a Ceph cluster. For details, see Storage.
Depending on the Container Cloud cluster type and selected StackLight database
mode, StackLight is deployed on the following number of nodes:
StackLight provides five web UIs including Prometheus, Alertmanager, Alerta,
OpenSearch Dashboards, and Grafana. Access to StackLight web UIs is protected
by Keycloak-based Identity and access management (IAM). All web UIs except
Alerta are exposed to IAM through the IAM proxy middleware. The Alerta
configuration provides direct integration with IAM.
The following diagram illustrates accessing the IAM-proxied StackLight web UIs,
for example, Prometheus web UI:
Authentication flow for the IAM-proxied StackLight web UIs:
A user enters the public IP of a StackLight web UI, for example, Prometheus
web UI.
The public IP leads to IAM proxy, deployed as a Kubernetes LoadBalancer,
which protects the Prometheus web UI.
LoadBalancer routes the HTTP request to Kubernetes internal IAM proxy
service endpoints, specified in the X-Forwarded-Proto or X-Forwarded-Host
headers.
The Keycloak login form opens (the login_url field in the IAM proxy
configuration, which points to Keycloak realm) and the user enters
the user name and password.
Keycloak validates the user name and password.
The user obtains access to the Prometheus web UI (the upstreams field
in the IAM proxy configuration).
Note
The discovery URL is the URL of the IAM service.
The upstream URL is the hidden endpoint of a web UI (Prometheus web UI in
the example above).
The following diagram illustrates accessing the Alerta web UI:
Authentication flow for the Alerta web UI:
A user enters the public IP of the Alerta web UI.
The public IP leads to Alerta deployed as a Kubernetes LoadBalancer type.
LoadBalancer routes the HTTP request to the Kubernetes internal Alerta
service endpoint.
The Keycloak login form opens (Alerta refers to the IAM realm) and
the user enters the user name and password.
Using the Mirantis Container Cloud web UI,
on the pre-deployment stage of a managed cluster,
you can view, enable or disable, or tune the following StackLight features
available:
StackLight HA mode.
Database retention size and time for Prometheus.
Tunable index retention period for OpenSearch.
Tunable PersistentVolumeClaim (PVC) size for Prometheus and OpenSearch
set to 16 GB for Prometheus and 30 GB for OpenSearch by
default. The PVC size must be logically aligned with the retention periods or
sizes for these components.
Email and Slack receivers for the Alertmanager notifications.
Predefined set of dashboards.
Predefined set of alerts and capability to add
new custom alerts for Prometheus in the following exemplary format:
StackLight measures, analyzes, and reports in a timely manner about failures
that may occur in the following Mirantis Container Cloud
components and their sub-components, if any:
StackLight uses a storage-based log retention strategy that optimizes storage
utilization and ensures effective data retention.
A proportion of available disk space is defined as 80% of disk space allocated
for the OpenSearch node with the following data types:
80% for system logs
10% for audit logs
5% for OpenStack notifications (applies only to MOSK clusters)
5% for Kubernetes events
This approach ensures that storage resources are efficiently allocated based
on the importance and volume of different data types.
The logging index management implies the following advantages:
Storage-based rollover mechanism
The rollover mechanism for system and audit indices enforces shard size
based on available storage, ensuring optimal resource utilization.
Consistent shard allocation
The number of primary shards per index is dynamically set based on cluster
size, which boosts search and facilitates ingestion for large clusters.
Minimal size of cluster state
The logging size of the cluster state is minimal and uses static mappings,
which are based on Elastic Common Schema (ESC) with slight deviations
from the standard. Dynamic mapping in index templates is avoided to reduce
overhead.
Storage compression
The system and audit indices utilize the best_compression codec that
minimizes the size of stored indices, resulting in significant storage
savings of up to 50% on average.
No filter by logging level
In light of non-even severity level over components in Container Cloud,
logs of all severity levels are collected to prevent ignorance of important
logs of low severity while debugging a cluster. Filtering by tags is still
available.
The data collected and transmitted through an encrypted channel back to
Mirantis provides our Customer Success Organization information to better
understand the operational usage patterns our customers are experiencing
as well as to provide feedback on product usage statistics to enable our
product teams to enhance our products and services for our customers.
Mirantis collects the following statistics using configuration-collector:
Since the Cluster releases 17.1.0 and 16.1.0
Mirantis collects hardware information using the following metrics:
mcc_hw_machine_chassis
mcc_hw_machine_cpu_model
mcc_hw_machine_cpu_number
mcc_hw_machine_nics
mcc_hw_machine_ram
mcc_hw_machine_storage (storage devices and disk layout)
mcc_hw_machine_vendor
Before the Cluster releases 17.0.0, 16.0.0, and 14.1.0
Mirantis collects the summary of all deployed Container Cloud configurations
using the following objects, if any:
Note
The data is anonymized from all sensitive information, such as IDs,
IP addresses, passwords, private keys, and so on.
Cluster
Machine
MCCUpgrade
BareMetalHost
BareMetalHostProfile
IPAMHost
IPAddr
KaaSCephCluster
L2Template
Subnet
Note
In the Cluster releases 17.0.0, 16.0.0, and 14.1.0, Mirantis does
not collect any configuration summary in light of the
configuration-collector refactoring.
The node-level resource data are broken down into three broad categories:
Cluster, Node, and Namespace. The telemetry data tracks Allocatable,
Capacity, Limits, Requests, and actual Usage of node-level resources.
StackLight components, which require external access, automatically use the
same proxy that is configured for Mirantis Container Cloud clusters. Therefore,
you only need to configure proxy during deployment of your management or
managed clusters. No additional actions are required to set up proxy for
StackLight. For more details about implementation of proxy support in
Container Cloud, see Proxy and cache support.
Note
Proxy handles only the HTTP and HTTPS traffic. Therefore, for
clusters with limited or no Internet access, it is not possible to set up
Alertmanager email notifications, which use SMTP, when proxy is used.
Proxy is used for the following StackLight components:
Component
Cluster type
Usage
Alertmanager
Any
As a default http_config
for all HTTP-based receivers except the predefined HTTP-alerta and
HTTP-salesforce. For these receivers, http_config is overridden on
the receiver level.
Metric Collector
Management
To send outbound cluster metrics to Mirantis.
Salesforce notifier
Any
To send notifications to the Salesforce instance.
Salesforce reporter
Any
To send metric reports to the Salesforce instance.
Using Mirantis Container Cloud, you can deploy a Mirantis Kubernetes Engine
(MKE) cluster on bare metal that requires corresponding resources.
If you use a firewall or proxy, make sure that the bootstrap and management
clusters have access to the following IP ranges and domain names
required for the Container Cloud content delivery network and alerting:
mirror.mirantis.com and repos.mirantis.com for packages
binary.mirantis.com for binaries and Helm charts
mirantis.azurecr.io and *.blob.core.windows.net for Docker images
mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry
(port 9093 if proxy is disabled, or port 443 if proxy is enabled)
mirantis.my.salesforce.com and login.salesforce.com
for Salesforce alerts
Note
Access to Salesforce is required from any Container Cloud
cluster type.
If any additional Alertmanager notification receiver is enabled,
for example, Slack, its endpoint must also be accessible
from the cluster.
Caution
Regional clusters are unsupported since Container Cloud 2.25.0.
Mirantis does not perform functional integration testing of the feature and
the related code is removed in Container Cloud 2.26.0. If you still
require this feature, contact Mirantis support for further information.
The following hardware configuration is used as a reference to deploy
Mirantis Container Cloud with bare metal Container Cloud clusters with
Mirantis Kubernetes Engine.
Reference hardware configuration for Container Cloud
management and managed clusters on bare metal¶
A management cluster requires 2 volumes for Container Cloud
(total 50 GB) and 5 volumes for StackLight (total 60 GB).
A managed cluster requires 5 volumes for StackLight.
The seed node is necessary only to deploy the management cluster.
When the bootstrap is complete, the bootstrap node can be
redeployed and its resources can be reused
for the managed cluster workloads.
The minimum reference system requirements for a baremetal-based bootstrap
seed node are as follows:
Basic server on Ubuntu 22.04 with the following configuration:
Kernel version 4.15.0-76.86 or later
8 GB of RAM
4 CPU
10 GB of free disk space for the bootstrap cluster cache
No DHCP or TFTP servers on any NIC networks
Routable access IPMI network for the hardware servers. For more details, see
Host networking.
Internet access for downloading of all required artifacts
The following diagram illustrates the physical and virtual L2 underlay
networking schema for the final state of the Mirantis Container Cloud
bare metal deployment.
The network fabric reference configuration is a spine/leaf with 2 leaf ToR
switches and one out-of-band (OOB) switch per rack.
Reference configuration uses the following switches for ToR and OOB:
Cisco WS-C3560E-24TD has 24 of 1 GbE ports. Used in OOB network
segment.
Dell Force 10 S4810P has 48 of 1/10GbE ports. Used as ToR in Common/PXE
network segment.
In the reference configuration, all odd interfaces from NIC0 are connected
to TORSwitch1, and all even interfaces from NIC0 are connected
to TORSwitch2. The Baseboard Management Controller (BMC) interfaces
of the servers are connected to OOBSwitch1.
The following recommendations apply to all types of nodes:
Use the Link Aggregation Control Protocol (LACP) bonding mode
with MC-LAG domains configured on leaf switches. This corresponds to
the 802.3ad bond mode on hosts.
Use ports from different multi-port NICs when creating bonds. This makes
network connections redundant if failure of a single NIC occurs.
Configure the ports that connect servers to the PXE network with PXE VLAN
as native or untagged. On these ports, configure LACP fallback to ensure
that the servers can reach DHCP server and boot over network.
When setting up the network range for DHCP Preboot Execution Environment
(PXE), keep in mind several considerations to ensure smooth server
provisioning:
Determine the network size. For instance, if you target a concurrent
provision of 50+ servers, a /24 network is recommended. This specific size
is crucial as it provides sufficient scope for the DHCP server to provide
unique IP addresses to each new Media Access Control (MAC) address,
thereby minimizing the risk of collision.
The concept of collision refers to the likelihood of two or more devices
being assigned the same IP address. With a /24 network, the collision
probability using the SDBM hash function, which is used by the DHCP server,
is low. If a collision occurs, the DHCP server
provides a free address using a linear lookup strategy.
In the context of PXE provisioning, technically, the IP address does not
need to be consistent for every new DHCP request associated with the same
MAC address. However, maintaining the same IP address can enhance user
experience, making the /24 network size more of a recommendation
than an absolute requirement.
For a minimal network size, it is sufficient to cover the number of
concurrently provisioned servers plus one additional address (50 + 1).
This calculation applies after covering any exclusions that exist in the
range. You can define excludes in the corresponding field of the Subnet
object. For details, see API Reference: Subnet resource.
When the available address space is less than the minimum described above,
you will not be able to automatically provision all servers. However, you
can manually provision them by combining manual IP assignment for each
bare metal host with manual pauses. For these operations, use the
host.dnsmasqs.metal3.io/address and baremetalhost.metal3.io/detached
annotations in the BareMetalHostInventory object. For details, see
Operations Guide: Manually allocate IP addresses for bare metal hosts.
All addresses within the specified range must remain unused before
provisioning. If an IP address in-use is issued by the DHCP server to a
BOOTP client, that specific client cannot complete provisioning.
The management cluster requires minimum two storage devices per node.
Each device is used for different type of storage.
The first device is always used for boot partitions and the root
file system. SSD is recommended. RAID device is not supported.
One storage device per server is reserved for local persistent
volumes. These volumes are served by the Local Storage Static Provisioner
(local-volume-provisioner) and used by many services of Container Cloud.
If you require all Internet access to go through a proxy server
for security and audit purposes, you can bootstrap management clusters using
proxy. The proxy server settings consist of three standard environment
variables that are set prior to the bootstrap process:
HTTP_PROXY
HTTPS_PROXY
NO_PROXY
These settings are not propagated to managed clusters. However, you can enable
a separate proxy access on a managed cluster using the Container Cloud web UI.
This proxy is intended for the end user needs and is not used for a managed
cluster deployment or for access to the Mirantis resources.
Caution
Since Container Cloud uses the OpenID Connect (OIDC) protocol
for IAM authentication, management clusters require
a direct non-proxy access from managed clusters.
StackLight components, which require external access, automatically use the
same proxy that is configured for Container Cloud clusters.
On the managed clusters with limited Internet access, a proxy is required for
StackLight components that use HTTP and HTTPS and are disabled by default but
need external access if enabled, for example, for the Salesforce integration
and Alertmanager notifications external rules.
For more details about proxy implementation in StackLight, see StackLight proxy.
For the list of Mirantis resources and IP addresses to be accessible
from the Container Cloud clusters, see Requirements.
After enabling proxy support on managed clusters, proxy is used for:
Docker traffic on managed clusters
StackLight
OpenStack on MOSK-based clusters
Warning
Any modification to the Proxy object used in any cluster, for
example, changing the proxy URL, NO_PROXY values, or
certificate, leads to cordon-drain and Docker
restart on the cluster machines.
The Container Cloud managed clusters are deployed without direct Internet
access in order to consume less Internet traffic in your cloud.
The Mirantis artifacts used during managed clusters deployment are downloaded
through a cache running on a management cluster.
The feature is enabled by default on new managed clusters
and will be automatically enabled on existing clusters during upgrade
to the latest version.
Caution
IAM operations require a direct non-proxy access
of a managed cluster to a management cluster.
To ensure the Mirantis Container Cloud stability in managing the Container
Cloud-based Mirantis Kubernetes Engine (MKE) clusters, the following MKE API
functionality is not available for the Container Cloud-based MKE clusters as
compared to the MKE clusters that are deployed not by Container Cloud.
Use the Container Cloud web UI or CLI for this functionality instead.
Public APIs limitations in a Container Cloud-based MKE cluster¶
API endpoint
Limitation
GET/swarm
Swarm Join Tokens are filtered out for all users, including admins.
PUT/api/ucp/config-toml
All requests are forbidden.
POST/nodes/{id}/update
Requests for the following changes are forbidden:
Change Role
Add or remove the com.docker.ucp.orchestrator.swarm and
com.docker.ucp.orchestrator.kubernetes labels.
Since 2.25.1 (Cluster releases 16.0.1 and 17.0.1), Container Cloud does not
override changes in MKE configuration except the following list of parameters
that are automatically managed by Container Cloud. These parameters are always
overridden by the Container Cloud default values if modified direclty using
the MKE API. For details on configuration using the MKE API, see
MKE configuration managed directly by the MKE API.
However, you can manually configure a few options from this list using the
Cluster object of a Container Cloud cluster. They are labeled with the
superscript and contain references to the
respective configuration procedures in the Comments columns of the tables.
All possible values for parameters labeled with the
superscript, which you can manually
configure using the Cluster object are described in
MKE Operations Guide: Configuration options.
MKE configuration managed directly by the MKE API¶
Since 2.25.1, aside from MKE parameters described in MKE configuration managed by Container Cloud,
Container Cloud does not override changes in MKE configuration that are applied
directly through the MKE API. For the configuration options and procedure, see
MKE documentation:
Mirantis cannot guarrantee the expected behavior of the
functionality configured using the MKE API as long as customer-specific
configuration does not undergo testing within Container Cloud. Therefore,
Mirantis recommends that you test custom MKE settings configured through
the MKE API on a staging environment before applying them to production.
Mirantis Container Cloud Bootstrap v2 provides best user experience to set up
Container Cloud. Using Bootstrap v2, you can provision and operate management
clusters using required objects through the Container Cloud API.
Basic concepts and components of Bootstrap v2 include:
Bootstrap cluster
Bootstrap cluster is any kind-based Kubernetes cluster that contains a
minimal set of Container Cloud bootstrap components allowing the user to
prepare the configuration for management cluster deployment and start the
deployment. The list of these components includes:
Bootstrap Controller
Controller that is responsible for:
Configuration of a bootstrap cluster with provider charts through the
bootstrap Helm bundle.
Configuration and deployment of a management cluster and
its related objects.
Helm Controller
Operator that manages Helm chart releases. It installs the Container
Cloud bootstrap and provider charts configured in the bootstrap Helm
bundle.
Public API charts
Helm charts that contain custom resource definitions for Container Cloud
resources.
Admission Controller
Controller that performs mutations and validations for the Container
Cloud resources including cluster and machines configuration.
Currently one bootstrap cluster can be used for deployment of only one
management cluster. For example, to add a new management cluster with
different settings, a new bootstrap cluster must be recreated from scratch.
Bootstrap region
BootstrapRegion is the first object to create in the bootstrap cluster
for the Bootstrap Controller to identify and install provider components
onto the bootstrap cluster. After, the user can prepare and deploy a
management cluster with related resources.
The bootstrap region is a starting point for the cluster deployment. The
user needs to approve the BootstrapRegion object. Otherwise, the
Bootstrap Controller will not be triggered for the cluster deployment.
Bootstrap Helm bundle
Helm bundle that contains charts configuration for the bootstrap cluster.
This object is managed by the Bootstrap Controller that updates the provider
bundle in the BootstrapRegion object. The Bootstrap Controller always
configures provider charts listed in the regional section of the
Container Cloud release for the provider. Depending on the cluster
configuration, the Bootstrap Controller may update or reconfigure this
bundle even after the cluster deployment starts. For example, the Bootstrap
Controller enables the provider in the bootstrap cluster only after the
bootstrap region is approved for the deployment.
Management cluster deployment consists of several sequential stages.
Each stage finishes when a specific condition is met or specific configuration
applies to a cluster or its machines.
In case of issues at any deployment stage, you can identify the problem
and adjust it on the fly. The cluster deployment does not abort until all
stages complete by means of the infinite-timeout option enabled
by default in Bootstrap v2.
Infinite timeout prevents the bootstrap failure due to timeout. This option
is useful in the following cases:
The network speed is slow for artifacts downloading
An infrastructure configuration does not allow booting fast
A bare-metal node inspecting presupposes more than two HDDSATA disks
to attach to a machine
You can track the status of each stage in the bootstrapStatus section of
the Cluster object that is updated by the Bootstrap Controller.
The Bootstrap Controller starts deploying the cluster after you approve the
BootstrapRegion configuration.
The following table describes deployment states of a management cluster that
apply in the strict order.
Verifies proxy configuration in the Cluster object.
If the bootstrap cluster was created without a proxy, no actions are
applied to the cluster.
2
ClusterSSHConfigured
Verifies SSH configuration for the cluster and machines.
You can provide any number of SSH public keys, which are added to
cluster machines. But the Bootstrap Controller always adds the
bootstrap-key SSH public key to the cluster configuration. The
Bootstrap Controller uses this SSH key to manage the lcm-agent
configuration on cluster machines.
The bootstrap-key SSH key is copied to a
bootstrap-key-<clusterName> object containing the cluster name in
its name.
3
ProviderUpdatedInBootstrap
Synchronizes the provider and settings of its components between the
Cluster object and bootstrap Helm bundle. Settings provided in
the cluster configuration have higher priority than the default
settings of the bootstrap cluster, except CDN.
4
ProviderEnabledInBootstrap
Enables the provider and its components if any were disabled by the
Bootstrap Controller during preparation of the bootstrap region.
A cluster and machines deployment starts after the provider enablement.
5
Nodes readiness
Waits for the provider to complete nodes deployment that comprises VMs
creation and MKE installation.
6
ObjectsCreated
Creates required namespaces and IAM secrets.
7
ProviderConfigured
Verifies the provider configuration in the provisioned cluster.
8
HelmBundleReady
Verifies the Helm bundle readiness for the provisioned cluster.
9
ControllersDisabledBeforePivot
Collects the list of deployment controllers and disables them to
prepare for pivot.
10
PivotDone
Moves all cluster-related objects from the bootstrap cluster to the
provisioned cluster. The copies of Cluster and Machine objects
remain in the bootstrap cluster to provide the status information to the
user. About every minute, the Bootstrap Controller reconciles the status
of the Cluster and Machine objects of the provisioned cluster
to the bootstrap cluster.
11
ControllersEnabledAfterPivot
Enables controllers in the provisioned cluster.
12
MachinesLCMAgentUpdated
Updates the lcm-agent configuration on machines to target LCM
agents to the provisioned cluster.
13
HelmControllerDisabledBeforeConfig
Disables the Helm Controller before reconfiguration.
14
HelmControllerConfigUpdated
Updates the Helm Controller configuration for the provisioned cluster.
15
Cluster readiness
Contains information about the global cluster status. The Bootstrap
Controller verifies that OIDC, Helm releases, and all Deployments are
ready. Once the cluster is ready, the Bootstrap Controller stops
managing the cluster.
The setup of a bootstrap cluster comprises preparation of the seed node,
configuration of environment variables, acquisition of the Container Cloud
license file, and execution of the bootstrap script.
To set up a bootstrap cluster:
Prepare the seed node:
Verify that the hardware allocated for the installation meets the
minimal requirements described in Requirements.
Install basic Ubuntu 22.04 server using standard installation images
of the operating system on the bare metal seed node.
Log in to the seed node that is running Ubuntu 22.04.
Prepare the system and network configuration:
Establish a virtual bridge using an IP address of the PXE network on the
seed node. Use the following netplan-based configuration file
as an example:
# cat /etc/netplan/config.yamlnetwork:version:2renderer:networkdethernets:ens3:dhcp4:falsedhcp6:falsebridges:br0:addresses:# Replace with IP address from PXE network to create a virtual bridge-10.0.0.15/24dhcp4:falsedhcp6:false# Adjust for your environmentgateway4:10.0.0.1interfaces:# Interface name may be different in your environment-ens3nameservers:addresses:# Adjust for your environment-8.8.8.8parameters:forward-delay:4stp:false
Apply the new network configuration using netplan:
The system output must contain a json file with no error messages.
In case of errors, follow the steps provided in Troubleshooting.
Note
If you require all Internet access to go through a proxy server
for security and audit purposes, configure Docker proxy settings
as described in the official
Docker documentation.
To verify that Docker is configured correctly and has access to Container
Cloud CDN:
Verify that the seed node has direct access to the Baseboard
Management Controller (BMC) of each bare metal host. All target
hardware nodes must be in the poweroff state.
The provisioning IP address in the PXE network. This address will be
assigned on the seed node to the interface defined by the
KAAS_BM_PXE_BRIDGE parameter described below. The PXE service
of the bootstrap cluster uses this address to network boot
bare metal hosts.
172.16.59.5
KAAS_BM_PXE_MASK
The PXE network address prefix length to be used with the
KAAS_BM_PXE_IP address when assigning it to the seed node
interface.
24
KAAS_BM_PXE_BRIDGE
The PXE network bridge name that must match the name of the bridge
created on the seed node during the Set up a bootstrap cluster stage.
br0
Optional. Configure proxy settings to bootstrap the cluster using proxy:
After the bootstrap cluster is set up, the bootstrap-proxy object is
created with the provided proxy settings. You can use this object later for
the Cluster object configuration.
Deploy the bootstrap cluster:
./bootstrap.shbootstrapv2
Make sure that port 80 is open for localhost to prevent security
requirements for the seed node:
Deploy a management cluster using the Container Cloud API¶
This section contains an overview of the cluster-related objects along with
the configuration procedure of these objects during deployment of a
management cluster using Bootstrap v2 through the Container Cloud API.
Overview of the cluster-related objects in the Container Cloud API/CLI¶
The following cluster-related objects are available through the Container
Cloud API. Use these objects to deploy a management cluster using the
Container Cloud API.
Region and provider names for a management cluster and all related
objects. First object to create in the bootstrap cluster. For
the bootstrap region definition, see Introduction.
SSHKey
Optional. SSH configuration with any number of SSH public keys to be
added to cluster machines.
By default, any bootstrap cluster has a pregenerated bootstrap-key
object to use for the cluster configuration. This is the service SSH key
used by the Bootstrap Controller to access machines for their
deployment. The private part of bootstrap-key is always saved to
kaas-bootstrap/ssh_key.
Proxy
Proxy configuration. Mandatory for offline environments with no direct
access to the Internet. Such configuration usually contains proxy for
the bootstrap cluster and already has the bootstrap-proxy object
to use in the cluster configuration by default.
Provider configuration for a management cluster. Before Container Cloud
2.26.0 (Cluster releases 17.1.0 and 16.1.0), requires the region name
label with the name of the BootstrapRegion object.
Machine
Machine configuration that must fit the following requirements:
Role - only manager
Number - odd for the management cluster HA
Mandatory labels - provider, cluster-name, and region
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
ServiceUser
Service user is the initial user to create in Keycloak for
access to a newly deployed management cluster. By default, it has the
global-admin, operator (namespaced), and bm-pool-operator
(namespaced) roles.
You can delete serviceuser after setting up other required users with
specific roles or after any integration with an external identity provider,
such as LDAP.
BareMetalHostPrivate API since 2.29.0 (16.4.0)
Information about hardware configuration of a machine. Required for
further machine selection during bootstrap. For details, see
API Reference: BareMetalHost.
BareMetalHostInventoryAvailable since 2.29.0 (16.4.0)
Before update of the management cluster to Container Cloud 2.29.0
(Cluster release 16.4.0), instead of BareMetalHostInventory, use the
BareMetalHost object. For details, see BareMetalHost.
Caution
While the Cluster release of the management cluster is 16.4.0,
BareMetalHostInventory operations are allowed to
m:kaas@management-admin only. Once the management cluster is updated
to the Cluster release 16.4.1 (or later), this limitation will be lifted.
BareMetalHostCredential
The object is created for each BareMetalHostInventory and contains
information about the Baseboard Management Controller (bmc)
credentials. For details, see API Reference: BareMetalHostCredential.
BareMetalHostProfile
Provisioning and configuration settings of the storage devices and the
operating system. For details, see API Reference: BareMetalHostProfile.
L2Template
Advanced host networking configuration for clusters, which enables, for
example, creation of bond interfaces on top of physical interfaces on the
host or the use of multiple subnets to separate different types of network
traffic. For details, see API Reference: L2Template.
Before Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0)
contains a reference to the MetalLBConfigTemplate object, which is
deprecated in 2.27.0 and unsupported since Container Cloud 2.28.0 (Cluster
releases 17.3.0 and 16.3.0).
MetalLBConfigTemplate
Deprecated in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0)
and unsupported since Container Cloud 2.28.0 (Cluster releases 17.3.0 and
16.3.0). Before Container Cloud 2.27.0, the default object for the MetalLB
configuration, which enables the use of Subnet objects to define MetalLB
IP address pools. For details, see API Reference: MetalLBConfigTemplate.
Subnet
Configuration for IP address allocation for cluster nodes. For details, see
API Reference: Subnet.
The following procedure describes how to prepare and deploy a management
cluster using Bootstrap v2 by operating YAML templates available in the
kaas-bootstrap/templates/ folder.
The kubectl apply command automatically saves the
applied data as plain text into the
kubectl.kubernetes.io/last-applied-configuration annotation of the
corresponding object. This may result in revealing sensitive data in this
annotation when creating or modifying objects containing credentials.
Such Container Cloud objects include:
BareMetalHostCredential
ClusterOIDCConfiguration
License
Proxy
ServiceUser
TLSConfig
Therefore, do not use kubectl apply on these objects.
Use kubectl create, kubectl patch, or
kubectl edit instead.
If you used kubectl apply on these objects, you
can remove the kubectl.kubernetes.io/last-applied-configuration
annotation from the objects using kubectl edit.
Create the BootstrapRegion object by modifying
bootstrapregion.yaml.template.
Configuration of bootstrapregion.yaml.template
Select from the following options:
Since Container Cloud 2.26.0 (Cluster releases 16.1.0 and 17.1.0),
set provider:baremetal and use the default <regionName>,
which is region-one.
Before Container Cloud 2.26.0, set the required <providerName>
and <regionName>.
Create the ServiceUser object by modifying
serviceusers.yaml.template.
Configuration of serviceusers.yaml.template
Service user is the initial user to create in Keycloak for
access to a newly deployed management cluster. By default, it has the
global-admin, operator (namespaced), and bm-pool-operator
(namespaced) roles.
You can delete serviceuser after setting up other required users with
specific roles or after any integration with an external identity provider,
such as LDAP.
The region label must match the BootstrapRegion object name.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Configure and apply the cluster configuration using cluster deployment
templates:
In cluster.yaml.template, set mandatory cluster labels:
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Configure provider settings as required:
Inspect the default bare metal host profile definition in
templates/bm/baremetalhostprofiles.yaml.template and adjust it to fit
your hardware configuration. For details, see Customize the default bare metal host profile.
Warning
Any data stored on any device defined in the fileSystems
list can be deleted or corrupted during cluster (re)deployment. It happens
because each device from the fileSystems list is a part of the
rootfs directory tree that is overwritten during (re)deployment.
Examples of affected devices include:
A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a
file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it
The wipe field (deprecated) or wipeDevice structure (recommended
since Container Cloud 2.26.0) have no effect in this case and cannot
protect data on these devices.
Therefore, to prevent data loss, move the necessary data from these file
systems to another server beforehand, if required.
In templates/bm/baremetalhostinventory.yaml.template, update the bare metal
host definitions according to your environment configuration. Use the reference
table below to manually set all parameters that start with SET_.
Note
Before Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0),
also set the name of the bootstrapRegion object from
bootstrapregion.yaml.template for the kaas.mirantis.com/region label
across all objects listed in templates/bm/baremetalhosts.yaml.template.
The MAC address of the first master node in the PXE network.
ac:1f:6b:02:84:71
SET_MACHINE_0_BMC_ADDRESS
The IP address of the BMC endpoint for the first master node in
the cluster. Must be an address from the OOB network
that is accessible through the management network gateway.
The MAC address of the second master node in the PXE network.
ac:1f:6b:02:84:72
SET_MACHINE_1_BMC_ADDRESS
The IP address of the BMC endpoint for the second master node in
the cluster. Must be an address from the OOB network
that is accessible through the management network gateway.
The MAC address of the third master node in the PXE network.
ac:1f:6b:02:84:73
SET_MACHINE_2_BMC_ADDRESS
The IP address of the BMC endpoint for the third master node in
the cluster. Must be an address from the OOB network
that is accessible through the management network gateway.
The parameter requires a user name and password in plain text.
Configure cluster network:
Important
Bootstrap V2 supports only separated
PXE and LCM networks.
To ensure successful bootstrap, enable asymmetric routing on the interfaces
of the management cluster nodes. This is required because the seed node
relies on one network by default, which can potentially cause
traffic asymmetry.
In the kernelParameters section of
bm/baremetalhostprofiles.yaml.template, set rp_filter to 2.
This enables loose mode as defined in
RFC3704.
Example configuration of asymmetric routing
...kernelParameters:...sysctl:# Enables the "Loose mode" for the "k8s-lcm" interface (management network)net.ipv4.conf.k8s-lcm.rp_filter:"2"# Enables the "Loose mode" for the "bond0" interface (PXE network)net.ipv4.conf.bond0.rp_filter:"2"...
Note
More complicated solutions that are not described in this manual
include getting rid of traffic asymmetry, for example:
Configure source routing on management cluster nodes.
Plug the seed node into the same networks as the management cluster nodes,
which requires custom configuration of the seed node.
Update the network objects definition in
templates/bm/ipam-objects.yaml.template according to the environment
configuration. By default, this template implies the use of separate PXE
and life-cycle management (LCM) networks.
Manually set all parameters that start with SET_.
For configuration details of bond network interface for the PXE and management
network, see Configure NIC bonding.
Example of the default L2 template snippet for a management cluster:
In this example, the following configuration applies:
A bond of two NIC interfaces
A static address in the PXE network set on the bond
An isolated L2 segment for the LCM network is configured using
the k8s-lcm VLAN with the static address in the LCM network
The default gateway address is in the LCM network
For general concepts of configuring separate PXE and LCM networks for a
management cluster, see Separate PXE and management networks. For the latest object
templates and variable names to use, see the following tables.
The below table contains examples of mandatory parameter values to set
in templates/bm/ipam-objects.yaml.template for the network scheme that
has the following networks:
172.16.59.0/24 - PXE network
172.16.61.0/25 - LCM network
Mandatory network parameters of the IPAM objects template¶
Parameter
Description
Example value
SET_PXE_CIDR
The IP address of the PXE network in the CIDR notation. The minimum
recommended network size is 256 addresses (/24 prefix length).
172.16.59.0/24
SET_PXE_SVC_POOL
The IP address range to use for endpoints of load balancers in the PXE
network for the Container Cloud services: Ironic-API, DHCP server,
HTTP server, and caching server. The minimum required range size is
5 addresses.
172.16.59.6-172.16.59.15
SET_PXE_ADDR_POOL
The IP address range in the PXE network to use for dynamic address
allocation for hosts during inspection and provisioning.
The minimum recommended range size is 30 addresses for management
cluster nodes if it is located in a separate PXE network segment.
Otherwise, it depends on the number of managed cluster nodes to
deploy in the same PXE network segment as the management cluster nodes.
172.16.59.51-172.16.59.200
SET_PXE_ADDR_RANGE
The IP address range in the PXE network to use for static address
allocation on each management cluster node. The minimum recommended
range size is 6 addresses.
172.16.59.41-172.16.59.50
SET_MGMT_CIDR
The IP address of the LCM network for the management cluster
in the CIDR notation.
If managed clusters will have their separate LCM networks, those
networks must be routable to the LCM network. The minimum
recommended network size is 128 addresses (/25 prefix length).
172.16.61.0/25
SET_MGMT_NW_GW
The default gateway address in the LCM network. This gateway
must provide access to the OOB network of the Container Cloud cluster
and to the Internet to download the Mirantis artifacts.
172.16.61.1
SET_LB_HOST
The IP address of the externally accessible MKE API endpoint
of the cluster in the CIDR notation. This address must be within
the management SET_MGMT_CIDR network but must NOT overlap
with any other addresses or address ranges within this network.
External load balancers are not supported.
172.16.61.5/32
SET_MGMT_DNS
An external (non-Kubernetes) DNS server accessible from the
LCM network.
8.8.8.8
SET_MGMT_ADDR_RANGE
The IP address range that includes addresses to be allocated to
bare metal hosts in the LCM network for the management cluster.
When this network is shared with managed clusters, the size of this
range limits the number of hosts that can be deployed in all clusters
sharing this network.
When this network is solely used by a management cluster, the range
must include at least 6 addresses for bare metal hosts of the
management cluster.
172.16.61.30-172.16.61.40
SET_MGMT_SVC_POOL
The IP address range to use for the externally accessible endpoints
of load balancers in the LCM network for the Container Cloud
services, such as Keycloak, web UI, and so on. The minimum required
range size is 19 addresses.
172.16.61.10-172.16.61.29
SET_VLAN_ID
The VLAN ID used for isolation of LCM network. The
bootstrap.sh process and the seed node must have routable
access to the network in this VLAN.
3975
When using separate PXE and LCM networks, the management cluster
services are exposed in different networks using two separate MetalLB
address pools:
Services exposed through the PXE network are as follows:
Ironic API as a bare metal provisioning server
HTTP server that provides images for network boot and server
provisioning
Caching server for accessing the Container Cloud artifacts deployed
on hosts
Services exposed through the LCM network are all other
Container Cloud services, such as Keycloak, web UI, and so on.
The default MetalLB configuration described in the MetalLBConfig
object template of templates/bm/metallbconfig.yaml.template uses two
separate MetalLB address pools. Also, it uses the interfaces selector
in its l2Advertisements template.
Caution
When you change the L2Template object template in
templates/bm/ipam-objects.yaml.template, ensure that interfaces
listed in the interfaces field of the
MetalLBConfig.spec.l2Advertisements section match those used in your
L2Template. For details about the interfaces selector, see
API Reference: MetalLBConfig spec.
In cluster.yaml.template, update the cluster-related
settings to fit your deployment.
Optional. Technology Preview. Deprecated since Container Cloud 2.29.0 (Cluster
releases 17.4.0 and 16.4.0). Available since Container Cloud 2.24.0 (Cluster
release 14.0.0). Enable WireGuard for traffic encryption on the Kubernetes
workloads network.
WireGuard configuration
Ensure that the Calico MTU size is at least 60 bytes smaller than the
interface MTU size of the workload network. IPv4 WireGuard uses a
60-byte header. For details, see Set the MTU size for Calico.
In templates/bm/cluster.yaml.template, enable WireGuard by adding
the secureOverlay parameter:
spec:...providerSpec:value:...secureOverlay:true
Caution
Changing this parameter on a running cluster causes a
downtime that can vary depending on the cluster size.
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Configure the provider-specific settings:
Inspect the machines.yaml.template and adjust spec and labels of
each entry according to your deployment. Adjust
spec.providerSpec.value.hostSelector values to match
BareMetalHostInventory corresponding to each machine. For details, see
API Reference: Bare metal Machine spec.
Monitor the inspecting process of the baremetal hosts and wait until all
hosts are in the available state:
kubectlgetbmh-ogo-template='{{- range .items -}} {{.status.provisioning.state}}{{"\n"}} {{- end -}}'
Example of system response:
available
available
available
Monitor the BootstrapRegion object status and wait until it is ready.
For a more convenient system response, consider using dedicated tools such
as jq or yq and adjust the -o flag to output in
json or yaml format accordingly.
Note
Before Container Cloud 2.26.0 (Cluster release 16.1.0),
the BareMetalObjectReferences condition is not mandatory and may
remain in the notready state with no effect on the
BootstrapRegion object. Since Container Cloud 2.26.0, this condition
is mandatory.
Change the directory to /kaas-bootstrap/.
Approve the BootstrapRegion object to start the cluster deployment:
Since Container Cloud 2.26.0 (Cluster release 16.1.0)
./container-cloudbootstrapapproveall
Before Container Cloud 2.26.0 (Cluster release 16.0.x or earlier)
Not all of Swarm and MCR addresses are usually in use. One Swarm Ingress
network is created by default and occupies the 10.0.0.0/24 address
block. Also, three MCR networks are created by default and occupy
three address blocks: 10.99.0.0/20, 10.99.16.0/20,
10.99.32.0/20.
To verify the actual networks state and addresses in use, run:
dockernetworkls
dockernetworkinspect<networkName>
Optional. If you plan to use multiple L2 segments for provisioning of
managed cluster nodes, consider the requirements specified in
Configure multiple DHCP address ranges.
Before update of the management cluster to Container Cloud 2.29.0
(Cluster release 16.4.0), instead of BareMetalHostInventory, use the
BareMetalHost object. For details, see BareMetalHost.
Caution
While the Cluster release of the management cluster is 16.4.0,
BareMetalHostInventory operations are allowed to
m:kaas@management-admin only. Once the management cluster is updated
to the Cluster release 16.4.1 (or later), this limitation will be lifted.
Before adding new BareMetalHostInventory objects, configure hardware hosts
to correctly boot them over the PXE network.
Important
Consider the following common requirements for hardware hosts
configuration:
Update firmware for BIOS and Baseboard Management Controller (BMC) to the
latest available version, especially if you are going to apply the UEFI
configuration.
Container Cloud uses the ipxe.efi binary loader that might be not
compatible with old firmware and have vendor-related issues with UEFI
booting. For example, the Supermicro issue.
In this case, we recommend using the legacy booting format.
Configure all or at least the PXE NIC on switches.
If the hardware host has more than one PXE NIC to boot, we strongly
recommend setting up only one in the boot order. It speeds up the
provisioning phase significantly.
Some hardware vendors require a host to be rebooted during BIOS
configuration changes from legacy to UEFI or vice versa for the
extra option with NIC settings to appear in the menu.
Connect only one Ethernet port on a host to the PXE network at any given
time. Collect the physical address (MAC) of this interface and use it to
configure the BareMetalHostInventory object describing the host.
To configure BIOS on a bare metal host:
Legacy hardware host configuration
Enable the global BIOS mode using
BIOS > Boot > boot mode select > legacy. Reboot the host
if required.
Enable the LAN-PXE-OPROM support using the following menus:
This section describes the bare metal host profile settings and
instructs how to configure this profile before deploying
Mirantis Container Cloud on physical servers.
The bare metal host profile is a Kubernetes custom resource.
It allows the Infrastructure Operator to define how the storage devices
and the operating system are provisioned and configured.
The bootstrap templates for a bare metal deployment include the template for
the default BareMetalHostProfile object in the following file
that defines the default bare metal host profile:
templates/bm/baremetalhostprofiles.yaml.template
Note
Using BareMetalHostProfile, you can configure LVM or mdadm-based
software RAID support during a management or managed cluster
creation. For details, see Configure RAID support.
This feature is available as Technology Preview. Use such
configuration for testing and evaluation purposes only. For the
Technology Preview feature definition, refer to Technology Preview features.
Warning
Any data stored on any device defined in the fileSystems
list can be deleted or corrupted during cluster (re)deployment. It happens
because each device from the fileSystems list is a part of the
rootfs directory tree that is overwritten during (re)deployment.
Examples of affected devices include:
A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a
file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it
The wipe field (deprecated) or wipeDevice structure (recommended
since Container Cloud 2.26.0) have no effect in this case and cannot
protect data on these devices.
Therefore, to prevent data loss, move the necessary data from these file
systems to another server beforehand, if required.
The customization procedure of BareMetalHostProfile is almost the same for
the management and managed clusters, with the following differences:
For a management cluster, the customization automatically applies
to machines during bootstrap. And for a managed cluster, you apply
the changes using kubectl before creating a managed cluster.
For a management cluster, you edit the default
baremetalhostprofiles.yaml.template. And for a managed cluster, you
create a new BareMetalHostProfile with the necessary configuration.
For the procedure details, see Create a custom bare metal host profile.
Use this procedure for both types of clusters considering the differences
described above.
You can configure L2 templates for the management cluster to set up
a bond network interface for the PXE and management network.
This configuration must be applied to the bootstrap templates,
before you run the bootstrap script to deploy the management
cluster.
..admonition:: Configuration requirements for NIC bonding
Add at least two physical interfaces to each host in your management
cluster.
Connect at least two interfaces per host to an Ethernet switch
that supports Link Aggregation Control Protocol (LACP)
port groups and LACP fallback.
Configure an LACP group on the ports connected
to the NICs of a host.
Configure the LACP fallback on the port group to ensure that
the host can boot over the PXE network before the bond interface
is set up on the host operating system.
Configure server BIOS for both NICs of a bond to be PXE-enabled.
If the server does not support booting from multiple NICs,
configure the port of the LACP group that is connected to the
PXE-enabled NIC of a server to be the primary port.
With this setting, the port becomes active in the fallback mode.
Configure the ports that connect servers to the PXE network with the
PXE VLAN as native or untagged.
For reference configuration of network fabric in a baremetal-based cluster,
see Network fabric.
To configure a bond interface that aggregates two interfaces
for the PXE and management network:
In kaas-bootstrap/templates/bm/ipam-objects.yaml.template:
Verify that only the following parameters for the declaration
of {{nic0}} and {{nic1}} are set, as shown in the example
below:
dhcp4
dhcp6
match
set-name
Remove other parameters.
Verify that the declaration of the bond interface bond0 has the
interfaces parameter listing both Ethernet interfaces.
Verify that the node address in the PXE network (ip"bond0:mgmt-pxe"
in the below example) is bound to the bond interface or to the virtual
bridge interface tied to that bond.
Caution
No VLAN ID must be configured for the PXE network
from the host side.
Configure bonding options using the parameters field. The only
mandatory option is mode. See the example below for details.
Note
You can set any mode supported by
netplan
and your hardware.
Important
Bond monitoring is disabled in Ubuntu by default. However,
Mirantis highly recommends enabling it using Media Independent Interface
(MII) monitoring by setting the mii-monitor-interval parameter to a
non-zero value. For details, see Linux documentation: bond monitoring.
Verify your configuration using the following example:
This section describes how to configure a dedicated PXE network for a
management bare metal cluster.
A separate PXE network allows isolating sensitive bare metal provisioning
process from the end users. The users still have access to Container Cloud
services, such as Keycloak, to authenticate workloads in managed clusters,
such as Horizon in a Mirantis OpenStack for Kubernetes cluster.
The following table describes the overall network mapping scheme with all
L2/L3 parameters, for example, for two networks, PXE (CIDR 10.0.0.0/24)
and management (CIDR 10.0.11.0/24):
When using separate PXE and management networks, the management cluster
services are exposed in different networks using two separate MetalLB
address pools:
Services exposed through the PXE network are as follows:
Ironic API as a bare metal provisioning server
HTTP server that provides images for network boot and server
provisioning
Caching server for accessing the Container Cloud artifacts deployed
on hosts
Services exposed through the management network are all other Container Cloud
services, such as Keycloak, web UI, and so on.
To configure separate PXE and management networks:
To ensure successful bootstrap, enable asymmetric routing on the interfaces
of the management cluster nodes. This is required because the seed node
relies on one network by default, which can potentially cause
traffic asymmetry.
In the kernelParameters section of
bm/baremetalhostprofiles.yaml.template, set rp_filter to 2.
This enables loose mode as defined in
RFC3704.
Example configuration of asymmetric routing
...kernelParameters:...sysctl:# Enables the "Loose mode" for the "k8s-lcm" interface (management network)net.ipv4.conf.k8s-lcm.rp_filter:"2"# Enables the "Loose mode" for the "bond0" interface (PXE network)net.ipv4.conf.bond0.rp_filter:"2"...
Note
More complicated solutions that are not described in this manual
include getting rid of traffic asymmetry, for example:
Configure source routing on management cluster nodes.
Plug the seed node into the same networks as the management cluster nodes,
which requires custom configuration of the seed node.
In kaas-bootstrap/templates/bm/ipam-objects.yaml.template:
Substitute all the Subnet object templates with the new ones
as described in the example template below
Update the L2 template spec.l3Layout and spec.npTemplate fields
as described in the example template below
Example of the Subnet object templates
# Subnet object that provides IP addresses for bare metal hosts of# management cluster in the PXE network.apiVersion:"ipam.mirantis.com/v1alpha1"kind:Subnetmetadata:name:mgmt-pxenamespace:defaultlabels:kaas.mirantis.com/provider:baremetalkaas-mgmt-pxe-subnet:""spec:cidr:SET_IPAM_CIDRgateway:SET_PXE_NW_GWnameservers:-SET_PXE_NW_DNSincludeRanges:-SET_IPAM_POOL_RANGEexcludeRanges:-SET_METALLB_PXE_ADDR_POOL---# Subnet object that provides IP addresses for bare metal hosts of# management cluster in the management network.apiVersion:"ipam.mirantis.com/v1alpha1"kind:Subnetmetadata:name:mgmt-lcmnamespace:defaultlabels:kaas.mirantis.com/provider:baremetalkaas-mgmt-lcm-subnet:""ipam/SVC-k8s-lcm:"1"ipam/SVC-ceph-cluster:"1"ipam/SVC-ceph-public:"1"cluster.sigs.k8s.io/cluster-name:CLUSTER_NAMEspec:cidr:{{SET_LCM_CIDR}}includeRanges:-{{SET_LCM_RANGE}}excludeRanges:-SET_LB_HOST-SET_METALLB_ADDR_POOL---# Deprecated since 2.27.0. Subnet object that provides configuration# for "services-pxe" MetalLB address pool that will be used to expose# services LB endpoints in the PXE network.apiVersion:"ipam.mirantis.com/v1alpha1"kind:Subnetmetadata:name:mgmt-pxe-lbnamespace:defaultlabels:kaas.mirantis.com/provider:baremetalmetallb/address-pool-name:services-pxemetallb/address-pool-protocol:layer2metallb/address-pool-auto-assign:"false"cluster.sigs.k8s.io/cluster-name:CLUSTER_NAMEspec:cidr:SET_IPAM_CIDRincludeRanges:-SET_METALLB_PXE_ADDR_POOL
Deprecated since Container Cloud 2.27.0 (Cluster releases 17.2.0 and
16.2.0): the last Subnet template named mgmt-pxe-lb in the example
above will be used to configure the MetalLB address pool in the PXE network.
The bare metal provider will automatically configure MetalLB
with address pools using the Subnet objects identified by specific
labels.
Warning
The bm-pxe address must have a separate interface
with only one address on this interface.
Verify the current MetalLB configuration that is stored in MetalLB
objects:
The auto-assign parameter will be set to false for all address
pools except the default one. So, a particular service will get an
address from such an address pool only if the Service object has a
special metallb.universe.tf/address-pool annotation that points to
the specific address pool name.
Note
It is expected that every Container Cloud service on a management
cluster will be assigned to one of the address pools.
Current consideration is to have two MetalLB address pools:
services-pxe is a reserved address pool name to use for
the Container Cloud services in the PXE network (Ironic API,
HTTP server, caching server).
The bootstrap cluster also uses the services-pxe address
pool for its provision services for management cluster nodes
to be provisioned from the bootstrap cluster. After the
management cluster is deployed, the bootstrap cluster is
deleted and that address pool is solely used by the newly
deployed cluster.
default is an address pool to use for all other Container
Cloud services in the management network. No annotation
is required on the Service objects in this case.
Select from the following options for configuration of the
dedicatedMetallbPools flag:
Since Container Cloud 2.25.0
Skip this step because the flag is hardcoded to true.
Since Container Cloud 2.24.0
Verify that the flag is set to the default true value.
The flag enables splitting of LB endpoints for the Container
Cloud services. The metallb.universe.tf/address-pool annotations on
the Service objects are configured by the bare metal provider
automatically when the dedicatedMetallbPools flag is set to true.
Example Service object configured by the baremetal-operator Helm
release:
The metallb.universe.tf/address-pool annotation on the Service
object is set to services-pxe by the baremetal provider, so the
ironic-api service will be assigned an LB address from the
corresponding MetalLB address pool.
In addition to the network parameters defined in Deploy a management cluster using CLI,
configure the following ones by replacing them in
templates/bm/ipam-objects.yaml.template:
Address of a management network for the management cluster
in the CIDR notation. You can later share this network with managed
clusters where it will act as the LCM network.
If managed clusters have their separate LCM networks,
those networks must be routable to the management network.
10.0.11.0/24
SET_LCM_RANGE
Address range that includes addresses to be allocated to
bare metal hosts in the management network for the management
cluster. When this network is shared with managed clusters,
the size of this range limits the number of hosts that can be
deployed in all clusters that share this network.
When this network is solely used by a management cluster,
the range should include at least 3 IP addresses
for bare metal hosts of the management cluster.
10.0.11.100-10.0.11.109
SET_METALLB_PXE_ADDR_POOL
Address range to be used for LB endpoints of the Container Cloud
services: Ironic-API, HTTP server, and caching server.
This range must be within the PXE network.
The minimum required range is 5 IP addresses.
10.0.0.61-10.0.0.70
The following parameters will now be tied to the management network
while their meaning remains the same as described in
Deploy a management cluster using CLI:
Subnet template parameters migrated to management network¶
Parameter
Description
Example value
SET_LB_HOST
IP address of the externally accessible API endpoint
of the management cluster. This address must NOT be
within the SET_METALLB_ADDR_POOL range but within the
management network. External load balancers are not supported.
10.0.11.90
SET_METALLB_ADDR_POOL
The address range to be used for the externally accessible LB
endpoints of the Container Cloud services, such as Keycloak, web UI,
and so on. This range must be within the management network.
The minimum required range is 19 IP addresses.
To facilitate multi-rack and other types of distributed bare metal datacenter
topologies, the dnsmasq DHCP server used for host provisioning in Container
Cloud supports working with multiple L2 segments through network routers that
support DHCP relay.
Container Cloud has its own DHCP relay running on one of the management
cluster nodes. That DHCP relay serves for proxying DHCP requests in the
same L2 domain where the management cluster nodes are located.
Caution
Networks used for hosts provisioning of a managed cluster
must have routes to the PXE network (when a dedicated PXE network
is configured) or to the combined PXE/management network
of the management cluster. This configuration enables hosts to
have access to the management cluster services that are used
during host provisioning.
Management cluster nodes must have routes through the PXE network
to PXE network segments used on a managed cluster.
The following example contains L2 template fragments for a
management cluster node:
l3Layout:# PXE/static subnet for a management cluster-scope:namespacesubnetName:kaas-mgmt-pxelabelSelector:kaas-mgmt-pxe-subnet:"1"# management (LCM) subnet for a management cluster-scope:namespacesubnetName:kaas-mgmt-lcmlabelSelector:kaas-mgmt-lcm-subnet:"1"# PXE/dhcp subnets for a managed cluster-scope:namespacesubnetName:managed-dhcp-rack-1-scope:namespacesubnetName:managed-dhcp-rack-2-scope:namespacesubnetName:managed-dhcp-rack-3...npTemplate:|...bonds:bond0:interfaces:- {{ nic 0 }}- {{ nic 1 }}parameters:mode: active-backupprimary: {{ nic 0 }}mii-monitor-interval: 100dhcp4: falsedhcp6: falseaddresses:# static address on management node in the PXE network- {{ ip "bond0:kaas-mgmt-pxe" }}routes:# routes to managed PXE network segments- to: {{ cidr_from_subnet "managed-dhcp-rack-1" }}via: {{ gateway_from_subnet "kaas-mgmt-pxe" }}- to: {{ cidr_from_subnet "managed-dhcp-rack-2" }}via: {{ gateway_from_subnet "kaas-mgmt-pxe" }}- to: {{ cidr_from_subnet "managed-dhcp-rack-3" }}via: {{ gateway_from_subnet "kaas-mgmt-pxe" }}...
To configure DHCP ranges for dnsmasq, create the Subnet objects
tagged with the ipam/SVC-dhcp-range label while setting up subnets
for a managed cluster using CLI.
Caution
Support of multiple DHCP ranges has the following limitations:
Using of custom DNS server addresses for servers that boot over PXE
is not supported.
The Subnet objects for DHCP ranges cannot be associated with any
specific cluster, as DHCP server configuration is only applicable to the
management cluster where DHCP server is running.
The cluster.sigs.k8s.io/cluster-name label will be ignored.
Note
Before the Cluster release 16.1.0, the Subnet object contains
the kaas.mirantis.com/region label that specifies the region
where the DHCP ranges will be applied.
Migration of DHCP configuration for existing management clusters¶
Note
This section applies only to existing management clusters that
are created before Container 2.24.0.
Caution
Since Container Cloud 2.24.0, you can only remove the deprecated
dnsmasq.dhcp_range, dnsmasq.dhcp_ranges, dnsmasq.dhcp_routers,
and dnsmasq.dhcp_dns_servers values from the cluster spec.
The Admission Controller does not accept any other changes in these values.
This configuration is completely superseded by the Subnet object.
The DHCP configuration automatically migrated from the cluster spec to
Subnet objects after cluster upgrade to 2.21.0.
To remove the deprecated dnsmasq parameters from the cluster spec:
Open the management cluster spec for editing.
In the baremetal-operator release values, remove the
dnsmasq.dhcp_range, dnsmasq.dhcp_ranges, dnsmasq.dhcp_routers,
and dnsmasq.dhcp_dns_servers parameters. For example:
The dnsmasq.dhcp_<name> parameters of the
baremetal-operator Helm chart values in the Clusterspec are
deprecated since the Cluster release 11.5.0 and removed in the
Cluster release 14.0.0.
Ensure that the required DHCP ranges and options are set in the Subnet
objects. For configuration details, see Configure DHCP ranges for dnsmasq.
The dnsmasq configuration options dhcp-option=3 and dhcp-option=6
are absent in the default configuration. So, by default, dnsmasq
will send the DNS server and default route to DHCP clients as defined in the
dnsmasq official documentation:
The netmask and broadcast address are the same as on the host
running dnsmasq.
The DNS server and default route are set to the address of the host
running dnsmasq.
If the domain name option is set, this name is sent to DHCP clients.
Create the Subnet objects tagged with the ipam/SVC-dhcp-range label.
Caution
For cluster-specific subnets, create Subnet objects in the
same namespace as the related Cluster object project. For shared
subnets, create Subnet objects in the default namespace.
Setting of custom nameservers in the DHCP subnet is not supported.
After creation of the above Subnet object, the provided data will be
utilized to render the Dnsmasq object used for configuration of the
dnsmasq deployment. You do not have to manually edit the Dnsmasq object.
Verify that the changes are applied to the Dnsmasq object:
For servers to access the DHCP server across the L2 segment boundaries,
for example, from another rack with a different VLAN for PXE network,
you must configure DHCP relay (agent) service on the border switch
of the segment. For example, on a top-of-rack (ToR) or leaf (distribution)
switch, depending on the data center network topology.
Warning
To ensure predictable routing for the relay of DHCP packets,
Mirantis strongly advises against the use of chained DHCP relay
configurations. This precaution limits the number of hops for DHCP packets,
with an optimal scenario being a single hop.
This approach is justified by the unpredictable nature of chained relay
configurations and potential incompatibilities between software and
hardware relay implementations.
The dnsmasq server listens on the PXE network of the management
cluster by using the dhcp-lb Kubernetes Service.
To configure the DHCP relay service, specify the external address of the
dhcp-lb Kubernetes Service as an upstream address for the relayed DHCP
requests, which is the IP helper address for DHCP. There is the dnsmasq
deployment behind this service that can only accept relayed DHCP requests.
Container Cloud has its own DHCP relay running on one of the management
cluster nodes. That DHCP relay serves for proxying DHCP requests in the
same L2 domain where the management cluster nodes are located.
To obtain the actual IP address issued to the dhcp-lb Kubernetes
Service:
This section instructs you on how to enable dynamic IP allocation feature
to increase the amount of baremetal hosts to be provisioned in parallel on
managed clusters.
Using this feature, you can effortlessly deploy a large managed cluster by
provisioning up to 100 hosts simultaneously. In addition to dynamic
IP allocation, this feature disables the ping check in the DHCP server.
Therefore, if you plan to deploy large managed clusters, enable this feature
during the management cluster bootstrap.
Set a custom external IP address for the DHCP service¶
Available since Container Cloud 2.25.0 (Cluster release 16.0.0)
This section instructs you on how to set a custom external IP address for
the dhcp-lb service so that it remains the same during management cluster
upgrades and other LCM operations.
The changes of dhcp-lb service address may lead to the necessity of
changing configuration for DHCP relays on ToR switches.
The described procedure allows you to avoid such unwanted changes.
This configuration makes sense when you use multiple DHCP address ranges
on your deployment. See Configure multiple DHCP address ranges for details.
To set a custom external IP address for the dhcp-lb service:
In the Cluster object of the management cluster, modify the
configuration of the baremetal-operator release by setting
dnsmasq.dedicated_udp_service_address_pool to true:
In the MetalLBConfig object of the management cluster, modify the
ipAddressPools object list by adding the dhcp-lb object and the
serviceAllocation parameters for the default object:
Select non-overlapping IP addresses for all the ipAddressPools that
you use: default, services-pxe, and dhcp-lb.
In the MetalLBConfig object of the management cluster, modify the
l2Advertisements object list by adding dhcp-lb to the
ipAddressPools section in the pxe object spec:
Note
A cluster may have a different L2Advertisement object name
instead of pxe.
Consider this section as part of the Bootstrap v2
CLI procedure.
During creation of a management cluster using Bootstrap v2, you can configure
optional cluster settings using the Container Cloud API by modifying
cluster.yaml.template.
To configure optional cluster settings:
Technology Preview. Enable custom host names for cluster machines.
When enabled, any machine host name in a particular region matches the related
Machine object name. For example, instead of the default
kaas-node-<UID>, a machine host name will be master-0. The custom
naming format is more convenient and easier to operate with.
Configuration for custom host names on the management and its future
managed clusters
Since Container Cloud 2.26.0 (16.1.0)
In cluster.yaml.template, find the
spec.providerSpec.value.kaas.regional.helmReleases.name:baremetal-provider section.
Under values.config, add customHostnamesEnabled:true:
Boolean, default - false. Enables the auditd role to install the
auditd packages and configure rules. CIS rules: 4.1.1.1, 4.1.1.2.
enabledAtBoot
Boolean, default - false. Configures grub to audit processes that can
be audited even if they start up prior to auditd startup. CIS rule:
4.1.1.3.
backlogLimit
Integer, default - none. Configures the backlog to hold records. If during
boot audit=1 is configured, the backlog holds 64 records. If more than
64 records are created during boot, auditd records will be lost with a
potential malicious activity being undetected. CIS rule: 4.1.1.4.
maxLogFile
Integer, default - none. Configures the maximum size of the audit log file.
Once the log reaches the maximum size, it is rotated and a new log file is
created. CIS rule: 4.1.2.1.
maxLogFileAction
String, default - none. Defines handling of the audit log file reaching the
maximum file size. Allowed values:
keep_logs - rotate logs but never delete them
rotate - add a cron job to compress rotated log files and keep
maximum 5 compressed files.
compress - compress log files and keep them under the
/var/log/auditd/ directory. Requires
auditd_max_log_file_keep to be enabled.
CIS rule: 4.1.2.2.
maxLogFileKeep
Integer, default - 5. Defines the number of compressed log files to keep
under the /var/log/auditd/ directory. Requires
auditd_max_log_file_action=compress. CIS rules - none.
mayHaltSystem
Boolean, default - false. Halts the system when the audit logs are
full. Applies the following configuration:
space_left_action=email
action_mail_acct=root
admin_space_left_action=halt
CIS rule: 4.1.2.3.
customRules
String, default - none. Base64-encoded content of the 60-custom.rules
file for any architecture. CIS rules - none.
customRulesX32
String, default - none. Base64-encoded content of the 60-custom.rules
file for the i386 architecture. CIS rules - none.
customRulesX64
String, default - none. Base64-encoded content of the 60-custom.rules
file for the x86_64 architecture. CIS rules - none.
presetRules
String, default - none. Comma-separated list of the following built-in
preset rules:
access
actions
delete
docker
identity
immutable
logins
mac-policy
modules
mounts
perm-mod
privileged
scope
session
system-locale
time-change
Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0) in the
Technology Preview scope, you can collect some of the preset rules indicated
above as groups and use them in presetRules:
ubuntu-cis-rules - this group contains rules to comply with the Ubuntu
CIS Benchmark recommendations, including the following CIS Ubuntu 20.04
v2.0.1 rules:
scope - 5.2.3.1
actions - same as 5.2.3.2
time-change - 5.2.3.4
system-locale - 5.2.3.5
privileged - 5.2.3.6
access - 5.2.3.7
identity - 5.2.3.8
perm-mod - 5.2.3.9
mounts - 5.2.3.10
session - 5.2.3.11
logins - 5.2.3.12
delete - 5.2.3.13
mac-policy - 5.2.3.14
modules - 5.2.3.19
docker-cis-rules - this group contains rules to comply with
Docker CIS Benchmark recommendations, including the docker Docker CIS
v1.6.0 rules 1.1.3 - 1.1.18.
You can also use two additional keywords inside presetRules:
none - select no built-in rules.
all - select all built-in rules. When using this keyword, you can add
the ! prefix to a rule name to exclude some rules. You can use the
! prefix for rules only if you add the all keyword as the
first rule. Place a rule with the ! prefix only after
the all keyword.
Example configurations:
presetRules:none - disable all preset rules
presetRules:docker - enable only the docker rules
presetRules:access,actions,logins - enable only the
access, actions, and logins rules
presetRules:ubuntu-cis-rules - enable all rules from the
ubuntu-cis-rules group
presetRules:docker-cis-rules,actions - enable all rules from
the docker-cis-rules group and the actions rule
presetRules:all - enable all preset rules
presetRules:all,!immutable,!sessions - enable all preset
rules except immutable and sessions
Verify that the userFederation section is located
on the same level as the initUsers section.
Verify that all attributes set in the mappers section
are defined for users in the specified LDAP system.
Missing attributes may cause authorization issues.
Disable NTP that is enabled by default. This option disables the
management of chrony configuration by Container Cloud to use your own
system for chrony management. Otherwise, configure the regional NTP server
parameters as described below.
NTP configuration
Configure the regional NTP server parameters to be applied to all machines
of managed clusters.
In cluster.yaml.template, add the ntp:servers section
with the list of required server names:
Applies since Container Cloud 2.26.0 (Cluster release 16.1.0). If you plan
to deploy large managed clusters, enable dynamic IP allocation to increase
the amount of baremetal hosts to be provisioned in parallel.
For details, see Enable dynamic IP allocation.
Now, you can proceed with operating your management cluster through the
Container Cloud web UI and deploying managed clusters as described in
Operations Guide.
If the BootstrapRegion object is in the Error state, find the error
type in the Status field of the object for the following components to
resolve the issue:
Field name
Troubleshooting steps
Helm
If the bootstrap HelmBundle is not ready for a long time, for example,
during 15 minutes in case of an average network bandwidth, verify
statuses of non-ready releases and resolve the issue depending
on the error message of a particular release:
The deployment statuses of a Machine object are the same as the
LCMMachine object states:
Uninitialized - the machine is not yet assigned to an LCMCluster.
Pending - the agent reports a node IP address and host name.
Prepare - the machine executes StateItems that correspond
to the prepare phase. This phase usually involves downloading
the necessary archives and packages.
Deploy - the machine executes StateItems that correspond
to the deploy phase that is becoming a Mirantis Kubernetes Engine (MKE)
node.
Ready - the machine is being deployed.
Upgrade - the machine is being upgraded to the new MKE version.
Reconfigure - the machine executes StateItems that correspond
to the reconfigure phase. The machine configuration is being updated
without affecting workloads running on the machine.
If the system response is empty, approve the BootstrapRegion object:
Using the Container Cloud web UI, navigate to the
Bootstrap tab and approve the related BootstrapRegion object
Using the Container Cloud CLI:
./container-cloudbootstrapapproveall
If the system response is not empty and the status remains the same for a
while, the issue may relate to machine misconfiguration. Therefore, verify
and adjust the parameters of the affected Machine object.
For provider-related issues, refer to the Troubleshooting section.
If the cluster deployment is stuck on the same stage for a long time, it may
be related to configuration issues in the Machine or other deployment
objects.
To troubleshoot cluster deployment:
Identify the current deployment stage that got stuck:
The syslog container collects logs generated by Ansible during the node
deployment and cleanup and outputs them in the JSON format.
Note
Add COLLECT_EXTENDED_LOGS=true before the
collect_logs command to output the extended version of logs
that contains system and MKE logs, logs from LCM Ansible and LCM Agent
along with cluster events and Kubernetes resources description and logs.
Without the --extended flag, the basic version of logs is collected, which
is sufficient for most use cases. The basic version of logs contains all
events, Kubernetes custom resources, and logs from all Container Cloud
components. This version does not require passing --key-file.
The logs are collected in the directory where the bootstrap script
is located.
The Container Cloud logs structure in <output_dir>/<cluster_name>/
is as follows:
/events.log
Human-readable table that contains information about the cluster events.
/system
System logs.
/system/mke (or /system/MachineName/mke)
Mirantis Kuberntes Engine (MKE) logs.
/objects/cluster
Logs of the non-namespaced Kubernetes objects.
/objects/namespaced
Logs of the namespaced Kubernetes objects.
/objects/namespaced/<namespaceName>/core/pods
Logs of the pods from a specific Kubernetes namespace. For example, logs
of the pods from the kaas namespace contain logs of Container Cloud
controllers, including bootstrap-cluster-controller
since Container Cloud 2.25.0.
Logs of the pods from a specific Kubernetes namespace that were previously
removed or failed.
/objects/namespaced/<namespaceName>/core/pods/<ironicPodName>/syslog.log
Technology Preview. Ironic pod logs.
Note
Logs collected by the syslog container during the bootstrap phase
are not transferred to the management cluster during pivoting.
These logs are located in /volume/log/ironic/ansible_conductor.log
inside the Ironic pod.
Each log entry of the management cluster logs contains a request ID that
identifies chronology of actions performed on a cluster or machine.
The format of the log entry is as follows:
<process ID>.[<subprocess ID>...<subprocess ID N>].req:<requestID>: <logMessage>
For example, bm.machine.req:28 contains information about the task 28
applied to a bare metal machine.
Since Container Cloud 2.22.0, the logging format has the following extended
structure for the admission-controller, storage-discovery, and all
supported baremetal-provider services of a management cluster:
Informational level. Possible values: debug, info, warn,
error, panic.
ts
Time stamp in the <YYYY-MM-DDTHH:mm:ssZ> format. For example:
2022-11-14T21:37:23Z.
logger
Details on the process ID being logged:
<processID>
Primary process identifier. The list of possible values includes
bm, os, iam, license, and bootstrap.
Note
The iam and license values are available since
Container Cloud 2.23.0. The bootstrap value is available since
Container Cloud 2.25.0.
<subProcessID(s)>
One or more secondary process identifiers. The list of possible values
includes cluster, machine, controller, and cluster-ctrl.
Note
The controller value is available since Container Cloud
2.23.0. The cluster-ctrl value is available since Container Cloud
2.25.0 for the bootstrap process identifier.
req
Request ID number that increases when a service performs the following
actions:
Receives a request from Kubernetes about creating, updating,
or deleting an object
Receives an HTTP request
Runs a background process
The request ID allows combining all operations performed with an object
within one request. For example, the result of a Machine object
creation, update of its statuses, and so on has the same request ID.
caller
Code line used to apply the corresponding action to an object.
msg
Description of a deployment or update phase. If empty, it contains the
"error" key with a message followed by the "stacktrace" key with
stack trace details. For example:
"msg"="" "error"="Cluster nodes are not yet ready" "stacktrace": "<stack-trace-info>"
The log format of the following Container Cloud components does
not contain the "stacktrace" key for easier log handling:
baremetal-provider, bootstrap-provider, and
host-os-modules-controller.
Note
Logs may also include a number of informational key-value pairs
containing additional cluster details. For example,
"name":"object-name","foobar":"baz".
Depending on the type of issue found in logs, apply the corresponding fixes.
For example, if you detect the LoadBalancerERRORstate errors
during the bootstrap of an OpenStack-based management cluster,
contact your system administrator to fix the issue.
For MOSK, the feature is generally available since
MOSK 23.1.
While bootstrapping a Container Cloud management cluster using proxy, you may
require Internet access to go through a man-in-the-middle (MITM) proxy. Such
configuration requires that you enable streaming and install a CA certificate
on a bootstrap node.
Replace ~/.mitmproxy/mitmproxy-ca-cert.cer with the path to your CA
certificate.
Caution
The target CA certificate file must be in the PEM format
with the .crt extension.
Apply the changes:
sudoupdate-ca-certificates
Now, proceed with bootstrapping your management cluster.
Create initial users after a management cluster bootstrap¶
Once you bootstrap your management cluster, create Keycloak users for access
to the Container Cloud web UI. Use the created credentials to log in to the
Container Cloud web UI.
Mirantis recommends creating at least two users, user and operator,
that are required for a typical Container Cloud deployment.
To create the user for access to the Container Cloud web UI, use:
Required. Comma-separated list of roles to assign to the user.
If you run the command without the --namespace flag,
you can assign the following roles:
global-admin - read and write access for global role bindings
writer - read and write access
reader - view access
operator - create and manage access to the BareMetalHost
objects
management-admin - full access to the management cluster,
available since Container Cloud 2.25.0 (Cluster releases
17.0.0, 16.0.0, 14.1.0)
If you run the command for a specific project using the
--namespace flag, you can assign the following roles:
operator or writer - read and write access
user or reader - view access
member - read and write access (excluding IAM objects)
bm-pool-operator - create and manage access to the
BareMetalHost objects
--kubeconfig
Required. Path to the management cluster kubeconfig generated during
the management cluster bootstrap.
--namespace
Optional. Name of the Container Cloud project where the user will be
created. If not set, a global user will be created for all Container
Cloud projects with the corresponding role access to view or manage
all Container Cloud public objects.
--password-stdin
Optional. Flag to provide the user password through stdin:
The issue may occur because the default Docker network address
172.17.0.0/16 and/or the kind Docker network, which is used by
kind, overlap with your cloud address or other addresses
of the network configuration.
Workaround:
Log in to your local machine.
Verify routing to the IP addresses of the target cloud endpoints:
Obtain the IP address of your target cloud. For example:
If the routing is incorrect, change the IP address
of the default Docker bridge:
Create or edit /etc/docker/daemon.json by adding the "bip"
option:
{"bip":"192.168.91.1/24"}
Restart the Docker daemon:
sudosystemctlrestartdocker
If required, customize addresses for your kind Docker network
or any other additional Docker networks:
Remove the kind network:
dockernetworkrm'kind'
Choose from the following options:
Configure /etc/docker/daemon.json:
Note
The following steps are applied to to customize addresses
for the kind Docker network. Use these steps as an
example for any other additional Docker networks.
Add the following section to /etc/docker/daemon.json:
Docker pruning removes the user defined networks,
including 'kind'. Therefore,
every time after running the Docker pruning commands,
re-create the 'kind' network again
using the command above.
This section describes how to configure authentication for Mirantis
Container Cloud depending on the external identity provider type
integrated to your deployment.
If you integrate LDAP for IAM to Mirantis Container Cloud,
add the required LDAP configuration to cluster.yaml.template
during the bootstrap of the management cluster.
Note
The example below defines the recommended non-anonymous
authentication type. If you require anonymous authentication,
replace the following parameters with authType: "none":
authType:"simple"bindCredential:""bindDn:""
To configure LDAP for IAM:
Open templates/bm/cluster.yaml.template.
Configure the keycloak:userFederation:providers:
and keycloak:userFederation:mappers: sections as required:
Verify that the userFederation section is located
on the same level as the initUsers section.
Verify that all attributes set in the mappers section
are defined for users in the specified LDAP system.
Missing attributes may cause authorization issues.
Now, return to the bootstrap instruction of your management cluster.
The instruction below applies to the DNS-based management
clusters. If you bootstrap a non-DNS-based management cluster,
configure Google OAuth IdP for Keycloak after bootstrap using the
official Keycloak documentation.
If you integrate Google OAuth external identity provider for IAM to
Mirantis Container Cloud, create the authorization credentials for IAM
in your Google OAuth account and configure cluster.yaml.template
during the bootstrap of the management cluster.
In the APIs Credentials menu, select
OAuth client ID.
In the window that opens:
In the Application type menu, select
Web application.
In the Authorized redirect URIs field, type in
<keycloak-url>/auth/realms/iam/broker/google/endpoint,
where <keycloak-url> is the corresponding DNS address.
Press Enter to add the URI.
Click Create.
A page with your client ID and client secret opens. Save these
credentials for further usage.
Log in to the bootstrap node.
Open templates/bm/cluster.yaml.template.
In the keycloak:externalIdP: section, add the following snippet
with your credentials created in previous steps:
This tutorial applies only to the Container Cloud web UI users
with the m:kaas:namespace@operator or m:kaas:namespace@writer
access role assigned by the Infrastructure Operator.
To add a bare metal host, the m:kaas@operator or
m:kaas:namespace@bm-pool-operator role is required.
After you deploy the Mirantis Container Cloud management cluster,
you can start creating managed clusters depending on your cloud needs.
The deployment procedure is performed using the Container Cloud web UI
and comprises the following steps:
Create a dedicated non-default project for managed clusters.
Create and configure bare metal hosts with corresponding labels for machines
such as worker, manager, or storage.
Create an initial cluster configuration.
Add the required amount of machines with the corresponding configuration
to the managed cluster.
Add a Ceph cluster.
Note
The Container Cloud web UI communicates with Keycloak
to authenticate users. Keycloak is exposed using HTTPS with
self-signed TLS certificates that are not trusted by web browsers.
The procedure below applies only to the Container Cloud web UI
users with the m:kaas@global-admin or m:kaas@writer access role
assigned by the infrastructure Operator.
The default project (Kubernetes namespace) in Container Cloud is dedicated
for management clusters only. Managed clusters require a separate project.
You can create as many projects as required by your company infrastructure.
To create a project for managed clusters using the Container Cloud web UI:
Log in to the Container Cloud web UI as m:kaas@global-admin or
m:kaas@writer.
In the Projects tab, click Create.
Type the new project name.
Click Create.
Note
Due to the known issue 50168, access to the
newly created project becomes available in five minutes after project
creation.
Generate a kubeconfig for a managed cluster using API¶
Create and operate a baremetal-based managed cluster¶
After bootstrapping your baremetal-based Mirantis Container Cloud
management cluster as described in Deploy a Container Cloud management cluster,
you can start creating the baremetal-based managed clusters.
This feature is available as Technology Preview. Use such
configuration for testing and evaluation purposes only.
For the Technology Preview feature definition, refer to Technology Preview features.
This feature is available as Technology Preview. Use such
configuration for testing and evaluation purposes only.
For the Technology Preview feature definition, refer to Technology Preview features.
This feature is available as Technology Preview. Use such
configuration for testing and evaluation purposes only.
For the Technology Preview feature definition, refer to Technology Preview features.
This section instructs you on how to configure and deploy a managed cluster
that is based on the baremetal-based management cluster.
By default, Mirantis Container Cloud configures a single
interface on the cluster nodes, leaving all other physical interfaces intact.
With L2 networking templates, you can create advanced host networking
configurations for your clusters. For example, you can create bond
interfaces on top of physical interfaces on the host or use multiple subnets
to separate different types of network traffic.
You can use several host-specific L2 templates per one cluster
to support different hardware configurations. For example, you can create
L2 templates with different number and layout of NICs to be applied
to the specific machines of one cluster.
Caution
Modification of L2 templates in use is allowed with a mandatory
validation step from the Infrastructure Operator to prevent accidental
cluster failures due to unsafe changes. The list of risks posed by modifying
L2 templates includes:
Services running on hosts cannot reconfigure automatically to switch to
the new IP addresses and/or interfaces.
Connections between services are interrupted unexpectedly, which can cause
data loss.
Incorrect configurations on hosts can lead to irrevocable loss of
connectivity between services and unexpected cluster partition or
disassembly.
Since Container Cloud 2.24.4, in the Technology Preview scope, you can create
a managed cluster with a multi-rack topology, where cluster nodes including
Kubernetes masters are distributed across multiple racks without L2 layer
extension between them, and use BGP for announcement of the cluster API
load balancer address and external addresses of Kubernetes load-balanced
services.
Implementation of the multi-rack topology implies the use of Rack and
MultiRackCluster objects that support configuration of BGP announcement
of the cluster API load balancer address. For the configuration procedure,
refer to Configure BGP announcement for cluster API LB address. For configuring the BGP announcement of
external addresses of Kubernetes load-balanced services, refer to
Configure MetalLB.
Follow the procedures described in the below subsections to configure initial
settings and advanced network objects for your managed clusters.
This section instructs you on how to create initial configuration of a managed
cluster that is based on the baremetal-based management cluster through the
Mirantis Container Cloud web UI.
Note
Due to the known issue 50181, creation of a
compact managed cluster or addition of any labels to the control plane nodes
is not available through the Container Cloud web UI.
To create a managed cluster on bare metal:
Available since the Cluster release 16.1.0 on the management cluster.
If you plan to deploy a large managed cluster, enable dynamic IP allocation
to increase the amount of baremetal hosts to be provisioned in parallel.
For details, see Enable dynamic IP allocation.
Available since Container Cloud 2.24.0. Optional.
Technology Preview. Enable custom host names for cluster machines.
When enabled, any machine host name in a particular region matches the related
Machine object name. For example, instead of the default
kaas-node-<UID>, a machine host name will be master-0. The custom
naming format is more convenient and easier to operate with.
If your proxy requires a trusted CA certificate, select the
CA Certificate check box and paste a CA certificate for a MITM
proxy to the corresponding field or upload a certificate using
Upload Certificate.
For MOSK-based deployments, the possibility to use a MITM
proxy with a CA certificate is available since MOSK 23.1.
For the list of Mirantis resources and IP addresses to be accessible
from the Container Cloud clusters, see Requirements.
In the Clusters tab, click Create Cluster.
Configure the new cluster in the Create New Cluster wizard
that opens:
Define general and Kubernetes parameters:
Create new cluster: General, Provider, and Kubernetes¶
Section
Parameter name
Description
General settings
Cluster name
The cluster name.
Provider
Select Baremetal.
Region Removed in 2.26.0 (17.1.0 and 16.1.0)
From the drop-down list, select Baremetal.
Release version
The Container Cloud version.
Proxy
Optional. From the drop-down list,
select the proxy server name that you have previously created.
SSH keys
From the drop-down list, select the SSH key name(s) that you have
previously added for SSH access to the bare metal hosts.
For MOSK-based deployments, the feature
support is available since MOSK 22.5.
Enable WireGuard
Optional. Technology Preview. Deprecated since Container Cloud 2.29.0 (Cluster
releases 17.4.0 and 16.4.0). Available since Container Cloud 2.24.0 (Cluster
release 14.0.0).
Enable WireGuard for traffic encryption on the Kubernetes workloads network.
WireGuard configuration
Ensure that the Calico MTU size is at least 60 bytes smaller than the
interface MTU size of the workload network. IPv4 WireGuard uses a 60-byte
header. For details, see Set the MTU size for Calico.
Enable WireGuard by selecting the Enable WireGuard check box.
Caution
Changing this parameter on a running cluster causes a
downtime that can vary depending on the cluster size.
This parameter was renamed from
Enable Secure Overlay to Enable WireGuard
in Cluster releases 17.0.0 and 16.0.0.
Parallel Upgrade Of Worker Machines
Optional. Available since Cluster releases 17.0.0 and 16.0.0.
The maximum number of the worker nodes to update simultaneously. It serves as
an upper limit on the number of machines that are drained at a given moment
of time. Defaults to 1.
You can also configure this option after deployment before
the cluster update.
Parallel Preparation For Upgrade Of Worker Machines
Optional. Available since Cluster releases 17.0.0 and 16.0.0.
The maximum number of worker nodes being prepared at a given moment of time,
which includes downloading of new artifacts. It serves as a limit for the
network load that can occur when downloading the files to the nodes.
Defaults to 50.
You can also configure this option after deployment before
the cluster update.
Provider
LB host IP
The IP address of the load balancer endpoint that will be used to
access the Kubernetes API of the new cluster. This IP address
must be in the LCM network if a separate LCM network is in use and
if L2 (ARP) announcement of cluster API load balancer IP is in use.
LB address range
Removed in Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0).
The range of IP addresses that can be assigned to load balancers for
Kubernetes Services by MetalLB. For a more flexible MetalLB configuration,
refer to Configure MetalLB.
Note
Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0),
MetalLB configuration must be added after cluster creation.
Kubernetes
Services CIDR blocks
The Kubernetes Services CIDR blocks.
For example, 10.233.0.0/18.
Pods CIDR blocks
The Kubernetes pods CIDR blocks.
For example, 10.233.64.0/18.
Note
The network subnet size of Kubernetes pods influences the number of
nodes that can be deployed in the cluster.
The default subnet size /18 is enough to create a cluster with
up to 256 nodes. Each node uses the /26 address blocks
(64 addresses), at least one address block is allocated per node.
These addresses are used by the Kubernetes pods with
hostNetwork:false. The cluster size may be limited
further when some nodes use more than one address block.
Configure StackLight:
Note
If StackLight is enabled in non-HA mode but Ceph is not
deployed yet, StackLight will not be installed and will be stuck in
the Yellow state waiting for a successful Ceph installation. Once
the Ceph cluster is deployed, the StackLight installation resumes.
To deploy a Ceph cluster, refer to Add a Ceph cluster.
Section
Parameter name
Description
StackLight
Enable Monitoring
Selected by default. Deselect to skip StackLight deployment. You can also
enable, disable, or configure StackLight parameters after deploying a
managed cluster. For details, see Change a cluster configuration or
Configure StackLight.
Enable Logging
Select to deploy the StackLight logging stack. For details about the
logging components, see Deployment architecture.
Note
The logging mechanism performance depends on the cluster log load. In
case of a high load, you may need to increase the default resource requests
and limits for fluentdLogs. For details, see
StackLight configuration parameters: Resource limits.
HA Mode
Select to enable StackLight monitoring in the HA mode. For the
differences between HA and non-HA modes, see Deployment architecture.
If disabled, StackLight requires a Ceph cluster. To deploy a Ceph cluster,
refer to Add a Ceph cluster.
StackLight Default Logs Severity Level
Log severity (verbosity) level for all StackLight components.
The default value for this parameter is
Default component log level that respects original defaults
of each StackLight component.
For details about severity levels, see MOSK Operations Guide:
StackLight configuration parameters - Log verbosity.
Expand the drop-down menu for a specific component to display
its list of available log levels.
OpenSearch
Logstash Retention Time
Skip this parameter since Container Cloud 2.26.0 (17.1.0, 16.1.0). It
was removed from the code base and will be removed from the web UI in
one of the following releases.
Available if you select Enable Logging. Specifies the
logstash-* index retention time.
Events Retention Time
Available if you select Enable Logging. Specifies the
kubernetes_events-* index retention time.
Notifications Retention
Available if you select Enable Logging. Specifies the
notification-* index retention time and is used for Mirantis
OpenStack for Kubernetes.
Persistent Volume Claim Size
Available if you select Enable Logging. The OpenSearch
persistent volume claim size.
Select to enable notifications about resolved StackLight alerts.
Require TLS
Select to enable transmitting emails through TLS.
Email alerts configuration for StackLight
Fill out the following email alerts parameters as required:
To - the email address to send notifications to.
From - the sender address.
SmartHost - the SMTP host through which the emails are sent.
Authentication username - the SMTP user name.
Authentication password - the SMTP password.
Authentication identity - the SMTP identity.
Authentication secret - the SMTP secret.
StackLight Slack Alerts
Enable Slack alerts
Select to enable the StackLight Slack alerts.
Send Resolved
Select to enable notifications about resolved StackLight alerts.
Slack alerts configuration for StackLight
Fill out the following Slack alerts parameters as required:
API URL - The Slack webhook URL.
Channel - The channel to send notifications to, for example,
#channel-for-alerts.
Available since Container Cloud 2.24.0 and 2.24.2 for MOSK 23.2. Optional.
Technology Preview. Enable the Linux Audit daemon auditd
to monitor activity of cluster processes and prevent potential malicious
activity.
Boolean, default - false. Enables the auditd role to install the
auditd packages and configure rules. CIS rules: 4.1.1.1, 4.1.1.2.
enabledAtBoot
Boolean, default - false. Configures grub to audit processes that can
be audited even if they start up prior to auditd startup. CIS rule:
4.1.1.3.
backlogLimit
Integer, default - none. Configures the backlog to hold records. If during
boot audit=1 is configured, the backlog holds 64 records. If more than
64 records are created during boot, auditd records will be lost with a
potential malicious activity being undetected. CIS rule: 4.1.1.4.
maxLogFile
Integer, default - none. Configures the maximum size of the audit log file.
Once the log reaches the maximum size, it is rotated and a new log file is
created. CIS rule: 4.1.2.1.
maxLogFileAction
String, default - none. Defines handling of the audit log file reaching the
maximum file size. Allowed values:
keep_logs - rotate logs but never delete them
rotate - add a cron job to compress rotated log files and keep
maximum 5 compressed files.
compress - compress log files and keep them under the
/var/log/auditd/ directory. Requires
auditd_max_log_file_keep to be enabled.
CIS rule: 4.1.2.2.
maxLogFileKeep
Integer, default - 5. Defines the number of compressed log files to keep
under the /var/log/auditd/ directory. Requires
auditd_max_log_file_action=compress. CIS rules - none.
mayHaltSystem
Boolean, default - false. Halts the system when the audit logs are
full. Applies the following configuration:
space_left_action=email
action_mail_acct=root
admin_space_left_action=halt
CIS rule: 4.1.2.3.
customRules
String, default - none. Base64-encoded content of the 60-custom.rules
file for any architecture. CIS rules - none.
customRulesX32
String, default - none. Base64-encoded content of the 60-custom.rules
file for the i386 architecture. CIS rules - none.
customRulesX64
String, default - none. Base64-encoded content of the 60-custom.rules
file for the x86_64 architecture. CIS rules - none.
presetRules
String, default - none. Comma-separated list of the following built-in
preset rules:
access
actions
delete
docker
identity
immutable
logins
mac-policy
modules
mounts
perm-mod
privileged
scope
session
system-locale
time-change
Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0) in the
Technology Preview scope, you can collect some of the preset rules indicated
above as groups and use them in presetRules:
ubuntu-cis-rules - this group contains rules to comply with the Ubuntu
CIS Benchmark recommendations, including the following CIS Ubuntu 20.04
v2.0.1 rules:
scope - 5.2.3.1
actions - same as 5.2.3.2
time-change - 5.2.3.4
system-locale - 5.2.3.5
privileged - 5.2.3.6
access - 5.2.3.7
identity - 5.2.3.8
perm-mod - 5.2.3.9
mounts - 5.2.3.10
session - 5.2.3.11
logins - 5.2.3.12
delete - 5.2.3.13
mac-policy - 5.2.3.14
modules - 5.2.3.19
docker-cis-rules - this group contains rules to comply with
Docker CIS Benchmark recommendations, including the docker Docker CIS
v1.6.0 rules 1.1.3 - 1.1.18.
You can also use two additional keywords inside presetRules:
none - select no built-in rules.
all - select all built-in rules. When using this keyword, you can add
the ! prefix to a rule name to exclude some rules. You can use the
! prefix for rules only if you add the all keyword as the
first rule. Place a rule with the ! prefix only after
the all keyword.
Example configurations:
presetRules:none - disable all preset rules
presetRules:docker - enable only the docker rules
presetRules:access,actions,logins - enable only the
access, actions, and logins rules
presetRules:ubuntu-cis-rules - enable all rules from the
ubuntu-cis-rules group
presetRules:docker-cis-rules,actions - enable all rules from
the docker-cis-rules group and the actions rule
presetRules:all - enable all preset rules
presetRules:all,!immutable,!sessions - enable all preset
rules except immutable and sessions
To monitor the cluster readiness, hover over the status icon of a specific
cluster in the Status column of the Clusters page.
Once the orange blinking status icon becomes green and Ready,
the cluster deployment or update is complete.
You can monitor live deployment status of the following cluster components:
Component
Description
Helm
Installation or upgrade status of all Helm releases
Kubelet
Readiness of the node in a Kubernetes cluster, as reported by kubelet
Kubernetes
Readiness of all requested Kubernetes objects
Nodes
Equality of the requested nodes number in the cluster to the number
of nodes having the Ready LCM status
OIDC
Readiness of the cluster OIDC configuration
StackLight
Health of all StackLight-related objects in a Kubernetes cluster
Swarm
Readiness of all nodes in a Docker Swarm cluster
LoadBalancer
Readiness of the Kubernetes API load balancer
ProviderInstance
Readiness of all machines in the underlying infrastructure
(virtual or bare metal, depending on the provider type)
Graceful Reboot
Readiness of a cluster during a scheduled graceful reboot,
available since Cluster releases 15.0.1 and 14.0.0.
Infrastructure Status
Available since Container Cloud 2.25.0 (Cluster releases 17.0.0 and 16.0.0).
Readiness of the MetalLBConfig object along with MetalLB and DHCP subnets.
LCM Operation
Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and
16.1.0). Health of all LCM operations on the cluster and its machines.
LCM Agent
Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and
16.2.0). Health of all LCM agents on cluster machines and the status of
LCM agents update to the version from the current Cluster release.
To simplify operations with L2 templates, before you start creating
them, inspect the general workflow of a network interface name gathering
and processing.
Network interface naming workflow:
The Operator creates a BareMetalHostInventory object.
Note
Before update of the management cluster to Container Cloud 2.29.0
(Cluster release 16.4.0), instead of BareMetalHostInventory, use the
BareMetalHost object. For details, see BareMetalHost.
Caution
While the Cluster release of the management cluster is 16.4.0,
BareMetalHostInventory operations are allowed to
m:kaas@management-admin only. Once the management cluster is updated
to the Cluster release 16.4.1 (or later), this limitation will be lifted.
The BareMetalHostInventory object executes the introspection stage
and becomes ready.
The Operator collects information about NIC count, naming, and so on
for further changes in the mapping logic.
At this stage, the NICs order in the object may randomly change
during each introspection, but the NICs names are always the same.
For more details, see Predictable Network Interface Names.
For example:
# Example commands:# kubectl -n managed-ns get bmh baremetalhost1 -o custom-columns='NAME:.metadata.name,STATUS:.status.provisioning.state'# NAME STATE# baremetalhost1 ready# kubectl -n managed-ns get bmh baremetalhost1 -o yaml# Example output:apiVersion:metal3.io/v1alpha1kind:BareMetalHost...status:...nics:-ip:fe80::ec4:7aff:fe6a:fb1f%eno2mac:0c:c4:7a:6a:fb:1fmodel:0x8086 0x1521name:eno2pxe:false-ip:fe80::ec4:7aff:fe1e:a2fc%ens1f0mac:0c:c4:7a:1e:a2:fcmodel:0x8086 0x10fbname:ens1f0pxe:false-ip:fe80::ec4:7aff:fe1e:a2fd%ens1f1mac:0c:c4:7a:1e:a2:fdmodel:0x8086 0x10fbname:ens1f1pxe:false-ip:192.168.1.151# Temp. PXE network adressmac:0c:c4:7a:6a:fb:1emodel:0x8086 0x1521name:eno1pxe:true...
The Operator selects from the following options:
Create an l2template object with the ifMapping configuration.
For details, see Create L2 templates.
The baremetal-provider service links the Machine object
to the BareMetalHostInventory object.
The kaas-ipam and baremetal-provider services collect hardware
information from the BareMetalHostInventory object and use it to
configure host networking and services.
The kaas-ipam service:
Spawns the IpamHost object.
Renders the l2template object.
Spawns the ipaddr object.
Updates the IpamHost object status with all rendered
and linked information.
The baremetal-provider service collects the rendered networking
information from the IpamHost object
The baremetal-provider service proceeds with the IpamHost object
provisioning.
After creating a basic Cluster object along with
the MetalLBConfig object and before creating an
L2 template, create the required subnets that can be used in the L2 template
to allocate IP addresses for the managed cluster nodes. Where required, create
a number of subnets for a particular project using the Subnet CR.
Each subnet used in an L2 template has its logical scope that is set using the
scope parameter in the corresponding L2Template.spec.l3Layout section.
One of the following logical scopes is used for each subnet referenced in an
L2 template:
global - CR uses the default namespace.
A subnet can be used for any cluster located in any project.
namespaced - CR uses the namespace that corresponds to a particular project
where managed clusters are located. A subnet can be used for any cluster
located in the same project.
cluster - Unsupported since Container Cloud 2.28.0 (Cluster releases 17.3.0
and 16.3.0). CR uses the namespace where the referenced cluster is located.
A subnet is only accessible to the cluster that
L2Template.metadata.labels:cluster.sigs.k8s.io/cluster-name (mandatory
since 2.25.0) or L2Template.spec.clusterRef (deprecated since 2.25.0)
refers to. The Subnet objects with the cluster scope will be created
for every new cluster depending on the provided SubnetPool.
Note
The use of the ipam/SVC-MetalLB label in Subnet objects
is unsupported as part of the MetalLBConfigTemplate object deprecation
since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0). No
actions are required for existing objects. A Subnet object containing
this label will be ignored by baremetal-provider after cluster update
to the mentioned Cluster releases.
You can have subnets with the same name in different projects.
In this case, the subnet that has the same project
as the cluster will be used. One L2 template may often reference several
subnets, those subnets may have different scopes in this case.
The IP address objects (IPaddr CR) that are allocated from subnets
always have the same project as their corresponding IpamHost objects,
regardless of the subnet scope.
You can create subnets using either the Container Cloud web UI or CLI.
Any Subnet object may contain ipam/SVC-<serviceName> labels.
All IP addresses allocated from the Subnet object that has service labels
defined, will inherit those labels.
When a particular IpamHost uses IP addresses allocated from such labeled
Subnet objects, the ServiceMap field in IpamHost.Status will
contain information about which IPs and interfaces correspond to which service
labels (that have been set in the Subnet objects). Using ServiceMap,
you can understand what IPs and interfaces of a particular host are used
for network traffic of a given service.
Currently, Container Cloud uses the following service labels that allow for
the use of specific subnets for particular Container Cloud services:
ipam/SVC-k8s-lcm
ipam/SVC-ceph-cluster
ipam/SVC-ceph-public
ipam/SVC-dhcp-range
ipam/SVC-MetalLBUnsupported since 2.28.0 (17.3.0 and 16.3.0)
ipam/SVC-LBhost
Caution
The use of the ipam/SVC-k8s-lcm label is mandatory
for every cluster.
You can also add custom service labels to the Subnet objects the same way
you add Container Cloud service labels. The mapping of IPs and interfaces to
the defined services is displayed in IpamHost.Status.ServiceMap.
You can assign multiple service labels to one network. You can also assign the
ceph-* and dhcp-range services to multiple networks. In the latter
case, the system sorts the IP addresses in the ascending order:
This section also applies to the bootstrap procedure of a
management cluster with the following difference: instead of creating the
Subnet object, add its configuration to
ipam-objects.yaml.template located in kaas-bootstrap/templates/bm/.
The Kubernetes Subnet object is created for a management cluster from
templates during bootstrap.
Each Subnet object can be used to define either a MetalLB address range or
MetalLB address pool. A MetalLB address pool may contain one or several
address ranges. The following rules apply to creation of address ranges or
pools:
To designate a subnet as a MetalLB address pool or range, use
the ipam/SVC-MetalLB label key. Set the label value to "1".
The object must contain the cluster.sigs.k8s.io/<cluster-name> label to
reference the name of the target cluster where the MetalLB address pool
is used.
You may create multiple subnets with the ipam/SVC-MetalLB label to
define multiple IP address ranges or multiple address pools for MetalLB in
the cluster.
The IP addresses of the MetalLB address pool are not assigned to the
interfaces on hosts. This subnet is virtual. Do not include such subnets
to the L2 template definitions for your cluster.
If a Subnet object defines a MetalLB address range, no additional
object properties are required.
You can use any number of Subnet objects with each defining a single
MetalLB address range. In this case, all address ranges are aggregated into
a single MetalLB L2 address pool named services having the auto-assign
policy enabled.
Intersection of IP address ranges within any single MetalLB address pool
is not allowed.
The bare metal provider verifies intersection of IP address ranges.
If it detects intersection, the MetalLB configuration is blocked and
the provider logs contain corresponding error messages.
Use the following labels to identify the Subnet object as a MetalLB
address pool and configure the name and protocol for that address pool.
All labels below are mandatory for the Subnet object that configures
a MetalLB address pool.
Mandatory Subnet labels for a MetalLB address pool¶
Label
Description
Labels to link Subnet to the target cluster and region
cluster.sigs.k8s.io/cluster-name
Specifies the cluster name where the MetalLB address pool is used.
kaas.mirantis.com/provider
Specifies the provider of the cluster where the MetalLB address pool is used.
kaas.mirantis.com/region
Specifies the region name of the cluster where the MetalLB address pool is
used.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
ipam/SVC-MetalLB
Defines that the Subnet object will be used to provide
a new address pool or range for MetalLB.
metallb/address-pool-name
Every address pool must have a distinct name.
The services-pxe address pool is mandatory when configuring
a dedicated PXE network in the management cluster. This name will be
used in annotations for services exposed through the PXE network.
A bootstrap cluster also uses the services-pxe address pool
for its provision services so that management cluster nodes can be
provisioned from the bootstrap cluster. After a management cluster is
deployed, the bootstrap cluster is deleted and that address pool is
solely used by the newly deployed cluster.
metallb/address-pool-auto-assign
Configures the auto-assign policy of an address pool. Boolean.
Caution
For the address pools defined using the MetalLB Helm chart
values in the Clusterspec section, auto-assign policy is
set to true and is not configurable .
For any service that does not have a specific MetalLB annotation
configured, MetalLB allocates external IPs from arbitrary address
pools that have the auto-assign policy set to true.
Only for the service that has a specific MetalLB annotation with
the address pool name, MetalLB allocates external IPs from the address
pool having the auto-assign policy set to false.
metallb/address-pool-protocol
Sets the address pool protocol.
The only supported value is layer2 (default).
Caution
Do not set the same address pool name for two or more
Subnet objects. Otherwise, the corresponding MetalLB address pool
configuration fails with a warning message in the bare metal provider log.
Caution
For the auto-assign policy, the following configuration
rules apply:
At least one MetalLB address pool must have the auto-assign
policy enabled so that unannotated services can have load balancer IPs
allocated for them. To satisfy this requirement, either configure one
of address pools using the Subnet object with
metallb/address-pool-auto-assign:"true" or configure address
range(s) using the Subnet object(s) without
metallb/address-pool-* labels.
When configuring multiple address pools with the auto-assign policy
enabled, keep in mind that it is not determined in advance which pool of
those multiple address pools is used to allocate an IP for a particular
unannotated service.
This section describes how to set up and verify MetalLB parameters before
configuring subnets for a managed cluster.
Caution
This section also applies to the bootstrap procedure of a
management cluster with the following differences:
Instead of the Cluster object, configure
templates/bm/cluster.yaml.template.
Instead of the MetalLBConfig object, configure
templates/bm/metallbconfig.yaml.template.
Instead of creating specific IPAM objects such as Subnet, add their
configuration to templates/bm/ipam-objects.yaml.template.
The Kubernetes objects described below are created for a management cluster
from template files during bootstrap.
Configuration rules for the ‘MetalLBConfig’ object¶
Caution
The use of the MetalLBConfig object is mandatory for
management and managed clusters after a management cluster upgrade to the
Cluster release 16.0.0.
The following rules and requirements apply to configuration of the
MetalLBConfig object:
Define one MetalLBConfig object per cluster.
Define the following mandatory labels:
cluster.sigs.k8s.io/cluster-name
Specifies the cluster name where the MetalLB address pool is used.
kaas.mirantis.com/provider
Specifies the provider of the cluster where the MetalLB address pool is used.
kaas.mirantis.com/region
Specifies the region name of the cluster where the MetalLB address pool is
used.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Intersection of IP address ranges within any single MetalLB address pool
is not allowed.
At least one MetalLB address pool must have the auto-assign policy enabled
so that unannotated services can have load balancer IP addresses allocated
to them.
When configuring multiple address pools with the auto-assign policy enabled,
keep in mind that it is not determined in advance which pool of those
multiple address pools is used to allocate an IP address for a particular
unannotated service.
Note
You can optimize address announcement for load-balanced services
using the interfaces selector for the l2Advertisements object. This
selector allows for address announcement only on selected host interfaces.
For details, see API Reference: MetalLB configuration examples.
Configuration rules for MetalLBConfigTemplate (obsolete since 2.27.0)
Caution
The MetalLBConfigTemplate object is deprecated in
Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0) and
unsupported since Container Cloud 2.28.0 (Cluster releases 17.3.0 and
16.3.0). For details, see MOSK Deprecation Notes: MetalLBConfigTemplate
resource management.
All rules described above for MetalLBConfig also apply to
MetalLBConfigTemplate.
Optional. Define one MetalLBConfigTemplate object per cluster.
The use of this object without MetalLBConfig is not allowed.
When using MetalLBConfigTemplate:
MetalLBConfig must reference MetalLBConfigTemplate by name:
spec:templateName:<managed-metallb-template>
You can use Subnet objects for defining MetalLB address pools.
Refer to MetalLB configuration guidelines for subnets for guidelines on configuring
MetalLB address pools using Subnet objects.
You can optimize address announcement for load-balanced services using
the interfaces selector for the l2Advertisements object. This
selector allows for address announcement only on selected host
interfaces. For details, see API Reference: MetalLBConfigTemplate
spec.
Optional. Configure parameters related to MetalLB components life cycle
such as deployment and update using the metallb Helm chart values in
the Clusterspec section. For example:
Configure the MetalLB parameters related to IP address allocation and
announcement for load-balanced cluster services. Select from the following
options:
Since Container Cloud 2.27.0 (Cluster releases 17.2.0 and
16.2.0)
Recommended. Default. Mandatory after a management cluster upgrade to
the Cluster release 17.2.0.
In the Technology Preview scope, you can use BGP for announcement of
external addresses of Kubernetes load-balanced services for managed
clusters. To configure the BGP announcement mode for MetalLB, use the
MetalLBConfig object.
The use of BGP is required to announce IP addresses for load-balanced
services when using MetalLB on nodes that are distributed across
multiple racks. In this case, setting of rack-id labels on nodes
is required, they are used in node selectors for BGPPeer,
BGPAdvertisement, or both MetalLB objects to properly configure
BGP connections from each node.
Configuration example of the Machine object for the
BGP announcement mode
apiVersion:cluster.k8s.io/v1alpha1kind:Machinemetadata:name:test-cluster-compute-1namespace:managed-nslabels:cluster.sigs.k8s.io/cluster-name:test-clusteripam/RackRef:rack-1# reference to the "rack-1" Rackkaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onespec:providerSpec:value:...nodeLabels:-key:rack-id# node label can be used in "nodeSelectors" insidevalue:rack-1# "BGPPeer" and/or "BGPAdvertisement" MetalLB objects...
Configuration example of the MetalLBConfig
object for the BGP announcement mode
apiVersion:ipam.mirantis.com/v1alpha1kind:MetalLBConfigmetadata:name:test-cluster-metallb-confignamespace:managed-nslabels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onespec:...bgpPeers:-name:svc-peer-1spec:holdTime:0skeepaliveTime:0speerAddress:10.77.42.1peerASN:65100myASN:65101nodeSelectors:-matchLabels:rack-id:rack-1# references the nodes having# the "rack-id=rack-1" labelbgpAdvertisements:-name:servicesspec:aggregationLength:32aggregationLengthV6:128ipAddressPools:-servicespeers:-svc-peer-1...
Since Container Cloud 2.24.x (Cluster releases 15.0.1, 14.0.1,
and 14.0.0)
Select from the following options:
Deprecated in Container Cloud 2.27.0 (Cluster releases 17.2.0 and
16.2.0) and unsupported since Container Cloud 2.28.0 (Cluster releases
17.3.0 and 16.3.0).
Mandatory after a management cluster upgrade to the Cluster release
16.0.0.
Create MetalLBConfig and MetalLBConfigTemplate objects.
This method allows using the Subnet object to define MetalLB
address pools.
For managed clusters, this configuration method is generally
available since Cluster releases 17.0.0 and 16.0.0. And it is
available as Technology Preview since Cluster releases 15.0.1,
14.0.1, and 14.0.0.
Since Cluster releases 15.0.3 and 14.0.3, in the Technology Preview
scope, you can use BGP for announcement of external addresses of
Kubernetes load-balanced services for managed clusters. To configure
the BGP announcement mode for MetalLB, use MetalLBConfig and
MetalLBConfigTemplate objects.
The use of BGP is required to announce IP addresses for load-balanced
services when using MetalLB on nodes that are distributed across
multiple racks. In this case, setting of rack-id labels on nodes
is required, they are used in node selectors for BGPPeer,
BGPAdvertisement, or both MetalLB objects to properly configure
BGP connections from each node.
Configuration example of the Machine object for the
BGP announcement mode
apiVersion:cluster.k8s.io/v1alpha1kind:Machinemetadata:name:test-cluster-compute-1namespace:managed-nslabels:cluster.sigs.k8s.io/cluster-name:test-clusteripam/RackRef:rack-1# reference to the "rack-1" Rackkaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onespec:providerSpec:value:...nodeLabels:-key:rack-id# node label can be used in "nodeSelectors" insidevalue:rack-1# "BGPPeer" and/or "BGPAdvertisement" MetalLB objects...
Configuration example of the MetalLBConfigTemplate
object for the BGP announcement mode
apiVersion:ipam.mirantis.com/v1alpha1kind:MetalLBConfigTemplatemetadata:name:test-cluster-metallb-config-templatenamespace:managed-nslabels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onespec:templates:...bgpPeers:|- name: svc-peer-1spec:peerAddress: 10.77.42.1peerASN: 65100myASN: 65101nodeSelectors:- matchLabels:rack-id: rack-1 # references the nodes having# the "rack-id=rack-1" labelbgpAdvertisements:|- name: servicesspec:ipAddressPools:- servicespeers:- svc-peer-1...
The bgpPeers and bgpAdvertisements fields are used to
configure BGP announcement instead of l2Advertisements.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
The use of BGP for announcement also allows for better balancing
of service traffic between cluster nodes as well as gives more
configuration control and flexibility for infrastructure
administrators. For configuration examples, refer to
MetalLB configuration examples. For configuration procedure,
refer to Configure BGP announcement for cluster API LB address.
Deprecated since Container Cloud 2.24.0. Configure the configInline
value in the MetalLB chart of the Cluster object.
Warning
This functionality is removed during the management
cluster upgrade to the Cluster release 16.0.0. Therefore, this
option becomes unavailable on managed clusters after the parent
management cluster upgrade to 16.0.0.
Deprecated since Container Cloud 2.24.0. Configure the Subnet
objects without MetalLBConfigTemplate.
Warning
This functionality is removed during the management
cluster upgrade to the Cluster release 16.0.0. Therefore, this
option becomes unavailable on managed clusters after the parent
management cluster upgrade to 16.0.0.
Caution
If the MetalLBConfig object is not used for MetalLB
configuration related to address allocation and announcement for
load-balanced services, then automated migration applies during
creation of clusters of any type or cluster update to Cluster releases
15.0.x or 14.0.x.
During automated migration, the MetalLBConfig and
MetalLBConfigTemplate objects are created and contents of the
MetalLB chart configInline value is converted to the parameters
of the MetalLBConfigTemplate object.
Any change to the configInline value made on a 15.0.x or 14.0.x
cluster will be reflected in the MetalLBConfigTemplate object.
This automated migration is removed during your management cluster
upgrade to the Cluster release 16.0.0, which is introduced in Container
Cloud 2.25.0, together with the possibility to use the configInline
value of the MetalLB chart. After that, any changes in MetalLB
configuration related to address allocation and announcement for
load-balanced services will be applied using the MetalLBConfigTemplate
and Subnet objects only.
Before Container Cloud 2.24.x (Cluster releases 12.7.0, 11.7.0,
or earlier)
Configure the configInline value for the MetalLB chart in the
Cluster object.
Configure both the configInline value for the MetalLB chart and
Subnet objects.
The resulting MetalLB address pools configuration will contain address
ranges from both cluster specification and Subnet objects.
All address ranges for L2 address pools will be aggregated into a single
L2 address pool and sorted as strings.
Changes to be applied since Container Cloud 2.25.0
The configuration options above are deprecated since Container Cloud
2.24.0, after your management cluster upgrade to the Cluster release
14.0.0 or 14.0.1. Automated migration of MetalLB parameters applies
during cluster creation or update to Container Cloud 2.24.x.
During automated migration, the MetalLBConfig and
MetalLBConfigTemplate objects are created and contents of the
MetalLB chart configInline value is converted to the parameters of
the MetalLBConfigTemplate object.
Any change to the configInline value made on a Container Cloud
2.24.x cluster will be reflected in the MetalLBConfigTemplate
object.
This automated migration is removed during your management cluster
upgrade to the Cluster release 16.0.0, which is introduced in
Container Cloud 2.25.0, together with the possibility to use
the configInline value of the MetalLB chart. After that, any
changes in MetalLB configuration related to address allocation and
announcement for load-balanced services will be applied using the
MetalLBConfigTemplate and Subnet objects only.
Verify the current MetalLB configuration:
Since Container Cloud 2.21.0
Verify the MetalLB configuration that is stored in MetalLB objects:
The auto-assign parameter will be set to false for all address
pools except the default one. So, a particular service will get an
address from such an address pool only if the Service object has a
special metallb.universe.tf/address-pool annotation that points to
the specific address pool name.
Note
It is expected that every Container Cloud service on a management
cluster will be assigned to one of the address pools.
Current consideration is to have two MetalLB address pools:
services-pxe is a reserved address pool name to use for
the Container Cloud services in the PXE network (Ironic API,
HTTP server, caching server).
default is an address pool to use for all other Container
Cloud services in the management network. No annotation
is required on the Service objects in this case.
The BGP configuration is not yet supported in the Container Cloud
web UI. Meantime, use the CLI for this purpose. For details, see
Configure and verify MetalLB using the CLI.
Read the MetalLB configuration guidelines described in
Configure MetalLB.
Optional. Configure parameters related to MetalLB components life cycle
such as deployment and update using the metallb Helm chart values in
the Clusterspec section. For example:
In the Networks section, click the MetalLB Configs
tab.
Click Create MetalLB Config.
Fill out the Create MetalLB Config form as required:
Name
Name of the MetalLB object being created.
Cluster
Name of the cluster that the MetalLB object is being created
for.
IP Address Pools
List of MetalLB IP address pool descriptions that will be used to create
the MetalLB IPAddressPool objects. Click the + button on
the right side of the section to add more objects.
Name
IP address pool name.
Addresses
Comma-separated ranges of the IP addresses included into the address
pool.
Auto Assign
Enable auto-assign policy for unannotated services to have load
balancer IP addresses allocated to them. At least one MetalLB address
pool must have the auto-assign policy enabled.
Service Allocation
IP address pool allocation to services. Click Edit to
insert a service allocation object with required label selectors for
services in the YAML format. For example:
List of MetalLBL2Advertisement objects to create MetalLB
L2Advertisement objects.
The l2Advertisements object allows defining interfaces to optimize
the announcement. When you use the interfaces selector, LB addresses
are announced only on selected host interfaces.
Mirantis recommends using the interfaces selector if nodes use separate
host networks for different types of traffic. The pros of such configuration
are as follows: less spam on other interfaces and networks and limited chances
to reach IP addresses of load-balanced services from irrelevant interfaces and
networks.
Caution
Interface names in the interfaces list must match those
on the corresponding nodes.
Add the following parameters:
Name
Name of the l2Advertisements object.
Interfaces
Optional. Comma-separated list of interface names that must match
the ones on the corresponding nodes. These names are defined in
L2 templates that are linked to the selected cluster.
IP Address Pools
Select the IP adress pool to use for the l2Advertisements
object.
Node Selectors
Optional. Match labels and values for the Kubernetes node selector
to limit the nodes announced as next hops for the LoadBalancer
IP. If you do not provide any labels, all nodes are announced as
next hops.
In Networks > MetalLB Configs, verify the status of the created
MetalLB object:
Ready - object is operational.
Error - object is non-operational. Hover over the status
to obtain details of the issue.
Note
To verify the object details, in
Networks > MetalLB Configs, click the More action
icon in the last column of the required object section and select
MetalLB Config info.
By default, MetalLB speakers are deployed on all Kubernetes nodes.
You can configure MetalLB to run its speakers on a particular set of nodes.
This decreases the number of nodes that should be connected to external
network. In this scenario, only a few nodes are exposed for ingress
traffic from the outside world.
To customize the MetalLB speaker node selector:
Using kubeconfig of the management cluster, open the Cluster object
of the managed cluster for editing:
The metallbSpeakerEnabled:"true" parameter in this example is the
label on Kubernetes nodes where MetalLB speakers will be deployed.
It can be an already existing node label or a new one.
You can add user-defined labels to nodes using the nodeLabels field.
List of node labels to be attached to a node for the user to run certain
components on separate cluster nodes. The list of allowed node labels
is located in the Cluster object status
providerStatus.releaseRef.current.allowedNodeLabels field.
If the value field is not defined in allowedNodeLabels, a label can
have any value.
Before or after a machine deployment, add the required label from the allowed
node labels list with the corresponding value to
spec.providerSpec.value.nodeLabels in machine.yaml. For example:
nodeLabels:-key:stacklightvalue:enabled
The addition of a node label that is not available in the list of allowed node
labels is restricted.
Create subnets for a managed cluster using web UI¶
After creating the MetalLB configuration as described in Configure MetalLB
and before creating an L2 template, create the required subnets to use in the
L2 template to allocate IP addresses for the managed cluster nodes.
To create subnets for a managed cluster using web UI:
Log in to the Container Cloud web UI with the operator permissions.
Switch to the required non-default project using the
Switch Project action icon located on top of the main left-side
navigation panel.
In the left sidebar, navigate to Networks.
The Subnets tab opens.
Click Create Subnet.
Fill out the Create subnet form as required:
Name
Subnet name.
Subnet Type
Subnet type:
DHCP
DHCP subnet that configures DHCP address ranges used by the DHCP
server on the management cluster. For details, see
Configure multiple DHCP address ranges.
LB
Cluster API LB subnet.
LCM
LCM subnet(s).
Storage access
Available in the web UI since Container Cloud 2.28.0 (17.3.0 and 16.3.0).
Storage access subnet.
Storage replication
Available in the web UI since Container Cloud 2.28.0 (17.3.0 and 16.3.0).
Storage replication subnet.
Custom
Custom subnet. For example, external or Kubernetes workloads.
MetalLB
Services subnet(s).
Warning
Since Container Cloud 2.28.0 (Cluster releases 17.3.0
and 16.3.0), disregard this parameter during subnet creation.
Configure MetalLB separately as described in
Configure MetalLB.
This parameter is removed from the Container Cloud web UI in
Container Cloud 2.29.0 (Cluster releases 17.4.0 and 16.4.0).
Cluster name that the subnet is being created for. Not required only
for the DHCP subnet.
CIDR
A valid IPv4 address of the subnet in the CIDR notation, for example,
10.11.0.0/24.
Include RangesOptional
A comma-separated list of IP address ranges within the given CIDR that should
be used in the allocation of IPs for nodes. The gateway, network, broadcast,
and DNSaddresses will be excluded (protected) automatically if they intersect
with one of the range. The IPs outside the given ranges will not be used in
the allocation. Each element of the list can be either an interval
10.11.0.5-10.11.0.70 or a single address 10.11.0.77.
Warning
Do not use values that are out of the given CIDR.
Exclude RangesOptional
A comma-separated list of IP address ranges within the given CIDR that should
not be used in the allocation of IPs for nodes. The IPs within the given CIDR
but outside the given ranges will be used in the allocation.
The gateway, network, broadcast, and DNS addresses will be excluded
(protected) automatically if they are included in the CIDR.
Each element of the list can be either an interval 10.11.0.5-10.11.0.70
or a single address 10.11.0.77.
Warning
Do not use values that are out of the given CIDR.
GatewayOptional
A valid IPv4 gateway address, for example, 10.11.0.9. Does not apply
to the MetalLB subnet.
Nameservers
IP addresses of nameservers separated by a comma. Does not apply
to the DHCP and MetalLB subnet types.
Use whole CIDR
Optional. Select to use the whole IPv4 address range that is set in
the CIDR field. Useful when defining single IP address (/32),
for example, in the Cluster API load balancer (LB) subnet.
If not set, the network address and broadcast address in the IP
subnet are excluded from the address allocation.
Labels
Key-value pairs attached to the selected subnet:
Caution
The values of the created subnet labels must match the
ones in spec.l3Layout section of the corresponding
L2Template object.
Click Add a label and assign the first custom label
with the required name and value. To assign consecutive labels,
use the + button located in the right side of the
Labels section.
MetalLB:
Warning
Since Container Cloud 2.28.0 (Cluster releases 17.3.0
and 16.3.0), disregard this label during subnet creation.
Configure MetalLB separately as described in
Configure MetalLB.
The label will be removed from the Container Cloud web UI in one
of the following releases.
metallb/address-pool-name
Name of the subnet address pool. Exemplary values:
services, default, external, services-pxe.
The latter label is dedicated for management clusters only.
For details about address pool names of a management cluster,
see Separate PXE and management networks.
metallb/address-pool-auto-assign
Enables automatic assignment of address pool. Boolean.
metallb/address-pool-protocol
Defines the address pool protocol. Possible values:
layer2 - announcement using the ARP protocol.
bgp - announcement using the BGP protocol. Technology
Preview.
In the Networks tab, verify the status of the created
subnet:
Ready - object is operational.
Error - object is non-operational. Hover over the status
to obtain details of the issue.
Note
To verify subnet details, in the Networks tab,
click the More action icon in the last column of the
required subnet and select Subnet info.
Before 2.26.0 (17.1.0, 16.1.0)
In the Clusters tab, click the required cluster and scroll
down to the Subnets section.
Click Add Subnet.
Fill out the Add new subnet form as required:
Subnet Name
Subnet name.
CIDR
A valid IPv4 CIDR, for example, 10.11.0.0/24.
Include RangesOptional
A comma-separated list of IP address ranges within the given CIDR that should
be used in the allocation of IPs for nodes. The gateway, network, broadcast,
and DNSaddresses will be excluded (protected) automatically if they intersect
with one of the range. The IPs outside the given ranges will not be used in
the allocation. Each element of the list can be either an interval
10.11.0.5-10.11.0.70 or a single address 10.11.0.77.
Warning
Do not use values that are out of the given CIDR.
Exclude RangesOptional
A comma-separated list of IP address ranges within the given CIDR that should
not be used in the allocation of IPs for nodes. The IPs within the given CIDR
but outside the given ranges will be used in the allocation.
The gateway, network, broadcast, and DNS addresses will be excluded
(protected) automatically if they are included in the CIDR.
Each element of the list can be either an interval 10.11.0.5-10.11.0.70
or a single address 10.11.0.77.
After creating the MetalLB configuration as described in Configure MetalLB
and before creating an L2 template, create the required subnets to use in the
L2 template to allocate IP addresses for the managed cluster nodes.
To create subnets for a managed cluster using CLI:
Log in to a local machine where your management cluster kubeconfig
is located and where kubectl is installed.
Note
The management cluster kubeconfig is created
during the last stage of the management cluster bootstrap.
Create a cluster using one of the following options:
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
A comma-separated list of IP address ranges within the given CIDR that should
be used in the allocation of IPs for nodes. The gateway, network, broadcast,
and DNSaddresses will be excluded (protected) automatically if they intersect
with one of the range. The IPs outside the given ranges will not be used in
the allocation. Each element of the list can be either an interval
10.11.0.5-10.11.0.70 or a single address 10.11.0.77.
Warning
Do not use values that are out of the given CIDR.
excludeRanges (list)
A comma-separated list of IP address ranges within the given CIDR that should
not be used in the allocation of IPs for nodes. The IPs within the given CIDR
but outside the given ranges will be used in the allocation.
The gateway, network, broadcast, and DNS addresses will be excluded
(protected) automatically if they are included in the CIDR.
Each element of the list can be either an interval 10.11.0.5-10.11.0.70
or a single address 10.11.0.77.
Warning
Do not use values that are out of the given CIDR.
useWholeCidr (boolean)
If set to true, the subnet address (10.11.0.0 in the example above)
and the broadcast address (10.11.0.255 in the example above)
are included into the address allocation for nodes. Otherwise,
(false by default), the subnet address and broadcast address
will be excluded from the address allocation.
gateway (singular)
A valid gateway address, for example, 10.11.0.9.
nameservers (list)
A list of the IP addresses of name servers. Each element of the list
is a single address, for example, 172.18.176.6.
Caution
The subnet for the PXE network of the management cluster
is automatically created during deployment.
Each cluster must use at least one subnet for its LCM network.
Every node must have the address allocated in the LCM network
using such subnet(s).
Each node of every cluster must have only one IP address in the
LCM network that is allocated from one of the Subnet objects having the
ipam/SVC-k8s-lcm label defined. Therefore, all Subnet objects used for
LCM networks must have the ipam/SVC-k8s-lcm label defined. For details,
see Service labels and their life cycle.
Note
You may use different subnets to allocate IP addresses
to different Container Cloud components in your cluster.
Add a label with the ipam/SVC- prefix to each subnet
that is used to configure a Container Cloud service.
For details, see Service labels and their life cycle and the optional steps
below.
Caution
Use of a dedicated network for Kubernetes pods traffic,
for external connection to the Kubernetes services exposed
by the cluster, and for the Ceph cluster access and replication
traffic is available as Technology Preview. Use such
configurations for testing and evaluation purposes only.
For the Technology Preview feature definition,
refer to Technology Preview features.
Optional. Technology Preview. Add a subnet for the externally accessible
API endpoint of the managed cluster.
Make sure that loadBalancerHost is set to "" (empty string)
in the Cluster spec.
Create a subnet with the ipam/SVC-LBhost label having the "1"
value to make the baremetal-provider use this subnet for allocation
of cluster API endpoints addresses.
One IP address will be allocated for each cluster to serve its
Kubernetes/MKE API endpoint.
Caution
Make sure that master nodes have host local-link addresses
in the same subnet as the cluster API endpoint address.
These host IP addresses will be used for VRRP traffic.
The cluster API endpoint address will be assigned to the same
interface on one of the master nodes where these host IPs
are assigned.
Note
We highly recommend that you assign the cluster API endpoint
address from the LCM network. For details on cluster networks
types, refer to Managed cluster networking.
See also the Single managed cluster use case example in the
following table.
You can use several options of addresses allocation scope of API endpoints
using subnets:
Use case
Example configuration
Several managed clusters in one management cluster
Create a subnet in the default namespace with no reference to any
cluster.
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Warning
Combining the ipam/SVC-LBhost label with any other
service labels on a single subnet is not supported. Use a dedicated
subnet for addresses allocation for cluster API endpoints.
Several managed clusters in a project
Create a subnet in a namespace corresponding to your project with no
reference to any cluster. Such subnet has priority over the one
described above.
Combining the ipam/SVC-LBhost label with any other
service labels on a single subnet is not supported. Use a dedicated
subnet for addresses allocation for cluster API endpoints.
Single managed cluster
Create a subnet in a namespace corresponding to your project with
a reference to the target cluster using the
cluster.sigs.k8s.io/cluster-name label. Such subnet has priority
over the ones described above. In this case, it is not obligatory to use
a dedicated subnet for addresses allocation of API endpoints.
You can add the ipam/SVC-LBhost label to the LCM subnet, and one of
the addresses from this subnet will be allocated for an API endpoint:
You can combine the ipam/SVC-LBhost label only with the
following service labels on a single subnet:
ipam/SVC-k8s-lcm
ipam/SVC-ceph-cluster
ipam/SVC-ceph-public
Otherwise, use a dedicated subnet for address allocation for the
cluster API endpoint. Other combinations are not supported and
can lead to unexpected results.
The above options can be used in conjunction. For example, you can define
a subnet for a region, a number of subnets within this region defined
for particular namespaces, and a number of subnets within the same region
and namespaces defined for particular clusters.
Optional. Add a subnet(s) for the Storage access network.
Set the ipam/SVC-ceph-public label with the value "1" to create
a subnet that will be used to configure the Ceph public network.
Set the cluster.sigs.k8s.io/cluster-name
label to the name of the target cluster during the subnet creation.
Use this subnet in the L2 template for storage nodes.
Assign this subnet to the interface connected to your Storage access
network.
Ceph will automatically use this subnet for its external connections.
A Ceph OSD will look for and bind to an address from this subnet
when it is started on a machine.
Optional. Add a subnet(s) for the Storage replication network.
Set the ipam/SVC-ceph-cluster label with the value "1" to create
a subnet that will be used to configure the Ceph cluster network.
Set the cluster.sigs.k8s.io/cluster-name label to the name
of the target cluster during the subnet creation.
Use this subnet in the L2 template for storage nodes.
Assign this subnet to the interface connected to your Storage replication
network.
Ceph will automatically use this subnet for its internal replication
traffic.
Optional. Add a subnet for Kubernetes pods traffic.
Use this subnet in the L2 template for all nodes in the cluster.
Assign this subnet to the interface connected to your Kubernetes
workloads network.
Use the npTemplate.bridges.k8s-pods bridge name in the L2 template.
This bridge name is reserved for the Kubernetes workloads network.
When the k8s-pods bridge is defined in an L2 template,
Calico CNI uses that network for routing the pods traffic between nodes.
Contains a short state description and a more detailed one if
applicable. The short status values are as follows:
OK - object is operational.
ERR - object is non-operational. This status has a detailed
description in the messages list.
TERM - object was deleted and is terminating.
messagesSince 2.23.0
Contains error or warning messages if the object state is ERR.
For example, ERR:WrongincludeRangeforCIDR….
statusMessage
Deprecated since Container Cloud 2.23.0 and will be removed in one
of the following releases in favor of state and messages.
Since Container Cloud 2.24.0, this field is not set for the objects
of newly created clusters.
cidr
Reflects the actual CIDR, has the same meaning as spec.cidr.
gateway
Reflects the actual gateway, has the same meaning as spec.gateway.
nameservers
Reflects the actual name servers, has same meaning as spec.nameservers.
ranges
Specifies the address ranges that are calculated using the fields from
spec:cidr,includeRanges,excludeRanges,gateway,useWholeCidr.
These ranges are directly used for nodes IP allocation.
allocatable
Includes the number of currently available IP addresses that can be allocated
for nodes from the subnet.
allocatedIPs
Specifies the list of IPv4 addresses with the corresponding IPaddr object IDs
that were already allocated from the subnet.
capacity
Contains the total number of IP addresses being held by ranges that equals to a sum
of the allocatable and allocatedIPs parameters values.
objCreated
Date, time, and IPAM version of the Subnet CR creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status
field in the Subnet CR.
objUpdated
Date, time, and IPAM version of the last Subnet CR update
by kaas-ipam.
Example of a successfully created subnet:
apiVersion:ipam.mirantis.com/v1alpha1kind:Subnetmetadata:labels:ipam/UID:6039758f-23ee-40ba-8c0f-61c01b0ac863kaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-oneipam/SVC-k8s-lcm:"1"name:kaas-mgmtnamespace:defaultspec:cidr:172.16.170.0/24excludeRanges:-172.16.170.100-172.16.170.101-172.16.170.139gateway:172.16.170.1includeRanges:-172.16.170.70-172.16.170.99nameservers:-172.18.176.6-172.18.224.6status:allocatable:27allocatedIPs:-172.16.170.70:ebabace8-7d9e-4913-a938-3d9e809f49fc-172.16.170.71:c1109596-fba1-471b-950b-b1b60ef2c37c-172.16.170.72:94c25734-c046-4a7e-a0fb-75582c5f20a9capacity:30checksums:annotations:sha256:38e0b9de817f645c4bec37c0d4a3e58baecccb040f5718dc069a72c7385a0bedlabels:sha256:5ed97704b05f15b204c1347603f9749ac015c29a4a16c6f599eed06babfb312espec:sha256:60ead7c744564b3bfbbb3c4e846bce54e9128be49a279bf0c2bbebac2cfcebe6cidr:172.16.170.0/24gateway:172.16.170.1labelSetChecksum:5ed97704b05f15b204c1347603f9749ac015c29a4a16c6f599eed06babfb312enameservers:-172.18.176.6-172.18.224.6objCreated:2023-03-03T03:06:20.00000Z by v6.4.999-20230127-091906-c451398objStatusUpdated:2023-03-03T04:05:14.48469Z by v6.4.999-20230127-091906-c451398objUpdated:2023-03-03T04:05:14.48469Z by v6.4.999-20230127-091906-c451398ranges:-172.16.170.70-172.16.170.99state:OK
Proceed to creating an L2 template for one or multiple managed clusters
as described in Create L2 templates.
Operators of Mirantis Container Cloud for on-demand self-service
Kubernetes deployments will want their users to create networks without
extensive knowledge about network topology or IP addresses. For
that purpose, the Operator can prepare L2 network templates in advance for
users to assign these templates to machines in their clusters.
The Operator can ensure that the users’ clusters have separate
IP address spaces using the SubnetPool resource.
SubnetPool allows for automatic creation of Subnet objects
that will consume blocks from the parent SubnetPool CIDR IP address
range. The SubnetPoolblockSize setting defines the IP address
block size to allocate to each child Subnet. SubnetPool has a global
scope, so any SubnetPool can be used to create the Subnet objects
for any namespace and for any cluster.
You can use the SubnetPool resource in the L2Template resources to
automatically allocate IP addresses from an appropriate IP range that
corresponds to a specific cluster, or create a Subnet resource
if it does not exist yet. This way, every cluster will use subnets
that do not overlap with other clusters.
To automate multiple subnet creation using SubnetPool:
Log in to a local machine where your management cluster kubeconfig
is located and where kubectl is installed.
Note
The management cluster kubeconfig is created
during the last stage of the management cluster bootstrap.
Create the subnetpool.yaml file with a number of subnet pools:
Note
You can define either or both subnets and subnet pools,
depending on the use case. A single L2 template can use
either or both subnets and subnet pools.
For the specification fields description of the SubnetPool object,
see SubnetPool spec.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Verify that the subnet pool is successfully created:
kubectlgetsubnetpoolkaas-mgmt-oyaml
In the system output, verify the status fields of the
subnetpool.yaml file. For the status fields description of the
SunbetPool object, see SubnetPool status.
Proceed to creating an L2 template for one or multiple managed clusters
as described in Create L2 templates. In this procedure, select
the exemplary L2 template for multiple subnets.
Caution
Using the l3Layout section, define all subnets that are used
in the npTemplate section.
Defining only part of subnets is not allowed.
If labelSelector is used in l3Layout, use any custom
label name that differs from system names. This allows for easier
cluster scaling in case of adding new subnets as described in
Expand IP addresses capacity in an existing cluster.
Mirantis recommends using a unique label prefix such as
user-defined/.
Since Container Cloud 2.9.0, L2 templates have a new format.
In the new L2 templates format, l2template:status:npTemplate
is used directly during provisioning. Therefore, a hardware node
obtains and applies a complete network configuration
during the first system boot.
After you create subnets for one or more managed clusters or projects
as described in Create subnets or Automate multiple subnet creation using SubnetPool,
follow the procedure below to create L2 templates for a managed cluster.
This procedure contains exemplary L2 templates for the following use cases:
This section contains an exemplary L2 template that demonstrates how to
set up bonds and bridges on hosts for your managed clusters
as described in Create L2 templates.
Use of a dedicated network for Kubernetes pods traffic,
for external connection to the Kubernetes services exposed
by the cluster, and for the Ceph cluster access and replication
traffic is available as Technology Preview. Use such
configurations for testing and evaluation purposes only.
For the Technology Preview feature definition,
refer to Technology Preview features.
Configure bonding options using the parameters field. The only
mandatory option is mode. See the example below for details.
Note
You can set any mode supported by
netplan
and your hardware.
Important
Bond monitoring is disabled in Ubuntu by default. However,
Mirantis highly recommends enabling it using Media Independent Interface
(MII) monitoring by setting the mii-monitor-interval parameter to a
non-zero value. For details, see Linux documentation: bond monitoring.
The Kubernetes LCM network connects LCM Agents running on nodes to the LCM API
of the management cluster. It is also used for communication between
kubelet and Kubernetes API server inside a Kubernetes cluster. The MKE
components use this network for communication inside a swarm cluster.
To configure each node with an IP address that will be used for LCM traffic,
use the npTemplate.bridges.k8s-lcm bridge in the L2 template, as
demonstrated in the example below.
Each node of every cluster must have only one IP address in the
LCM network that is allocated from one of the Subnet objects having the
ipam/SVC-k8s-lcm label defined. Therefore, all Subnet objects used for
LCM networks must have the ipam/SVC-k8s-lcm label defined. For details,
see Service labels and their life cycle.
As defined in Host networking, the LCM network can be collocated
with the PXE network.
Dedicated network for the Kubernetes pods traffic¶
If you want to use a dedicated network for Kubernetes pods traffic,
configure each node with an IPv4
address that will be used to route the pods traffic between nodes.
To accomplish that, use the npTemplate.bridges.k8s-pods bridge
in the L2 template, as demonstrated in the example below.
As defined in Host networking, this bridge name is reserved for the
Kubernetes pods network. When the k8s-pods bridge is defined in an L2
template, Calico CNI uses that network for routing the pods traffic between
nodes.
Dedicated network for the Kubernetes services traffic (MetalLB)¶
You can use a dedicated network for external connection to the Kubernetes
services exposed by the cluster.
If enabled, MetalLB will listen and respond on the dedicated virtual bridge.
To accomplish that, configure each node where metallb-speaker is deployed
with an IPv4 address. For details on selecting nodes for metallb-speaker,
see Configure node selector for MetalLB speaker.
Both the MetalLB IP address ranges and the IP
addresses configured on those nodes must fit in the same CIDR.
Use the npTemplate.bridges.k8s-ext bridge in the L2 template,
as demonstrated in the example below.
This bridge name is reserved for the Kubernetes external network.
The Subnet object that corresponds to the k8s-ext bridge must have
explicitly excluded the IP address ranges that are in use by MetalLB.
Dedicated network for the Ceph distributed storage traffic¶
You can configure dedicated networks for the Ceph cluster access and
replication traffic. Set labels on the Subnet CRs for the corresponding
networks, as described in Create subnets.
Container Cloud automatically configures Ceph to use the addresses from these
subnets. Ensure that the addresses are assigned to the storage nodes.
Use the npTemplate.bridges.ceph-cluster and
npTemplate.bridges.ceph-public bridges in the L2 template,
as demonstrated in the example below. These names are reserved for the Ceph
cluster access (public) and replication (cluster) networks.
The Subnet objects used to assign IP addresses to these bridges
must have corresponding labels ipam/SVC-ceph-public for the
ceph-public bridge and ipam/SVC-ceph-cluster for the
ceph-cluster bridge.
Example of an L2 template with interfaces bonding¶
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
L2 template example for automatic multiple subnet creation¶
This section contains an exemplary L2 template for automatic multiple
subnet creation as described in Automate multiple subnet creation using SubnetPool. This template
also contains the L3Layout section that allows defining the Subnet
scopes and enables auto-creation of the Subnet objects from the
SubnetPool objects. For details about auto-creation of the Subnet
objects see Automate multiple subnet creation using SubnetPool.
Do not assign an IP address to the PXE nic0 NIC explicitly
to prevent the IP duplication during updates.
The IP address is automatically assigned by the
bootstrapping engine.
Example of an L2 template for multiple subnets:
apiVersion:ipam.mirantis.com/v1alpha1kind:L2Templatemetadata:name:test-managednamespace:managed-nslabels:kaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onecluster.sigs.k8s.io/cluster-name:my-clusterspec:autoIfMappingPrio:-provision-eno-ens-enpl3Layout:-subnetName:lcm-subnetscope:namespace-subnetName:subnet-1subnetPool:kaas-mgmtscope:namespace-subnetName:subnet-2subnetPool:kaas-mgmtscope:clusternpTemplate:|version: 2ethernets:onboard1gbe0:dhcp4: falsedhcp6: falsematch:macaddress: {{mac 0}}set-name: {{nic 0}}# IMPORTANT: do not assign an IP address here explicitly# to prevent IP duplication issues. The IP will be assigned# automatically by the bootstrapping engine.# addresses: []onboard1gbe1:dhcp4: falsedhcp6: falsematch:macaddress: {{mac 1}}set-name: {{nic 1}}ten10gbe0s0:dhcp4: falsedhcp6: falsematch:macaddress: {{mac 2}}set-name: {{nic 2}}addresses:- {{ip "2:subnet-1"}}ten10gbe0s1:dhcp4: falsedhcp6: falsematch:macaddress: {{mac 3}}set-name: {{nic 3}}addresses:- {{ip "3:subnet-2"}}bridges:k8s-lcm:interfaces: [onboard1gbe0]addresses:- {{ip "k8s-lcm:lcm-subnet"}}gateway4: {{gateway_from_subnet "lcm-subnet"}}nameservers:addresses: {{nameservers_from_subnet "lcm-subnet"}}
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
In the template above, the following networks are defined
in the l3Layout section:
lcm-subnet - the subnet name to use for the
LCM network in the npTemplate. This subnet is
shared between the project clusters because it has the namespaced scope.
Since a subnet pool is not in use, create the corresponding Subnet
object before machines are attached to cluster manually. For details, see
Create subnets for a managed cluster using CLI.
Mark this Subnet with the ipam/SVC-k8s-lcm label.
The L2 template must contain the definition of the virtual Linux bridge
(k8s-lcm in the L2 template example) that is used to set up the LCM
network interface. IP addresses for the defined bridge must be assigned
from the LCM subnet, which is marked with the ipam/SVC-k8s-lcm label.
Each node of every cluster must have only one IP address in the
LCM network that is allocated from one of the Subnet objects having the
ipam/SVC-k8s-lcm label defined. Therefore, all Subnet objects used for
LCM networks must have the ipam/SVC-k8s-lcm label defined. For details,
see Service labels and their life cycle.
subnet-1 - unless already created, this subnet will be created
from the kaas-mgmt subnet pool. The subnet name must be unique within
the project. This subnet is shared between the project clusters.
subnet-2 - will be created from the kaas-mgmt subnet pool.
This subnet has the cluster scope. Therefore, the real name of the
Subnet CR object consists of the subnet name defined in l3Layout
and the cluster UID.
But the npTemplate section of the L2 template must contain only
the subnet name defined in l3Layout.
The subnets of the cluster scope are not shared between clusters.
Caution
Using the l3Layout section, define all subnets that are used
in the npTemplate section.
Defining only part of subnets is not allowed.
If labelSelector is used in l3Layout, use any custom
label name that differs from system names. This allows for easier
cluster scaling in case of adding new subnets as described in
Expand IP addresses capacity in an existing cluster.
Mirantis recommends using a unique label prefix such as
user-defined/.
Caution
Modification of L2 templates in use is allowed with a mandatory
validation step from the Infrastructure Operator to prevent accidental
cluster failures due to unsafe changes. The list of risks posed by modifying
L2 templates includes:
Services running on hosts cannot reconfigure automatically to switch to
the new IP addresses and/or interfaces.
Connections between services are interrupted unexpectedly, which can cause
data loss.
Incorrect configurations on hosts can lead to irrevocable loss of
connectivity between services and unexpected cluster partition or
disassembly.
You can create several L2 templates with different
configurations to be applied to different nodes of
the same cluster. See Assign L2 templates to machines for details.
Add or edit the mandatory parameters in the new L2 template.
The following tables provide the description of the mandatory
parameters in the example templates mentioned in the previous step.
Deprecated since Container Cloud 2.25.0 in favor of the mandatory
cluster.sigs.k8s.io/cluster-name label. Will be removed in one of the
following releases.
On existing clusters, this parameter is automatically migrated to the
cluster.sigs.k8s.io/cluster-name label since 2.25.0.
If an existing cluster has clusterRef:default set, the migration process
involves removing this parameter. Subsequently, it is not substituted with
the cluster.sigs.k8s.io/cluster-name label, ensuring the application of
the L2 template across the entire Kubernetes namespace.
The Cluster object name that this template is applied to.
The default value is used to apply the given template to all clusters
within a particular project, unless an L2 template that references
a specific cluster name exists. The clusterRef field has priority over
the cluster.sigs.k8s.io/cluster-name label:
When clusterRef is set to a non-default value, the
cluster.sigs.k8s.io/cluster-name label will be added or updated with
that value.
When clusterRef is set to default, the
cluster.sigs.k8s.io/cluster-name label will be absent or removed.
L2 template requirements
An L2 template must have the same project (Kubernetes namespace) as the
referenced cluster.
A cluster can be associated with many L2 templates. Only one of them can
have the ipam/DefaultForCluster label. Every L2 template that does not
have the ipam/DefaultForCluster label can be later assigned to a
particular machine using l2TemplateSelector.
The following rules apply to the default L2 template of a namespace:
Since Container Cloud 2.25.0, creation of the default L2 template for
a namespace is disabled. On existing clusters, the
Spec.clusterRef:default parameter of such an L2 template is
automatically removed during the migration process. Subsequently,
this parameter is not substituted with the
cluster.sigs.k8s.io/cluster-name label, ensuring the application
of the L2 template across the entire Kubernetes namespace. Therefore,
you can continue using existing default namespaced L2 templates.
Before Container Cloud 2.25.0, the default L2Template object of a
namespace must have the Spec.clusterRef:default parameter that is
deprecated since 2.25.0.
ifMapping or autoIfMappingPrio
ifMapping
List of interface names for the template. The interface mapping is defined
globally for all bare metal hosts in the cluster but can be overridden at the
host level, if required, by editing the IpamHost object for a particular
host. The ifMapping parameter is mutually exclusive with
autoIfMappingPrio.
autoIfMappingPrio
autoIfMappingPrio is a list of prefixes, such as eno, ens,
and so on, to match the interfaces to automatically create a list
for the template. If you are not aware of any specific
ordering of interfaces on the nodes, use the default
ordering from
Predictable Network Interfaces Names specification for systemd.
You can also override the default NIC list per host
using the IfMappingOverride parameter of the corresponding
IpamHost. The provision value corresponds to the network
interface that was used to provision a node.
Usually, it is the first NIC found on a particular node.
It is defined explicitly to ensure that this interface
will not be reconfigured accidentally.
The autoIfMappingPrio parameter is mutually exclusive
with ifMapping.
l3Layout
Subnets to be used in the npTemplate section. The field contains
a list of subnet definitions with parameters used by template macros.
subnetName
Defines the alias name of the subnet that can be used to reference this
subnet from the template macros. This parameter is mandatory for every
entry in the l3Layout list.
subnetPoolUnsupported since 2.28.0 (17.3.0 and 16.3.0)
Optional. Default: none. Defines a name of the parent SubnetPool object
that will be used to create a Subnet object with a given subnetName
and scope. For deprecation details, see MOSK Deprecation Notes:
SubnetPool resource management.
If a corresponding Subnet object already exists,
nothing will be created and the existing object will be used.
If no SubnetPool is provided, no new Subnet object will be created.
scope
Logical scope of the Subnet object with a corresponding subnetName.
Possible values:
global - the Subnet object is accessible globally,
for any Container Cloud project and cluster, for example, the PXE subnet.
namespace - the Subnet object is accessible within the same
project where the L2 template is defined.
cluster - the Subnet object is only accessible to the cluster
that L2Template.spec.clusterRef refers to. The Subnet objects
with the cluster scope will be created for every new cluster.
labelSelector
Contains a dictionary of labels and their respective values that will be
used to find the matching Subnet object for the subnet. If the
labelSelector field is omitted, the Subnet object will be selected
by name, specified by the subnetName parameter.
Caution
The labels and their values in this section must match the ones
added for the corresponding Subnet object.
Caution
The l3Layout section is mandatory for each L2Template
custom resource.
npTemplate
A netplan-compatible configuration with special lookup functions that
defines the networking settings for the cluster hosts, where physical
NIC names and details are parameterized. This configuration will be
processed using Go templates. Instead of specifying IP and MAC addresses,
interface names, and other network details specific to a particular host,
the template supports use of special lookup functions. These lookup
functions, such as nic, mac, ip, and so on, return
host-specific network information when the template is rendered for
a particular host.
Caution
All rules and restrictions of the netplan configuration
also apply to L2 templates. For details,
see the official netplan documentation.
Caution
We strongly recommend following the below conventions on
network interface naming:
A physical NIC name set by an L2 template must not exceed
15 symbols. Otherwise, an L2 template creation fails.
This limit is set by the Linux kernel.
Names of virtual network interfaces such as VLANs, bridges,
bonds, veth, and so on must not exceed 15 symbols.
We recommend setting interfaces names that do not
exceed 13 symbols for both physical and virtual interfaces
to avoid corner cases and issues in netplan rendering.
The following table describes the main lookup functions for an
L2 template.
Lookup function
Description
{{nic N}}
Name of a NIC number N. NIC numbers correspond to the interface
mapping list. This macro can be used as a key for the elements
of the ethernets map, or as the value of the name and
set-name parameters of a NIC. It is also used to reference the
physical NIC from definitions of virtual interfaces (vlan,
bridge).
{{mac N}}
MAC address of a NIC number N registered during a host hardware
inspection.
{{ip “N:subnet-a”}}
IP address and mask for a NIC number N. The address will be auto-allocated
from the given subnet if the address does not exist yet.
{{ip “br0:subnet-x”}}
IP address and mask for a virtual interface, “br0” in this example.
The address will be auto-allocated from the given subnet
if the address does not exist yet.
For virtual interfaces names, an IP address placeholder must contain
a human-readable ID that is unique within the L2 template and must
have the following format:
The <shortUniqueHumanReadableID> is made equal to a virtual
interface name throughout this document and Container Cloud
bootstrap templates.
{{cidr_from_subnet “subnet-a”}}
IPv4 CIDR address from the given subnet.
{{gateway_from_subnet“subnet-a”}}
IPv4 default gateway address from the given subnet.
{{nameservers_from_subnet “subnet-a”}}
List of the IP addresses of name servers from the given subnet.
{{cluster_api_lb_ip}}
Technology Preview since Container Cloud 2.24.4. IP address for
a cluster API load balancer.
Note
Every subnet referenced in an L2 template can have either
a global or namespaced scope. In the latter case, the subnet
must exist in the same project where the corresponding cluster
and L2 template are located.
Optional. To designate an L2 template as default, assign the
ipam/DefaultForCluster label to it. Only one L2 template in
a cluster can have this label. It will be used for machines that
do not have an L2 template explicitly assigned to them.
To assign the default template to the cluster:
Since Container Cloud 2.25.0, use the mandatory
cluster.sigs.k8s.io/cluster-name label in the L2 template
metadata section.
Before Container Cloud 2.25.0, use the
cluster.sigs.k8s.io/cluster-name label or the clusterRef
parameter in the L2 template spec section. This parameter is
deprecated and will be removed in one of the following releases. During
cluster update to 2.25.0, this parameter is automatically migrated to the
cluster.sigs.k8s.io/cluster-name label.
Optional. Add the l2template-<NAME>:"exists" label to the L2 template.
Replace <NAME> with the unique L2 template name or any other unique
identifier. You can refer to this label to assign this L2 template
when you create machines.
Add the L2 template to your management cluster. Select one of the following
options:
In the left sidebar, navigate to Networks and click
the L2 Templates tab.
Click Create L2 Template.
Fill out the Create L2 Template form as required:
Name
L2 template name.
Cluster
Cluster name that the L2 template is being added for. To set the
L2 template as default for all machines, also select
Set default for the cluster.
Specification
L2 specification in the YAML format that you have previously created.
Click Edit to edit the L2 template if required.
Note
Before Container Cloud 2.28.0 (Cluster releases 17.3.0 and
16.3.0), the field name is YAML file, and you can
upload the required YAML file instead of inserting and editing it.
Labels
Available since Container Cloud 2.28.0 (Cluster releases 17.3.0 and
16.3.0). Key-value pairs attached to the L2 template. For details,
see API Reference: L2Template metadata.
Proceed with Add a machine.
The resulting L2 template will be used to render the netplan configuration
for the managed cluster machines.
Workflow of the netplan configuration using an L2 template¶
The kaas-ipam service uses the data from BareMetalHost,
the L2 template, and subnets to generate the netplan configuration
for every cluster machine.
Note
Before update of the management cluster to Container Cloud 2.29.0
(Cluster release 16.4.0), instead of BareMetalHostInventory, use the
BareMetalHost object. For details, see BareMetalHost.
Caution
While the Cluster release of the management cluster is 16.4.0,
BareMetalHostInventory operations are allowed to
m:kaas@management-admin only. Once the management cluster is updated
to the Cluster release 16.4.1 (or later), this limitation will be lifted.
The generated netplan configuration is saved in the
status.netconfigFiles section of the IpamHost resource.
If the status.netconfigFilesState field of the IpamHost resource
is OK, the configuration was rendered in the IpamHost resource
successfully. Otherwise, the status contains an error message.
Caution
The following fields of the ipamHost status are renamed since
Container Cloud 2.22.0 in the scope of the L2Template and IpamHost
objects refactoring:
netconfigV2 to netconfigCandidate
netconfigV2state to netconfigCandidateState
netconfigFilesState to netconfigFilesStates (per file)
No user actions are required after renaming.
The format of netconfigFilesState changed after renaming. The
netconfigFilesStates field contains a dictionary of statuses of network
configuration files stored in netconfigFiles. The dictionary contains
the keys that are file paths and values that have the same meaning for each
file that netconfigFilesState had:
For a successfully rendered configuration file:
OK:<timestamp><sha256-hash-of-rendered-file>, where a timestamp
is in the RFC 3339 format.
For a failed rendering: ERR:<error-message>.
The baremetal-provider service copies data
from the status.netconfigFiles of IpamHost to the
Spec.StateItemsOverwrites[‘deploy’][‘bm_ipam_netconfigv2’] parameter
of LCMMachine.
The lcm-agent service on every host synchronizes the LCMMachine
data to its host. The lcm-agent service runs
a playbook to update the netplan configuration on the host
during the pre-download and deploy phases.
Configure BGP announcement for cluster API LB address¶
TechPreviewAvailable since 2.24.4
When you create a bare metal managed cluster with the multi-rack topology,
where Kubernetes masters are distributed across multiple racks
without an L2 layer extension between them, you must configure
BGP announcement of the cluster API load balancer address.
For clusters where Kubernetes masters are in the same rack or with an L2 layer
extension between masters, you can configure either BGP or L2 (ARP)
announcement of the cluster API load balancer address.
The L2 (ARP) announcement is used by default and its configuration is covered
in Create a cluster using web UI.
Caution
Create Rack and MultiRackCluster objects, which are
described in the below procedure, before initiating the provisioning
of master nodes to ensure that both BGP and netplan configurations
are applied simultaneously during the provisioning process.
To enable the use of BGP announcement for the cluster API LB address:
In the Cluster object, set the useBGPAnnouncement parameter
to true:
spec:providerSpec:value:useBGPAnnouncement:true
Create the MultiRackCluster object that is mandatory when configuring
BGP announcement for the cluster API LB address. This object enables you
to set cluster-wide parameters for configuration of BGP announcement.
In this scenario, the MultiRackCluster object must be bound to the
corresponding Cluster object using the
cluster.sigs.k8s.io/cluster-name label.
Container Cloud uses the bird BGP daemon for announcement of the cluster
API LB address. For this reason, set the corresponding
bgpdConfigFileName and bgpdConfigFilePath parameters in the
MultiRackCluster object, so that bird can locate the configuration
file. For details, see the configuration example below.
The bgpdConfigTemplate object contains the default configuration file
template for the bird BGP daemon, which you can override in Rack
objects.
The defaultPeer parameter contains default parameters of the BGP
connection from master nodes to infrastructure BGP peers, which you can
override in Rack objects.
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Create the Rack object(s). This object is mandatory when configuring
BGP announcement for the cluster API LB address and it allows you
to configure BGP announcement parameters for each rack.
In this scenario, Rack objects must be bound to Machine objects
corresponding to master nodes of the cluster.
Each Rack object describes the configuration for the bird BGP
daemon used to announce the cluster API LB address from a particular
master node or from several master nodes in the same rack.
The Machine object can optionally define the rack-id node label
that is not used for BGP announcement of the cluster API LB IP but
can be used for MetalLB. This label is required for MetalLB node selectors
when MetalLB is used to announce LB IP addresses on nodes that are
distributed across multiple racks. In this scenario, the L2 (ARP)
announcement mode cannot be used for MetalLB because master nodes are in
different L2 segments. So, the BGP announcement mode must be used for
MetalLB, and node selectors are required to properly configure BGP
connections from each node. See Configure MetalLB for details.
The L2Template object includes the lo interface configuration
to set the IP address for the bird BGP daemon that will be advertised
as the cluster API LB address. The {{ cluster_api_lb_ip }}
function is used in npTemplate to obtain the cluster API LB address
value.
Configuration example for Rack
apiVersion:ipam.mirantis.com/v1alpha1kind:Rackmetadata:name:rack-master-1namespace:managed-nslabels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onespec:bgpdConfigTemplate:|# optional...peeringMap:lcm-rack-control-1:peers:-neighborIP:10.77.31.2# "localASN" & "neighborASN" are taken from-neighborIP:10.77.31.3# "MultiRackCluster.spec.defaultPeer" if# not set here
Configuration example for Machine
apiVersion:cluster.k8s.io/v1alpha1kind:Machinemetadata:name:test-cluster-master-1namespace:managed-nsannotations:metal3.io/BareMetalHost:managed-ns/test-cluster-master-1labels:cluster.sigs.k8s.io/cluster-name:test-clustercluster.sigs.k8s.io/control-plane:controlplanehostlabel.bm.kaas.mirantis.com/controlplane:controlplaneipam/RackRef:rack-master-1# reference to the "rack-master-1" Rackkaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onespec:providerSpec:value:kind:BareMetalMachineProviderSpecapiVersion:baremetal.k8s.io/v1alpha1hostSelector:matchLabels:kaas.mirantis.com/baremetalhost-id:test-cluster-master-1l2TemplateSelector:name:test-cluster-master-1nodeLabels:# optional. it is not used for BGP announcement-key:rack-id# of the cluster API LB IP but it can be usedvalue:rack-master-1# for MetalLB if "nodeSelectors" are required...
Configuration example for L2Template
apiVersion:ipam.mirantis.com/v1alpha1kind:L2Templatemetadata:labels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onename:test-cluster-master-1namespace:managed-nsspec:...l3Layout:-subnetName:lcm-rack-control-1# this network is referencedscope:namespace# in the "rack-master-1" Rack-subnetName:ext-rack-control-1# optional. this network is usedscope:namespace# for k8s services traffic and# MetalLB BGP connections...npTemplate:|...ethernets:lo:addresses:- {{ cluster_api_lb_ip }} # function for cluster API LB IPdhcp4: falsedhcp6: false...
The configuration example for the scenario where Kubernetes masters are
in the same rack or with an L2 layer extension between masters is described
in Single rack configuration example.
The configuration example for the scenario where Kubernetes masters are
distributed across multiple racks without L2 layer extension between them
is described in Multiple rack configuration example.
After you add machines to your new bare metal cluster as described in
Add a machine to bare metal managed cluster,
create a Ceph cluster on top of this managed cluster using the
Mirantis Container Cloud web UI or CLI.
Mirantis highly recommends adding a Ceph cluster using the CLI
instead of the web UI. For the CLI procedure, refer to Add a Ceph cluster using CLI.
The web UI capabilities for adding a Ceph cluster are limited and lack
flexibility in defining Ceph cluster specifications.
For example, if an error occurs while adding a Ceph cluster using the
web UI, usually you can address it only through the CLI.
The web UI functionality for managing Ceph cluster is going to be
deprecated in one of the following releases.
This section explains how to create a Ceph cluster on top of a managed cluster
using the Mirantis Container Cloud web UI. As a result, you will deploy a Ceph
cluster with minimum three Ceph nodes that provide persistent volumes to
the Kubernetes workloads for your managed cluster.
Note
For the advanced configuration through the KaaSCephCluster custom
resource, see Ceph advanced configuration.
Replication network for Ceph OSDs. Must contain the CIDR definition
and match the corresponding values of the cluster Subnet
object or the environment network values. For configuration examples,
see the descriptions of managed-ns_Subnet_storage YAML files
in :ref: e2example1.
Public Network
Public network for Ceph data. Must contain the CIDR definition and
match the corresponding values of the cluster Subnet object
or the environment network values. For configuration examples,
see the descriptions of managed-ns_Subnet_storage YAML files
in :ref: e2example1.
Enable OSDs LCM
Select to enable LCM for Ceph OSDs.
Machines / Machine #1-3
Select machine
Select the name of the Kubernetes machine that will host
the corresponding Ceph node in the Ceph cluster.
Manager, Monitor
Select the required Ceph services to install on the Ceph node.
Devices
Select the disk that Ceph will use.
Warning
Do not select the device for system services,
for example, sda.
Warning
A Ceph cluster does not support removable devices that
are hosts with hotplug functionality enabled. To use devices as
Ceph OSD data devices, make them non-removable or disable the
hotplug functionality in the BIOS settings for disks that are
configured to be used as Ceph OSD data devices.
Enable Object Storage
Select to enable the single-instance RGW Object Storage.
To add more Ceph nodes to the new Ceph cluster, click +
next to any Ceph Machine title in the Machines tab.
Configure a Ceph node as required.
Warning
Do not add more than 3 Manager and/or Monitor
services to the Ceph cluster.
After you add and configure all nodes in your Ceph cluster, click
Create.
Verify your Ceph cluster as described in Verify Ceph.
Verify that network addresses used on your clusters do not overlap with
the following default MKE network addresses for Swarm and MCR:
10.0.0.0/16 is used for Swarm networks. IP addresses from this
network are virtual.
10.99.0.0/16 is used for MCR networks. IP addresses from this
network are allocated on hosts.
Verification of Swarm and MCR network addresses
To verify Swarm and MCR network addresses, run on any master node:
Not all of Swarm and MCR addresses are usually in use. One Swarm Ingress
network is created by default and occupies the 10.0.0.0/24 address
block. Also, three MCR networks are created by default and occupy
three address blocks: 10.99.0.0/20, 10.99.16.0/20,
10.99.32.0/20.
To verify the actual networks state and addresses in use, run:
This section explains how to create a Ceph cluster on top of a managed cluster
using the Mirantis Container Cloud CLI. As a result, you will deploy a Ceph
cluster with minimum three Ceph nodes that provide persistent volumes to
the Kubernetes workloads for your managed cluster.
Note
For the advanced configuration through the KaaSCephCluster custom
resource, see Ceph advanced configuration.
Substitute <managedClusterProject> and <clusterName> with
the corresponding managed cluster namespace and name accordingly.
Example output:
status:providerStatus:ready:trueconditions:-message:Helm charts are successfully installed(upgraded).ready:truetype:Helm-message:Kubernetes objects are fully up.ready:truetype:Kubernetes-message:All requested nodes are ready.ready:truetype:Nodes-message:Maintenance state of the cluster is falseready:truetype:Maintenance-message:TLS configuration settings are appliedready:truetype:TLS-message:Kubelet is Ready on all nodes belonging to the clusterready:truetype:Kubelet-message:Swarm is Ready on all nodes belonging to the clusterready:truetype:Swarm-message:All provider instances of the cluster are Readyready:truetype:ProviderInstance-message:LCM agents have the latest versionready:truetype:LCMAgent-message:StackLight is fully up.ready:truetype:StackLight-message:OIDC configuration has been applied.ready:truetype:OIDC-message:Load balancer 10.100.91.150 for kubernetes API has status HEALTHYready:truetype:LoadBalancer
Create a YAML file with the Ceph cluster specification:
<publicNet> is a CIDR definition or comma-separated list of
CIDR definitions (if the managed cluster uses multiple networks) of
public network for the Ceph data. The values should match the
corresponding values of the cluster Subnet object.
<clusterNet> is a CIDR definition or comma-separated list of
CIDR definitions (if the managed cluster uses multiple networks) of
replication network for the Ceph data. The values should match
the corresponding values of the cluster Subnet object.
Configure Subnet objects for the Storage access network by setting
ipam/SVC-ceph-public:"1" and ipam/SVC-ceph-cluster:"1" labels
to the corresponding Subnet objects. For more details, refer to
Create subnets for a managed cluster using CLI, Step 5.
Configure Ceph Manager and Ceph Monitor roles to select nodes that
should place Ceph Monitor and Ceph Manager daemons:
Obtain the names of the machines to place Ceph Monitor and Ceph
Manager daemons at:
kubectl-n<managedClusterProject>getmachine
Add the nodes section with mon and mgr roles defined:
Substitute <mgr-node-X> with the corresponding Machine object
names and <role-X> with the corresponding roles of daemon placement,
for example, mon or mgr.
Configure Ceph OSD daemons for Ceph cluster data storage:
Note
This step involves the deployment of Ceph Monitor and Ceph Manager
daemons on nodes that are different from the ones hosting Ceph cluster
OSDs. However, it is also possible to colocate Ceph OSDs, Ceph Monitor,
and Ceph Manager daemons on the same nodes. You can achieve this by
configuring the roles and storageDevices sections accordingly.
This kind of configuration flexibility is particularly useful in
scenarios such as hyper-converged clusters.
Warning
The minimal production cluster requires at least three nodes
for Ceph Monitor daemons and three nodes for Ceph OSDs.
Obtain the names of the machines with disks intended for storing Ceph
data:
kubectl-n<managedClusterProject>getmachine
For each machine, use status.providerStatus.hardware.storage
to obtain information about node disks:
Select by-id symlinks on the disks to be used in the Ceph cluster.
The symlinks should meet the following requirements:
A by-id symlink should contain
status.providerStatus.hardware.storage.serialNumber
A by-id symlink should not contain wwn
For the example above, if you are willing to use the sdc disk
to store Ceph data on it, use the
/dev/disk/by-id/scsi-SQEMU_QEMU_HARDDISK_2e52abb48862dbdc symlink.
It will be persistent and will not be affected by node reboot.
Specify selected by-id symlinks in the
spec.cephClusterSpec.nodes.storageDevices.fullPath field
along with the
spec.cephClusterSpec.nodes.storageDevices.config.deviceClass
field:
<storage-node-X> with the corresponding Machine
object names
<byIDSymlink-X> with the obtained by-id symlinks from
status.providerStatus.hardware.storage.byIDs
<deviceClass-X> with the obtained disk types from
status.providerStatus.hardware.storage.type
Before Container Cloud 2.25.0
Specify selected by-id symlinks in the
spec.cephClusterSpec.nodes.storageDevices.name field
along with the
spec.cephClusterSpec.nodes.storageDevices.config.deviceClass
field:
Wait for the KaaSCephCluster status and then for
status.shortClusterInfo.state to become Ready:
kubectl-n<managedClusterProject>getkcc-oyaml
Example of a complete L2 templates configuration for cluster creation¶
The following example contains all required objects of an advanced network
and host configuration for a baremetal-based managed cluster.
The procedure below contains:
Various .yaml objects to be applied with a managed cluster
kubeconfig
Useful comments inside the .yaml example files
Example hardware and configuration data, such as network, disk,
auth, that must be updated accordingly to fit your cluster configuration
Example templates, such as l2template and baremetalhostprofline,
that illustrate how to implement a specific configuration
Caution
The exemplary configuration described below is not production
ready and is provided for illustration purposes only.
For illustration purposes, all files provided in this exemplary procedure
are named by the Kubernetes object types:
Note
Before update of the management cluster to Container Cloud 2.29.0
(Cluster release 16.4.0), instead of BareMetalHostInventory, use the
BareMetalHost object. For details, see BareMetalHost.
Caution
While the Cluster release of the management cluster is 16.4.0,
BareMetalHostInventory operations are allowed to
m:kaas@management-admin only. Once the management cluster is updated
to the Cluster release 16.4.1 (or later), this limitation will be lifted.
Create an empty .yaml file with the namespace object:
apiVersion:v1
Select from the following options:
Since Container Cloud 2.21.0 and 2.21.1 for MOSK 22.5
Create the required number of .yaml files with the
BareMetalHostCredential objects for each bmh node with the
unique name and authentication data. The following example
contains one BareMetalHostCredential object:
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Create the required number of .yaml files with the Secret
objects for each bmh node with the unique name and
authentication data. The following example contains one Secret
object:
apiVersion:kaas.mirantis.com/v1alpha1kind:BareMetalHostInventorymetadata:annotations:inspect.metal3.io/hardwaredetails-storage-sort-term:hctl ASC, wwn ASC, by_id ASC, name ASClabels:cluster.sigs.k8s.io/cluster-name:managed-cluster# we will use those label, to link machine to exact bmh nodekaas.mirantis.com/baremetalhost-id:cz7700kaas.mirantis.com/provider:baremetalname:cz7700-managed-cluster-control-noefinamespace:managed-nsspec:bmc:address:192.168.1.12bmhCredentialsName:'cz7740-cred'bootMACAddress:0c:c4:7a:34:52:04bootMode:legacyonline:true
apiVersion:metal3.io/v1alpha1kind:BareMetalHostmetadata:labels:cluster.sigs.k8s.io/cluster-name:managed-clusterhostlabel.bm.kaas.mirantis.com/controlplane:controlplane# we will use those label, to link machine to exact bmh nodekaas.mirantis.com/baremetalhost-id:cz7700kaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-oneannotations:kaas.mirantis.com/baremetalhost-credentials-name:cz7700-credname:cz7700-managed-cluster-control-noefinamespace:managed-nsspec:bmc:address:192.168.1.12# credentialsName is updated automatically during cluster deploymentcredentialsName:''bootMACAddress:0c:c4:7a:34:52:04bootMode:legacyonline:true
apiVersion:metal3.io/v1alpha1kind:BareMetalHostmetadata:labels:cluster.sigs.k8s.io/cluster-name:managed-clusterhostlabel.bm.kaas.mirantis.com/controlplane:controlplane# we will use those label, to link machine to exact bmh nodekaas.mirantis.com/baremetalhost-id:cz7700kaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onename:cz7700-managed-cluster-control-noefinamespace:managed-nsspec:bmc:address:192.168.1.12# The secret for credentials requires the username and password# keys in the Base64 encoding.credentialsName:cz7700-credbootMACAddress:0c:c4:7a:34:52:04bootMode:legacyonline:true
apiVersion:metal3.io/v1alpha1kind:BareMetalHostProfilemetadata:labels:cluster.sigs.k8s.io/cluster-name:managed-cluster# This label indicates that this profile will be default in# namespaces, so machines w\o exact profile selecting will use# this templatekaas.mirantis.com/defaultBMHProfile:'true'kaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onename:bmhp-cluster-defaultnamespace:managed-nsspec:devices:-device:byPath:/dev/disk/by-path/pci-0000:00:1f.2-ata-1minSize:120Giwipe:truepartitions:-name:bios_grubpartflags:-bios_grubsize:4Miwipe:true-name:uefipartflags:-espsize:200Miwipe:true-name:config-2size:64Miwipe:true-name:lvm_dummy_partsize:1Giwipe:true-name:lvm_root_partsize:0wipe:true-device:byPath:/dev/disk/by-path/pci-0000:00:1f.2-ata-2minSize:30Giwipe:true-device:byPath:/dev/disk/by-path/pci-0000:00:1f.2-ata-3minSize:30Giwipe:truepartitions:-name:lvm_lvp_partsize:0wipe:true-device:byPath:/dev/disk/by-path/pci-0000:00:1f.2-ata-4wipe:truefileSystems:-fileSystem:vfatpartition:config-2-fileSystem:vfatmountPoint:/boot/efipartition:uefi-fileSystem:ext4logicalVolume:rootmountPoint:/-fileSystem:ext4logicalVolume:lvpmountPoint:/mnt/local-volumes/grubConfig:defaultGrubOptions:-GRUB_DISABLE_RECOVERY="true"-GRUB_PRELOAD_MODULES=lvm-GRUB_TIMEOUT=30kernelParameters:modules:-content:'optionskvm_intelnested=1'filename:kvm_intel.confsysctl:# For the list of options prohibited to change, refer to# https://docs.mirantis.com/mke/3.7/install/predeployment/set-up-kernel-default-protections.htmlfs.aio-max-nr:'1048576'fs.file-max:'9223372036854775807'fs.inotify.max_user_instances:'4096'kernel.core_uses_pid:'1'kernel.dmesg_restrict:'1'net.ipv4.conf.all.rp_filter:'0'net.ipv4.conf.default.rp_filter:'0'net.ipv4.conf.k8s-ext.rp_filter:'0'net.ipv4.conf.k8s-ext.rp_filter:'0'net.ipv4.conf.m-pub.rp_filter:'0'vm.max_map_count:'262144'logicalVolumes:-name:rootsize:0vg:lvm_root-name:lvpsize:0vg:lvm_lvppostDeployScript:|#!/bin/bash -ex# used for test-debug only!echo "root:r00tme" | sudo chpasswdecho 'ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"' > /etc/udev/rules.d/60-ssd-scheduler.rulesecho $(date) 'post_deploy_script done' >> /root/post_deploy_donepreDeployScript:|#!/bin/bash -execho "$(date) pre_deploy_script done" >> /root/pre_deploy_donevolumeGroups:-devices:-partition:lvm_root_partname:lvm_root-devices:-partition:lvm_lvp_partname:lvm_lvp-devices:-partition:lvm_dummy_part# here we can create lvm, but do not mount or format it somewherename:lvm_forawesomeapp
apiVersion:metal3.io/v1alpha1kind:BareMetalHostProfilemetadata:labels:cluster.sigs.k8s.io/cluster-name:managed-clusterkaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onename:worker-storage1namespace:managed-nsspec:devices:-device:minSize:120Giwipe:truepartitions:-name:bios_grubpartflags:-bios_grubsize:4Miwipe:true-name:uefipartflags:-espsize:200Miwipe:true-name:config-2size:64Miwipe:true# Create dummy partition w\o mounting-name:lvm_dummy_partsize:1Giwipe:true-name:lvm_root_partsize:0wipe:true-device:# Will be used for Ceph, so required to be wipedbyPath:/dev/disk/by-path/pci-0000:00:1f.2-ata-1minSize:30Giwipe:true-device:byPath:/dev/disk/by-path/pci-0000:00:1f.2-ata-2minSize:30Giwipe:truepartitions:-name:lvm_lvp_partsize:0wipe:true-device:byPath:/dev/disk/by-path/pci-0000:00:1f.2-ata-3wipe:true-device:byPath:/dev/disk/by-path/pci-0000:00:1f.2-ata-4minSize:30Giwipe:truepartitions:-name:lvm_lvp_part_sdfwipe:truesize:0fileSystems:-fileSystem:vfatpartition:config-2-fileSystem:vfatmountPoint:/boot/efipartition:uefi-fileSystem:ext4logicalVolume:rootmountPoint:/-fileSystem:ext4logicalVolume:lvpmountPoint:/mnt/local-volumes/grubConfig:defaultGrubOptions:-GRUB_DISABLE_RECOVERY="true"-GRUB_PRELOAD_MODULES=lvm-GRUB_TIMEOUT=30kernelParameters:modules:-content:'optionskvm_intelnested=1'filename:kvm_intel.confsysctl:# For the list of options prohibited to change, refer to# https://docs.mirantis.com/mke/3.6/install/predeployment/set-up-kernel-default-protections.htmlfs.aio-max-nr:'1048576'fs.file-max:'9223372036854775807'fs.inotify.max_user_instances:'4096'kernel.core_uses_pid:'1'kernel.dmesg_restrict:'1'net.ipv4.conf.all.rp_filter:'0'net.ipv4.conf.default.rp_filter:'0'net.ipv4.conf.k8s-ext.rp_filter:'0'net.ipv4.conf.k8s-ext.rp_filter:'0'net.ipv4.conf.m-pub.rp_filter:'0'vm.max_map_count:'262144'logicalVolumes:-name:rootsize:0vg:lvm_root-name:lvpsize:0vg:lvm_lvppostDeployScript:|#!/bin/bash -ex# used for test-debug only! That would allow operator to logic via TTY.echo "root:r00tme" | sudo chpasswd# Just an example for enforcing "ssd" disks to be switched to use "deadline" i\o scheduler.echo 'ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"' > /etc/udev/ rules.d/60-ssd-scheduler.rulesecho $(date) 'post_deploy_script done' >> /root/post_deploy_donepreDeployScript:|#!/bin/bash -execho "$(date) pre_deploy_script done" >> /root/pre_deploy_donevolumeGroups:-devices:-partition:lvm_root_partname:lvm_root-devices:-partition:lvm_lvp_part-partition:lvm_lvp_part_sdfname:lvm_lvp-devices:-partition:lvm_dummy_partname:lvm_forawesomeapp
Applies since Container Cloud 2.21.0 and 2.21.1 for
MOSK as TechPreview and since 2.24.0 as GA for
management clusters. For managed clusters, is generally available
since Container Cloud 2.25.0.
The MetalLBConfigTemplate object is available as
Technology Preview since Container Cloud 2.24.0 and is generally
available since Container Cloud 2.25.0.
apiVersion:kaas.mirantis.com/v1alpha1kind:KaaSCephClustermetadata:name:ceph-cluster-managed-clusternamespace:managed-nsspec:cephClusterSpec:nodes:# Add the exact ``nodes`` names.# Obtain the name from "get bmh -o wide" ``consumer`` field.cz812-managed-cluster-storage-worker-noefi-58spl:roles:-mgr-mon# All disk configuration must be reflected in ``baremetalhostprofile``storageDevices:-config:deviceClass:ssdfullPath:/dev/disk/by-id/scsi-1ATA_WDC_WDS100T2B0A-00SM50_200231434939cz813-managed-cluster-storage-worker-noefi-lr4k4:roles:-mgr-monstorageDevices:-config:deviceClass:ssdfullPath:/dev/disk/by-id/scsi-1ATA_WDC_WDS100T2B0A-00SM50_200231440912cz814-managed-cluster-storage-worker-noefi-z2m67:roles:-mgr-monstorageDevices:-config:deviceClass:ssdfullPath:/dev/disk/by-id/scsi-1ATA_WDC_WDS100T2B0A-00SM50_200231443409pools:-default:truedeviceClass:ssdname:kubernetesreplicated:size:3role:kubernetesk8sCluster:name:managed-clusternamespace:managed-ns
Note
The storageDevices[].fullPath field is available since
Container Cloud 2.25.0. For the clusters running earlier product
versions, define the /dev/disk/by-id symlinks using
storageDevices[].name instead.
Obtain kubeconfig of the newly created managed cluster:
By default, MKE uses Keycloak as the OIDC provider. Using the
ClusterOIDCConfiguration custom resource, you can add your own OpenID
Connect (OIDC) provider for MKE on managed clusters to authenticate user
requests to Kubernetes. For OIDC provider requirements, see OIDC official
specification.
Note
For OpenStack and StackLight, Container Cloud supports only
Keycloak, which is configured on the management cluster,
as the OIDC provider.
To add a custom OIDC provider for MKE:
Configure the OIDC provider:
Log in to the OIDC provider dashboard.
Create an OIDC client. If you are going to use an existing one, skip
this step.
Add the MKE redirectURL of the managed cluster to the OIDC client.
By default, the URL format is https://<MKEIP>:6443/login.
Add the <ContainerCloudwebUIIP>/token to the OIDC client
for generation of kubeconfig files of the target managed cluster
through the Container Cloud web UI.
Ensure that the aud claim of the issued id_token for audience
will be equal to the created client ID.
Optional. Allow MKE to refresh authentication when id_token expires
by allowing the offline_access claim for the OIDC client.
The kubectl apply command automatically saves the
applied data as plain text into the
kubectl.kubernetes.io/last-applied-configuration annotation of the
corresponding object. This may result in revealing sensitive data in this
annotation when creating or modifying the object.
Therefore, do not use kubectl apply on this object.
Use kubectl create, kubectl patch, or
kubectl edit instead.
If you used kubectl apply on this object, you
can remove the kubectl.kubernetes.io/last-applied-configuration
annotation from the object using kubectl edit.
The ClusterOIDCConfiguration object is created in the management
cluster. Users with the m:kaas:ns@operator/writer/member roles have
access to this object.
Once done, the following dependent objects are created automatically in the
target managed cluster: the
rbac.authorization.k8s.io/v1/ClusterRoleBinding object that binds the
admin group defined in spec:adminRoleCriteria:value to the
cluster-adminrbac.authorization.k8s.io/v1/ClusterRole object.
In the Cluster object of the managed cluster, add the name of the
ClusterOIDCConfiguration object to the spec.providerSpec.value.oidc
field.
Wait until the cluster machines switch from the Reconfigure to
Ready state for the changes to apply.
This section is intended only for advanced Infrastructure Operators
who are familiar with Kubernetes Cluster API.
Mirantis currently supports only those Mirantis
Container Cloud API features that are implemented in the
Container Cloud web UI.
Use other Container Cloud API features for testing
and evaluation purposes only.
The Container Cloud APIs are implemented using the Kubernetes
CustomResourceDefinitions (CRDs) that enable you to expand
the Kubernetes API. Different types of resources are grouped in the dedicated
files, such as cluster.yaml or machines.yaml.
For testing and evaluation purposes, you may also use the experimentalpublic Container Cloud API that
allows for implementation of custom clients for creating and operating of
managed clusters. This repository contains branches that correspond to the
Container Cloud releases. For an example usage, refer to the
README
file of the repository.
This section describes the License custom resource (CR) used in Mirantis
Container Cloud API to maintain the Mirantis Container Cloud license data.
Warning
The kubectl apply command automatically saves the
applied data as plain text into the
kubectl.kubernetes.io/last-applied-configuration annotation of the
corresponding object. This may result in revealing sensitive data in this
annotation when creating or modifying the object.
Therefore, do not use kubectl apply on this object.
Use kubectl create, kubectl patch, or
kubectl edit instead.
If you used kubectl apply on this object, you
can remove the kubectl.kubernetes.io/last-applied-configuration
annotation from the object using kubectl edit.
The Container Cloud License CR contains the following fields:
apiVersion
The API version of the object that is kaas.mirantis.com/v1alpha1.
kind
The object type that is License.
metadata
The metadata object field of the License resource contains
the following fields:
name
The name of the License object, must be license.
spec
The spec object field of the License resource contains the
Secret reference where license data is stored.
license
secret
The Secret reference where the license data is stored.
key
The name of a key in the license Secret data field
under which the license data is stored.
name
The name of the Secret where the license data is stored.
value
The value of the updated license. If you need to update the license,
place it under this field. The new license data will be placed to the
Secret and value will be cleaned.
status
customerID
The unique ID of a customer generated during the license issuance.
instance
The unique ID of the current Mirantis Container Cloud instance.
dev
The license is for development.
openstack
The license limits for MOSK clusters:
clusters
The maximum number of MOSK clusters to be deployed.
If the field is absent, the number of deployments is unlimited.
workersPerCluster
The maximum number of workers per MOSK cluster to be
created. If the field is absent, the number of workers is unlimited.
expirationTime
The license expiration time in the ISO 8601 format.
expired
The license expiration state. If the value is true, the license has
expired. If the field is absent, the license is valid.
This section describes the Diagnostic custom resource (CR) used in Mirantis
Container Cloud API to trigger self-diagnostics for management or managed
clusters.
The Container Cloud Diagnostic CR contains the following fields:
apiVersion
API version of the object that is diagnostic.mirantis.com/v1alpha1.
kind
Object type that is Diagnostic.
metadata
Object metadata that contains the following fields:
name
Name of the Diagnostic object.
namespace
Namespace used to create the Diagnostic object. Must be equal to the
namespace of the target cluster.
spec
Resource specification that contains the following fields:
cluster
Name of the target cluster to run diagnostics on.
checks
Reserved for internal usage, any override will be discarded.
status
finishedAt
Completion timestamp of diagnostics. If the Diagnostic Controller version
is outdated, this field is not set and the corresponding error message
is displayed in the error field.
error
Error that occurs during diagnostics or if the Diagnostic Controller
version is outdated. Omitted if empty.
controllerVersion
Version of the controller that launched diagnostics.
result
Map of check statuses where the key is the check name and the value is
the result of the corresponding diagnostic check:
description
Description of the check in plain text.
result
Result of diagnostics. Possible values are PASS, ERROR,
FAIL, WARNING, INFO.
message
Optional. Explanation of the check results. It may optionally contain
a reference to the documentation describing a known issue related to
the check results, including the existing workaround for the issue.
success
Success status of the check. Boolean.
ticketInfo
Optional. Information about the ticket to track the resolution
progress of the known issue related to the check results. For example,
FIELD-12345.
The Diagnostic resource example:
apiVersion:diagnostic.mirantis.com/v1alpha1kind:Diagnosticmetadata:name:test-diagnosticnamespace:test-namespacespec:cluster:test-clusterstatus:finishedAt:2024-07-01T11:27:14Zerror:""controllerVersion:v1.40.11result:bm_address_capacity:description:Baremetal addresses capacitymessage:LCM Subnet 'default/k8s-lcm-nics' has 8 allocatable addresses (thresholdis 5) - OK; PXE-NIC Subnet 'default/k8s-pxe-nics' has 7 allocatable addresses(threshold is 5) - OK; Auto-assignable address pool 'default' from MetallbConfig'default/kaas-mgmt-metallb' has left 21 available IP addresses (thresholdis 10) - OKresult:INFOsuccess:truebm_artifacts_overrides:description:Baremetal overrides checkmessage:BM operator has no undesired overridesresult:PASSsuccess:true
IAMUser is the Cluster (non-namespaced) object. Its objects are synced
from Keycloak that is they are created upon user creation in Keycloak and
deleted user upon deletion in Keycloak. The IAMUser is exposed as read-only
to all users. It contains the following fields:
apiVersion
API version of the object that is iam.mirantis.com/v1alpha1
kind
Object type that is IAMUser
metadata
Object metadata that contains the following field:
name
Sanitized user name without special characters with first 8 symbols of
the user UUID appended to the end
displayName
Name of the user as defined in the Keycloak database
externalID
ID of the user as defined in the Keycloak database
The management-admin role is available since Container
Cloud 2.25.0 (Cluster releases 17.0.0, 16.0.0, 14.1.0).
description
Role description.
scope
Role scope.
Configuration example:
apiVersion:iam.mirantis.com/v1alpha1kind:IAMRolemetadata:name:global-admindescription:Gives permission to manage IAM role bindings in the Container Cloud deployment.scope:global
IAMGlobalRoleBinding is the Cluster (non-namespaced) object that
should be used for global role bindings in all namespaces. This object is
accessible to users with the global-adminIAMRole assigned through the
IAMGlobalRoleBinding object. The object contains the following fields:
apiVersion
API version of the object that is iam.mirantis.com/v1alpha1.
kind
Object type that is IAMGlobalRoleBinding.
metadata
Object metadata that contains the following field:
name
Role binding name. If the role binding is user-created, user can set
any unique name. If a name relates to a binding that is synced by
user-controller from Keycloak, the naming convention is
<username>-<rolename>.
role
Object role that contains the following field:
name
Role name.
user
Object name that contains the following field:
name
Name of the iamuser object that the defined role is provided to.
Not equal to the user name in Keycloak.
legacy
Defines whether the role binding is legacy. Possible values are true or
false.
legacyRole
Applicable when the legacy field value is true.
Defines the legacy role name in Keycloak.
external
Defines whether the role is assigned through Keycloak and is synced by
user-controller with the Container Cloud API as the
IAMGlobalRoleBinding object. Possible values are true or false.
Caution
If you create the IAM*RoleBinding, do not set or modify
the legacy, legacyRole, and external fields unless absolutely
necessary and you understand all implications.
IAMRoleBinding is the namespaced object that represents a grant of one
role to one user in all clusters of the namespace. It is accessible to users
that have either of the following bindings assigned to them:
IAMGlobalRoleBinding that binds them with the global-admin,
operator, or useriamRole. For user, the bindings are
read-only.
IAMRoleBinding that binds them with the operator or useriamRole in a particular namespace. For user, the bindings are
read-only.
apiVersion
API version of the object that is iam.mirantis.com/v1alpha1.
kind
Object type that is IAMRoleBinding.
metadata
Object metadata that contains the following fields:
namespace
Namespace that the defined binding belongs to.
name
Role binding name. If the role is user-created, user can set any unique
name. If a name relates to a binding that is synced from Keycloak,
the naming convention is <userName>-<roleName>.
legacy
Defines whether the role binding is legacy. Possible values are true or
false.
legacyRole
Applicable when the legacy field value is true.
Defines the legacy role name in Keycloak.
external
Defines whether the role is assigned through Keycloak and is synced by
user-controller with the Container Cloud API as the
IAMGlobalRoleBinding object. Possible values are true or false.
Caution
If you create the IAM*RoleBinding, do not set or modify
the legacy, legacyRole, and external fields unless absolutely
necessary and you understand all implications.
role
Object role that contains the following field:
name
Role name.
user
Object user that contains the following field:
name
Name of the iamuser object that the defined role is granted to.
Not equal to the user name in Keycloak.
IAMClusterRoleBinding is the namespaced object that represents a grant
of one role to one user on one cluster in the namespace. This object is
accessible to users that have either of the following bindings
assigned to them:
IAMGlobalRoleBinding that binds them with the global-admin,
operator, or useriamRole. For user, the bindings are
read-only.
IAMRoleBinding that binds them with the operator or useriamRole in a particular namespace. For user, the bindings are
read-only.
The IAMClusterRoleBinding object contains the following fields:
apiVersion
API version of the object that is iam.mirantis.com/v1alpha1.
kind
Object type that is IAMClusterRoleBinding.
metadata
Object metadata that contains the following fields:
namespace
Namespace of the cluster that the defined binding belongs to.
name
Role binding name. If the role is user-created, user can set any unique
name. If a name relates to a binding that is synced from Keycloak,
the naming convention is <userName>-<roleName>-<clusterName>.
role
Object role that contains the following field:
name
Role name.
user
Object user that contains the following field:
name
Name of the iamuser object that the defined role is granted to.
Not equal to the user name in Keycloak.
cluster
Object cluster that contains the following field:
name
Name of the cluster on which the defined role is granted.
legacy
Defines whether the role binding is legacy. Possible values are true or
false.
legacyRole
Applicable when the legacy field value is true.
Defines the legacy role name in Keycloak.
external
Defines whether the role is assigned through Keycloak and is synced by
user-controller with the Container Cloud API as the
IAMGlobalRoleBinding object. Possible values are true or false.
Caution
If you create the IAM*RoleBinding, do not set or modify
the legacy, legacyRole, and external fields unless absolutely
necessary and you understand all implications.
This section contains description of the OpenID Connect (OIDC) custom resource
for Mirantis Container Cloud that you can use to customize OIDC for Mirantis
Kubernetes Engine (MKE) on managed clusters. Using this resource, add your own
OIDC provider to authenticate user requests to Kubernetes. For OIDC provider
requirements, see OIDC official specification.
The Container Cloud ClusterOIDCConfiguration custom resource contains
the following fields:
apiVersion
The API version of the object that is kaas.mirantis.com/v1alpha1.
kind
The object type that is ClusterOIDCConfiguration.
metadata
The metadata object field of the ClusterOIDCConfiguration resource
contains the following fields:
name
The object name.
namespace
The project name (Kubernetes namespace) of the related managed cluster.
spec
The spec object field of the ClusterOIDCConfiguration resource
contains the following fields:
adminRoleCriteria
Definition of the id_token claim with the admin role and the role
value.
matchType
Matching type of the claim with the requested role. Possible values
that MKE uses to match the claim with the requested value:
must
Requires a plain string in the id_token claim, for example,
"iam_role":"mke-admin".
contains
Requires an array of strings in the id_token claim,
for example, "iam_role":["mke-admin","pod-reader"].
name
Name of the admin id_token claim containing a role or array of
roles.
value
Role value that matches the "iam_role" value in the admin
id_token claim.
caBundle
Base64-encoded certificate authority bundle of the OIDC provider
endpoint.
clientID
ID of the OIDC client to be used by Kubernetes.
clientSecret
Secret value of the clientID parameter. After the
ClusterOIDCConfiguration object creation, this field is updated
automatically with a reference to the corresponding Secret. For example:
This section describes the UpdateGroup custom resource (CR) used in the
Container Cloud API to configure update concurrency for specific sets of
machines or machine pools within a cluster. This resource enhances the update
process by allowing a more granular control over the concurrency of machine
updates. This resource also provides a way to control the reboot behavior of
machines during a Cluster release update.
The Container Cloud UpdateGroup CR contains the following fields:
apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is UpdateGroup.
metadata
Metadata of the UpdateGroup CR that contains the following fields. All
of them are required.
name
Name of the UpdateGroup object.
namespace
Project where the UpdateGroup is created.
labels
Label to associate the UpdateGroup with a specific cluster in the
cluster.sigs.k8s.io/cluster-name:<cluster-name> format.
spec
Specification of the UpdateGroup CR that contains the following fields:
index
Index to determine the processing order of the UpdateGroup object.
Groups with the same index are processed concurrently.
Number of machines to update concurrently within UpdateGroup.
rebootIfUpdateRequiresSince 2.28.0 (17.3.0 and 16.3.0)
Technology Preview. Automatic reboot of controller or worker machines
of an update group if a Cluster release update involves node reboot,
for example, when kernel version update is available in new Cluster
release. You can set this parameter for management or managed clusters.
Boolean. By default, true on management clusters and false on
managed clusters. On managed clusters:
If set to true, related machines are rebooted as part of a Cluster
release update that requires a reboot.
If set to false, machines are not rebooted even if a Cluster
release update requires a reboot.
Caution
During a distribution upgrade, machines are always rebooted,
overriding rebootIfUpdateRequires:false.
This section describes the MCCUpgrade resource used in Mirantis
Container Cloud API to configure a schedule for the Container Cloud update.
The Container Cloud MCCUpgrade CR contains the following fields:
apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is MCCUpgrade.
metadata
The metadata object field of the MCCUpgrade resource contains
the following fields:
name
The name of MCCUpgrade object, must be mcc-upgrade.
spec
The spec object field of the MCCUpgrade resource contains the
schedule when Container Cloud update is allowed or blocked. This field
contains the following fields:
blockUntil
Deprecated since Container Cloud 2.28.0 (Cluster release 16.3.0). Use
autoDelay instead.
Time stamp in the ISO 8601 format, for example,
2021-12-31T12:30:00-05:00. Updates will be disabled until this time.
You cannot set this field to more than 7 days in the future and more
than 30 days after the latest Container Cloud release.
autoDelay
Available since Container Cloud 2.28.0 (Cluster release 16.3.0).
Flag that enables delay of the management cluster auto-update to a new
Container Cloud release and ensures that auto-update is not started
immediately on the release date. Boolean, false by default.
The delay period is minimum 20 days for each newly discovered release
and depends on specifics of each release cycle and on optional
configuration of week days and hours selected for update. You can verify
the exact date of a scheduled auto-update in the status section of
the MCCUpgrade object.
Note
Modifying the delay period is not supported.
timeZone
Name of a time zone in the IANA Time Zone Database. This time zone will
be used for all schedule calculations. For example: Europe/Samara,
CET, America/Los_Angeles.
schedule
List of schedule items that allow an update at specific hours or
weekdays. The update process can proceed if at least one of these items
allows it. Schedule items allow update when both hours and
weekdays conditions are met. When this list is empty or absent,
update is allowed at any hour of any day. Every schedule item contains
the following fields:
hours
Object with 2 fields: from and to. Both must be non-negative
integers not greater than 24. The to field must be greater than
the from one. Update is allowed if the current hour in the
time zone specified by timeZone is greater or equals to from
and is less than to. If hours is absent, update is allowed
at any hour.
weekdays
Object with boolean fields with these names:
monday
tuesday
wednesday
thursday
friday
saturday
sunday
Update is allowed only on weekdays that have the corresponding field
set to true. If all fields are false or absent, or
weekdays is empty or absent, update is allowed on all weekdays.
In this example, all schedule calculations are done in the CET timezone and
upgrades are allowed only:
From 7:00 to 17:00 on Mondays
From 10:00 to 17:00 on Tuesdays
From 7:00 to 10:00 on Fridays
status
The status object field of the MCCUpgrade resource contains
information about the next planned Container Cloud update, if available.
This field contains the following fields:
nextAttemptDeprecated since 2.28.0 (Cluster release 16.3.0)
Time stamp in the ISO 8601 format indicating the time when the Release
Controller will attempt to discover and install a new Container Cloud
release. Set to the next allowed time according to the schedule
configured in spec or one minute in the future if the schedule
currently allows update.
messageDeprecated since 2.28.0 (Cluster release 16.3.0)
Message from the last update step or attempt.
nextRelease
Object describing the next release that Container Cloud will be updated
to. Absent if no new releases have been discovered. Contains the
following fields:
version
Semver-compatible version of the next Container Cloud release, for
example, 2.22.0.
date
Time stamp in the ISO 8601 format of the Container Cloud release
defined in version:
Since 2.28.0 (Cluster release 16.3.0), the field indicates the
publish time stamp of a new release.
Before 2.28.0 (Cluster release 16.2.x or earlier), the field
indicates the discovery time stamp of a new release.
scheduled
Available since Container Cloud 2.28.0 (Cluster release 16.3.0).
Time window that the pending Container Cloud release update is
scheduled for:
startTime
Time stamp in the ISO 8601 format indicating the start time of
the update for the pending Container Cloud release.
endTime
Time stamp in the ISO 8601 format indicating the end time of
the update for the pending Container Cloud release.
lastUpgrade
Time stamps of the latest Container Cloud update:
startedAt
Time stamp in the ISO 8601 format indicating the time when the last
Container Cloud update started.
finishedAt
Time stamp in the ISO 8601 format indicating the time when the last
Container Cloud update finished.
conditions
Available since Container Cloud 2.28.0 (Cluster release 16.3.0). List of
status conditions describing the status of the MCCUpgrade resource.
Each condition has the following format:
type
Condition type representing a particular aspect of the MCCUpgrade
object. Currently, the only supported condition type is Ready that
defines readiness to process a new release.
If the status field of the Ready condition type is False,
the Release Controller blocks the start of update operations.
status
Condition status. Possible values: True, False,
Unknown.
reason
Machine-readable explanation of the condition.
lastTransitionTime
Time of the latest condition transition.
message
Human-readable description of the condition.
Example of MCCUpgrade status:
status:conditions:-lastTransitionTime:"2024-09-16T13:22:27Z"message:New release scheduled for upgradereason:ReleaseScheduledstatus:"True"type:ReadylastUpgrade:{}message:''nextAttempt:"2024-09-16T13:23:27Z"nextRelease:date:"2024-08-25T21:05:46Z"scheduled:endTime:"2024-09-17T00:00:00Z"startTime:"2024-09-16T00:00:00Z"version:2.28.0
Available since 2.27.0 (17.2.0 and 16.2.0)TechPreview
This section describes the ClusterUpdatePlan custom resource (CR) used in
the Container Cloud API to granularly control update process of a managed
cluster by stopping the update after each step.
The ClusterUpdatePlan CR contains the following fields:
apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is ClusterUpdatePlan.
metadata
Metadata of the ClusterUpdatePlan CR that contains the following fields:
name
Name of the ClusterUpdatePlan object.
namespace
Project name of the cluster that relates to ClusterUpdatePlan.
spec
Specification of the ClusterUpdatePlan CR that contains the following
fields:
source
Source name of the Cluster release from which the cluster is updated.
target
Target name of the Cluster release to which the cluster is updated.
cluster
Name of the cluster for which ClusterUpdatePlan is created.
releaseNotes
Available since Container Cloud 2.29.0 (Cluster releases 17.4.0 and
16.4.0). Link to MOSK release notes of the target
release.
steps
List of update steps, where each step contains the following fields:
id
Available since Container Cloud 2.28.0 (Cluster releases 17.3.0 and
16.3.0). Step ID.
name
Step name.
description
Step description.
constraints
Description of constraints applied during the step execution.
impact
Impact of the step on the cluster functionality and workloads.
Contains the following fields:
users
Impact on the Container Cloud user operations. Possible values:
none, major, or minor.
workloads
Impact on workloads. Possible values: none, major, or
minor.
info
Additional details on impact, if any.
duration
Details about duration of the step execution. Contains the following
fields:
estimated
Estimated time to complete the update step.
Note
Before Container Cloud 2.29.0 (Cluster releases 17.4.0
and 16.4.0), this field was named eta.
info
Additional details on update duration, if any.
granularity
Information on the current step granularity. Indicates whether the
current step is applied to each machine individually or to the entire
cluster at once. Possible values are cluster or machine.
commence
Flag that allows controlling the step execution. Boolean, false
by default. If set to true, the step starts execution after all
previous steps are completed.
Caution
Cancelling an already started update step is unsupported.
status
Status of the ClusterUpdatePlan CR that contains the following fields:
startedAt
Time when ClusterUpdatePlan has started.
completedAt
Available since Container Cloud 2.29.0 (Cluster releases 17.4.0 and
16.4.0). Time of update completion.
status
Overall object status.
steps
List of step statuses in the same order as defined in spec. Each step
status contains the following fields:
id
Available since Container Cloud 2.28.0 (Cluster releases 17.3.0 and
16.3.0). Step ID.
name
Step name.
status
Step status. Possible values are:
NotStarted
Step has not started yet.
Scheduled
Available since Container Cloud 2.28.0 (Cluster releases 17.3.0 and
16.3.0). Step is already triggered but its execution has not
started yet.
InProgress
Step is currently in progress.
AutoPaused
Available since Container Cloud 2.29.0 (Cluster release 17.4.0) as
Technology Preview. Update is automatically paused by the trigger
from a firing alert defined in the UpdateAutoPause
configuration. For details, see UpdateAutoPause resource.
Stuck
Step execution contains an issue, which also indicates that the
step does not fit into the estimate defined in the duration
field for this step in spec.
Completed
Step has been completed.
message
Message describing status details the current update step.
duration
Current duration of the step execution.
startedAt
Start time of the step execution.
Example of a ClusterUpdatePlan object:
apiVersion:kaas.mirantis.com/v1alpha1kind:ClusterUpdatePlanmetadata:creationTimestamp:"2025-02-06T16:53:51Z"generation:11name:mosk-17.4.0namespace:childresourceVersion:"6072567"uid:82c072be-1dc5-43dd-b8cf-bc643206d563spec:cluster:moskreleaseNotes:https://docs.mirantis.com/mosk/latest/25.1-series.htmlsource:mosk-17-3-0-24-3steps:-commence:truedescription:-install new version of OpenStack and Tungsten Fabric life cycle managementmodules-OpenStack and Tungsten Fabric container images pre-cached-OpenStack and Tungsten Fabric control plane components restarted in parallelduration:estimated:1h30m0sinfo:-15 minutes to cache the images and update the life cycle management modules-1h to restart the componentsgranularity:clusterid:openstackimpact:info:-some of the running cloud operations may fail due to restart of API servicesand schedulers-DNS might be affectedusers:minorworkloads:minorname:Update OpenStack and Tungsten Fabric-commence:truedescription:-Ceph version update-restart Ceph monitor, manager, object gateway (radosgw), and metadata services-restart OSD services node-by-node, or rack-by-rack depending on the clusterconfigurationduration:estimated:8m30sinfo:-15 minutes for the Ceph version update-around 40 minutes to update Ceph cluster of 30 nodesgranularity:clusterid:cephimpact:info:-'minorunavailabilityofobjectstorageAPIs:S3/Swift'-workloads may experience IO performance degradation for the virtual storagedevices backed by Cephusers:minorworkloads:minorname:Update Ceph-commence:truedescription:-new host OS kernel and packages get installed-host OS configuration re-applied-container runtime version gets bumped-new versions of Kubernetes components installedduration:estimated:1h40m0sinfo:-about 20 minutes to update host OS per a Kubernetes controller, nodes updatedone-by-one-Kubernetes components update takes about 40 minutes, all nodes in parallelgranularity:clusterid:k8s-controllersimpact:users:noneworkloads:nonename:Update host OS and Kubernetes components on master nodes-commence:truedescription:-new host OS kernel and packages get installed-host OS configuration re-applied-container runtime version gets bumped-new versions of Kubernetes components installed-data plane components (Open vSwitch and Neutron L3 agents, TF agents and vrouter)restarted on gateway and compute nodes-storage nodes put to “no-out” mode to prevent rebalancing-by default, nodes are updated one-by-one, a node group can be configured toupdate several nodes in parallelduration:estimated:8h0m0sinfo:-host OS update - up to 15 minutes per node (not including host OS configurationmodules)-Kubernetes components update - up to 15 minutes per node-OpenStack controllers and gateways updated one-by-one-nodes hosting Ceph OSD, monitor, manager, metadata, object gateway (radosgw)services updated one-by-onegranularity:machineid:k8s-workers-vdrok-child-defaultimpact:info:-'OpenStackcontrollernodes:somerunningOpenStackoperationsmightnotcompleteduetorestartofcomponents'-'OpenStackcomputenodes:minorlossoftheEast-WestconnectivitywiththeOpenvSwitchnetworkingbackendthatcausesapproximately5minofdowntime'-'OpenStackgatewaynodes:minorlossoftheNorth-SouthconnectivitywiththeOpenvSwitchnetworkingbackend:anon-distributedHAvirtualrouterneedsupto1minutetofailover;anon-distributedandnon-HAvirtualrouterfailovertimedependsonmanyfactorsandmaytakeupto10minutes'users:majorworkloads:majorname:Update host OS and Kubernetes components on worker nodes, group vdrok-child-default-commence:truedescription:-restart of StackLight, MetalLB services-restart of auxiliary controllers and chartsduration:estimated:1h30m0sgranularity:clusterid:mcc-componentsimpact:info:-minor cloud API downtime due restart of MetalLB componentsusers:minorworkloads:nonename:Auxiliary components updatetarget:mosk-17-4-0-25-1status:completedAt:"2025-02-07T19:24:51Z"startedAt:"2025-02-07T17:07:02Z"status:Completedsteps:-duration:26m36.355605528sid:openstackmessage:Readyname:Update OpenStack and Tungsten FabricstartedAt:"2025-02-07T17:07:02Z"status:Completed-duration:6m1.124356485sid:cephmessage:Readyname:Update CephstartedAt:"2025-02-07T17:33:38Z"status:Completed-duration:24m3.151554465sid:k8s-controllersmessage:Readyname:Update host OS and Kubernetes components on master nodesstartedAt:"2025-02-07T17:39:39Z"status:Completed-duration:1h19m9.359184228sid:k8s-workers-vdrok-child-defaultmessage:Readyname:Update host OS and Kubernetes components on worker nodes, group vdrok-child-defaultstartedAt:"2025-02-07T18:03:42Z"status:Completed-duration:2m0.772243006sid:mcc-componentsmessage:Readyname:Auxiliary components updatestartedAt:"2025-02-07T19:22:51Z"status:Completed
This section describes the UpdateAutoPause custom resource (CR) used in the
Container Cloud API to configure automatic pausing of cluster release updates
in a managed cluster using StackLight alerts.
The Container Cloud UpdateAutoPause CR contains the following fields:
apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is UpdateAutoPause.
metadata
Metadata of the UpdateAutoPause CR that contains the following fields:
name
Name of the UpdateAutoPause object. Must match the cluster name.
namespace
Project where the UpdateAutoPause is created. Must match the cluster
namespace.
spec
Specification of the UpdateAutoPause CR that contains the following
field:
alerts
List of alert names. The occurrence of any alert from this list triggers
auto-pause of the cluster release update.
status
Status of the UpdateAutoPause CR that contains the following fields:
firingAlerts
List of currently firing alerts from the specified set.
error
Error message, if any, encountered during object processing.
TechPreviewAvailable since 2.24.0 and 23.2 for MOSK clusters
This section describes the CacheWarmupRequest custom resource (CR) used in
the Container Cloud API to predownload images and store them in the
mcc-cache service.
The Container Cloud CacheWarmupRequest CR contains the following fields:
apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is CacheWarmupRequest.
metadata
The metadata object field of the CacheWarmupRequest
resource contains the following fields:
name
Name of the CacheWarmupRequest object that must match the existing
management cluster name to which the warm-up operation applies.
namespace
Container Cloud project in which the cluster is created.
Always set to default as the only available project for management
clusters creation.
spec
The spec object field of the CacheWarmupRequest resource
contains the settings for artifacts fetching and artifacts filtering
through Cluster releases. This field contains the following fields:
clusterReleases
Array of strings. Defines a set of Cluster release names to
warm up in the mcc-cache service.
openstackReleases
Optional. Array of strings. Defines a set of OpenStack
releases to warm up in mcc-cache. Applicable only
if ClusterReleases field contains mosk releases.
If you plan to upgrade an OpenStack version, define the current and the
target versions including the intermediate versions, if any.
For example, to upgrade OpenStack from Victoria to Yoga:
openstackReleases:-victoria-wallaby-xena-yoga
fetchRequestTimeout
Optional. String. Time for a single request to download
a single artifact. Defaults to 30m. For example, 1h2m3s.
clientsPerEndpoint
Optional. Integer. Number of clients to use for fetching artifacts
per each mcc-cache service endpoint. Defaults to 2.
openstackOnly
Optional. Boolean. Enables fetching of the OpenStack-related artifacts
for MOSK. Defaults to false. Applicable only if the
ClusterReleases field contains mosk releases. Useful when you
need to upgrade only an OpenStack version.
This section describes the GracefulRebootRequest custom resource (CR)
used in the Container Cloud API for a rolling reboot of several or all cluster
machines without workloads interruption. The resource is also useful for a
bulk reboot of machines, for example, on large clusters.
The Container Cloud GracefulRebootRequest CR contains the following fields:
apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is GracefulRebootRequest.
metadata
Metadata of the GracefulRebootRequest CR that contains the following
fields:
name
Name of the GracefulRebootRequest object. The object name must match
the name of the cluster on which you want to reboot machines.
namespace
Project where the GracefulRebootRequest is created.
spec
Specification of the GracefulRebootRequest CR that contains the
following fields:
machines
List of machines for a rolling reboot. Each machine of the list is
cordoned, drained, rebooted, and uncordoned in the order of cluster
upgrade policy. For details about the upgrade order,
see Change the upgrade order of a machine or machine pool.
Leave this field empty to reboot all cluster machines.
Caution
The cluster and machines must have the Ready status to
perform a graceful reboot.
This section describes the ContainerRegistry custom resource (CR) used in
Mirantis Container Cloud API to configure CA certificates on machines to access
private Docker registries.
The Container Cloud ContainerRegistry CR contains the following fields:
apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1
kind
Object type that is ContainerRegistry
metadata
The metadata object field of the ContainerRegistry CR contains
the following fields:
name
Name of the container registry
namespace
Project where the container registry is created
spec
The spec object field of the ContainerRegistry CR contains the
following fields:
domain
Host name and optional port of the registry
CACert
CA certificate of the registry in the base64-encoded format
Caution
Only one ContainerRegistry resource can exist per domain.
To configure multiple CA certificates for the same domain, combine them into
one certificate.
This section describes the TLSConfig resource used in Mirantis
Container Cloud API to configure TLS certificates for cluster applications.
Warning
The kubectl apply command automatically saves the
applied data as plain text into the
kubectl.kubernetes.io/last-applied-configuration annotation of the
corresponding object. This may result in revealing sensitive data in this
annotation when creating or modifying the object.
Therefore, do not use kubectl apply on this object.
Use kubectl create, kubectl patch, or
kubectl edit instead.
If you used kubectl apply on this object, you
can remove the kubectl.kubernetes.io/last-applied-configuration
annotation from the object using kubectl edit.
The Container Cloud TLSConfig CR contains the following fields:
apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is TLSConfig.
metadata
The metadata object field of the TLSConfig resource contains
the following fields:
name
Name of the public key.
namespace
Project where the TLS certificate is created.
spec
The spec object field contains the configuration to apply for an
application. It contains the following fields:
serverName
Host name of a server.
serverCertificate
Certificate to authenticate server’s identity to a client.
A valid certificate bundle can be passed.
The server certificate must be on the top of the chain.
privateKey
Reference to the Secret object that contains a private key.
A private key is a key for the server. It must correspond to the
public key used in the server certificate.
key
Key name in the secret.
name
Secret name.
caCertificate
Certificate that issued the server certificate. The top-most
intermediate certificate should be used if a CA certificate is
unavailable.
Private API since Container Cloud 2.29.0 (Cluster release 16.4.0)
Warning
Since Container Cloud 2.29.0 (Cluster release 16.4.0), use the
BareMetalHostInventory resource instead of BareMetalHost for
adding and modifying configuration of a bare metal server. Any change in the
BareMetalHost object will be overwitten by BareMetalHostInventory.
For any existing BareMetalHost object, a BareMetalHostInventory
object is created automatically during management cluster update to the
Cluster release 16.4.0.
This section describes the BareMetalHost resource used in the
Mirantis Container Cloud API. BareMetalHost object
is being created for each Machine and contains all information about
machine hardware configuration. BareMetalHost objects are used to monitor
and manage the state of a bare metal server. This includes inspecting the host
hardware, firmware, operating system provisioning, power control, server
deprovision. When a machine is created, the bare metal provider assigns a
BareMetalHost to that machine using labels and the BareMetalHostProfile
configuration.
For demonstration purposes, the Container Cloud BareMetalHost
custom resource (CR) can be split into the following major sections:
The Container Cloud BareMetalHost CR contains the following fields:
apiVersion
API version of the object that is metal3.io/v1alpha1.
kind
Object type that is BareMetalHost.
metadata
The metadata field contains the following subfields:
name
Name of the BareMetalHost object.
namespace
Project in which the BareMetalHost object was created.
annotations
Available since Cluster releases 12.5.0, 11.5.0, and 7.11.0.
Key-value pairs to attach additional metadata to the object:
kaas.mirantis.com/baremetalhost-credentials-name
Key that connects the BareMetalHost object with a previously
created BareMetalHostCredential object. The value of this key
must match the BareMetalHostCredential object name.
host.dnsmasqs.metal3.io/address
Available since Cluster releases 17.0.0 and 16.0.0.
Key that assigns a particular IP address to a bare metal host during
PXE provisioning.
baremetalhost.metal3.io/detached
Available since Cluster releases 17.0.0 and 16.0.0.
Key that pauses host management by the bare metal Operator for a
manual IP address assignment.
Note
If the host provisioning has already started or completed, adding
of this annotation deletes the information about the host from Ironic without
triggering deprovisioning. The bare metal Operator recreates the host
in Ironic once you remove the annotation. For details, see
Metal3 documentation.
Available since Cluster releases 17.0.0 and 16.0.0. Optional.
Key that defines sorting of the bmh:status:storage[] list during
inspection of a bare metal host. Accepts multiple tags separated by
a comma or semi-column with the ASC/DESC suffix for sorting
direction. Example terms: sizeBytesDESC, hctlASC,
typeASC, nameDESC.
Since Cluster releases 17.1.0 and 16.1.0, the following default
value applies: hctlASC,wwnASC,by_idASC,nameASC.
labels
Labels used by the bare metal provider to find a matching
BareMetalHost object to deploy a machine:
hostlabel.bm.kaas.mirantis.com/controlplane
hostlabel.bm.kaas.mirantis.com/worker
hostlabel.bm.kaas.mirantis.com/storage
Each BareMetalHost object added using the Container Cloud web UI
will be assigned one of these labels. If the BareMetalHost and
Machine objects are created using API, any label may be used
to match these objects for a bare metal host to deploy a machine.
Warning
Labels and annotations that are not documented in this API
Reference are generated automatically by Container Cloud. Do not modify them
using the Container Cloud API.
Configuration example:
apiVersion:metal3.io/v1alpha1kind:BareMetalHostmetadata:name:master-0namespace:defaultlabels:kaas.mirantis.com/baremetalhost-id:hw-master-0kaas.mirantis.com/baremetalhost-id:<bareMetalHostHardwareNodeUniqueId>annotations:# Since 2.21.0 (7.11.0, 12.5.0, 11.5.0)kaas.mirantis.com/baremetalhost-credentials-name:hw-master-0-credentials
The spec section for the BareMetalHost object defines the desired state
of BareMetalHost. It contains the following fields:
bmc
Details for communication with the Baseboard Management Controller (bmc)
module on a host. Contains the following subfields:
address
URL for communicating with the BMC. URLs vary depending on the
communication protocol and the BMC type, for example:
IPMI
Default BMC type in the ipmi://<host>:<port> format. You can also
use a plain <host>:<port> format. A port is optional if using the
default port 623.
You can change the IPMI privilege level from the default
ADMINISTRATOR to OPERATOR with an optional URL parameter
privilegelevel: ipmi://<host>:<port>?privilegelevel=OPERATOR.
Redfish
BMC type in the redfish:// format. To disable TLS, you
can use the redfish+http:// format. A host name or IP address and
a path to the system ID are required for both formats. For example,
redfish://myhost.example/redfish/v1/Systems/System.Embedded.1
or redfish://myhost.example/redfish/v1/Systems/1.
credentialsName
Name of the secret containing the BareMetalHost object credentials.
Since Container Cloud 2.21.0 and 2.21.1 for MOSK 22.5,
this field is updated automatically during cluster deployment. For
details, see BareMetalHostCredential.
Before Container Cloud 2.21.0 or MOSK 22.5,
the secret requires the username and password keys in the
Base64 encoding.
disableCertificateVerification
Boolean to skip certificate validation when true.
bootMACAddress
MAC address for booting.
bootMode
Boot mode: UEFI if UEFI is enabled and legacy if disabled.
online
Defines whether the server must be online after provisioning is done.
Warning
Setting online:false to more than one bare metal host
in a management cluster at a time can make the cluster non-operational.
Configuration example for Container Cloud 2.21.0 or later:
metadata:name:node-1-nameannotations:kaas.mirantis.com/baremetalhost-credentials-name:node-1-credentials# Since Container Cloud 2.21.0spec:bmc:address:192.168.33.106:623credentialsName:''bootMACAddress:0c:c4:7a:a8:d3:44bootMode:legacyonline:true
Configuration example for Container Cloud 2.20.1 or earlier:
The status field of the BareMetalHost object defines the current
state of BareMetalHost. It contains the following fields:
errorMessage
Last error message reported by the provisioning subsystem.
goodCredentials
Last credentials that were validated.
hardware
Hardware discovered on the host. Contains information about the storage,
CPU, host name, firmware, and so on.
operationalStatus
Status of the host:
OK
Host is configured correctly and is manageable.
discovered
Host is only partially configured. For example, the bmc address
is discovered but not the login credentials.
error
Host has any sort of error.
poweredOn
Host availability status: powered on (true) or powered off (false).
provisioning
State information tracked by the provisioner:
state
Current action being done with the host by the provisioner.
id
UUID of a machine.
triedCredentials
Details of the last credentials sent to the provisioning backend.
Configuration example:
status:errorMessage:""goodCredentials:credentials:name:master-0-bmc-secretnamespace:defaultcredentialsVersion:"13404"hardware:cpu:arch:x86_64clockMegahertz:3000count:32flags:-3dnowprefetch-abm...model:Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHzfirmware:bios:date:""vendor:""version:""hostname:ipa-fcab7472-892f-473c-85a4-35d64e96c78fnics:-ip:""mac:0c:c4:7a:a8:d3:45model:0x8086 0x1521name:enp8s0f1pxe:falsespeedGbps:0vlanId:0...ramMebibytes:262144storage:-by_path:/dev/disk/by-path/pci-0000:00:1f.2-ata-1hctl:"4:0:0:0"model:Micron_5200_MTFDname:/dev/sdarotational:falseserialNumber:18381E8DC148sizeBytes:1920383410176vendor:ATAwwn:"0x500a07511e8dc148"wwnWithExtension:"0x500a07511e8dc148"...systemVendor:manufacturer:SupermicroproductName:SYS-6018R-TDW (To be filled by O.E.M.)serialNumber:E16865116300188operationalStatus:OKpoweredOn:trueprovisioning:state:provisionedtriedCredentials:credentials:name:master-0-bmc-secretnamespace:defaultcredentialsVersion:"13404"
This section describes the BareMetalHostCredential custom resource (CR)
used in the Mirantis Container Cloud API. The BareMetalHostCredential
object is created for each BareMetalHostInventory and contains all
information about the Baseboard Management Controller (bmc) credentials.
Note
Before update of the management cluster to Container Cloud 2.29.0
(Cluster release 16.4.0), instead of BareMetalHostInventory, use the
BareMetalHost object. For details, see BareMetalHost.
Caution
While the Cluster release of the management cluster is 16.4.0,
BareMetalHostInventory operations are allowed to
m:kaas@management-admin only. Once the management cluster is updated
to the Cluster release 16.4.1 (or later), this limitation will be lifted.
Warning
The kubectl apply command automatically saves the
applied data as plain text into the
kubectl.kubernetes.io/last-applied-configuration annotation of the
corresponding object. This may result in revealing sensitive data in this
annotation when creating or modifying the object.
Therefore, do not use kubectl apply on this object.
Use kubectl create, kubectl patch, or
kubectl edit instead.
If you used kubectl apply on this object, you
can remove the kubectl.kubernetes.io/last-applied-configuration
annotation from the object using kubectl edit.
For demonstration purposes, the BareMetalHostCredential CR can be split
into the following sections:
The BareMetalHostCredential metadata contains the following fields:
apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1
kind
Object type that is BareMetalHostCredential
metadata
The metadata field contains the following subfields:
name
Name of the BareMetalHostCredential object
namespace
Container Cloud project in which the related BareMetalHostInventory
object was created
labels
Labels used by the bare metal provider:
kaas.mirantis.com/region
Region name
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
The spec section for the BareMetalHostCredential object contains
sensitive information that is moved to a separate
Secret object during cluster deployment:
username
User name of the bmc account with administrator privileges to control
the power state and boot source of the bare metal host
password
Details on the user password of the bmc account with administrator
privileges:
value
Password that will be automatically removed once saved in a separate
Secret object
name
Name of the Secret object where credentials are saved
The BareMetalHostCredential object creation triggers the following
automatic actions:
Create an underlying Secret object containing data about username
and password of the bmc account of the related
BareMetalHostCredential object.
Erase sensitive password data of the bmc account from the
BareMetalHostCredential object.
Add the created Secret object name to the spec.password.name
section of the related BareMetalHostCredential object.
Update BareMetalHostInventory.spec.bmc.bmhCredentialsName with the
BareMetalHostCredential object name.
Note
Before Container Cloud 2.29.0 (17.4.0 and 16.4.0),
BareMetalHost.spec.bmc.credentialsName was updated with the
BareMetalHostCredential object name.
Note
When you delete a BareMetalHostInventory object, the related
BareMetalHostCredential object is deleted automatically.
Note
On existing clusters, a BareMetalHostCredential object is
automatically created for each BareMetalHostInventory object during a
cluster update.
Example of BareMetalHostCredential before the cluster deployment starts:
Available since Container Cloud 2.29.0 (Cluster release 16.4.0)
Note
Before update of the management cluster to Container Cloud 2.29.0
(Cluster release 16.4.0), instead of BareMetalHostInventory, use the
BareMetalHost object. For details, see BareMetalHost.
Caution
While the Cluster release of the management cluster is 16.4.0,
BareMetalHostInventory operations are allowed to
m:kaas@management-admin only. Once the management cluster is updated
to the Cluster release 16.4.1 (or later), this limitation will be lifted.
This section describes the BareMetalHostInventory resource used in the
Mirantis Container Cloud API to monitor and manage the state of a bare metal
server. This includes inspecting the host hardware, firmware, operating system
provisioning, power control, and server deprovision.
The BareMetalHostInventory object is created for each Machine and
contains all information about machine hardware configuration.
Each BareMetalHostInventory object is synchronized with an automatically
created BareMetalHost object, which is used for internal purposes of
the Container Cloud private API.
Use the BareMetalHostInventory object instead of BareMetalHost for
adding and modifying configuration of a bare metal server.
Caution
Any change in the BareMetalHost object will be overwitten by
BareMetalHostInventory.
For any existing BareMetalHost object, a BareMetalHostInventory
object is created automatically during management cluster update to
Container Cloud 2.29.0 (Cluster release 16.4.0).
For demonstration purposes, the Container Cloud BareMetalHostInventory
custom resource (CR) can be split into the following major sections:
Key that pauses host management by the bare metal Operator for a
manual IP address assignment.
Note
If the host provisioning has already started or completed, adding
of this annotation deletes the information about the host from Ironic without
triggering deprovisioning. The bare metal Operator recreates the host
in Ironic once you remove the annotation. For details, see
Metal3 documentation.
Optional. Key that defines sorting of the bmh:status:storage[] list
during inspection of a bare metal host. Accepts multiple tags separated
by a comma or semi-column with the ASC/DESC suffix for sorting
direction. Example terms: sizeBytesDESC, hctlASC,
typeASC, nameDESC.
The default value is hctlASC,wwnASC,by_idASC,nameASC.
labels
Labels used by the bare metal provider to find a matching
BareMetalHostInventory object for machine deployment. For example:
hostlabel.bm.kaas.mirantis.com/controlplane
hostlabel.bm.kaas.mirantis.com/worker
hostlabel.bm.kaas.mirantis.com/storage
Warning
Labels and annotations that are not documented in this API
Reference are generated automatically by Container Cloud. Do not modify them
using the Container Cloud API.
Configuration example:
apiVersion:kaas.mirantis.com/v1alpha1kind:BareMetalHostInventorymetadata:name:master-0namespace:defaultlabels:kaas.mirantis.com/baremetalhost-id:hw-master-0annotations:inspect.metal3.io/hardwaredetails-storage-sort-term:hctl ASC, wwn ASC, by_id ASC, name ASC
The spec section for the BareMetalHostInventory object defines the
required state of BareMetalHostInventory. It contains the following fields:
bmc
Details for communication with the Baseboard Management Controller (bmc)
module on a host. Contains the following subfields:
address
URL for communicating with the BMC. URLs vary depending on the
communication protocol and the BMC type. For example:
IPMI
Default BMC type in the ipmi://<host>:<port> format. You can also
use a plain <host>:<port> format. A port is optional if using the
default port 623.
You can change the IPMI privilege level from the default
ADMINISTRATOR to OPERATOR with an optional URL parameter
privilegelevel: ipmi://<host>:<port>?privilegelevel=OPERATOR.
Redfish
BMC type in the redfish:// format. To disable TLS, you can use
the redfish+http:// format. A host name or IP address and a path
to the system ID are required for both formats. For example,
redfish://myhost.example/redfish/v1/Systems/System.Embedded.1
or redfish://myhost.example/redfish/v1/Systems/1.
bmhCredentialsName
Name of the BareMetalHostCredentials object.
disableCertificateVerification
Key that disables certificate validation. Boolean, false by default.
When true, the validation is skipped.
bootMACAddress
MAC address for booting.
bootMode
Boot mode: UEFI if UEFI is enabled and legacy if disabled.
online
Defines whether the server must be online after provisioning is done.
Warning
Setting online:false to more than one bare metal host
in a management cluster at a time can make the cluster non-operational.
This section describes the BareMetalHostProfile resource used
in Mirantis Container Cloud API
to define how the storage devices and operating system
are provisioned and configured.
For demonstration purposes, the Container Cloud BareMetalHostProfile
custom resource (CR) is split into the following major sections:
The spec field of BareMetalHostProfile object contains
the fields to customize your hardware configuration:
Warning
Any data stored on any device defined in the fileSystems
list can be deleted or corrupted during cluster (re)deployment. It happens
because each device from the fileSystems list is a part of the
rootfs directory tree that is overwritten during (re)deployment.
Examples of affected devices include:
A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a
file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it
The wipe field (deprecated) or wipeDevice structure (recommended
since Container Cloud 2.26.0) have no effect in this case and cannot
protect data on these devices.
Therefore, to prevent data loss, move the necessary data from these file
systems to another server beforehand, if required.
devices
List of definitions of the physical storage devices. To configure more
than three storage devices per host, add additional devices to this list.
Each device in the list can have one or more
partitions defined by the list in the partitions field.
Each device in the list must have the following fields in the
properties section for device handling:
workBy (recommended, string)
Defines how the device should be identified. Accepts a comma-separated
string with the following recommended value (in order of priority):
by_id,by_path,by_wwn,by_name. Since 2.25.1, this value is set
by default.
wipeDevice (recommended, object)
Available since Container Cloud 2.26.0 (Cluster releases 17.1.0 and
16.1.0). Enables and configures cleanup of a device or its metadata
before cluster deployment. Contains the following fields:
eraseMetadata (dictionary)
Enables metadata cleanup of a device. Contains the following
field:
enabled (boolean)
Enables the eraseMetadata option. False by default.
eraseDevice (dictionary)
Configures a complete cleanup of a device. Contains the following
fields:
blkdiscard (object)
Executes the blkdiscard command on the target device
to discard all data blocks. Contains the following fields:
enabled (boolean)
Enables the blkdiscard option. False by default.
zeroout (string)
Configures writing of zeroes to each block during device
erasure. Contains the following options:
fallback - default, blkdiscard attempts to
write zeroes only if the device does not support the block
discard feature. In this case, the blkdiscard
command is re-executed with an additional --zeroout
flag.
always - always write zeroes.
never - never write zeroes.
userDefined (object)
Enables execution of a custom command or shell script to erase
the target device. Contains the following fields:
enabled (boolean)
Enables the userDefined option. False by default.
command (string)
Defines a command to erase the target device. Empty by
default. Mutually exclusive with script. For the command
execution, the ansible.builtin.command module is called.
script (string)
Defines a plain-text script allowing pipelines (|) to
erase the target device. Empty by default. Mutually exclusive
with command. For the script execution, the
ansible.builtin.shell module is called.
When executing a command or a script, you can use the following
environment variables:
DEVICE_KNAME (always defined by Ansible)
Device kernel path, for example, /dev/sda
DEVICE_BY_NAME (optional)
Link from /dev/disk/by-name/ if it was added by
udev
DEVICE_BY_ID (optional)
Link from /dev/disk/by-id/ if it was added by
udev
DEVICE_BY_PATH (optional)
Link from /dev/disk/by-path/ if it was added by
udev
DEVICE_BY_WWN (optional)
Link from /dev/disk/by-wwn/ if it was added by
udev
Defines whether the device must be wiped of the data before being used.
Note
This field is deprecated since Container Cloud 2.26.0
(Cluster releases 17.1.0 and 16.1.0) for the sake of wipeDevice
and will be removed in one of the following releases.
For backward compatibility, any existing wipe:true option
is automatically converted to the following structure:
wipeDevice:eraseMetadata:enabled:True
Before Container Cloud 2.26.0, the wipe field is mandatory.
Each device in the list can have the following fields in its
properties section that affect the selection of the specific device
when the profile is applied to a host:
type (optional, string)
The device type. Possible values: hdd, ssd,
nvme. This property is used to filter selected devices by type.
partflags (optional, string)
Extra partition flags to be applied on a partition. For example,
bios_grub.
The lower and upper limit of the selected device size. Only the
devices matching these criteria are considered for allocation.
Omitted parameter means no upper or lower limit.
The minSize and maxSize parameter names are also available
for the same purpose.
Caution
Mirantis recommends using only one parameter name type and units
throughout the configuration files. If both sizeGiB and size are
used, sizeGiB is ignored during deployment and the suffix is adjusted
accordingly. For example, 1.5Gi will be serialized as 1536Mi.
The size without units is counted in bytes. For example, size:120 means
120 bytes.
Since Container Cloud 2.26.0 (Cluster releases 17.1.0 and 16.1.0),
minSizeGiB and maxSizeGiB are deprecated.
Instead of floats that define sizes in GiB for *GiB fields, use
the <sizeNumber>Gi text notation (Ki, Mi, and so on).
All newly created profiles are automatically migrated to the Gi
syntax. In existing profiles, migrate the syntax manually.
byName (forbidden in new profiles since 2.27.0, optional, string)
The specific device name to be selected during provisioning, such as
dev/sda.
Warning
With NVME devices and certain hardware disk controllers,
you cannot reliably select such device by the system name.
Therefore, use a more specific byPath, serialNumber, or
wwn selector.
Caution
Since Container Cloud 2.26.0 (Cluster releases 17.1.0 and
16.1.0), byName is deprecated. Since Container Cloud 2.27.0
(Cluster releases 17.2.0 and 16.2.0), byName is blocked by
admission-controller in new BareMetalHostProfile objects.
As a replacement, use a more specific selector, such as byPath,
serialNumber, or wwn.
byPath (optional, string) Since 2.26.0 (17.1.0, 16.1.0)
The specific device name with its path to be selected during
provisioning, such as /dev/disk/by-path/pci-0000:00:07.0.
serialNumber (optional, string) Since 2.26.0 (17.1.0, 16.1.0)
The specific serial number of a physical disk to be selected during
provisioning, such as S2RBNXAH116186E.
wwn (optional, string) Since 2.26.0 (17.1.0, 16.1.0)
The specific World Wide Name number of a physical disk to be selected
during provisioning, such as 0x5002538d409aeeb4.
Warning
When using strict filters, such as byPath,
serialNumber, or wwn, Mirantis strongly recommends not
combining them with a soft filter, such as minSize / maxSize.
Use only one approach.
softRaidDevicesTech Preview
List of definitions of a software-based Redundant Array of Independent
Disks (RAID) created by mdadm. Use the following fields to describe
an mdadm RAID device:
name (mandatory, string)
Name of a RAID device. Supports the following formats:
dev path, for example, /dev/md0.
simple name, for example, raid-name that will be created as
/dev/md/raid-name on the target OS.
devices (mandatory, list)
List of partitions from the devices list. Expand the resulting list
of devices into at least two partitions.
level (optional, string)
Level of a RAID device, defaults to raid1. Possible values:
raid1, raid0, raid10.
metadata (optional, string)
Metadata version of RAID, defaults to 1.0.
Possible values: 1.0, 1.1, 1.2. For details about the
differences in metadata, see
man 8 mdadm.
Warning
The EFI system partition partflags: ['esp'] must be
a physical partition in the main partition table of the disk, not under
LVM or mdadm software RAID.
fileSystems
List of file systems. Each file system can be created on top of either
device, partition, or logical volume. If more file systems are required
for additional devices, define them in this field. Each fileSystems
in the list has the following fields:
fileSystem (mandatory, string)
Type of a file system to create on a partition. For example, ext4,
vfat.
mountOpts (optional, string)
Comma-separated string of mount options. For example,
rw,noatime,nodiratime,lazytime,nobarrier,commit=240,data=ordered.
mountPoint (optional, string)
Target mount point for a file system. For example,
/mnt/local-volumes/.
partition (optional, string)
Partition name to be selected for creation from the list in the
devices section. For example, uefi.
logicalVolume (optional, string)
LVM logical volume name if the file system is supposed to be created
on an LVM volume defined in the logicalVolumes section. For example,
lvp.
logicalVolumes
List of LVM logical volumes. Every logical volume belongs to a volume
group from the volumeGroups list and has the size attribute
for a size in the corresponding units.
You can also add a software-based RAID raid1 created by LVM
using the following fields:
name (mandatory, string)
Name of a logical volume.
vg (mandatory, string)
Name of a volume group that must be a name from the volumeGroups
list.
sizeGiB or size (mandatory, string)
Size of a logical volume in gigabytes. When set to 0, all available
space on the corresponding volume group will be used. The 0 value
equals -l100%FREE in the lvcreate command.
type (optional, string)
Type of a logical volume. If you require a usual logical volume,
you can omit this field.
Possible values:
linear
Default. A usual logical volume. This value is implied for bare metal
host profiles created using the Container Cloud release earlier than
2.12.0 where the type field is unavailable.
raid1Tech Preview
Serves to build the raid1 type of LVM. Equals to the
lvcreate --type raid1... command. For details, see
man 8 lvcreate
and man 7 lvmraid.
Caution
Mirantis recommends using only one parameter name type and units
throughout the configuration files. If both sizeGiB and size are
used, sizeGiB is ignored during deployment and the suffix is adjusted
accordingly. For example, 1.5Gi will be serialized as 1536Mi.
The size without units is counted in bytes. For example, size:120 means
120 bytes.
volumeGroups
List of definitions of LVM volume groups. Each volume group contains one
or more devices or partitions from the devices list. Contains the
following field:
devices (mandatory, list)
List of partitions to be used in a volume group. For example:
Name of a volume group to be created. For example: lvm_root.
preDeployScript (optional, string)
Shell script that executes on a host before provisioning the target
operating system inside the ramfs system.
postDeployScript (optional, string)
Shell script that executes on a host after deploying the operating
system inside the ramfs system that is chrooted to the target
operating system. To use a specific default gateway (for example,
to have Internet access) on this stage, refer to
Migration of DHCP configuration for existing management clusters.
grubConfig (optional, object)
Set of options for the Linux GRUB bootloader on the target operating system.
Contains the following field:
defaultGrubOptions (optional, array)
Set of options passed to the Linux GRUB bootloader. Each string in the
list defines one parameter. For example:
If asymmetric traffic is expected on some of the managed cluster
nodes, enable the loose mode for the corresponding interfaces on those
nodes by setting the net.ipv4.conf.<interface-name>.rp_filter
parameter to "2" in the kernelParameters.sysctl section.
For example:
General configuration example with the deprecated wipe
option for devices - applies before 2.26.0 (17.1.0 and 16.1.0)
spec:devices:-device:#byName: /dev/sdaminSize:61GiBwipe:trueworkBy:by_wwn,by_path,by_id,by_namepartitions:-name:bios_grubpartflags:-bios_grubsize:4Miwipe:true-name:uefipartflags:['esp']size:200Miwipe:true-name:config-2# limited to 64Mbsize:64Miwipe:true-name:md_root_part1wipe:truepartflags:['raid']size:60Gi-name:lvm_lvp_part1wipe:truepartflags:['raid']# 0 Means, all left spacesize:0-device:#byName: /dev/sdbminSize:61GiBwipe:trueworkBy:by_wwn,by_path,by_id,by_namepartitions:-name:md_root_part2wipe:truepartflags:['raid']size:60Gi-name:lvm_lvp_part2wipe:true# 0 Means, all left spacesize:0-device:#byName: /dev/sdcminSize:30Gibwipe:trueworkBy:by_wwn,by_path,by_id,by_namesoftRaidDevices:-name:md_rootmetadata:"1.2"devices:-partition:md_root_part1-partition:md_root_part2volumeGroups:-name:lvm_lvpdevices:-partition:lvm_lvp_part1-partition:lvm_lvp_part2logicalVolumes:-name:lvpvg:lvm_lvp# Means, all left spacesizeGiB:0postDeployScript:|#!/bin/bash -execho $(date) 'post_deploy_script done' >> /root/post_deploy_donepreDeployScript:|#!/bin/bash -execho 'ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"' > /etc/udev/rules.d/60-ssd-scheduler.rulesecho $(date) 'pre_deploy_script done' >> /root/pre_deploy_donefileSystems:-fileSystem:vfatpartition:config-2-fileSystem:vfatpartition:uefimountPoint:/boot/efi/-fileSystem:ext4softRaidDevice:md_rootmountPoint:/-fileSystem:ext4logicalVolume:lvpmountPoint:/mnt/local-volumes/grubConfig:defaultGrubOptions:-GRUB_DISABLE_RECOVERY="true"-GRUB_PRELOAD_MODULES=lvm-GRUB_TIMEOUT=20kernelParameters:sysctl:# For the list of options prohibited to change, refer to# https://docs.mirantis.com/mke/3.7/install/predeployment/set-up-kernel-default-protections.htmlkernel.dmesg_restrict:"1"kernel.core_uses_pid:"1"fs.file-max:"9223372036854775807"fs.aio-max-nr:"1048576"fs.inotify.max_user_instances:"4096"vm.max_map_count:"262144"modules:-filename:kvm_intel.confcontent:|options kvm_intel nested=1
During volume mounts, Mirantis strongly advises against mounting the entire
/var directory to a separate disk or partition. Otherwise, the
cloud-init service may fail to configure the target host system during
the first boot.
This recommendation allows preventing the following cloud-init issue related
to asynchronous mount in systemd with ignoring dependency:
System boots the / mounts.
The cloud-init service starts and processes data in
/var/lib/cloud-init, which currently references
[/]var/lib/cloud-init.
The systemd service mounts /var/lib/cloud-init and breaks the
cloud-init service logic.
Recommended configuration example for /var/lib/nova
spec:devices:...- device:serialNumber:BTWA516305VE480FGNtype:ssdwipeDevice:eraseMetadata:enabled:truepartitions:-name:var_partsize:0fileSystems:....- fileSystem:ext4partition:var_partmountPoint:'/var'# NOT RECOMMENDEDmountOpts:'rw,noatime,nodiratime,lazytime'
The fields of the Cluster resource that are located
under the status section including providerStatus
are available for viewing only.
They are automatically generated by the bare metal cloud provider
and must not be modified using Container Cloud API.
The Container Cloud Cluster CR contains the following fields:
apiVersion
API version of the object that is cluster.k8s.io/v1alpha1.
kind
Object type that is Cluster.
The metadata object field of the Cluster resource
contains the following fields:
name
Name of a cluster. A managed cluster name is specified under the
ClusterName field in the Create Cluster wizard of the
Container Cloud web UI. A management cluster name is configurable in the
bootstrap script.
namespace
Project in which the cluster object was created. The management cluster is
always created in the default project. The managed cluster project
equals to the selected project name.
labels
Key-value pairs attached to the object:
kaas.mirantis.com/provider
Provider type that is baremetal for the baremetal-based clusters.
kaas.mirantis.com/region
Region name. The default region name for the management cluster is
region-one.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Warning
Labels and annotations that are not documented in this API
Reference are generated automatically by Container Cloud. Do not modify them
using the Container Cloud API.
The spec object field of the Cluster object
represents the BaremetalClusterProviderSpec subresource that
contains a complete description of the desired bare metal cluster
state and all details to create the cluster-level
resources. It also contains the fields required for LCM deployment
and integration of the Container Cloud components.
The providerSpec object field is custom for each cloud provider and
contains the following generic fields for the bare metal provider:
apiVersion
API version of the object that is baremetal.k8s.io/v1alpha1
Maintenance mode of a cluster. Prepares a cluster for maintenance
and enables the possibility to switch machines into maintenance mode.
containerRegistries
List of the ContainerRegistries resources names.
ntpEnabled
NTP server mode. Boolean, enabled by default.
Since Container Cloud 2.23.0, you can optionally disable NTP to disable
the management of chrony configuration by Container Cloud and use your
own system for chrony management. Otherwise, configure the regional NTP
server parameters to be applied to all machines of managed clusters.
Before Container Cloud 2.23.0, you can optionally configure NTP parameters
if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org) are
accessible from the node where a management cluster is being provisioned.
Otherwise, this configuration is mandatory.
NTP configuration
Configure the regional NTP server parameters to be applied to all machines
of managed clusters.
In the Cluster object, add the ntp:servers section
with the list of required server names:
Optional. Auditing tools enabled on the cluster. Contains the auditd
field that enables the Linux Audit daemon auditd to monitor
activity of cluster processes and prevent potential malicious activity.
Boolean, default - false. Enables the auditd role to install the
auditd packages and configure rules. CIS rules: 4.1.1.1, 4.1.1.2.
enabledAtBoot
Boolean, default - false. Configures grub to audit processes that can
be audited even if they start up prior to auditd startup. CIS rule:
4.1.1.3.
backlogLimit
Integer, default - none. Configures the backlog to hold records. If during
boot audit=1 is configured, the backlog holds 64 records. If more than
64 records are created during boot, auditd records will be lost with a
potential malicious activity being undetected. CIS rule: 4.1.1.4.
maxLogFile
Integer, default - none. Configures the maximum size of the audit log file.
Once the log reaches the maximum size, it is rotated and a new log file is
created. CIS rule: 4.1.2.1.
maxLogFileAction
String, default - none. Defines handling of the audit log file reaching the
maximum file size. Allowed values:
keep_logs - rotate logs but never delete them
rotate - add a cron job to compress rotated log files and keep
maximum 5 compressed files.
compress - compress log files and keep them under the
/var/log/auditd/ directory. Requires
auditd_max_log_file_keep to be enabled.
CIS rule: 4.1.2.2.
maxLogFileKeep
Integer, default - 5. Defines the number of compressed log files to keep
under the /var/log/auditd/ directory. Requires
auditd_max_log_file_action=compress. CIS rules - none.
mayHaltSystem
Boolean, default - false. Halts the system when the audit logs are
full. Applies the following configuration:
space_left_action=email
action_mail_acct=root
admin_space_left_action=halt
CIS rule: 4.1.2.3.
customRules
String, default - none. Base64-encoded content of the 60-custom.rules
file for any architecture. CIS rules - none.
customRulesX32
String, default - none. Base64-encoded content of the 60-custom.rules
file for the i386 architecture. CIS rules - none.
customRulesX64
String, default - none. Base64-encoded content of the 60-custom.rules
file for the x86_64 architecture. CIS rules - none.
presetRules
String, default - none. Comma-separated list of the following built-in
preset rules:
access
actions
delete
docker
identity
immutable
logins
mac-policy
modules
mounts
perm-mod
privileged
scope
session
system-locale
time-change
Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0) in the
Technology Preview scope, you can collect some of the preset rules indicated
above as groups and use them in presetRules:
ubuntu-cis-rules - this group contains rules to comply with the Ubuntu
CIS Benchmark recommendations, including the following CIS Ubuntu 20.04
v2.0.1 rules:
scope - 5.2.3.1
actions - same as 5.2.3.2
time-change - 5.2.3.4
system-locale - 5.2.3.5
privileged - 5.2.3.6
access - 5.2.3.7
identity - 5.2.3.8
perm-mod - 5.2.3.9
mounts - 5.2.3.10
session - 5.2.3.11
logins - 5.2.3.12
delete - 5.2.3.13
mac-policy - 5.2.3.14
modules - 5.2.3.19
docker-cis-rules - this group contains rules to comply with
Docker CIS Benchmark recommendations, including the docker Docker CIS
v1.6.0 rules 1.1.3 - 1.1.18.
You can also use two additional keywords inside presetRules:
none - select no built-in rules.
all - select all built-in rules. When using this keyword, you can add
the ! prefix to a rule name to exclude some rules. You can use the
! prefix for rules only if you add the all keyword as the
first rule. Place a rule with the ! prefix only after
the all keyword.
Example configurations:
presetRules:none - disable all preset rules
presetRules:docker - enable only the docker rules
presetRules:access,actions,logins - enable only the
access, actions, and logins rules
presetRules:ubuntu-cis-rules - enable all rules from the
ubuntu-cis-rules group
presetRules:docker-cis-rules,actions - enable all rules from
the docker-cis-rules group and the actions rule
presetRules:all - enable all preset rules
presetRules:all,!immutable,!sessions - enable all preset
rules except immutable and sessions
Optional. Technology Preview. Deprecated since Container Cloud 2.29.0
(Cluster releases 17.4.0 and 16.4.0). Available since Container Cloud 2.24.0
(Cluster release 14.0.0). Enables WireGuard for traffic encryption on the
Kubernetes workloads network. Boolean. Disabled by default.
Caution
Before enabling WireGuard, ensure that the Calico MTU size is
at least 60 bytes smaller than the interface MTU size of the workload
network. IPv4 WireGuard uses a 60-byte header. For details, see
Set the MTU size for Calico.
Caution
Changing this parameter on a running cluster causes a downtime
that can vary depending on the cluster size.
This section represents the Container Cloud components that are
enabled on a cluster. It contains the following fields:
management
Configuration for the management cluster components:
enabled
Management cluster enabled (true) or disabled (false).
helmReleases
List of the management cluster Helm releases that will be installed
on the cluster. A Helm release includes the name and values
fields. The specified values will be merged with relevant Helm release
values of the management cluster in the Release object.
regional
List of regional cluster components for the provider:
provider
Provider type that is baremetal.
helmReleases
List of the regional Helm releases that will be installed
on the cluster. A Helm release includes the name and values
fields. The specified values will be merged with relevant
regional Helm release values in the Release object.
The providerStatus object field of the Cluster resource that reflects
the cluster readiness contains the following fields:
persistentVolumesProviderProvisioned
Status of the persistent volumes provisioning.
Prevents the Helm releases that require persistent volumes from being
installed until some default StorageClass is added to the Cluster
object.
helm
Details about the deployed Helm releases:
ready
Status of the deployed Helm releases. The true value indicates that
all Helm releases are deployed successfully.
releases
List of the enabled Helm releases that run on the Container Cloud
cluster:
releaseStatuses
List of the deployed Helm releases. The success:true field
indicates that the release is deployed successfully.
stacklight
Status of the StackLight deployment. Contains URLs of all StackLight
components. The success:true field indicates that StackLight
is deployed successfully.
nodes
Details about the cluster nodes:
ready
Number of nodes that completed the deployment or update.
requested
Total number of nodes. If the number of ready nodes does not match
the number of requested nodes, it means that a cluster is being
currently deployed or updated.
notReadyObjects
The list of the services, deployments, and statefulsets
Kubernetes objects that are not in the Ready state yet.
A service is not ready if its external address has not been provisioned
yet. A deployment or statefulset is not ready if the number of
ready replicas is not equal to the number of desired replicas. Both objects
contain the name and namespace of the object and the number of ready and
desired replicas (for controllers). If all objects are ready, the
notReadyObjects list is empty.
The oidc section of the providerStatus object field
in the Cluster resource reflects the Open ID Connect configuration details.
It contains the required details to obtain a token for
a Container Cloud cluster and consists of the following fields:
certificate
Base64-encoded OIDC certificate.
clientId
Client ID for OIDC requests.
groupsClaim
Name of an OIDC groups claim.
issuerUrl
Issuer URL to obtain the representation of the realm.
ready
OIDC status relevance. If true, the status corresponds to the
LCMCluster OIDC configuration.
The releaseRefs section of the providerStatus object field
in the Cluster resource provides the current Cluster release version
as well as the one available for upgrade. It contains the following fields:
current
Details of the currently installed Cluster release:
lcmType
Type of the Cluster release (ucp).
name
Name of the Cluster release resource.
version
Version of the Cluster release.
unsupportedSinceKaaSVersion
Indicates that a Container Cloud release newer than
the current one exists and that it does not support the current
Cluster release.
available
List of the releases available for upgrade. Contains the name and
version fields.
For security reasons and to ensure safe and reliable cluster
operability, test this configuration on a staging environment before
applying it to production. For any questions, contact Mirantis support.
Caution
As long as the feature is still on the development stage,
Mirantis highly recommends deleting all HostOSConfiguration objects,
if any, before automatic upgrade of the management cluster to Container Cloud
2.27.0 (Cluster release 16.2.0). After the upgrade, you can recreate the
required objects using the updated parameters.
This precautionary step prevents re-processing and re-applying of existing
configuration, which is defined in HostOSConfiguration objects, during
management cluster upgrade to 2.27.0. Such behavior is caused by changes in
the HostOSConfiguration API introduced in 2.27.0.
This section describes the HostOSConfiguration custom resource (CR)
used in the Container Cloud API. It contains all necessary information to
introduce and load modules for further configuration of the host operating
system of the related Machine object.
Note
This object must be created and managed on the management cluster.
For demonstration purposes, we split the Container Cloud
HostOSConfiguration CR into the following sections:
The spec object field contains configuration for a
HostOSConfiguration object and has the following fields:
machineSelector
Required for production deployments. A set of Machine objects to apply
the HostOSConfiguration object to. Has the format of the Kubernetes
label selector.
configs
Required. List of configurations to apply to Machine objects defined in
machineSelector. Each entry has the following fields:
module
Required. Name of the module that refers to an existing module in one of
the HostOSConfigurationModules
objects.
moduleVersion
Required. Version of the module in use in the SemVer format.
description
Optional. Description and purpose of the configuration.
order
Optional. Positive integer between 1 and 1024 that indicates the
order of applying the module configuration. A configuration with the
lowest order value is applied first. If the order field is not set:
Since 2.27.0 (Cluster releases 17.2.0 and 16.2.0)
The configuration is applied in the order of appearance in the list
after all configurations with the value are applied.
In 2.26.0 (Cluster releases 17.1.0 and 16.1.0)
The following rules apply to the ordering when comparing each pair
of entries:
Ordering by alphabet based on the module values unless they are
equal.
Ordering by version based on the moduleVersion values, with
preference given to the lesser value.
values
Optional if secretValues is set. Module configuration in the format
of key-value pairs.
secretValues
Optional if values is set. Reference to a Secret object that
contains the configuration values for the module:
namespace
Project name of the Secret object.
name
Name of the Secret object.
Note
You can use both values and secretValues together.
But if the values are duplicated, the secretValues data rewrites
duplicated keys of the values data.
Warning
The referenced Secret object must contain only primitive
non-nested values. Otherwise, the values will not be applied correctly.
phase
Optional. LCM phase, in which a module configuration must be executed.
The only supported and default value is reconfigure. Hence, you may
omit this field.
orderRemoved in 2.27.0 (17.2.0 and 16.2.0)
Optional. Positive integer between 1 and 1024 that indicates the
order of applying HostOSConfiguration objects on newly added or newly
assigned machines. An object with the lowest order value is applied first.
If the value is not set, the object is applied last in the order.
If no order field is set for all HostOSConfiguration objects,
the objects are sorted by name.
Note
If a user changes the HostOSConfiguration object that was
already applied on some machines, then only the changed items from
the spec.configs section of the HostOSConfiguration object are
applied to those machines, and the execution order applies only to the
changed items.
The configuration changes are applied on corresponding LCMMachine
objects almost immediately after host-os-modules-controller
verifies the changes.
Configuration example:
spec:machineSelector:matchLabels:label-name:"label-value"configs:-description:Brief description of the configurationmodule:container-cloud-provided-module-namemoduleVersion:1.0.0order:1# the 'phase' field is provided for illustration purposes. it is redundant# because the only supported value is "reconfigure".phase:"reconfigure"values:foo:1bar:"baz"secretValues:name:values-from-secretnamespace:default
The status field of the HostOSConfiguration object contains the
current state of the object:
controllerUpdateSince 2.27.0 (17.2.0 and 16.2.0)
Reserved. Indicates whether the status updates are initiated by
host-os-modules-controller.
isValidSince 2.27.0 (17.2.0 and 16.2.0)
Indicates whether all given configurations have been validated successfully
and are ready to be applied on machines. An invalid object is discarded
from processing.
specUpdatedAtSince 2.27.0 (17.2.0 and 16.2.0)
Defines the time of the last change in the object spec observed by
host-os-modules-controller.
containsDeprecatedModulesSince 2.28.0 (17.3.0 and 16.3.0)
Indicates whether the object uses one or several deprecated modules.
Boolean.
machinesStatesSince 2.27.0 (17.2.0 and 16.2.0)
Specifies the per-machine state observed by baremetal-provider.
The keys are machines names, and each entry has the following fields:
observedGeneration
Read-only. Specifies the sequence number representing the quantity of
changes in the object since its creation. For example, during object
creation, the value is 1.
selected
Indicates whether the machine satisfied the selector of the object.
Non-selected machines are not defined in machinesStates. Boolean.
secretValuesChanged
Indicates whether the secret values have been changed and the
corresponding stateItems have to be updated. Boolean.
The value is set to true by host-os-modules-controller if changes
in the secret data are detected. The value is set to false by
baremetal-provider after processing.
configStateItemsStatuses
Specifies key-value pairs with statuses of StateItems that are
applied to the machine. Each key contains the name and version
of the configuration module. Each key value has the following format:
Key: name of a configuration StateItem
Value: simplified status of the configuration StateItem that has
the following fields:
hash
Value of the hash sum from the status of the corresponding
StateItem in the LCMMachine object. Appears when the status
switches to Success.
state
Actual state of the corresponding StateItem from the
LCMMachine object. Possible values: NotStarted,
Running, Success, Failed.
configs
List of configurations statuses, indicating results of application of each
configuration. Every entry has the following fields:
moduleName
Existing module name from the list defined in the spec:modules
section of the related HostOSConfigurationModules object.
moduleVersion
Existing module version defined in the spec:modules section of the
related HostOSConfigurationModules object.
modulesReference
Name of the HostOSConfigurationModules object that contains
the related module configuration.
modulePlaybook
Name of the Ansible playbook of the module. The value is taken from
the related HostOSConfigurationModules object where this module
is defined.
moduleURL
URL to the module package in the FQDN format. The value is taken
from the related HostOSConfigurationModules object where this module
is defined.
moduleHashsum
Hash sum of the module. The value is taken from the related
HostOSConfigurationModules object where this module is defined.
lastDesignatedConfiguration
Removed in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0).
Key-value pairs representing the latest designated configuration data
for modules. Each key corresponds to a machine name, while the
associated value contains the configuration data encoded in the
gzip+base64 format.
lastValidatedSpec
Removed in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0).
Last validated module configuration encoded in the gzip+base64
format.
valuesValid
Removed in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0).
Validation state of the configuration and secret values defined in the
object spec against the module valuesValidationSchema.
Always true when valuesValidationSchema is empty.
error
Details of an error, if any, that occurs during the object processing
by host-os-modules-controller.
secretObjectVersion
Available since Container Cloud 2.27.0 (Cluster releases 17.2.0 and
16.2.0). Resource version of the corresponding Secret object observed
by host-os-modules-controller. Is present only if secretValues
is set.
moduleDeprecatedBy
Available since Container Cloud 2.28.0 (Cluster releases 17.3.0 and
16.3.0). List of modules that deprecate the currently configured module.
Contains the name and version fields specifying one or more
modules that deprecate the current module.
supportedDistributions
Available since Container Cloud 2.28.0 (Cluster releases 17.3.0
and 16.3.0). List of operating system distributions that are supported by
the current module. An empty list means support of any distribution by
the current module.
HostOSConfiguration status example:
status:configs:-moduleHashsum:bc5fafd15666cb73379d2e63571a0de96fff96ac28e5bce603498cc1f34de299moduleName:module-namemodulePlaybook:main.yamlmoduleURL:<url-to-module-archive.tgz>moduleVersion:1.1.0modulesReference:mcc-modulesmoduleDeprecatedBy:-name:another-module-nameversion:1.0.0-moduleHashsum:53ec71760dd6c00c6ca668f961b94d4c162eef520a1f6cb7346a3289ac5d24cdmoduleName:another-module-namemodulePlaybook:main.yamlmoduleURL:<url-to-another-module-archive.tgz>moduleVersion:1.1.0modulesReference:mcc-modulessecretObjectVersion:"14234794"containsDeprecatedModules:trueisValid:truemachinesStates:default/master-0:configStateItemsStatuses:# moduleName-moduleVersionmodule-name-1.1.0:# corresponding state itemhost-os-download-<object-name>-module-name-1.1.0-reconfigure:hash:0e5c4a849153d3278846a8ed681f4822fb721f6d005021c4509e7126164f428dstate:Successhost-os-<object-name>-module-name-1.1.0-reconfigure:state:Not Startedanother-module-name-1.1.0:host-os-download-<object-name>-another-module-name-1.1.0-reconfigure:state:Not Startedhost-os-<object-name>-another-module-name-1.1.0-reconfigure:state:Not StartedobservedGeneration:1selected:trueupdatedAt:"2024-04-23T14:10:28Z"
For security reasons and to ensure safe and reliable cluster
operability, test this configuration on a staging environment before
applying it to production. For any questions, contact Mirantis support.
This section describes the HostOSConfigurationModules custom resource (CR)
used in the Container Cloud API. It contains all necessary information to
introduce and load modules for further configuration of the host operating
system of the related Machine object. For description of module format,
schemas, and rules, see Format and structure of a module package.
Note
This object must be created and managed on the management cluster.
For demonstration purposes, we split the Container Cloud
HostOSConfigurationModules CR into the following sections:
Required for custom modules. URL to the archive containing the module
package in the FQDN format. If omitted, the module is considered as the
one provided and validated by Container Cloud.
version
Required. Module version in SemVer format that must equal the
corresponding custom module version defined in the metadata section
of the corresponding module. For reference, see MOSK
documentation: Day-2 operations - Metadata file format.
sha256sum
Required. Hash sum computed using the SHA-256 algorithm.
The hash sum is automatically validated upon fetching the module
package, the module does not load if the hash sum is invalid.
deprecatesSince 2.28.0 (17.3.0 and 16.3.0)
Reserved. List of modules that will be deprecated by the module.
This field is overriden by the same field, if any, of the module
metadata section.
Contains the name and version fields specifying one or more
modules to be deprecated. If name is omitted, it inherits the name
of the current module.
The status field of the HostOSConfigurationModules object contains the
current state of the object:
modules
List of module statuses, indicating the loading results of each module.
Each entry has the following fields:
name
Name of the loaded module.
version
Version of the loaded module.
url
URL to the archive containing the loaded module package in the FQDN
format.
docURL
URL to the loaded module documentation if it was initially present
in the module package.
description
Description of the loaded module if it was initially present in the
module package.
sha256sum
Actual SHA-256 hash sum of the loaded module.
valuesValidationSchema
JSON schema used against the module configuration values if it was
initially present in the module package. The value is encoded in the
gzip+base64 format.
state
Actual availability state of the module. Possible values are:
available or error.
error
Error, if any, that occurred during the module fetching and verification.
playbookName
Name of the module package playbook.
deprecatesSince 2.28.0 (17.3.0 and 16.3.0)
List of modules that are deprecated by the module. Contains the name
and version fields specifying one or more modules deprecated by the
current module.
deprecatedBySince 2.28.0 (17.3.0 and 16.3.0)
List of modules that deprecate the current module. Contains the name
and version fields specifying one or more modules that deprecate
the current module.
supportedDistributionsSince 2.28.0 (17.3.0 and 16.3.0)
List of operating system distributions that are supported by
the current module. An empty list means support of any distribution by
the current module.
HostOSConfigurationModules status example:
status:modules:-description:Brief description of the moduledocURL:https://docs.mirantis.comname:mirantis-provided-module-nameplaybookName:directory/main.yamlsha256sum:ff3c426d5a2663b544acea74e583d91cc2e292913fc8ac464c7d52a3182ec146state:availableurl:https://example.mirantis.com/path/to/module-name-1.0.0.tgzvaluesValidationSchema:<gzip+base64 encoded data>version:1.0.0deprecates:-name:custom-module-nameversion:1.0.0-description:Brief description of the moduledocURL:https://example.documentation.page/module-namename:custom-module-nameplaybookName:directory/main.yamlsha256sum:258ccafac1570de7b7829bde108fa9ee71b469358dbbdd0215a081f8acbb63bastate:availableurl:https://fully.qualified.domain.name/to/module/archive/module-name-1.0.0.tgzversion:1.0.0deprecatedBy:-name:mirantis-provided-module-nameversion:1.0.0supportedDistributions:-ubuntu/jammy
This section describes the IPaddr resource used in Mirantis
Container Cloud API. The IPAddr object describes an IP address
and contains all information about the associated MAC address.
For demonstration purposes, the Container Cloud IPaddr
custom resource (CR) is split into the following major sections:
The Container Cloud IPaddr CR contains the following fields:
apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1
kind
Object type that is IPaddr
metadata
The metadata field contains the following subfields:
name
Name of the IPaddr object in the auto-XX-XX-XX-XX-XX-XX format
where XX-XX-XX-XX-XX-XX is the associated MAC address
namespace
Project in which the IPaddr object was created
labels
Key-value pairs that are attached to the object:
ipam/IP
IPv4 address
ipam/IpamHostID
Unique ID of the associated IpamHost object
ipam/MAC
MAC address
ipam/SubnetID
Unique ID of the Subnet object
ipam/UID
Unique ID of the IPAddr object
Warning
Labels and annotations that are not documented in this API
Reference are generated automatically by Container Cloud. Do not modify them
using the Container Cloud API.
The status object field of the IPAddr resource reflects the actual
state of the IPAddr object. In contains the following fields:
address
IP address.
cidr
IPv4 CIDR for the Subnet.
gateway
Gateway address for the Subnet.
mac
MAC address in the XX:XX:XX:XX:XX:XX format.
nameservers
List of the IP addresses of name servers of the Subnet.
Each element of the list is a single address, for example, 172.18.176.6.
stateSince 2.23.0
Message that reflects the current status of the resource.
The list of possible values includes the following:
OK - object is operational.
ERR - object is non-operational. This status has a detailed
description in the messages list.
TERM - object was deleted and is terminating.
messagesSince 2.23.0
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status
field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.
phase
Deprecated since Container Cloud 2.23.0 and will be removed in one of the
following releases in favor of state. Possible values: Active,
Failed, or Terminating.
Configuration example:
status:address:172.16.48.201cidr:172.16.48.201/24gateway:172.16.48.1objCreated:2021-10-21T19:09:32Z by v5.1.0-20210930-121522-f5b2af8objStatusUpdated:2021-10-21T19:14:18.748114886Z by v5.1.0-20210930-121522-f5b2af8objUpdated:2021-10-21T19:09:32.606968024Z by v5.1.0-20210930-121522-f5b2af8mac:0C:C4:7A:A8:B8:18nameservers:-172.18.176.6state:OKphase:Active
This section describes the IpamHost resource used in Mirantis
Container Cloud API. The kaas-ipam controller monitors
the current state of the bare metal Machine, verifies if BareMetalHost
is successfully created and inspection is completed.
Then the kaas-ipam controller fetches the information about the network
interface configuration, creates the IpamHost object, and requests the IP
addresses.
The IpamHost object is created for each Machine and contains
all configuration of the host network interfaces and IP address.
It also contains the information about associated BareMetalHost,
Machine, and MAC addresses.
Note
Before update of the management cluster to Container Cloud 2.29.0
(Cluster release 16.4.0), instead of BareMetalHostInventory, use the
BareMetalHost object. For details, see BareMetalHost.
Caution
While the Cluster release of the management cluster is 16.4.0,
BareMetalHostInventory operations are allowed to
m:kaas@management-admin only. Once the management cluster is updated
to the Cluster release 16.4.1 (or later), this limitation will be lifted.
For demonstration purposes, the Container Cloud IpamHost
custom resource (CR) is split into the following major sections:
The Container Cloud IpamHost CR contains the following fields:
apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1
kind
Object type that is IpamHost
metadata
The metadata field contains the following subfields:
name
Name of the IpamHost object
namespace
Project in which the IpamHost object has been created
labels
Key-value pairs that are attached to the object:
cluster.sigs.k8s.io/cluster-name
References the Cluster object name that IpamHost is
assigned to
ipam/BMHostID
Unique ID of the associated BareMetalHost object
ipam/MAC-XX-XX-XX-XX-XX-XX:"1"
Number of NICs of the host that the corresponding MAC address is
assigned to
ipam/MachineID
Unique ID of the associated Machine object
ipam/UID
Unique ID of the IpamHost object
Warning
Labels and annotations that are not documented in this API
Reference are generated automatically by Container Cloud. Do not modify them
using the Container Cloud API.
The spec field of the IpamHost resource describes the desired
state of the object. It contains the following fields:
nicMACmap
Represents an unordered list of all NICs of the host obtained during the
bare metal host inspection.
Each NIC entry contains such fields as name, mac, ip,
and so on. The primary field defines which NIC was used for PXE booting.
Only one NIC can be primary. The IP address is not configurable
and is provided only for debug purposes.
l2TemplateSelector
If specified, contains the name (first priority) or label
of the L2 template that will be applied during a machine creation.
The l2TemplateSelector field is copied from the MachineproviderSpec object to the IpamHost object only once,
during a machine creation. To modify l2TemplateSelector after creation
of a Machine CR, edit the IpamHost object.
netconfigUpdateModeTechPreview
Update mode of network configuration. Possible values:
MANUAL
Default, recommended. An operator manually applies new network
configuration.
AUTO-UNSAFE
Unsafe, not recommended. If new network configuration is rendered by
kaas-ipam successfully, it is applied automatically with no
manual approval.
MANUAL-GRACEPERIOD
Initial value set during the IpamHost object creation. If new network
configuration is rendered by kaas-ipam successfully, it is applied
automatically with no manual approval. This value is implemented for
automatic changes in the IpamHost object during the host provisioning
and deployment. The value is changed automatically to MANUAL in
three hours after the IpamHost object creation.
Caution
For MKE clusters that are part of MOSK infrastructure, the
feature support will become available in one of the following
Container Cloud releases.
netconfigUpdateAllowTechPreview
Manual approval of network changes. Possible values: true or false.
Set to true to approve the Netplan configuration file candidate
(stored in netconfigCandidate) and copy its contents to the effective
Netplan configuration file list (stored in netconfigFiles). After that,
its value is automatically switched back to false.
Note
This value has effect only if netconfigUpdateMode is set to
MANUAL.
Set to true only if status.netconfigCandidateState of network
configuration candidate is OK.
Caution
The following fields of the ipamHost status are renamed since
Container Cloud 2.22.0 in the scope of the L2Template and IpamHost
objects refactoring:
netconfigV2 to netconfigCandidate
netconfigV2state to netconfigCandidateState
netconfigFilesState to netconfigFilesStates (per file)
No user actions are required after renaming.
The format of netconfigFilesState changed after renaming. The
netconfigFilesStates field contains a dictionary of statuses of network
configuration files stored in netconfigFiles. The dictionary contains
the keys that are file paths and values that have the same meaning for each
file that netconfigFilesState had:
For a successfully rendered configuration file:
OK:<timestamp><sha256-hash-of-rendered-file>, where a timestamp
is in the RFC 3339 format.
For a failed rendering: ERR:<error-message>.
Caution
For MKE clusters that are part of MOSK infrastructure, the
feature support will become available in one of the following
Container Cloud releases.
The following fields of the ipamHost status are renamed since
Container Cloud 2.22.0 in the scope of the L2Template and IpamHost
objects refactoring:
netconfigV2 to netconfigCandidate
netconfigV2state to netconfigCandidateState
netconfigFilesState to netconfigFilesStates (per file)
No user actions are required after renaming.
The format of netconfigFilesState changed after renaming. The
netconfigFilesStates field contains a dictionary of statuses of network
configuration files stored in netconfigFiles. The dictionary contains
the keys that are file paths and values that have the same meaning for each
file that netconfigFilesState had:
For a successfully rendered configuration file:
OK:<timestamp><sha256-hash-of-rendered-file>, where a timestamp
is in the RFC 3339 format.
For a failed rendering: ERR:<error-message>.
The status field of the IpamHost resource describes the observed
state of the object. It contains the following fields:
netconfigCandidate
Candidate of the Netplan configuration file in human readable format that
is rendered using the corresponding L2Template. This field contains
valid data if l2RenderResult and netconfigCandidateState retain the
OK result.
l2RenderResultDeprecated
Status of a rendered Netplan configuration candidate stored in
netconfigCandidate. Possible values:
For a successful L2 template rendering:
OK: timestamp sha256-hash-of-rendered-netplan, where
timestamp is in the RFC 3339 format
For a failed rendering: ERR: <error-message>
This field is deprecated and will be removed in one of the following
releases. Use netconfigCandidateState instead.
netconfigCandidateStateTechPreview
Status of a rendered Netplan configuration candidate stored in
netconfigCandidate. Possible values:
For a successful L2 template rendering:
OK: timestamp sha256-hash-of-rendered-netplan, where
timestamp is in the RFC 3339 format
For a failed rendering: ERR: <error-message>
Caution
For MKE clusters that are part of MOSK infrastructure, the
feature support will become available in one of the following
Container Cloud releases.
netconfigFiles
List of Netplan configuration files rendered using the corresponding
L2Template. It is used to configure host networking during bare metal
host provisioning and during Kubernetes node deployment. For details, refer
to Workflow of the netplan configuration using an L2 template.
Its contents are changed only if rendering of Netplan configuration was
successful. So, it always retains the last successfully rendered Netplan
configuration. To apply changes in contents, the Infrastructure Operator
approval is required. For details, see Modify network configuration on an existing machine.
Every item in this list contains:
content
The base64-encoded Netplan configuration file that was rendered
using the corresponding L2Template.
path
The file path for the Netplan configuration file on the target host.
netconfigFilesStates
Status of Netplan configuration files stored in netconfigFiles.
Possible values are:
For a successful L2 template rendering:
OK: timestamp sha256-hash-of-rendered-netplan, where
timestamp is in the RFC 3339 format
For a failed rendering: ERR: <error-message>
serviceMap
Dictionary of services and their endpoints (IP address and optional
interface name) that have the ipam/SVC-<serviceName> label.
These addresses are added to the ServiceMap dictionary
during rendering of an L2 template for a given IpamHost.
For details, see Service labels and their life cycle.
stateSince 2.23.0
Message that reflects the current status of the resource.
The list of possible values includes the following:
OK - object is operational.
ERR - object is non-operational. This status has a detailed
description in the messages list.
TERM - object was deleted and is terminating.
messagesSince 2.23.0
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status
field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.
Configuration example:
status:l2RenderResult:OKl2TemplateRef:namespace_name/l2-template-name/1/2589/88865f94-04f0-4226-886b-2640af95a8abnetconfigFiles:-content:...<base64-encoded Netplan configuration file>...path:/etc/netplan/60-kaas-lcm-netplan.yamlnetconfigFilesStates: /etc/netplan/60-kaas-lcm-netplan.yaml:'OK:2023-01-23T09:27:22.71802Zece7b73808999b540e32ca1720c6b7a6e54c544cc82fa40d7f6b2beadeca0f53'netconfigCandidate:...<Netplan configuration file in plain text, rendered from L2Template>...netconfigCandidateState: OK:2022-06-08T03:18:08.49590Z a4a128bc6069638a37e604f05a5f8345cf6b40e62bce8a96350b5a29bc8bccde\serviceMap:ipam/SVC-ceph-cluster:-ifName:ceph-br2ipAddress:10.0.10.11-ifName:ceph-br1ipAddress:10.0.12.22ipam/SVC-ceph-public:-ifName:ceph-publicipAddress:10.1.1.15ipam/SVC-k8s-lcm:-ifName:k8s-lcmipAddress:10.0.1.52phase:Activestate:OKobjCreated:2021-10-21T19:09:32Z by v5.1.0-20210930-121522-f5b2af8objStatusUpdated:2021-10-21T19:14:18.748114886Z by v5.1.0-20210930-121522-f5b2af8objUpdated:2021-10-21T19:09:32.606968024Z by v5.1.0-20210930-121522-f5b2af8
This section describes the L2Template resource used in Mirantis
Container Cloud API.
By default, Container Cloud configures a single interface on cluster nodes,
leaving all other physical interfaces intact.
With L2Template, you can create advanced host networking configurations
for your clusters. For example, you can create bond interfaces on top of
physical interfaces on the host.
For demonstration purposes, the Container Cloud L2Template
custom resource (CR) is split into the following major sections:
The Container Cloud L2Template CR contains the following fields:
apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is L2Template.
metadata
The metadata field contains the following subfields:
name
Name of the L2Template object.
namespace
Project in which the L2Template object was created.
labels
Key-value pairs that are attached to the object:
Caution
All ipam/* labels, except ipam/DefaultForCluster,
are set automatically and must not be configured manually.
cluster.sigs.k8s.io/cluster-name
References the Cluster object name that this template is
applied to. Mandatory for newly created L2Template since
Container Cloud 2.25.0.
The process of selecting the L2Template object for a specific
cluster is as follows:
The kaas-ipam controller monitors the L2Template objects
with the cluster.sigs.k8s.io/cluster-name:<clusterName> label.
The L2Template object with the
cluster.sigs.k8s.io/cluster-name:<clusterName>
label is assigned to a cluster with Name:<clusterName>,
if available.
ipam/PreInstalledL2Template:"1"
Is automatically added during a management cluster deployment.
Indicates that the current L2Template object was preinstalled.
Represents L2 templates that are automatically copied to a project
once it is created. Once the L2 templates are copied,
the ipam/PreInstalledL2Template label is removed.
Note
Preinstalled L2 templates are removed in Container Cloud
2.26.0 (Cluster releases 17.1.0 and 16.1.0) along with the
ipam/PreInstalledL2Template label. During cluster update to the
mentioned releases, existing preinstalled templates are
automatically removed.
ipam/DefaultForCluster
This label is unique per cluster. When you use several L2 templates
per cluster, only the first template is automatically labeled
as the default one. All consequent templates must be referenced
in the machines configuration files using L2templateSelector.
You can manually configure this label if required.
ipam/UID
Unique ID of an object.
kaas.mirantis.com/provider
Provider type.
kaas.mirantis.com/region
Region name.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Warning
Labels and annotations that are not documented in this API
Reference are generated automatically by Container Cloud. Do not modify them
using the Container Cloud API.
The spec field of the L2Template resource describes the desired
state of the object. It contains the following fields:
clusterRef
Caution
Deprecated since Container Cloud 2.25.0 in favor of the mandatory
cluster.sigs.k8s.io/cluster-name label. Will be removed in one of the
following releases.
On existing clusters, this parameter is automatically migrated to the
cluster.sigs.k8s.io/cluster-name label since 2.25.0.
If an existing cluster has clusterRef:default set, the migration process
involves removing this parameter. Subsequently, it is not substituted with
the cluster.sigs.k8s.io/cluster-name label, ensuring the application of
the L2 template across the entire Kubernetes namespace.
The Cluster object name that this template is applied to.
The default value is used to apply the given template to all clusters
within a particular project, unless an L2 template that references
a specific cluster name exists. The clusterRef field has priority over
the cluster.sigs.k8s.io/cluster-name label:
When clusterRef is set to a non-default value, the
cluster.sigs.k8s.io/cluster-name label will be added or updated with
that value.
When clusterRef is set to default, the
cluster.sigs.k8s.io/cluster-name label will be absent or removed.
L2 template requirements
An L2 template must have the same project (Kubernetes namespace) as the
referenced cluster.
A cluster can be associated with many L2 templates. Only one of them can
have the ipam/DefaultForCluster label. Every L2 template that does not
have the ipam/DefaultForCluster label can be later assigned to a
particular machine using l2TemplateSelector.
The following rules apply to the default L2 template of a namespace:
Since Container Cloud 2.25.0, creation of the default L2 template for
a namespace is disabled. On existing clusters, the
Spec.clusterRef:default parameter of such an L2 template is
automatically removed during the migration process. Subsequently,
this parameter is not substituted with the
cluster.sigs.k8s.io/cluster-name label, ensuring the application
of the L2 template across the entire Kubernetes namespace. Therefore,
you can continue using existing default namespaced L2 templates.
Before Container Cloud 2.25.0, the default L2Template object of a
namespace must have the Spec.clusterRef:default parameter that is
deprecated since 2.25.0.
ifMapping
List of interface names for the template. The interface mapping is defined
globally for all bare metal hosts in the cluster but can be overridden at the
host level, if required, by editing the IpamHost object for a particular
host. The ifMapping parameter is mutually exclusive with
autoIfMappingPrio.
autoIfMappingPrio
autoIfMappingPrio is a list of prefixes, such as eno, ens,
and so on, to match the interfaces to automatically create a list
for the template. If you are not aware of any specific
ordering of interfaces on the nodes, use the default
ordering from
Predictable Network Interfaces Names specification for systemd.
You can also override the default NIC list per host
using the IfMappingOverride parameter of the corresponding
IpamHost. The provision value corresponds to the network
interface that was used to provision a node.
Usually, it is the first NIC found on a particular node.
It is defined explicitly to ensure that this interface
will not be reconfigured accidentally.
The autoIfMappingPrio parameter is mutually exclusive
with ifMapping.
l3Layout
Subnets to be used in the npTemplate section. The field contains
a list of subnet definitions with parameters used by template macros.
subnetName
Defines the alias name of the subnet that can be used to reference this
subnet from the template macros. This parameter is mandatory for every
entry in the l3Layout list.
subnetPoolUnsupported since 2.28.0 (17.3.0 and 16.3.0)
Optional. Default: none. Defines a name of the parent SubnetPool object
that will be used to create a Subnet object with a given subnetName
and scope. For deprecation details, see MOSK Deprecation Notes:
SubnetPool resource management.
If a corresponding Subnet object already exists,
nothing will be created and the existing object will be used.
If no SubnetPool is provided, no new Subnet object will be created.
scope
Logical scope of the Subnet object with a corresponding subnetName.
Possible values:
global - the Subnet object is accessible globally,
for any Container Cloud project and cluster, for example, the PXE subnet.
namespace - the Subnet object is accessible within the same
project where the L2 template is defined.
cluster - the Subnet object is only accessible to the cluster
that L2Template.spec.clusterRef refers to. The Subnet objects
with the cluster scope will be created for every new cluster.
labelSelector
Contains a dictionary of labels and their respective values that will be
used to find the matching Subnet object for the subnet. If the
labelSelector field is omitted, the Subnet object will be selected
by name, specified by the subnetName parameter.
Caution
The labels and their values in this section must match the ones
added for the corresponding Subnet object.
Caution
The l3Layout section is mandatory for each L2Template
custom resource.
npTemplate
A netplan-compatible configuration with special lookup functions that
defines the networking settings for the cluster hosts, where physical
NIC names and details are parameterized. This configuration will be
processed using Go templates. Instead of specifying IP and MAC addresses,
interface names, and other network details specific to a particular host,
the template supports use of special lookup functions. These lookup
functions, such as nic, mac, ip, and so on, return
host-specific network information when the template is rendered for
a particular host.
Caution
All rules and restrictions of the netplan configuration
also apply to L2 templates. For details,
see the official netplan documentation.
Caution
We strongly recommend following the below conventions on
network interface naming:
A physical NIC name set by an L2 template must not exceed
15 symbols. Otherwise, an L2 template creation fails.
This limit is set by the Linux kernel.
Names of virtual network interfaces such as VLANs, bridges,
bonds, veth, and so on must not exceed 15 symbols.
We recommend setting interfaces names that do not
exceed 13 symbols for both physical and virtual interfaces
to avoid corner cases and issues in netplan rendering.
The status field of the L2Template resource reflects the actual state
of the L2Template object and contains the following fields:
stateSince 2.23.0
Message that reflects the current status of the resource.
The list of possible values includes the following:
OK - object is operational.
ERR - object is non-operational. This status has a detailed
description in the messages list.
TERM - object was deleted and is terminating.
messagesSince 2.23.0
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status
field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.
phase
Deprecated since Container Cloud 2.23.0 and will be removed in one of the
following releases in favor of state. Possible values: Active,
Failed, or Terminating.
reason
Deprecated since Container Cloud 2.23.0 and will be removed in one of the
following releases in favor of messages. For the field description, see
messages.
Configuration example:
status:phase:Failedstate:ERRmessages:-"ERR:Thekaas-mgmtsubnetintheterminatingstate."objCreated:2021-10-21T19:09:32Z by v5.1.0-20210930-121522-f5b2af8objStatusUpdated:2021-10-21T19:14:18.748114886Z by v5.1.0-20210930-121522-f5b2af8objUpdated:2021-10-21T19:09:32.606968024Z by v5.1.0-20210930-121522-f5b2af8
This section describes the Machine resource used in Mirantis
Container Cloud API for bare metal provider.
The Machine resource describes the machine-level parameters.
For demonstration purposes, the Container Cloud Machine
custom resource (CR) is split into the following major sections:
The Container Cloud Machine CR contains the following fields:
apiVersion
API version of the object that is cluster.k8s.io/v1alpha1.
kind
Object type that is Machine.
The metadata object field of the Machine resource contains
the following fields:
name
Name of the Machine object.
namespace
Project in which the Machine object is created.
annotations
Key-value pair to attach arbitrary metadata to the object:
metal3.io/BareMetalHost
Annotation attached to the Machine object to reference
the corresponding BareMetalHostInventory object in the
<BareMetalHostProjectName/BareMetalHostName> format.
Note
Before update of the management cluster to Container Cloud 2.29.0
(Cluster release 16.4.0), instead of BareMetalHostInventory, use the
BareMetalHost object. For details, see BareMetalHost.
Caution
While the Cluster release of the management cluster is 16.4.0,
BareMetalHostInventory operations are allowed to
m:kaas@management-admin only. Once the management cluster is updated
to the Cluster release 16.4.1 (or later), this limitation will be lifted.
labels
Key-value pairs that are attached to the object:
kaas.mirantis.com/provider
Provider type that matches the provider type in the Cluster object
and must be baremetal.
kaas.mirantis.com/region
Region name that matches the region name in the Cluster object.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
cluster.sigs.k8s.io/cluster-name
Cluster name that the Machine object is linked to.
cluster.sigs.k8s.io/control-plane
For the control plane role of a machine, this label contains any value,
for example, "true".
For the worker role, this label is absent.
Warning
Labels and annotations that are not documented in this API
Reference are generated automatically by Container Cloud. Do not modify them
using the Container Cloud API.
Configuration example:
apiVersion:cluster.k8s.io/v1alpha1kind:Machinemetadata:name:example-control-planenamespace:example-nsannotations:metal3.io/BareMetalHost:default/master-0labels:kaas.mirantis.com/provider:baremetalcluster.sigs.k8s.io/cluster-name:example-clustercluster.sigs.k8s.io/control-plane:"true"# remove for worker
The spec object field of the Machine object represents
the BareMetalMachineProviderSpec subresource with all required
details to create a bare metal instance. It contains the following fields:
apiVersion
API version of the object that is baremetal.k8s.io/v1alpha1.
kind
Object type that is BareMetalMachineProviderSpec.
bareMetalHostProfile
Configuration profile of a bare metal host:
name
Name of a bare metal host profile
namespace
Project in which the bare metal host profile is created.
l2TemplateIfMappingOverride
If specified, overrides the interface mapping value for the corresponding
L2Template object.
l2TemplateSelector
If specified, contains the name (first priority) or label
of the L2 template that will be applied during a machine creation.
The l2TemplateSelector field is copied from the MachineproviderSpec object to the IpamHost object only once,
during a machine creation. To modify l2TemplateSelector after creation
of a Machine CR, edit the IpamHost object.
hostSelector
Specifies the matching criteria for labels on the bare metal hosts.
Limits the set of the BareMetalHostInventory objects considered for
claiming for the Machine object. The following selector labels
can be added when creating a machine using the Container Cloud web UI:
hostlabel.bm.kaas.mirantis.com/controlplane
hostlabel.bm.kaas.mirantis.com/worker
hostlabel.bm.kaas.mirantis.com/storage
Any custom label that is assigned to one or more bare metal hosts using API
can be used as a host selector. If the BareMetalHostInventory objects
with the specified label are missing, the Machine object will not
be deployed until at least one bare metal host with the specified label
is available.
Note
Before update of the management cluster to Container Cloud 2.29.0
(Cluster release 16.4.0), instead of BareMetalHostInventory, use the
BareMetalHost object. For details, see BareMetalHost.
Caution
While the Cluster release of the management cluster is 16.4.0,
BareMetalHostInventory operations are allowed to
m:kaas@management-admin only. Once the management cluster is updated
to the Cluster release 16.4.1 (or later), this limitation will be lifted.
nodeLabels
List of node labels to be attached to a node for the user to run certain
components on separate cluster nodes. The list of allowed node labels
is located in the Cluster object status
providerStatus.releaseRef.current.allowedNodeLabels field.
If the value field is not defined in allowedNodeLabels, a label can
have any value.
Before or after a machine deployment, add the required label from the allowed
node labels list with the corresponding value to
spec.providerSpec.value.nodeLabels in machine.yaml. For example:
nodeLabels:-key:stacklightvalue:enabled
The addition of a node label that is not available in the list of allowed node
labels is restricted.
distributionMandatory
Specifies an operating system (OS) distribution ID that is present in the
current ClusterRelease object under the AllowedDistributions list.
When specified, the BareMetalHostInventory object linked to this
Machine object will be provisioned using the selected OS distribution
instead of the default one.
By default, ubuntu/jammy is installed on greenfield managed clusters:
Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0), for
MOSK clusters
Since Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0), for
non-MOSK clusters
The default distribution is marked with the boolean flag default
inside one of the elements under the AllowedDistributions list.
The ubuntu/focal distribution was deprecated in Container Cloud 2.28.0
and only supported for existing managed clusters. Container Cloud 2.28.x
release series is the last one to support Ubuntu 20.04 as the host operating
system for managed clusters.
Caution
The outdated ubuntu/bionic distribution, which is removed
in Cluster releases 17.0.0 and 16.0.0, is only supported for existing
clusters based on Ubuntu 18.04. For greenfield deployments of managed
clusters, only ubuntu/jammy is supported.
Warning
During the course of the Container Cloud 2.28.x series, Mirantis
highly recommends upgrading an operating system on any nodes of all your
managed cluster machines to Ubuntu 22.04 before the next major Cluster
release becomes available.
It is not mandatory to upgrade all machines at once. You can upgrade them
one by one or in small batches, for example, if the maintenance window is
limited in time.
Otherwise, the Cluster release update of the Ubuntu 20.04-based managed
clusters will become impossible as of Container Cloud 2.29.0 with Ubuntu
22.04 as the only supported version.
Management cluster update to Container Cloud 2.29.1 will be blocked if
at least one node of any related managed cluster is running Ubuntu 20.04.
maintenance
Maintenance mode of a machine. If enabled, the node of the selected machine
is drained, cordoned, and prepared for maintenance operations.
upgradeIndex (optional)
Positive numeral value that determines the order of machines upgrade. The
first machine to upgrade is always one of the control plane machines
with the lowest upgradeIndex. Other control plane machines are upgraded
one by one according to their upgrade indexes.
If the Cluster spec dedicatedControlPlane field is false, worker
machines are upgraded only after the upgrade of all control plane machines
finishes. Otherwise, they are upgraded after the first control plane
machine, concurrently with other control plane machines.
If two or more machines have the same value of upgradeIndex, these
machines are equally prioritized during upgrade.
deletionPolicy
Generally available since Container Cloud 2.25.0 (Cluster releases 17.0.0
and 16.0.0). Technology Preview since 2.21.0 (Cluster releases 11.5.0 and
7.11.0) for non-MOSK clusters. Policy used to identify steps
required during a Machine object deletion. Supported policies are as
follows:
graceful
Prepares a machine for deletion by cordoning, draining, and removing
from Docker Swarm of the related node. Then deletes Kubernetes objects
and associated resources. Can be aborted only before a node is removed
from Docker Swarm.
unsafe
Default. Deletes Kubernetes objects and associated resources without any
preparations.
forced
Deletes Kubernetes objects and associated resources without any
preparations. Removes the Machine object even if the cloud provider
or LCM Controller gets stuck at some step. May require a manual cleanup
of machine resources in case of the controller failure.
The status object field of the Machine object represents the
BareMetalMachineProviderStatus subresource that describes the current
bare metal instance state and contains the following fields:
apiVersion
API version of the object that is cluster.k8s.io/v1alpha1.
kind
Object type that is BareMetalMachineProviderStatus.
hardware
Provides a machine hardware information:
cpu
Number of CPUs.
ram
RAM capacity in GB.
storage
List of hard drives mounted on the machine. Contains the disk name
and size in GB.
status
Represents the current status of a machine:
Provision
A machine is yet to obtain a status
Uninitialized
A machine is yet to obtain the node IP address and host name
Pending
A machine is yet to receive the deployment instructions and
it is either not booted yet or waits for the LCM controller to be
deployed
Prepare
A machine is running the Prepare phase during which Docker images
and packages are being predownloaded
Deploy
A machine is processing the LCM Controller instructions
Reconfigure
A machine is being updated with a configuration without affecting
workloads running on the machine
Ready
A machine is deployed and the supported Mirantis Kubernetes Engine (MKE)
version is set
Maintenance
A machine host is cordoned, drained, and prepared for maintenance
operations
currentDistributionSince 2.24.0 as TechPreview and 2.24.2 as GA
Distribution ID of the current operating system installed on the machine.
For example, ubuntu/jammy.
maintenance
Maintenance mode of a machine. If enabled, the node of the selected machine
is drained, cordoned, and prepared for maintenance operations.
rebootAvailable since 2.22.0
Indicator of a host reboot to complete the Ubuntu operating system updates,
if any.
required
Specifies whether a host reboot is required. Boolean. If true,
a manual host reboot is required.
reason
Specifies the package name(s) to apply during a host reboot.
upgradeIndex
Positive numeral value that determines the order of machines upgrade. If
upgradeIndex in the Machine object spec is set, this status value
equals the one in the spec. Otherwise, this value displays the automatically
generated order of upgrade.
delete
Generally available since Container Cloud 2.25.0 (Cluster releases 17.0.0
and 16.0.0). Technology Preview since 2.21.0 for non-MOSK
clusters. Start of a machine deletion or a successful abortion. Boolean.
prepareDeletionPhase
Generally available since Container Cloud 2.25.0 (Cluster releases 17.0.0
and 16.0.0). Technology Preview since 2.21.0 for non-MOSK
clusters. Preparation phase for a graceful machine deletion. Possible values
are as follows:
started
Cloud provider controller prepares a machine for deletion by cordoning,
draining the machine, and so on.
completed
LCM Controller starts removing the machine resources since
the preparation for deletion is complete.
aborting
Cloud provider controller attempts to uncordon the node. If the attempt
fails, the status changes to failed.
TechPreview since 2.21.0 and 2.21.1 for MOSK 22.5GA since 2.24.0 for management and regional clustersGA since 2.25.0 for managed clusters
This section describes the MetalLBConfig custom resource used in the
Container Cloud API that contains the MetalLB configuration objects for a
particular cluster.
For demonstration purposes, the Container Cloud MetalLBConfig
custom resource description is split into the following major sections:
The Container Cloud MetalLBConfig CR contains the following fields:
apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1.
kind
Object type that is MetalLBConfig.
The metadata object field of the MetalLBConfig resource
contains the following fields:
name
Name of the MetalLBConfig object.
namespace
Project in which the object was created. Must match the project name of
the target cluster.
labels
Key-value pairs attached to the object. Mandatory labels:
kaas.mirantis.com/provider
Provider type that is baremetal.
kaas.mirantis.com/region
Region name that matches the region name of the target cluster.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
cluster.sigs.k8s.io/cluster-name
Name of the cluster that the MetalLB configuration must apply to.
Warning
Labels and annotations that are not documented in this API
Reference are generated automatically by Container Cloud. Do not modify them
using the Container Cloud API.
The spec field of the MetalLBConfig object represents the
MetalLBConfigSpec subresource that contains the description of MetalLB
configuration objects.
These objects are created in the target cluster during its deployment.
The spec field contains the following optional fields:
addressPools
Removed in Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0),
deprecated in 2.26.0 (Cluster releases 17.2.0 and 16.2.0).
List of MetalLBAddressPool objects to create MetalLB AddressPool
objects.
bfdProfiles
List of MetalLBBFDProfile objects to create MetalLB BFDProfile
objects.
bgpAdvertisements
List of MetalLBBGPAdvertisement objects to create MetalLB
BGPAdvertisement objects.
bgpPeers
List of MetalLBBGPPeer objects to create MetalLB BGPPeer objects.
communities
List of MetalLBCommunity objects to create MetalLB Community
objects.
ipAddressPools
List of MetalLBIPAddressPool objects to create MetalLB
IPAddressPool objects.
l2Advertisements
List of MetalLBL2Advertisement objects to create MetalLB
L2Advertisement objects.
The l2Advertisements object allows defining interfaces to optimize
the announcement. When you use the interfaces selector, LB addresses
are announced only on selected host interfaces.
Mirantis recommends using the interfaces selector if nodes use separate
host networks for different types of traffic. The pros of such configuration
are as follows: less spam on other interfaces and networks and limited chances
to reach IP addresses of load-balanced services from irrelevant interfaces and
networks.
Caution
Interface names in the interfaces list must match those
on the corresponding nodes.
Name of the MetalLBConfigTemplate object used as a source of MetalLB
configuration objects. Mutually exclusive with the fields listed below
that will be part of the MetalLBConfigTemplate object. For details,
see MetalLBConfigTemplate.
Before Cluster releases 17.2.0 and 16.2.0, MetalLBConfigTemplate is the
default configuration method for MetalLB on bare metal deployments. Since
Cluster releases 17.2.0 and 16.2.0, use the MetalLBConfig object
instead.
Caution
For MKE clusters that are part of MOSK infrastructure, the
feature support will become available in one of the following
Container Cloud releases.
Caution
For managed clusters, this field is available as Technology
Preview since Container Cloud 2.24.0, is generally available since
2.25.0, and is deprecated since 2.27.0.
The objects listed in the spec field of the MetalLBConfig object,
such as MetalLBIPAddressPool, MetalLBL2Advertisement, and so on,
are used as templates for the MetalLB objects that will be created in the
target cluster. Each of these objects has the following structure:
labels
Optional. Key-value pairs attached to the metallb.io/<objectName>
object as metadata.labels.
name
Name of the metallb.io/<objectName> object.
spec
Contents of the spec section of the metallb.io/<objectName> object.
The spec field has the metallb.io/<objectName>Spec type.
For details, see MetalLB objects.
For example, MetalLBIPAddressPool is a template for the
metallb.io/IPAddressPool object and has the following structure:
labels
Optional. Key-value pairs attached to the metallb.io/IPAddressPool
object as metadata.labels.
name
Name of the metallb.io/IPAddressPool object.
spec
Contents of spec section of the metallb.io/IPAddressPool object.
The spec has the metallb.io/IPAddressPoolSpec type.
Container Cloud supports the following MetalLB object types of the
metallb.io API group:
IPAddressPool
Community
L2Advertisement
BFDProfile
BGPAdvertisement
BGPPeer
As of v1beta1 and v1beta2 API versions, metadata of MetalLB objects
has a standard format with no specific fields or labels defined for any
particular object:
apiVersion
API version of the object that can be metallb.io/v1beta1 or
metallb.io/v1beta2.
kind
Object type that is one of the metallb.io types listed above. For
example, IPAddressPool.
metadata
Object metadata that contains the following subfields:
name
Name of the object.
namespace
Namespace where the MetalLB components are located. It matches
metallb-system in Container Cloud.
labels
Optional. Key-value pairs that are attached to the object. It can be an
arbitrary set of labels. No special labels are defined as of v1beta1
and v1beta2 API versions.
The MetalLBConfig object contains spec sections of the
metallb.io/<objectName> objects that have the
metallb.io/<objectName>Spec type. For metallb.io/<objectName> and
metallb.io/<objectName>Spec types definitions, refer to the official
MetalLB documentation:
Before Container Cloud 2.27.0 (Cluster releases 17.2.0 and 16.2.0),
metallb.io/<objectName> objects v0.13.9 are supported.
The l2Advertisements object allows defining interfaces to optimize
the announcement. When you use the interfaces selector, LB addresses
are announced only on selected host interfaces. Mirantis recommends this
configuration if nodes use separate host networks for different types
of traffic. The pros of such configuration are as follows: less spam on
other interfaces and networks, limited chances to reach services
LB addresses from irrelevant interfaces and networks.
Configuration example:
l2Advertisements:|- name: management-lcmspec:ipAddressPools:- defaultinterfaces:# LB addresses from the "default" address pool will be announced# on the "k8s-lcm" interface- k8s-lcm
Caution
Interface names in the interfaces list must match those
on the corresponding nodes.
For managed clusters, this field is available as Technology
Preview and is generally available since Container Cloud 2.25.0.
Caution
For MKE clusters that are part of MOSK infrastructure, the
feature support will become available in one of the following
Container Cloud releases.
The status field describes the actual state of the object.
It contains the following fields:
bootstrapModeOnly in 2.24.0
Field that appears only during a management cluster bootstrap as true
and is used internally for bootstrap. Once deployment completes, the value
is moved to false and is excluded from the status output.
objects
Description of MetalLB objects that is used to create MetalLB native
objects in the target cluster.
The format of underlying objects is the same as for those in the spec
field, except templateName, which is obsolete since Container Cloud
2.28.0 (Cluster releases 17.3.0 and 16.3.0) and which is not present in this
field. The objects contents are rendered from the following locations,
with possible modifications for the bootstrap cluster:
Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0),
MetalLBConfig.spec
Before Container Cloud 2.28.0 (Cluster releases 17.2.0, 16.2.0, or
earlier):
MetalLBConfigTemplate.status of the corresponding template if
MetalLBConfig.spec.templateName is defined
MetalLBConfig.spec if MetalLBConfig.spec.templateName is not
defined
propagateResult
Result of objects propagation. During objects propagation, native MetalLB
objects of the target cluster are created and updated according to the
description of the objects present in the status.objects field.
This field contains the following information:
message
Text message that describes the result of the last attempt of objects
propagation. Contains an error message if the last attempt was
unsuccessful.
success
Result of the last attempt of objects propagation. Boolean.
time
Timestamp of the last attempt of objects propagation. For example,
2023-07-04T00:30:36Z.
If the objects propagation was successful, the MetalLB objects of the
target cluster match the ones present in the status.objects field.
updateResult
Status of the MetalLB objects update. Has the same format of subfields
that in propagateResult described above.
During objects update, the status.objects contents are rendered as
described in the objects field definition above.
If the objects update was successful, the MetalLB objects description
present in status.objects is rendered successfully and up to date.
This description is used to update MetalLB objects in the target cluster.
If the objects update was not successful, MetalLB objects will not be
propagated to the target cluster.
Interface names in the interfaces list must match the ones
on the corresponding nodes.
After the object is created and processed by the MetalLB Controller, the
status field is added. For example:
status:objects:ipAddressPools:-name:servicesspec:addresses:-10.100.100.151-10.100.100.170autoAssign:trueavoidBuggyIPs:falsel2Advertisements:-name:servicesspec:ipAddressPools:-servicespropagateResult:message:Objects were successfully updatedsuccess:truetime:"2023-07-04T14:31:40Z"updateResult:message:Objects were successfully read from MetalLB configuration specificationsuccess:truetime:"2023-07-04T14:31:39Z"
Example of native MetalLB objects to be created in the
managed-ns/managed-cluster cluster during deployment:
Admission Controller blocks creation of the object
2.28.0
17.3.0 and 16.3.0
Unsupported for any cluster type
2.27.0
17.2.0 and 16.2.0
Deprecated for any cluster type
2.25.0
17.0.0 and 16.0.0
Generally available for managed clusters
2.24.2
15.0.1, 14.0.1, 14.0.0
Technology Preview for managed clusters
2.24.0
14.0.0
Generally available for management clusters
This section describes the MetalLBConfigTemplate custom resource used in
the Container Cloud API that contains the template for MetalLB configuration
for a particular cluster.
Note
The MetalLBConfigTemplate object applies to bare metal
deployments only.
Before Cluster releases 17.2.0 and 16.2.0, MetalLBConfigTemplate is the
default configuration method for MetalLB on bare metal deployments. This method
allows the use of Subnet objects to define MetalLB IP address pools
the same way as they were used before introducing the MetalLBConfig and
MetalLBConfigTemplate objects. Since Cluster releases 17.2.0 and 16.2.0,
use the MetalLBConfig object for this purpose instead.
For demonstration purposes, the Container Cloud MetalLBConfigTemplate
custom resource description is split into the following major sections:
The Container Cloud MetalLBConfigTemplate CR contains the following fields:
apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is MetalLBConfigTemplate.
The metadata object field of the MetalLBConfigTemplate resource
contains the following fields:
name
Name of the MetalLBConfigTemplate object.
namespace
Project in which the object was created. Must match the project name of
the target cluster.
labels
Key-value pairs attached to the object. Mandatory labels:
kaas.mirantis.com/provider
Provider type that is baremetal.
kaas.mirantis.com/region
Region name that matches the region name of the target cluster.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
cluster.sigs.k8s.io/cluster-name
Name of the cluster that the MetalLB configuration applies to.
Warning
Labels and annotations that are not documented in this API
Reference are generated automatically by Container Cloud. Do not modify them
using the Container Cloud API.
The spec field of the MetalLBConfigTemplate object contains the
templates of MetalLB configuration objects and optional auxiliary variables.
Container Cloud uses these templates to create MetalLB configuration objects
during the cluster deployment.
The spec field contains the following optional fields:
machines
Key-value dictionary to select IpamHost objects corresponding to nodes
of the target cluster. Keys contain machine aliases used in
spec.templates. Values contain the NameLabelsSelector items that
select IpamHost by name or by labels. For example:
This field is required if some IP addresses of nodes are used in
spec.templates.
vars
Key-value dictionary of arbitrary user-defined variables that are used in
spec.templates. For example:
vars:localPort:4561
templates
List of templates for MetalLB configuration objects that are used to
render MetalLB configuration definitions and create MetalLB objects in
the target cluster. Contains the following optional fields:
bfdProfiles
Template for the MetalLBBFDProfile object list to create MetalLB
BFDProfile objects.
bgpAdvertisements
Template for the MetalLBBGPAdvertisement object list to create
MetalLB BGPAdvertisement objects.
bgpPeers
Template for the MetalLBBGPPeer object list to create MetalLB
BGPPeer objects.
communities
Template for the MetalLBCommunity object list to create MetalLB
Community objects.
ipAddressPools
Template for the MetalLBIPAddressPool object list to create MetalLB
IPAddressPool objects.
l2Advertisements
Template for the MetalLBL2Advertisement object list to create
MetalLB L2Advertisement objects.
Each template is a string and has the same structure as the list of the
corresponding objects described in MetalLBConfig spec such as
MetalLBIPAddressPool and MetalLBL2Advertisement, but
you can use additional functions and variables inside these templates.
Note
When using the MetalLBConfigTemplate object, you can define
MetalLB IP address pools using both Subnet objects and
spec.ipAddressPools templates. IP address pools rendered from these
sources will be concatenated and then written to
status.renderedObjects.ipAddressPools.
You can use the following functions in templates:
ipAddressPoolNames
Selects all IP address pools of the given announcement type found for
the target cluster. Possible types: layer2, bgp, any.
The any type includes all IP address pools found for the target
cluster. The announcement types of IP address pools are verified using
the metallb/address-pool-protocol labels of the corresponding
Subnet object.
The ipAddressPools templates have no types as native MetalLB
IPAddressPool objects have no announcement type.
The l2Advertisements template can refer to IP address pools of the
layer2 or any type.
The bgpAdvertisements template can refer to IP address pools of the
bgp or any type.
IP address pools are searched in the templates.ipAddressPools field
and in the Subnet objects of the target cluster. For example:
The l2Advertisements object allows defining interfaces to optimize
the announcement. When you use the interfaces selector, LB addresses
are announced only on selected host interfaces. Mirantis recommends this
configuration if nodes use separate host networks for different types
of traffic. The pros of such configuration are as follows: less spam on
other interfaces and networks, limited chances to reach services
LB addresses from irrelevant interfaces and networks.
Configuration example:
l2Advertisements:|- name: management-lcmspec:ipAddressPools:- defaultinterfaces:# LB addresses from the "default" address pool will be announced# on the "k8s-lcm" interface- k8s-lcm
Caution
Interface names in the interfaces list must match those
on the corresponding nodes.
The status field describes the actual state of the object.
It contains the following fields:
renderedObjects
MetalLB objects description rendered from spec.templates in the same
format as they are defined in the MetalLBConfig spec field.
All underlying objects are optional. The following objects can be present:
bfdProfiles, bgpAdvertisements, bgpPeers, communities,
ipAddressPools, l2Advertisements.
stateSince 2.23.0
Message that reflects the current status of the resource.
The list of possible values includes the following:
OK - object is operational.
ERR - object is non-operational. This status has a detailed
description in the messages list.
TERM - object was deleted and is terminating.
messagesSince 2.23.0
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status
field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.
apiVersion:ipam.mirantis.com/v1alpha1kind:MetalLBConfigTemplatemetadata:labels:cluster.sigs.k8s.io/cluster-name:kaas-mgmtkaas.mirantis.com/provider:baremetalname:mgmt-metallb-templatenamespace:defaultspec:templates:l2Advertisements:|- name: management-lcmspec:ipAddressPools:- defaultinterfaces:# IPs from the "default" address pool will be announced on the "k8s-lcm" interface- k8s-lcm- name: provision-pxespec:ipAddressPools:- services-pxeinterfaces:# IPs from the "services-pxe" address pool will be announced on the "k8s-pxe" interface- k8s-pxe
Configuration example for Subnet of the default pool
After the objects are created and processed by the kaas-ipam Controller,
the status field displays for MetalLBConfigTemplate:
Configuration example of the status field for
MetalLBConfigTemplate
status:checksums:annotations:sha256:38e0b9de817f645c4bec37c0d4a3e58baecccb040f5718dc069a72c7385a0bedlabels:sha256:380337902278e8985e816978c349910a4f7ed98169c361eb8777411ac427e6baspec:sha256:0860790fc94217598e0775ab2961a02acc4fba820ae17c737b94bb5d55390dbemessages:-Template for BFDProfiles is undefined-Template for BGPAdvertisements is undefined-Template for BGPPeers is undefined-Template for Communities is undefinedobjCreated:2023-06-30T21:22:56.00000Z by v6.5.999-20230627-072014-ba8d918objStatusUpdated:2023-07-04T00:30:35.82023Z by v6.5.999-20230627-072014-ba8d918objUpdated:2023-06-30T22:10:51.73822Z by v6.5.999-20230627-072014-ba8d918renderedObjects:ipAddressPools:-name:defaultspec:addresses:-10.0.34.101-10.0.34.120autoAssign:true-name:services-pxespec:addresses:-10.0.24.221-10.0.24.230autoAssign:falsel2Advertisements:-name:management-lcmspec:interfaces:-k8s-lcmipAddressPools:-default-name:provision-pxespec:interfaces:-k8s-pxeipAddressPools:-services-pxestate:OK
The following example illustrates contents of the status field that
displays for MetalLBConfig after the objects are processed
by the MetalLB Controller.
Configuration example of the status field for
MetalLBConfig
status:objects:ipAddressPools:-name:defaultspec:addresses:-10.0.34.101-10.0.34.120autoAssign:trueavoidBuggyIPs:false-name:services-pxespec:addresses:-10.0.24.221-10.0.24.230autoAssign:falseavoidBuggyIPs:falsel2Advertisements:-name:management-lcmspec:interfaces:-k8s-lcmipAddressPools:-default-name:provision-pxespec:interfaces:-k8s-pxeipAddressPools:-services-pxepropagateResult:message:Objects were successfully updatedsuccess:truetime:"2023-07-05T03:10:23Z"updateResult:message:Objects were successfully read from MetalLB configuration specificationsuccess:truetime:"2023-07-05T03:10:23Z"
Using the objects described above, several native MetalLB objects are created
in the kaas-mgmt cluster during deployment.
Configuration example of MetalLB objects created during cluster
deployment
In the following configuration example, MetalLB is configured to use BGP for
announcement of external addresses of Kubernetes load-balanced services
for the managed cluster from master nodes. Each master node is located in
its own rack without the L2 layer extension between racks.
This section contains only examples of the objects required to illustrate
the MetalLB configuration. For Rack, MultiRackCluster, L2Template
and other objects required to configure BGP announcement of the cluster API
load balancer address for this scenario, refer to Multiple rack configuration example.
apiVersion:ipam.mirantis.com/v1alpha1kind:MetalLBConfigTemplatemetadata:labels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalname:test-cluster-metallb-bgp-templatenamespace:managed-nsspec:templates:bgpAdvertisements:|- name: servicesspec:ipAddressPools:- servicespeers: # "peers" can be omitted if all defined peers- svc-peer-rack1 # are used in a particular "bgpAdvertisement"- svc-peer-rack2- svc-peer-rack3bgpPeers:|- name: svc-peer-rack1spec:peerAddress: 10.77.41.1 # peer address is in the external subnet #1peerASN: 65100myASN: 65101nodeSelectors:- matchLabels:rack-id: rack-master-1 # references the node corresponding# to the "test-cluster-master-1" Machine- name: svc-peer-rack2spec:peerAddress: 10.77.42.1 # peer address is in the external subnet #2peerASN: 65100myASN: 65101nodeSelectors:- matchLabels:rack-id: rack-master-2 # references the node corresponding# to the "test-cluster-master-2" Machine- name: svc-peer-rack3spec:peerAddress: 10.77.43.1 # peer address is in the external subnet #3peerASN: 65100myASN: 65101nodeSelectors:- matchLabels:rack-id: rack-master-3 # references the node corresponding# to the "test-cluster-master-3" Machine
The following objects illustrate configuration for three subnets that
are used to configure external network in three racks. Each master node uses
its own external L2/L3 network segment.
Configuration example for the Subnetext-rack-control-1
Rack objects and ipam/RackRef labels in Machine objects are not
required for MetalLB configuration. But in this example, rack objects
are implied to be used for configuration of BGP announcement of the cluster
API load balancer address. Rack objects are not present in this example.
Machine objects select different L2 templates because each master node uses
different L2/L3 network segments for LCM, external, and other networks.
Configuration example for the Machinetest-cluster-master-1
apiVersion:cluster.k8s.io/v1alpha1kind:Machinemetadata:name:test-cluster-master-1namespace:managed-nsannotations:metal3.io/BareMetalHost:managed-ns/test-cluster-master-1labels:cluster.sigs.k8s.io/cluster-name:test-clustercluster.sigs.k8s.io/control-plane:controlplanehostlabel.bm.kaas.mirantis.com/controlplane:controlplaneipam/RackRef:rack-master-1kaas.mirantis.com/provider:baremetalspec:providerSpec:value:kind:BareMetalMachineProviderSpecapiVersion:baremetal.k8s.io/v1alpha1hostSelector:matchLabels:kaas.mirantis.com/baremetalhost-id:test-cluster-master-1l2TemplateSelector:name:test-cluster-master-1nodeLabels:-key:rack-id# it is used in "nodeSelectors"value:rack-master-1# of "bgpPeer" MetalLB objects
Configuration example for the Machinetest-cluster-master-2
apiVersion:cluster.k8s.io/v1alpha1kind:Machinemetadata:name:test-cluster-master-2namespace:managed-nsannotations:metal3.io/BareMetalHost:managed-ns/test-cluster-master-2labels:cluster.sigs.k8s.io/cluster-name:test-clustercluster.sigs.k8s.io/control-plane:controlplanehostlabel.bm.kaas.mirantis.com/controlplane:controlplaneipam/RackRef:rack-master-2kaas.mirantis.com/provider:baremetalspec:providerSpec:value:kind:BareMetalMachineProviderSpecapiVersion:baremetal.k8s.io/v1alpha1hostSelector:matchLabels:kaas.mirantis.com/baremetalhost-id:test-cluster-master-2l2TemplateSelector:name:test-cluster-master-2nodeLabels:-key:rack-id# it is used in "nodeSelectors"value:rack-master-1# of "bgpPeer" MetalLB objects
Configuration example for the Machinetest-cluster-master-2
apiVersion:cluster.k8s.io/v1alpha1kind:Machinemetadata:name:test-cluster-master-3namespace:managed-nsannotations:metal3.io/BareMetalHost:managed-ns/test-cluster-master-3labels:cluster.sigs.k8s.io/cluster-name:test-clustercluster.sigs.k8s.io/control-plane:controlplanehostlabel.bm.kaas.mirantis.com/controlplane:controlplaneipam/RackRef:rack-master-3kaas.mirantis.com/provider:baremetalspec:providerSpec:value:kind:BareMetalMachineProviderSpecapiVersion:baremetal.k8s.io/v1alpha1hostSelector:matchLabels:kaas.mirantis.com/baremetalhost-id:test-cluster-master-3l2TemplateSelector:name:test-cluster-master-3nodeLabels:-key:rack-id# it is used in "nodeSelectors"value:rack-master-3# of "bgpPeer" MetalLB objects
This section describes the MultiRackCluster resource used in the
Container Cloud API.
When you create a bare metal managed cluster with a multi-rack topology,
where Kubernetes masters are distributed across multiple racks
without L2 layer extension between them, the MultiRackCluster resource
allows you to set cluster-wide parameters for configuration of the
BGP announcement of the cluster API load balancer address.
In this scenario, the MultiRackCluster object must be bound to the
Cluster object.
The MultiRackCluster object is generally used for a particular cluster
in conjunction with Rack objects described in Rack.
For demonstration purposes, the Container Cloud MultiRackCluster
custom resource (CR) description is split into the following major sections:
The Container Cloud MultiRackCluster CR metadata contains the following
fields:
apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is MultiRackCluster.
metadata
The metadata field contains the following subfields:
name
Name of the MultiRackCluster object.
namespace
Container Cloud project (Kubernetes namespace) in which the object was
created.
labels
Key-value pairs that are attached to the object:
cluster.sigs.k8s.io/cluster-name
Cluster object name that this MultiRackCluster object is
applied to. To enable the use of BGP announcement for the cluster API
LB address, set the useBGPAnnouncement parameter in the
Cluster object to true:
spec:providerSpec:value:useBGPAnnouncement:true
kaas.mirantis.com/provider
Provider name that is baremetal.
kaas.mirantis.com/region
Region name.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Warning
Labels and annotations that are not documented in this API
Reference are generated automatically by Container Cloud. Do not modify them
using the Container Cloud API.
The MultiRackCluster metadata configuration example:
The spec field of the MultiRackCluster resource describes the desired
state of the object. It contains the following fields:
bgpdConfigFileName
Name of the configuration file for the BGP daemon (bird). Recommended value
is bird.conf.
bgpdConfigFilePath
Path to the directory where the configuration file for the BGP daemon
(bird) is added. The recommended value is /etc/bird.
bgpdConfigTemplate
Optional. Configuration text file template for the BGP daemon (bird)
configuration file where you can use go template constructs and
the following variables:
RouterID, LocalIP
Local IP on the given network, which is a key in the
Rack.spec.peeringMap dictionary, for a given node. You can use
it, for example, in the routerid{{$.RouterID}}; instruction.
LocalASN
Local AS number.
NeighborASN
Neighbor AS number.
NeighborIP
Neighbor IP address. Its values are taken from Rack.spec.peeringMap,
it can be used only inside the range iteration through the
Neighbors list.
Neighbors
List of peers in the given network and node. It can be iterated
through the range statement in the go template.
Values for LocalASN and NeighborASN are taken from:
MultiRackCluster.defaultPeer - if not used as a field inside the
range iteration through the Neighbors list.
Corresponding values of Rack.spec.peeringMap - if used as a field
inside the range iteration through the Neighbors list.
This template can be overridden using the Rack objects. For details,
see Rack spec.
defaultPeer
Configuration parameters for the default BGP peer. These parameters will
be used in rendering of the configuration file for BGP daemon from
the template if they are not overridden for a particular rack or network
using Rack objects. For details, see Rack spec.
localASN
Mandatory. Local AS number.
neighborASN
Mandatory. Neighbor AS number.
neighborIP
Reserved. Neighbor IP address. Leave it as an empty string.
password
Optional. Neighbor password. If not set, you can hardcode it in
bgpdConfigTemplate. It is required for MD5 authentication between
BGP peers.
Configuration examples:
Since Cluster releases 17.1.0 and 16.1.0 for bird v2.x
spec:bgpdConfigFileName:bird.confbgpdConfigFilePath:/etc/birdbgpdConfigTemplate:|protocol device {}#protocol direct {interface "lo";ipv4;}#protocol kernel {ipv4 {export all;};}#{{range $i, $peer := .Neighbors}}protocol bgp 'bgp_peer_{{$i}}' {local port 1179 as {{.LocalASN}};neighbor {{.NeighborIP}} as {{.NeighborASN}};ipv4 {import none;export filter {if dest = RTD_UNREACHABLE then {reject;}accept;};};}{{end}}defaultPeer:localASN:65101neighborASN:65100neighborIP:""
Before Cluster releases 17.1.0 and 16.1.0 for bird v1.x
spec:bgpdConfigFileName:bird.confbgpdConfigFilePath:/etc/birdbgpdConfigTemplate:|listen bgp port 1179;protocol device {}#protocol direct {interface "lo";}#protocol kernel {export all;}#{{range $i, $peer := .Neighbors}}protocol bgp 'bgp_peer_{{$i}}' {local as {{.LocalASN}};neighbor {{.NeighborIP}} as {{.NeighborASN}};import all;export filter {if dest = RTD_UNREACHABLE then {reject;}accept;};}{{end}}defaultPeer:localASN:65101neighborASN:65100neighborIP:""
The status field of the MultiRackCluster resource reflects the actual
state of the MultiRackCluster object and contains the following fields:
stateSince 2.23.0
Message that reflects the current status of the resource.
The list of possible values includes the following:
OK - object is operational.
ERR - object is non-operational. This status has a detailed
description in the messages list.
TERM - object was deleted and is terminating.
messagesSince 2.23.0
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status
field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.
Configuration example:
status:checksums:annotations:sha256:38e0b9de817f645c4bec37c0d4a3e58baecccb040f5718dc069a72c7385a0bedlabels:sha256:d8f8eacf487d57c22ca0ace29bd156c66941a373b5e707d671dc151959a64ce7spec:sha256:66b5d28215bdd36723fe6230359977fbede828906c6ae96b5129a972f1fa51e9objCreated:2023-08-11T12:25:21.00000Z by v6.5.999-20230810-155553-2497818objStatusUpdated:2023-08-11T12:32:58.11966Z by v6.5.999-20230810-155553-2497818objUpdated:2023-08-11T12:32:57.32036Z by v6.5.999-20230810-155553-2497818state:OK
The following configuration examples of several bare metal objects illustrate
how to configure BGP announcement of the load balancer address used to expose
the cluster API.
In the following example, all master nodes are in a single rack. One Rack
object is required in this case for master nodes. Some worker nodes can
coexist in the same rack with master nodes or occupy separate racks. It is
implied that the useBGPAnnouncement parameter is set to true in the
corresponding Cluster object.
Configuration example for MultiRackCluster
Since Cluster releases 17.1.0 and 16.1.0 for bird v2.x:
apiVersion:ipam.mirantis.com/v1alpha1kind:MultiRackClustermetadata:name:multirack-test-clusternamespace:managed-nslabels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onespec:bgpdConfigFileName:bird.confbgpdConfigFilePath:/etc/birdbgpdConfigTemplate:|protocol device {}#protocol direct {interface "lo";ipv4;}#protocol kernel {ipv4 {export all;};}#{{range $i, $peer := .Neighbors}}protocol bgp 'bgp_peer_{{$i}}' {local port 1179 as {{.LocalASN}};neighbor {{.NeighborIP}} as {{.NeighborASN}};ipv4 {import none;export filter {if dest = RTD_UNREACHABLE then {reject;}accept;};};}{{end}}defaultPeer:localASN:65101neighborASN:65100neighborIP:""
Before Cluster releases 17.1.0 and 16.1.0 for bird v1.x:
apiVersion:ipam.mirantis.com/v1alpha1kind:MultiRackClustermetadata:name:multirack-test-clusternamespace:managed-nslabels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalspec:bgpdConfigFileName:bird.confbgpdConfigFilePath:/etc/birdbgpdConfigTemplate:|listen bgp port 1179;protocol device {}#protocol direct {interface "lo";}#protocol kernel {export all;}#{{range $i, $peer := .Neighbors}}protocol bgp 'bgp_peer_{{$i}}' {local as {{.LocalASN}};neighbor {{.NeighborIP}} as {{.NeighborASN}};import all;export filter {if dest = RTD_UNREACHABLE then {reject;}accept;};}{{end}}defaultPeer:localASN:65101neighborASN:65100neighborIP:""
Configuration example for Rack
apiVersion:ipam.mirantis.com/v1alpha1kind:Rackmetadata:name:rack-masternamespace:managed-nslabels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalspec:peeringMap:lcm-rack-control:peers:-neighborIP:10.77.31.1# "localASN" and "neighborASN" are taken from-neighborIP:10.77.37.1# "MultiRackCluster.spec.defaultPeer"# if not set here
Configuration example for Machine
# "Machine" templates for "test-cluster-master-2" and "test-cluster-master-3"# differ only in BMH selectors in this example.apiVersion:cluster.k8s.io/v1alpha1kind:Machinemetadata:name:test-cluster-master-1namespace:managed-nsannotations:metal3.io/BareMetalHost:managed-ns/test-cluster-master-1labels:cluster.sigs.k8s.io/cluster-name:test-clustercluster.sigs.k8s.io/control-plane:controlplanehostlabel.bm.kaas.mirantis.com/controlplane:controlplaneipam/RackRef:rack-master# used to connect "IpamHost" to "Rack" objects, so that# BGP parameters can be obtained from "Rack" to# render BGP configuration for the given "IpamHost" objectkaas.mirantis.com/provider:baremetalspec:providerSpec:value:kind:BareMetalMachineProviderSpecapiVersion:baremetal.k8s.io/v1alpha1hostSelector:matchLabels:kaas.mirantis.com/baremetalhost-id:test-cluster-master-1l2TemplateSelector:name:test-cluster-master
Note
Before update of the management cluster to Container Cloud 2.29.0
(Cluster release 16.4.0), instead of BareMetalHostInventory, use the
BareMetalHost object. For details, see BareMetalHost.
Caution
While the Cluster release of the management cluster is 16.4.0,
BareMetalHostInventory operations are allowed to
m:kaas@management-admin only. Once the management cluster is updated
to the Cluster release 16.4.1 (or later), this limitation will be lifted.
Configuration example for L2Template
apiVersion:ipam.mirantis.com/v1alpha1kind:L2Templatemetadata:labels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalname:test-cluster-masternamespace:managed-nsspec:...l3Layout:-subnetName:lcm-rack-control# this network is referenced in "rack-master" Rackscope:namespace...npTemplate:|...ethernets:lo:addresses:- {{ cluster_api_lb_ip }} # function for cluster API LB IPdhcp4: falsedhcp6: false...
After the objects are created and nodes are provisioned, the IpamHost
objects will have BGP daemon configuration files in their status fields.
For example:
You can decode /etc/bird/bird.conf contents and verify the configuration:
echo"<<base64-string>>"|base64-d
The following system output applies to the above configuration examples:
Configuration example for the decoded bird.conf
Since Cluster releases 17.1.0 and 16.1.0 for bird v2.x:
protocol device {}#protocol direct {interface "lo";ipv4;}#protocol kernel {ipv4 {export all;};}#protocol bgp 'bgp_peer_0' {local port 1179 as 65101;neighbor 10.77.31.1 as 65100;ipv4 {import none;export filter {if dest = RTD_UNREACHABLE then {reject;}accept;};};}protocol bgp 'bgp_peer_1' {local port 1179 as 65101;neighbor 10.77.37.1 as 65100;ipv4 {import none;export filter {if dest = RTD_UNREACHABLE then {reject;}accept;};};}
Before Cluster releases 17.1.0 and 16.1.0 for bird v1.x:
listen bgp port 1179;protocol device {}#protocol direct {interface "lo";}#protocol kernel {export all;}#protocol bgp 'bgp_peer_0' {local as 65101;neighbor 10.77.31.1 as 65100;import all;export filter {if dest = RTD_UNREACHABLE then {reject;}accept;};}protocol bgp 'bgp_peer_1' {local as 65101;neighbor 10.77.37.1 as 65100;import all;export filter {if dest = RTD_UNREACHABLE then {reject;}accept;};}
BGP daemon configuration files are copied from IpamHost.status
to the corresponding LCMMachine object the same way as it is done for
netplan configuration files. Then, the configuration files are written to the
corresponding node by the LCM-Agent.
In the following configuration example, each master node is located in its own
rack. Three Rack objects are required in this case for master nodes.
Some worker nodes can coexist in the same racks with master nodes or occupy
separate racks.
Only objects that are required to show configuration for BGP announcement
of the cluster API load balancer address are provided here.
It is implied that the useBGPAnnouncement parameter is set to true
in the corresponding Cluster object.
Configuration example for MultiRackCluster
Since Cluster releases 17.1.0 and 16.1.0 for bird v2.x:
# It is the same object as in the single rack example.apiVersion:ipam.mirantis.com/v1alpha1kind:MultiRackClustermetadata:name:multirack-test-clusternamespace:managed-nslabels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onespec:bgpdConfigFileName:bird.confbgpdConfigFilePath:/etc/birdbgpdConfigTemplate:|protocol device {}#protocol direct {interface "lo";ipv4;}#protocol kernel {ipv4 {export all;};}#{{range $i, $peer := .Neighbors}}protocol bgp 'bgp_peer_{{$i}}' {local port 1179 as {{.LocalASN}};neighbor {{.NeighborIP}} as {{.NeighborASN}};ipv4 {import none;export filter {if dest = RTD_UNREACHABLE then {reject;}accept;};};}{{end}}defaultPeer:localASN:65101neighborASN:65100neighborIP:""
Before Cluster releases 17.1.0 and 16.1.0 for bird v1.x:
# It is the same object as in the single rack example.apiVersion:ipam.mirantis.com/v1alpha1kind:MultiRackClustermetadata:name:multirack-test-clusternamespace:managed-nslabels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalspec:bgpdConfigFileName:bird.confbgpdConfigFilePath:/etc/birdbgpdConfigTemplate:|listen bgp port 1179;protocol device {}#protocol direct {interface "lo";}#protocol kernel {export all;}#{{range $i, $peer := .Neighbors}}protocol bgp 'bgp_peer_{{$i}}' {local as {{.LocalASN}};neighbor {{.NeighborIP}} as {{.NeighborASN}};import all;export filter {if dest = RTD_UNREACHABLE then {reject;}accept;};}{{end}}defaultPeer:localASN:65101neighborASN:65100neighborIP:""
The following Rack objects differ in neighbor IP addresses and in the
network (L3 subnet) used for BGP connection to announce the cluster API LB IP
and for cluster API traffic.
Configuration example for Rack 1
apiVersion:ipam.mirantis.com/v1alpha1kind:Rackmetadata:name:rack-master-1namespace:managed-nslabels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalspec:peeringMap:lcm-rack-control-1:peers:-neighborIP:10.77.31.2# "localASN" and "neighborASN" are taken from-neighborIP:10.77.31.3# "MultiRackCluster.spec.defaultPeer" if# not set here
Configuration example for Rack 2
apiVersion:ipam.mirantis.com/v1alpha1kind:Rackmetadata:name:rack-master-2namespace:managed-nslabels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalspec:peeringMap:lcm-rack-control-2:peers:-neighborIP:10.77.32.2# "localASN" and "neighborASN" are taken from-neighborIP:10.77.32.3# "MultiRackCluster.spec.defaultPeer" if# not set here
Configuration example for Rack 3
apiVersion:ipam.mirantis.com/v1alpha1kind:Rackmetadata:name:rack-master-3namespace:managed-nslabels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalspec:peeringMap:lcm-rack-control-3:peers:-neighborIP:10.77.33.2# "localASN" and "neighborASN" are taken from-neighborIP:10.77.33.3# "MultiRackCluster.spec.defaultPeer" if# not set here
As compared to single rack examples, the following Machine objects differ
in:
BMH selectors
L2Template selectors
Rack selectors (the ipam/RackRef label)
The rack-id node labels
The labels on master nodes are required for MetalLB node selectors if
MetalLB is used to announce LB IP addresses on master nodes. In this
scenario, the L2 (ARP) announcement mode cannot be used for MetalLB because
master nodes are in different L2 segments.
So, the BGP announcement mode must be used for MetalLB. Node selectors
are required to properly configure BGP connections from each master node.
Note
Before update of the management cluster to Container Cloud 2.29.0
(Cluster release 16.4.0), instead of BareMetalHostInventory, use the
BareMetalHost object. For details, see BareMetalHost.
Caution
While the Cluster release of the management cluster is 16.4.0,
BareMetalHostInventory operations are allowed to
m:kaas@management-admin only. Once the management cluster is updated
to the Cluster release 16.4.1 (or later), this limitation will be lifted.
Configuration example for Machine 1
apiVersion:cluster.k8s.io/v1alpha1kind:Machinemetadata:name:test-cluster-master-1namespace:managed-nsannotations:metal3.io/BareMetalHost:managed-ns/test-cluster-master-1labels:cluster.sigs.k8s.io/cluster-name:test-clustercluster.sigs.k8s.io/control-plane:controlplanehostlabel.bm.kaas.mirantis.com/controlplane:controlplaneipam/RackRef:rack-master-1kaas.mirantis.com/provider:baremetalspec:providerSpec:value:kind:BareMetalMachineProviderSpecapiVersion:baremetal.k8s.io/v1alpha1hostSelector:matchLabels:kaas.mirantis.com/baremetalhost-id:test-cluster-master-1l2TemplateSelector:name:test-cluster-master-1nodeLabels:# not used for BGP announcement of the-key:rack-id# cluster API LB IP but can be used forvalue:rack-master-1# MetalLB if "nodeSelectors" are required
Configuration example for Machine 2
apiVersion:cluster.k8s.io/v1alpha1kind:Machinemetadata:name:test-cluster-master-2namespace:managed-nsannotations:metal3.io/BareMetalHost:managed-ns/test-cluster-master-2labels:cluster.sigs.k8s.io/cluster-name:test-clustercluster.sigs.k8s.io/control-plane:controlplanehostlabel.bm.kaas.mirantis.com/controlplane:controlplaneipam/RackRef:rack-master-2kaas.mirantis.com/provider:baremetalspec:providerSpec:value:kind:BareMetalMachineProviderSpecapiVersion:baremetal.k8s.io/v1alpha1hostSelector:matchLabels:kaas.mirantis.com/baremetalhost-id:test-cluster-master-2l2TemplateSelector:name:test-cluster-master-2nodeLabels:# not used for BGP announcement of the-key:rack-id# cluster API LB IP but can be used forvalue:rack-master-2# MetalLB if "nodeSelectors" are required
Configuration example for Machine 3
apiVersion:cluster.k8s.io/v1alpha1kind:Machinemetadata:name:test-cluster-master-3namespace:managed-nsannotations:metal3.io/BareMetalHost:managed-ns/test-cluster-master-3labels:cluster.sigs.k8s.io/cluster-name:test-clustercluster.sigs.k8s.io/control-plane:controlplanehostlabel.bm.kaas.mirantis.com/controlplane:controlplaneipam/RackRef:rack-master-3kaas.mirantis.com/provider:baremetalspec:providerSpec:value:kind:BareMetalMachineProviderSpecapiVersion:baremetal.k8s.io/v1alpha1hostSelector:matchLabels:kaas.mirantis.com/baremetalhost-id:test-cluster-master-3l2TemplateSelector:name:test-cluster-master-3nodeLabels:# optional. not used for BGP announcement of-key:rack-id# the cluster API LB IP but can be used forvalue:rack-master-3# MetalLB if "nodeSelectors" are required
Configuration example for Subnet defining the cluster API
LB IP address
The following L2Template objects differ in LCM and external subnets that
each master node uses.
Configuration example for L2Template 1
apiVersion:ipam.mirantis.com/v1alpha1kind:L2Templatemetadata:labels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalname:test-cluster-master-1namespace:managed-nsspec:...l3Layout:-subnetName:lcm-rack-control-1# this network is referencedscope:namespace# in the "rack-master-1" Rack-subnetName:ext-rack-control-1# this optional network is used forscope:namespace# Kubernetes services traffic and# MetalLB BGP connections...npTemplate:|...ethernets:lo:addresses:- {{ cluster_api_lb_ip }} # function for cluster API LB IPdhcp4: falsedhcp6: false...
Configuration example for L2Template 2
apiVersion:ipam.mirantis.com/v1alpha1kind:L2Templatemetadata:labels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalname:test-cluster-master-2namespace:managed-nsspec:...l3Layout:-subnetName:lcm-rack-control-2# this network is referencedscope:namespace# in "rack-master-2" Rack-subnetName:ext-rack-control-2# this network is used for Kubernetes servicesscope:namespace# traffic and MetalLB BGP connections...npTemplate:|...ethernets:lo:addresses:- {{ cluster_api_lb_ip }} # function for cluster API LB IPdhcp4: falsedhcp6: false...
Configuration example for L2Template 3
apiVersion:ipam.mirantis.com/v1alpha1kind:L2Templatemetadata:labels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalname:test-cluster-master-3namespace:managed-nsspec:...l3Layout:-subnetName:lcm-rack-control-3# this network is referencedscope:namespace# in "rack-master-3" Rack-subnetName:ext-rack-control-3# this network is used for Kubernetes servicesscope:namespace# traffic and MetalLB BGP connections...npTemplate:|...ethernets:lo:addresses:- {{ cluster_api_lb_ip }} # function for cluster API LB IPdhcp4: falsedhcp6: false...
The following MetalLBConfig example illustrates how node labels
are used in nodeSelectors of bgpPeers. Each of bgpPeers
corresponds to one of master nodes.
Configuration example for MetalLBConfig
apiVersion:ipam.mirantis.com/v1alpha1kind:MetalLBConfigmetadata:labels:cluster.sigs.k8s.io/cluster-name:test-clusterkaas.mirantis.com/provider:baremetalname:test-cluster-metallb-confignamespace:managed-nsspec:...bgpPeers:-name:svc-peer-rack1spec:holdTime:0skeepaliveTime:0speerAddress:10.77.41.1# peer address is in external subnet# instead of LCM subnet used for BGP# connection to announce cluster API LB IPpeerASN:65100# the same as for BGP connection used to announce# cluster API LB IPmyASN:65101# the same as for BGP connection used to announce# cluster API LB IPnodeSelectors:-matchLabels:rack-id:rack-master-1# references the node corresponding# to "test-cluster-master-1" Machine-name:svc-peer-rack2spec:holdTime:0skeepaliveTime:0speerAddress:10.77.42.1peerASN:65100myASN:65101nodeSelectors:-matchLabels:rack-id:rack-master-1-name:svc-peer-rack3spec:holdTime:0skeepaliveTime:0speerAddress:10.77.43.1peerASN:65100myASN:65101nodeSelectors:-matchLabels:rack-id:rack-master-1...
After the objects are created and nodes are provisioned, the IpamHost
objects will have BGP daemon configuration files in their status fields.
Refer to Single rack configuration example on how to verify the BGP configuration
files.
This section describes the Rack resource used in the Container Cloud API.
When you create a bare metal managed cluster with a multi-rack topology,
where Kubernetes masters are distributed across multiple racks
without L2 layer extension between them, the Rack resource allows you
to configure BGP announcement of the cluster API load balancer address from
each rack.
In this scenario, Rack objects must be bound to Machine objects
corresponding to master nodes of the cluster. Each Rack object describes
the configuration of the BGP daemon (bird) used to announce the cluster API LB
address from a particular master node (or from several nodes in the same rack).
Rack objects are used for a particular cluster only in conjunction with
the MultiRackCluster object described in MultiRackCluster.
For demonstration purposes, the Container Cloud Rack custom resource (CR)
description is split into the following major sections:
The Container Cloud Rack CR metadata contains the following fields:
apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is Rack.
metadata
The metadata field contains the following subfields:
name
Name of the Rack object. Corresponding Machine objects must
have their ipam/RackRef label value set to the name of the Rack
object. This label is required only for Machine objects of the
master nodes that announce the cluster API LB address.
namespace
Container Cloud project (Kubernetes namespace) where the object was
created.
labels
Key-value pairs that are attached to the object:
cluster.sigs.k8s.io/cluster-name
Cluster object name that this Rack object is applied to.
kaas.mirantis.com/provider
Provider name that is baremetal.
kaas.mirantis.com/region
Region name.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Warning
Labels and annotations that are not documented in this API
Reference are generated automatically by Container Cloud. Do not modify them
using the Container Cloud API.
The spec field of the Rack resource describes the desired
state of the object. It contains the following fields:
bgpdConfigTemplate
Optional. Configuration file template that will be used to create
configuration file for a BGP daemon on nodes in this rack. If not set, the
configuration file template from the corresponding MultiRackCluster
object is used.
peeringMap
Structure that describes general parameters of BGP peers to be used
in the configuration file for a BGP daemon for each network where BGP
announcement is used. Also, you can define a separate configuration file
template for the BGP daemon for each of those networks.
The peeringMap structure is as follows:
peeringMap:<network-name-a>:peers:-localASN:<localASN-1>neighborASN:<neighborASN-1>neighborIP:<neighborIP-1>password:<password-1>-localASN:<localASN-2>neighborASN:<neighborASN-2>neighborIP:<neighborIP-2>password:<password-2>bgpdConfigTemplate:|<configuration file template for a BGP daemon>...
<network-name-a>
Name of the network where a BGP daemon should connect to the neighbor
BGP peers. By default, it is implied that the same network is used on the
node to make connection to the neighbor BGP peers as well as to receive
and respond to the traffic directed to the IP address being advertised.
In our scenario, the advertised IP address is the cluster API LB
IP address.
This network name must be the same as the subnet name used in the L2
template (l3Layout section) for the corresponding master node(s).
peers
Optional. List of dictionaries where each dictionary defines
configuration parameters for a particular BGP peer. Peer parameters are
as follows:
localASN
Optional. Local AS number. If not set, it can be taken from
MultiRackCluster.spec.defaultPeer or can be hardcoded in
bgpdConfigTemplate.
neighborASN
Optional. Neighbor AS number. If not set, it can be taken from
MultiRackCluster.spec.defaultPeer or can be hardcoded in
bgpdConfigTemplate.
neighborIP
Mandatory. Neighbor IP address.
password
Optional. Neighbor password. If not set, it can be taken from
MultiRackCluster.spec.defaultPeer or can be hardcoded in
bgpdConfigTemplate. It is required when MD5 authentication
between BGP peers is used.
bgpdConfigTemplate
Optional. Configuration file template that will be used to create the
configuration file for the BGP daemon of the network-name-a network
on a particular node. If not set, Rack.spec.bgpdConfigTemplate
is used.
Configuration example:
Since Cluster releases 17.1.0 and 16.1.0 for bird v2.x
spec:bgpdConfigTemplate:|protocol device {}#protocol direct {interface "lo";ipv4;}#protocol kernel {ipv4 {export all;};}#protocol bgp bgp_lcm {local port 1179 as {{.LocalASN}};neighbor {{.NeighborIP}} as {{.NeighborASN}};ipv4 {import none;export filter {if dest = RTD_UNREACHABLE then {reject;}accept;};};}peeringMap:lcm-rack1:peers:-localASN:65050neighborASN:65011neighborIP:10.77.31.1
Before Cluster releases 17.1.0 and 16.1.0 for bird v1.x
spec:bgpdConfigTemplate:|listen bgp port 1179;protocol device {}#protocol direct {interface "lo";}#protocol kernel {export all;}#protocol bgp bgp_lcm {local as {{.LocalASN}};neighbor {{.NeighborIP}} as {{.NeighborASN}};import all;export filter {if dest = RTD_UNREACHABLE then {reject;}accept;};}peeringMap:lcm-rack1:peers:-localASN:65050neighborASN:65011neighborIP:10.77.31.1
The status field of the Rack resource reflects the actual state
of the Rack object and contains the following fields:
stateSince 2.23.0
Message that reflects the current status of the resource.
The list of possible values includes the following:
OK - object is operational.
ERR - object is non-operational. This status has a detailed
description in the messages list.
TERM - object was deleted and is terminating.
messagesSince 2.23.0
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status
field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.
Configuration example:
status:checksums:annotations:sha256:cd4b751d9773eacbfd5493712db0cbebd6df0762156aefa502d65a9d5e8af31dlabels:sha256:fc2612d12253443955e1bf929f437245d304b483974ff02a165bc5c78363f739spec:sha256:8f0223b1eefb6a9cd583905a25822fd83ac544e62e1dfef26ee798834ef4c0c1objCreated:2023-08-11T12:25:21.00000Z by v6.5.999-20230810-155553-2497818objStatusUpdated:2023-08-11T12:33:00.92163Z by v6.5.999-20230810-155553-2497818objUpdated:2023-08-11T12:32:59.11951Z by v6.5.999-20230810-155553-2497818state:OK
The Container Cloud Subnet CR contains the following fields:
apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is Subnet
metadata
This field contains the following subfields:
name
Name of the Subnet object.
namespace
Project in which the Subnet object was created.
labels
Key-value pairs that are attached to the object:
ipam/DefaultSubnet: "1"Deprecated since 2.14.0
Indicates that this subnet was automatically created
for the PXE network.
ipam/UID
Unique ID of a subnet.
kaas.mirantis.com/provider
Provider type.
kaas.mirantis.com/region
Region name.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Warning
Labels and annotations that are not documented in this API
Reference are generated automatically by Container Cloud. Do not modify them
using the Container Cloud API.
The spec field of the Subnet resource describes the desired state of
a subnet. It contains the following fields:
cidr
A valid IPv4 CIDR, for example, 10.11.0.0/24.
gateway
A valid gateway address, for example, 10.11.0.9.
includeRanges
A comma-separated list of IP address ranges within the given CIDR that should
be used in the allocation of IPs for nodes. The gateway, network, broadcast,
and DNSaddresses will be excluded (protected) automatically if they intersect
with one of the range. The IPs outside the given ranges will not be used in
the allocation. Each element of the list can be either an interval
10.11.0.5-10.11.0.70 or a single address 10.11.0.77.
Warning
Do not use values that are out of the given CIDR.
excludeRanges
A comma-separated list of IP address ranges within the given CIDR that should
not be used in the allocation of IPs for nodes. The IPs within the given CIDR
but outside the given ranges will be used in the allocation.
The gateway, network, broadcast, and DNS addresses will be excluded
(protected) automatically if they are included in the CIDR.
Each element of the list can be either an interval 10.11.0.5-10.11.0.70
or a single address 10.11.0.77.
Warning
Do not use values that are out of the given CIDR.
useWholeCidr
If set to false (by default), the subnet address and broadcast
address will be excluded from the address allocation.
If set to true, the subnet address and the broadcast address
are included into the address allocation for nodes.
nameservers
The list of IP addresses of name servers. Each element of the list
is a single address, for example, 172.18.176.6.
The status field of the Subnet resource describes the actual state of
a subnet. It contains the following fields:
allocatable
The number of IP addresses that are available for allocation.
allocatedIPs
The list of allocated IP addresses in the IP:<IPAddrobjectUID> format.
capacity
The total number of IP addresses to be allocated, including the sum of
allocatable and already allocated IP addresses.
cidr
The IPv4 CIDR for a subnet.
gateway
The gateway address for a subnet.
nameservers
The list of IP addresses of name servers.
ranges
The list of IP address ranges within the given CIDR that are used in
the allocation of IPs for nodes.
statusMessage
Deprecated since Container Cloud 2.23.0 and will be removed in one of the
following releases in favor of state and messages. Since Container
Cloud 2.24.0, this field is not set for the subnets of newly created
clusters. For the field description, see state.
stateSince 2.23.0
Message that reflects the current status of the resource.
The list of possible values includes the following:
OK - object is operational.
ERR - object is non-operational. This status has a detailed
description in the messages list.
TERM - object was deleted and is terminating.
messagesSince 2.23.0
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status
field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.
Configuration example:
status:allocatable:51allocatedIPs:-172.16.48.200:24e94698-f726-11ea-a717-0242c0a85b02-172.16.48.201:2bb62373-f726-11ea-a717-0242c0a85b02-172.16.48.202:37806659-f726-11ea-a717-0242c0a85b02capacity:54cidr:172.16.48.0/24gateway:172.16.48.1nameservers:-172.18.176.6ranges:-172.16.48.200-172.16.48.253objCreated:2021-10-21T19:09:32Z by v5.1.0-20210930-121522-f5b2af8objStatusUpdated:2021-10-21T19:14:18.748114886Z by v5.1.0-20210930-121522-f5b2af8objUpdated:2021-10-21T19:09:32.606968024Z by v5.1.0-20210930-121522-f5b2af8state:OK
The Container Cloud SubnetPool CR contains the following fields:
apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1.
kind
Object type that is SubnetPool.
metadata
The metadata field contains the following subfields:
name
Name of the SubnetPool object.
namespace
Project in which the SubnetPool object was created.
labels
Key-value pairs that are attached to the object:
kaas.mirantis.com/provider
Provider type that is baremetal.
kaas.mirantis.com/region
Region name.
Note
The kaas.mirantis.com/region label is removed from all
Container Cloud objects in 2.26.0 (Cluster releases 17.1.0 and 16.1.0).
Therefore, do not add the label starting these releases. On existing
clusters updated to these releases, or if manually added, this label will
be ignored by Container Cloud.
Warning
Labels and annotations that are not documented in this API
Reference are generated automatically by Container Cloud. Do not modify them
using the Container Cloud API.
The spec field of the SubnetPool resource describes the desired state
of a subnet pool. It contains the following fields:
cidr
Valid IPv4 CIDR. For example, 10.10.0.0/16.
blockSize
IP address block size to use when assigning an IP address block
to every new child Subnet object. For example, if you set /25,
every new child Subnet will have 128 IPs to allocate.
Possible values are from /29 to the cidr size. Immutable.
nameservers
Optional. List of IP addresses of name servers to use for every new child
Subnet object. Each element of the list is a single address,
for example, 172.18.176.6. Default: empty.
gatewayPolicy
Optional. Method of assigning a gateway address to new child Subnet
objects. Default: none. Possible values are:
first - first IP of the IP address block assigned to a child
Subnet, for example, 10.11.10.1.
last - last IP of the IP address block assigned to a child Subnet,
for example, 10.11.10.254.
The status field of the SubnetPool resource describes the actual state
of a subnet pool. It contains the following fields:
allocatedSubnets
List of allocated subnets. Each subnet has the <CIDR>:<SUBNET_UID>
format.
blockSize
Block size to use for IP address assignments from the defined pool.
capacity
Total number of IP addresses to be allocated. Includes the number of
allocatable and already allocated IP addresses.
allocatable
Number of subnets with the blockSize size that are available for
allocation.
stateSince 2.23.0
Message that reflects the current status of the resource.
The list of possible values includes the following:
OK - object is operational.
ERR - object is non-operational. This status has a detailed
description in the messages list.
TERM - object was deleted and is terminating.
messagesSince 2.23.0
List of error or warning messages if the object state is ERR.
objCreated
Date, time, and IPAM version of the resource creation.
objStatusUpdated
Date, time, and IPAM version of the last update of the status
field in the resource.
objUpdated
Date, time, and IPAM version of the last resource update.
Example:
status:allocatedSubnets:-10.10.0.0/24:0272bfa9-19de-11eb-b591-0242ac110002blockSize:/24capacity:54allocatable:51objCreated:2021-10-21T19:09:32Z by v5.1.0-20210930-121522-f5b2af8objStatusUpdated:2021-10-21T19:14:18.748114886Z by v5.1.0-20210930-121522-f5b2af8objUpdated:2021-10-21T19:09:32.606968024Z by v5.1.0-20210930-121522-f5b2af8state:OK
The Mirantis Container Cloud Release Compatibility Matrix
outlines the specific operating environments that are validated and supported.
The document provides the deployment compatibility for each product release and
determines the upgrade paths between major components versions when upgrading.
The document also provides the Container Cloud browser compatibility.
A Container Cloud management cluster upgrades automatically when a new
product release becomes available. Once the management cluster has been
updated, the user may trigger the managed clusters upgrade through the
Container Cloud web UI or API.
To view the full components list with their respective versions for each
Container Cloud release, refer to the Container Cloud Release Notes related
to the release version of your deployment or use the Releases
section in the web UI or API.
Caution
The document applies to the Container Cloud regular
deployments. For supported configurations of existing Mirantis Kubernetes
Engine (MKE) clusters that are not deployed by Container Cloud, refer to
MKE Compatibility Matrix.
The following tables outline the compatibility matrices of the most recent
major Container Cloud and Cluster releases along with patch releases and
their component versions. For details about unsupported releases, see
Releases summary.
Major and patch versions update path
The primary distinction between major and patch product versions lies in
the fact that major release versions introduce new functionalities,
whereas patch release versions predominantly offer minor product
enhancements, mostly CVE resolutions for your clusters.
Depending on your deployment needs, you can either update only between
major Cluster releases or apply patch updates between major releases.
Choosing the latter option ensures you receive security fixes as soon as
they become available. Though, be prepared to update your cluster
frequently, approximately once every three weeks.
Otherwise, you can update only between major Cluster releases as each
subsequent major Cluster release includes patch Cluster release updates
of the previous major Cluster release.
Legend
Symbol
Definition
Cluster release is not included in the Container Cloud release yet.
Latest supported Cluster release to use for cluster deployment or update.
Deprecated Cluster release that you must update to the latest supported
Cluster release. The deprecated Cluster release will become unsupported
in one of the following Container Cloud releases. Greenfield deployments
based on a deprecated Cluster release are not supported.
Use the latest supported Cluster release instead.
Unsupported Cluster release that blocks automatic upgrade of a
management cluster. Update the Cluster release to the latest supported
one to unblock management cluster upgrade and obtain newest product
features and enhancements.
Component is included in the Container Cloud release.
Component is available in the Technology Preview
scope. Use it only for testing purposes on staging environments.
Component is unsupported in the Container Cloud release.
The following table outlines the compatibility matrix for the Container Cloud
release series 2.28.x and 2.29.0.
Container Cloud compatibility matrix 2.28.x and 2.29.0¶
The major Cluster release 14.1.0 is dedicated for the vSphere provider
only. This is the last Cluster release for the vSphere provider based
on MCR 20.10 and MKE 3.6.6 with Kubernetes 1.24.
OpenStack Antelope is supported as TechPreview since
MOSK 23.3.
A Container Cloud cluster based on MOSK Yoga or
Antelope with Tungsten Fabric is supported as TechPreview since
Container Cloud 2.25.1. Since Container Cloud 2.26.0, support for this
configuration is suspended. If you still require this configuration,
contact Mirantis support for further information.
OpenStack Victoria is supported until September, 2023.
MOSK 23.2 is the last release version where OpenStack
Victoria packages are updated.
If you have not already upgraded your OpenStack version to Yoga,
Mirantis highly recommends doing this during the course of the
MOSK 23.2 series. For details, see
MOSK documentation: Upgrade OpenStack.
Since Container Cloud 2.27.3 (Cluster release 16.2.3), the VMware
vSphere configuration is unsupported. For details, see
Deprecation notes.
VMware vSphere is supported on RHEL 8.7 or Ubuntu 20.04.
RHEL 8.7 is generally available since Cluster releases 16.0.0 and
14.1.0. Before these Cluster releases, it is supported within the
Technology Preview features scope.
For Ubuntu deployments, Packer builds a vSphere virtual machine
template that is based on Ubuntu 20.04 with kernel
5.15.0-116-generic.
If you build a VM template manually, we recommend installing the
same kernel version 5.15.0-116-generic.
Attachment of non Container Cloud based MKE clusters is supported
only for vSphere-based management clusters on Ubuntu 20.04. Since Container
Cloud 2.27.3 (Cluster release 16.2.3), the vSphere-based configuration is
unsupported. For details, see Deprecation notes.
The kernel version of the host operating system is validated by Mirantis
and confirmed to be working for the supported use cases. Usage of
custom kernel versions or third-party vendor-provided kernels, such
as FIPS-enabled, assume full responsibility for validating the
compatibility of components in such environments.
On non-MOSK clusters, Ubuntu 22.04 is installed by
default on management and managed clusters. Ubuntu 20.04 is not
supported.
On MOSK clusters:
Since Container Cloud 2.28.0 (Cluster releases 17.3.0), Ubuntu 22.04
is generally available for managed clusters. All existing
deployments based on Ubuntu 20.04 must be upgraded to 22.04 within
the course of 2.28.x. Otherwise, update of managed clusters to
2.29.0 will become impossible and management cluster update to
2.29.1 will be blocked.
Before Container Cloud 2.28.0 (Cluster releases 17.2.0, 16.2.0, or
earlier), Ubuntu 22.04 is installed by default on management
clusters only. And Ubuntu 20.04 is the only supported distribution
for managed clusters.
The Container Cloud web UI runs in the browser, separate from any backend
software. As such, Mirantis aims to support browsers separately from
the backend software in use, although each Container Cloud release is tested
with specific browser versions.
Mirantis currently supports the following web browsers for the Container Cloud
web UI:
Browser
Supported version
Release date
Supported operating system
Firefox
94.0 or newer
November 2, 2021
Windows, macOS
Google Chrome
96.0.4664 or newer
November 15, 2021
Windows, macOS
Microsoft Edge
95.0.1020 or newer
October 21, 2021
Windows
Caution
This table does not apply to third-party web UIs such as the
StackLight or Keycloak endpoints that are available through the Container
Cloud web UI. Refer to the official documentation of the corresponding
third-party component for details about its supported browsers versions.
To ensure the best user experience, Mirantis recommends that you use the
latest version of any of the supported browsers. The use of other browsers
or older versions of the browsers we support can result in rendering issues,
and can even lead to glitches and crashes in the event that the Container Cloud
web UI does not support some JavaScript language features or browser web APIs.
Important
Mirantis does not tie browser support to any particular Container Cloud
release.
Mirantis strives to leverage the latest in browser technology to build more
performant client software, as well as ensuring that our customers benefit from
the latest browser security updates. To this end, our strategy is to regularly
move our supported browser versions forward, while also lagging behind the
latest releases by approximately one year to give our customers a
sufficient upgrade buffer.
The primary distinction between major and patch product versions lies in
the fact that major release versions introduce new functionalities,
whereas patch release versions predominantly offer minor product
enhancements, mostly CVE resolutions for your clusters.
Depending on your deployment needs, you can either update only between
major Cluster releases or apply patch updates between major releases.
Choosing the latter option ensures you receive security fixes as soon as
they become available. Though, be prepared to update your cluster
frequently, approximately once every three weeks.
Otherwise, you can update only between major Cluster releases as each
subsequent major Cluster release includes patch Cluster release updates
of the previous major Cluster release.
This section outlines the release notes for the Mirantis
Container Cloud GA release. Within the scope of the Container Cloud GA
release, major releases are being published continuously with new features,
improvements, and critical issues resolutions to enhance the
Container Cloud GA version. Between major releases, patch releases that
incorporate fixes for CVEs of high and critical severity are being delivered.
For details, see Container Cloud releases, Cluster releases (managed), and
Patch releases.
Once a new Container Cloud release is available, a management cluster
automatically upgrades to a newer consecutive release unless this cluster
contains managed clusters with a Cluster release unsupported by the newer
Container Cloud release. For more details about the Container Cloud
release mechanism, see
Reference Architecture: Release Controller.
Does not support greenfield deployments on deprecated Cluster releases
of the 17.3.x and 16.3.x series. Use the latest available Cluster releases
of the series instead.
Caution
Make sure to update the Cluster release version
of your managed cluster before the current Cluster release
version becomes unsupported by a new Container Cloud release
version.
Otherwise, Container Cloud stops auto-upgrade and eventually
Container Cloud itself becomes unsupported.
This section outlines release notes for the Container Cloud release 2.29.0.
To allow the operator use the gitops approach, implemented the
BareMetalHostInventory resource that must be used instead of
BareMetalHost for adding and modifying configuration of bare metal servers.
The BareMetalHostInventory resource monitors and manages the state of a
bare metal server and is created for each Machine with all information
about machine hardware configuration.
Each BareMetalHostInventory object is synchronized with an automatically
created BareMetalHost object, which is now used for internal purposes of
the Container Cloud private API.
Caution
Any change in the BareMetalHost object will be overwitten by
BareMetalHostInventory.
For any existing BareMetalHost object, a BareMetalHostInventory object
is created automatically during cluster update.
Caution
While the Cluster release the management cluster is 16.4.0,
BareMetalHostInventory operations are allowed to
m:kaas@management-admin only. Once the management cluster is updated
to the Cluster release 16.4.1 (or later), this limitation will be lifted.
Validation of the Subnet object changes against allocated IP addresses¶
Implemented a validation of the Subnet object changes against already
allocated IP addresses. This validation is performed by the Admission
Controller. The controller now blocks changes in the Subnet object
containing allocated IP addresses that are out of the allocatable IP address
space, which is formed by a CIDR address and include/exclude address ranges.
Improvements in calculation of update estimates using ClusterUpdatePlan¶
Improved calculation of update estimates for a managed cluster that is managed
by the ClusterUpdatePlan object. Each step of ClusterUpdatePlan now has
more precise estimates that are based on the following calculations:
The amount and type of components updated between releases during patch
updates
The amount of nodes with particular roles in the OpenStack cluster
The number of nodes and storage used in the Ceph cluster
Also, the ClusterUpdatePlan object now contains the releaseNotes field
that links to MOSK release notes of the target release.
Switch of the default container runtime from Docker to containerd¶
Switched the default container runtime from Docker to containerd on greenfield
management and managed clusters. The use of containerd allows for better
Kubernetes performance and component update without pod restart when applying
fixes for CVEs.
On existing clusters, perform the mandatory migration from Docker to containerd
in the scope of Container Cloud 2.29.x. Otherwise, the management cluster
update to Container Cloud 2.30.0 will be blocked.
Important
Container runtime migration involves machine cordoning and
draining.
The following issues have been addressed in the Mirantis Container Cloud
release 2.29.0 along with the Cluster releases 17.4.0 and
16.4.0. For the list of MOSK addressed
issues, see MOSK release notes 25.1: Addressed issues.
Note
This section provides descriptions of issues addressed since
the last Container Cloud patch release 2.28.5.
For details on addressed issues in earlier patch releases since 2.28.0,
which are also included into the major release 2.29.0, refer to
2.28.x patch releases.
[47263] [StackLight] Fixed the issue with configuration inconsistencies
for requests and limits between the deprecated
resourcesPerClusterSize and resources parameters.
[44193] [StackLight] Fixed the issue with OpenSearch reaching the 85%
disk usage watermark on High Availability clusters that use Local Volume
Provisioner, which caused the OpenSearch cluster state to switch to
Warning or Critical.
[46858] [Container Cloud web UI] Fixed the issue that prevented the
drop-down menu from displaying the full list of allowed node labels.
[39437] [LCM] Fixed the issue that caused failure to replace a master
node and the Kubelet'sNodeReadyconditionisUnknown message in the
machine status on the remaining master nodes.
This section lists known issues with workarounds for the Mirantis
Container Cloud release 2.29.0 including the Cluster releases
17.4.0 and 16.4.0. For the list of
MOSK known issues, see
MOSK release notes 25.1: Known issues.
This section also outlines still valid known issues
from previous Container Cloud releases.
Bare metal¶[50287] BareMetalHost with a Redfish BMC address is stuck on registering phase¶
During addition of a bare metal host containing a Redfish Baseboard Management
Controller address with the following exemplary configuration may get stuck
during the registering phase:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
Ceph¶[50637] Ceph creates second miracephnodedisable object during node disabling¶
During managed cluster update, if some node is being disabled and at the same
time ceph-maintenance-controller is restarted, a second
miracephnodedisable object is erroneously created for the node. As a
result, the second object fails in the Cleaning state, which blocks
managed cluster update.
Workaround
On the affected managed cluster, obtain the list of miracephnodedisable
objects:
kubectlgetmiracephnodedisable-nceph-lcm-mirantis
The system response must contain one completed and one failed
miracephnodedisable object for the node being disabled. For example:
[50566] Ceph upgrade is very slow during patch or major cluster update¶
Due to the upstream Ceph issue
66717,
during CVE upgrade of the Ceph daemon image of Ceph Reef 18.2.4, OSDs may start
slow and even fail the starting probe with the following describe output in
the rook-ceph-osd-X pod:
Complete the following steps during every patch or major cluster update of the
Cluster releases 17.2.x, 17.3.x, and 17.4.x (until Ceph 18.2.5 becomes
supported):
Plan extra time in the maintenance window for the patch cluster update.
Slow starts will still impact the update procedure, but after completing the
following step, the recovery process noticeably shortens without affecting
the overall cluster state and data responsiveness.
Select one of the following options:
Before the cluster update, set the noout flag:
cephosdsetnoout
Once the Ceph OSDs image upgrade is done, unset the flag:
cephosdunsetnoout
Monitor the Ceph OSDs image upgrade. If the symptoms of slow start appear,
set the noout flag as soon as possible. Once the Ceph OSDs image
upgrade is done, unset the flag.
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
Scale up the affected StatefulSet or Deployment back to the
original number of replicas and wait until its state becomes Running.
LCM¶[50768] Failure to update the MCCUpgrade object¶
While editing the MCCUpgrade object, the following error occurs when trying
to save changes:
HTTPresponsebody:{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure",
"message":"Internal error occurred: failed calling webhook \"mccupgrades.kaas.mirantis.com\":failed to call webhook: the server could not find the requested resource",
"reason":"InternalError",
"details":{"causes":[{"message":"failed calling webhook \"mccupgrades.kaas.mirantis.com\":failed to call webhook: the server could not find the requested resource"}]},"code":500}
To work around the issue, remove the
name:mccupgrades.kaas.mirantis.com entry from
mutatingwebhookconfiguration:
[50561] The local-volume-provisioner pod switches to CrashLoopBackOff¶
After machine disablement and consequent re-enablement, persistent volumes
(PVs) provisioned by local-volume-provisioner that are not used by any pod
may cause the local-volume-provisioner pod on such machine to switch to the
CrashLoopBackOff state.
Workaround:
Identify the ID of the affected local-volume-provisioner:
To work around the issue, manually adjust the affected dashboards to
restore their custom appearance.
Container Cloud web UI¶[50181] Failure to deploy a compact cluster¶
A compact MOSK cluster fails to be deployed through the Container Cloud web UI
due to inability to add any label to the control plane machines along with
inability to change dedicatedControlPlane:false using the web UI.
To work around the issue, manually add the required labels using CLI. Once
done, the cluster deployment resumes.
[50168] Inability to use a new project right after creation¶
A newly created project does not display all available tabs in the Container
Cloud web UI and contains different accessdenied errors during first five
minutes after creation.
To work around the issue, refresh the browser in five minutes after the
project creation.
[50140] The Ceph Clusters tab does not display Ceph cluster details¶
The Clusters page for the bare metal provider does not display
information about the Ceph cluster in the Ceph Clusters tab and
contains accessdenied errors.
The following table lists major components and their versions delivered in
Container Cloud 2.29.0. The components that are newly added, updated,
deprecated, or removed as compared to 2.28.0, are marked with a corresponding
superscript, for example, admission-controllerUpdated.
This section lists the artifacts of components included in the Container Cloud
release 2.29.0. The components that are newly added, updated,
deprecated, or removed as compared to 2.28.0, are marked with a corresponding
superscript, for example, admission-controllerUpdated.
In total, since Container Cloud 2.28.5, in 2.29.0, 736
Common Vulnerabilities and Exposures (CVE) have been fixed:
125 of critical and 611 of high severity.
The table below includes the total numbers of addressed unique and common
vulnerabilities and exposures (CVE) by product component since the 2.28.5
patch release. The common CVEs are issues addressed across several images.
This section describes the specific actions you as a cloud operator need to
complete before or after your Container Cloud cluster update to the Cluster
releases 17.4.0 or 16.4.0. For details on update impact and maintenance
window planning, see MOSK Update notes.
Pre-update actions¶Update managed clusters to Ubuntu 22.04¶
In Container Cloud 2.29.0, the Cluster release update of the Ubuntu 20.04-based
managed clusters becomes impossible, and Ubuntu 22.04 becomes the only
supported version of the operating system. Therefore, ensure that every node of
your managed clusters are running Ubuntu 22.04 to unblock managed cluster
update in Container Cloud 2.29.0.
Management cluster update to Container Cloud 2.29.1 will be blocked
if at least one node of any related managed cluster is running Ubuntu 20.04.
Note
Existing management clusters were automatically updated to Ubuntu
22.04 during cluster upgrade to the Cluster release 16.2.0 in Container
Cloud 2.27.0. Greenfield deployments of management clusters are also based
on Ubuntu 22.04.
Back up custom Grafana dashboards on managed clusters¶
In Container Cloud 2.29.0, Grafana is updated to version 11 where the following
deprecated Angular-based plugins are automatically migrated to the React-based
ones:
Graph (old) -> Time Series
Singlestat -> Stat
Stat (old) -> Stat
Table (old) -> Table
Worldmap -> Geomap
This migration may corrupt custom Grafana dashboards that have Angular-based
panels. Therefore, if you have such dashboards on managed clusters, back them
up and manually upgrade Angular-based panels before updating to the Cluster
release 17.4.0 to prevent custom appearance issues after plugin migration.
For management clusters that are updated automatically, it is
important to remove all Angular-based panels and prepare the backup of
custom Grafana dashboards before Container Cloud 2.29.0 is released. For
details, see Post update notes in 2.28.5 release notes. Otherwise, custom dashboards using Angular-based
plugins may be corrupted and must be manually restored without a backup.
Post-update actions¶Start using BareMetalHostInventory instead of BareMetalHost¶
Container Cloud 2.29.0 introduces the BareMetalHostInventory resource that
must be used instead of BareMetalHost for adding and modifying
configuration of bare metal servers. Therefore, if you need to modify an
existing or create a new configuration of a bare metal host, use
BareMetalHostInventory.
Each BareMetalHostInventory object is synchronized with an automatically
created BareMetalHost object, which is now used for internal purposes of
the Container Cloud private API.
Caution
Any change in the BareMetalHost object will be overwitten by
BareMetalHostInventory.
For any existing BareMetalHost object, a BareMetalHostInventory object
is created automatically during cluster update.
The rules are applied automatically to all cluster nodes during cluster
update. Therefore, if you use custom Linux accounts protected by passwords, do
not plan any critical maintenance activities right after cluster upgrade as you
may need to update Linux user passwords.
Note
By default, during cluster creation, mcc-user is created without
a password with an option to add an SSH key.
Migrate container runtime from Docker to containerd¶
Container Cloud 2.29.0 introduces switching of the default container runtime
from Docker to containerd on greenfield management and managed clusters.
On existing clusters, perform the mandatory migration from Docker to containerd
in the scope of Container Cloud 2.29.x. Otherwise, the management cluster
update to Container Cloud 2.30.0 will be blocked.
Important
Container runtime migration involves machine cordoning and
draining.
Note
If you have not upgraded the operating system distribution on your
machines to Jammy yet, Mirantis recommends migrating machines from Docker
to containerd on managed clusters together with distribution upgrade to
minimize the maintenance window.
In this case, ensure that all cluster machines are updated at once during
the same maintenance window to prevent machines from running different
container runtimes.
Support for the patch Cluster release 17.3.5
that represents Mirantis OpenStack for Kubernetes (MOSK) patch release
24.3.2.
Support for Mirantis Kubernetes Engine 3.7.18 and Mirantis Container Runtime
23.0.15, which includes containerd 1.6.36.
Optional migration of container runtime from Docker to containerd.
Bare metal: update of Ubuntu mirror from ubuntu-2024-12-05-003900 to
ubuntu-2025-01-08-003900 along with update of minor kernel version from
5.15.0-126-generic to 5.15.0-130-generic.
Security fixes for CVEs in images.
This patch release also supports the latest major Cluster releases
17.3.0 and 16.3.0. And it does not support greenfield
deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.28.5, refer
to 2.28.0.
This section describes the specific actions you as a cloud operator need to
complete before or after your Container Cloud cluster update to the Cluster
releases 17.3.5 or 16.3.5.
Post-update actions¶Optional migration of container runtime from Docker to containerd¶
Since Container Cloud 2.28.4, Mirantis introduced an optional migration of
container runtime from Docker to containerd, which is implemented for existing
management and managed bare metal clusters. The use of containerd allows for
better Kubernetes performance and component update without pod restart when
applying fixes for CVEs. For the migration procedure, refer to
MOSK Operations Guide: Migrate container runtime from Docker
to containerd.
Note
Container runtime migration becomes mandatory in the scope of
Container Cloud 2.29.x. Otherwise, the management cluster update to
Container Cloud 2.30.0 will be blocked.
Note
In Containter Cloud 2.28.x series, the default container runtime
remains Docker for greenfield deployments. Support for greenfield
deployments based on containerd will be announced in one of the following
releases.
Important
Container runtime migration involves machine cordoning and
draining.
Note
If you have not upgraded the operating system distribution on your
machines to Jammy yet, Mirantis recommends migrating machines from Docker
to containerd on managed clusters together with distribution upgrade to
minimize the maintenance window.
In this case, ensure that all cluster machines are updated at once during
the same maintenance window to prevent machines from running different
container runtimes.
In Container Cloud 2.29.0, Grafana will be updated to version 11 where
the following deprecated Angular-based plugins will be automatically migrated
to the React-based ones:
Graph (old) -> Time Series
Singlestat -> Stat
Stat (old) -> Stat
Table (old) -> Table
Worldmap -> Geomap
This migration may corrupt custom Grafana dashboards that have Angular-based
panels. Therefore, if you have such dashboards, back them up and manually
upgrade Angular-based panels during the course of Container Cloud 2.28.x
(Cluster releases 17.3.x and 16.3.x) to prevent custom appearance issues after
plugin migration in Container Cloud 2.29.0 (Cluster releases 17.4.0 and
16.4.0).
For management clusters that are updated automatically, it is
important to prepare the backup before Container Cloud 2.29.0 is released.
Otherwise, custom dashboards using Angular-based plugins may be corrupted.
For managed clusters, you can perform the backup after the Container Cloud
2.29.0 release date but before updating them to the Cluster release 17.4.0.
In total, since Container Cloud 2.28.4, 1 Common Vulnerability and Exposure
(CVE) of high severity has been fixed in 2.28.5.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since Container Cloud 2.28.4.
The common CVEs are issues addressed across several images.
If the dnsmasq pod is restarted during the bootstrap of newly added
nodes, those nodes may fail to undergo inspection. That can result in
inspectionerror in the corresponding BareMetalHost objects.
The issue can occur when:
The dnsmasq pod was moved to another node.
DHCP subnets were changed, including addition or removal. In this case, the
dhcpd container of the dnsmasq pod is restarted.
Caution
If changing or adding of DHCP subnets is required to bootstrap
new nodes, wait after changing or adding DHCP subnets until the
dnsmasq pod becomes ready, then create BareMetalHost objects.
To verify whether the nodes are affected:
Verify whether the BareMetalHost objects contain the
inspectionerror:
Verify whether the dnsmasq pod was in Ready state when the
inspection of the affected baremetal hosts (test-worker-3 in the example
above) was started:
In the system response above, inspection was started at
"2024-10-11T07:38:19Z", immediately before the period of the dhcpd
container downtime. Therefore, this node is most likely affected by the
issue.
Workaround
Reboot the node using the IPMI reset or cycle
command.
If the node fails to boot, remove the failed BareMetalHost object and
create it again:
Remove BareMetalHost object. For example:
kubectldeletebmh-nmanaged-nstest-worker-3
Verify that the BareMetalHost object is removed:
kubectlgetbmh-nmanaged-nstest-worker-3
Create a BareMetalHost object from the template. For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
Ceph¶[50566] Ceph upgrade is very slow during patch or major cluster update¶
Due to the upstream Ceph issue
66717,
during CVE upgrade of the Ceph daemon image of Ceph Reef 18.2.4, OSDs may start
slow and even fail the starting probe with the following describe output in
the rook-ceph-osd-X pod:
Complete the following steps during every patch or major cluster update of the
Cluster releases 17.2.x, 17.3.x, and 17.4.x (until Ceph 18.2.5 becomes
supported):
Plan extra time in the maintenance window for the patch cluster update.
Slow starts will still impact the update procedure, but after completing the
following step, the recovery process noticeably shortens without affecting
the overall cluster state and data responsiveness.
Select one of the following options:
Before the cluster update, set the noout flag:
cephosdsetnoout
Once the Ceph OSDs image upgrade is done, unset the flag:
cephosdunsetnoout
Monitor the Ceph OSDs image upgrade. If the symptoms of slow start appear,
set the noout flag as soon as possible. Once the Ceph OSDs image
upgrade is done, unset the flag.
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
On High Availability (HA) clusters that use Local Volume Provisioner (LVP),
Prometheus and OpenSearch from StackLight may share the same pool of storage.
In such configuration, OpenSearch may approach the 85% disk usage watermark
due to the combined storage allocation and usage patterns set by the Persistent
Volume Claim (PVC) size parameters for Prometheus and OpenSearch, which consume
storage the most.
When the 85% threshold is reached, the affected node is transitioned to the
read-only state, preventing shard allocation and causing the OpenSearch cluster
state to transition to Warning (Yellow) or Critical (Red).
Caution
The issue and the provided workaround apply only for clusters on
which OpenSearch and Prometheus utilize the same storage pool.
Derived from .values.elasticsearch.persistentVolumeUsableStorageSizeGB,
defaulting to .values.elasticsearch.persistentVolumeClaimSize if
unspecified. To obtain the OpenSearch PVC size:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
If the formula result is positive, it is an early indication that the
cluster is affected.
Verify whether the OpenSearchClusterStatusWarning or
OpenSearchClusterStatusCritical alert is firing. And if so,
verify the following:
Log in to the OpenSearch web UI.
In Management -> Dev Tools, run the following command:
GET_cluster/allocation/explain
The following system response indicates that the corresponding node is
affected:
"explanation":"the node is above the low watermark cluster setting \[cluster.routing.allocation.disk.watermark.low=85%], using more disk space \than the maximum allowed [85.0%], actual free: [xx.xxx%]"
Note
The system response may contain even higher watermark percent
than 85.0%, depending on the case.
Workaround:
Warning
The workaround implies adjustement of the retention threshold for
OpenSearch. And depending on the new threshold, some old logs will be
deleted.
A user-defined variable that specifies what percentage of the total storage
capacity should not be used by OpenSearch or Prometheus. This is used to
reserve space for other components. It should be expressed as a decimal.
For example, for 5% of reservation, Reserved_Percentage is 0.05.
Mirantis recommends using 0.05 as a starting point.
Filesystem_Reserve
Percentage to deduct for filesystems that may reserve some portion of the
available storage, which is marked as occupied. For example, for EXT4, it
is 5% by default, so the value must be 0.05.
Prometheus_PVC_Size_GB
Sourced from .values.prometheusServer.persistentVolumeClaimSize.
Total_Storage_Capacity_GB
Total capacity of the OpenSearch PVCs. For LVP, the capacity of the
storage pool. To obtain the total capacity:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
Calculation of above formula provides a maximum safe storage to allocate
for .values.elasticsearch.persistentVolumeUsableStorageSizeGB. Use this
formula as a reference for setting
.values.elasticsearch.persistentVolumeUsableStorageSizeGB on a cluster.
Wait up to 15-20 mins for OpenSearch to perform the cleaning.
Verify that the cluster is not affected anymore using the procedure above.
Container Cloud web UI¶[50181] Failure to deploy a compact cluster¶
A compact MOSK cluster fails to be deployed through the Container Cloud web UI
due to inability to add any label to the control plane machines along with
inability to change dedicatedControlPlane:false using the web UI.
To work around the issue, manually add the required labels using CLI. Once
done, the cluster deployment resumes.
[50168] Inability to use a new project right after creation¶
A newly created project does not display all available tabs in the Container
Cloud web UI and contains different accessdenied errors during first five
minutes after creation.
To work around the issue, refresh the browser in five minutes after the
project creation.
This section lists the artifacts of components included in the Container Cloud
patch release 2.28.5. For artifacts of the Cluster releases introduced in
2.28.5, see patch Cluster releases 17.3.5 and 16.3.5.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
Support for the patch Cluster release 17.3.4
that represents Mirantis OpenStack for Kubernetes (MOSK) patch release
24.3.1.
Support for Mirantis Kubernetes Engine to 3.7.17 and Mirantis Container
Runtime 23.0.15, which includes containerd 1.6.36.
Optional migration of container runtime from Docker to containerd.
Bare metal: update of Ubuntu mirror from ubuntu-2024-11-18-003900 to
ubuntu-2024-12-05-003900 along with update of minor kernel version from
5.15.0-125-generic to 5.15.0-126-generic.
Security fixes for CVEs in images.
OpenStack provider: suspension of support for cluster deployment and update.
For details, see Deprecation notes.
This patch release also supports the latest major Cluster releases
17.3.0 and 16.3.0. And it does not support greenfield
deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.28.4, refer
to 2.28.0.
This section describes the specific actions you as a cloud operator need to
complete before or after your Container Cloud cluster update to the Cluster
releases 17.3.4 or 16.3.4.
Important
For MOSK deployments, although
MOSK 24.3.1 is classified as a patch release, as a
cloud operator, you will be performing a major update regardless of the
upgrade path: whether you are upgrading from patch 24.2.5 or major version
24.3. For details, see MOSK 24.3.1 release notes: Update
notes.
Post-update actions¶Optional migration of container runtime from Docker to containerd¶
Container Cloud 2.28.4 introduces an optional migration of container runtime
from Docker to containerd, which is implemented for existing management and
managed bare metal clusters. The use of containerd allows for better Kubernetes
performance and component update without pod restart when applying fixes for
CVEs. For the migration procedure, refer to MOSK Operations
Guide: Migrate container runtime from Docker to containerd.
Note
Container runtime migration becomes mandatory in the scope of
Container Cloud 2.29.x. Otherwise, the management cluster update to
Container Cloud 2.30.0 will be blocked.
Note
In Containter Cloud 2.28.x series, the default container runtime
remains Docker for greenfield deployments. Support for greenfield
deployments based on containerd will be announced in one of the following
releases.
Important
Container runtime migration involves machine cordoning and
draining.
Note
If you have not upgraded the operating system distribution on your
machines to Jammy yet, Mirantis recommends migrating machines from Docker
to containerd on managed clusters together with distribution upgrade to
minimize the maintenance window.
In this case, ensure that all cluster machines are updated at once during
the same maintenance window to prevent machines from running different
container runtimes.
In Container Cloud 2.29.0, Grafana will be updated to version 11 where
the following deprecated Angular-based plugins will be automatically migrated
to the React-based ones:
Graph (old) -> Time Series
Singlestat -> Stat
Stat (old) -> Stat
Table (old) -> Table
Worldmap -> Geomap
This migration may corrupt custom Grafana dashboards that have Angular-based
panels. Therefore, if you have such dashboards, back them up and manually
upgrade Angular-based panels during the course of Container Cloud 2.28.x
(Cluster releases 17.3.x and 16.3.x) to prevent custom appearance issues after
plugin migration in Container Cloud 2.29.0 (Cluster releases 17.4.0 and
16.4.0).
For management clusters that are updated automatically, it is
important to prepare the backup before Container Cloud 2.29.0 is released.
Otherwise, custom dashboards using Angular-based plugins may be corrupted.
For managed clusters, you can perform the backup after the Container Cloud
2.29.0 release date but before updating them to the Cluster release 17.4.0.
In total, since Container Cloud 2.28.3, 158 Common Vulnerabilities and
Exposures (CVE) have been fixed in 2.28.4: 10 of critical and 148 of
high severity.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since Container Cloud 2.28.3.
The common CVEs are issues addressed across several images.
The following issues have been addressed in the Container Cloud patch release
2.28.4 along with the patch Cluster releases 16.3.4 and
17.3.4:
[30294] [LCM] Fixed the issue that prevented replacement of a manager
machine during the calico-node Pod start on a new node that has the
same IP address as the node being replaced.
[5782] [LCM] Fixed the issue that prevented deployment of a manager
machine during node replacement.
[5568] [LCM] Fixed the issue that prevented cleaning of resources by the
calico-kube-controllers Pod during unsafe or forced deletion of a
manager machine.
If the dnsmasq pod is restarted during the bootstrap of newly added
nodes, those nodes may fail to undergo inspection. That can result in
inspectionerror in the corresponding BareMetalHost objects.
The issue can occur when:
The dnsmasq pod was moved to another node.
DHCP subnets were changed, including addition or removal. In this case, the
dhcpd container of the dnsmasq pod is restarted.
Caution
If changing or adding of DHCP subnets is required to bootstrap
new nodes, wait after changing or adding DHCP subnets until the
dnsmasq pod becomes ready, then create BareMetalHost objects.
To verify whether the nodes are affected:
Verify whether the BareMetalHost objects contain the
inspectionerror:
Verify whether the dnsmasq pod was in Ready state when the
inspection of the affected baremetal hosts (test-worker-3 in the example
above) was started:
In the system response above, inspection was started at
"2024-10-11T07:38:19Z", immediately before the period of the dhcpd
container downtime. Therefore, this node is most likely affected by the
issue.
Workaround
Reboot the node using the IPMI reset or cycle
command.
If the node fails to boot, remove the failed BareMetalHost object and
create it again:
Remove BareMetalHost object. For example:
kubectldeletebmh-nmanaged-nstest-worker-3
Verify that the BareMetalHost object is removed:
kubectlgetbmh-nmanaged-nstest-worker-3
Create a BareMetalHost object from the template. For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
Ceph¶[50566] Ceph upgrade is very slow during patch or major cluster update¶
Due to the upstream Ceph issue
66717,
during CVE upgrade of the Ceph daemon image of Ceph Reef 18.2.4, OSDs may start
slow and even fail the starting probe with the following describe output in
the rook-ceph-osd-X pod:
Complete the following steps during every patch or major cluster update of the
Cluster releases 17.2.x, 17.3.x, and 17.4.x (until Ceph 18.2.5 becomes
supported):
Plan extra time in the maintenance window for the patch cluster update.
Slow starts will still impact the update procedure, but after completing the
following step, the recovery process noticeably shortens without affecting
the overall cluster state and data responsiveness.
Select one of the following options:
Before the cluster update, set the noout flag:
cephosdsetnoout
Once the Ceph OSDs image upgrade is done, unset the flag:
cephosdunsetnoout
Monitor the Ceph OSDs image upgrade. If the symptoms of slow start appear,
set the noout flag as soon as possible. Once the Ceph OSDs image
upgrade is done, unset the flag.
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
On High Availability (HA) clusters that use Local Volume Provisioner (LVP),
Prometheus and OpenSearch from StackLight may share the same pool of storage.
In such configuration, OpenSearch may approach the 85% disk usage watermark
due to the combined storage allocation and usage patterns set by the Persistent
Volume Claim (PVC) size parameters for Prometheus and OpenSearch, which consume
storage the most.
When the 85% threshold is reached, the affected node is transitioned to the
read-only state, preventing shard allocation and causing the OpenSearch cluster
state to transition to Warning (Yellow) or Critical (Red).
Caution
The issue and the provided workaround apply only for clusters on
which OpenSearch and Prometheus utilize the same storage pool.
Derived from .values.elasticsearch.persistentVolumeUsableStorageSizeGB,
defaulting to .values.elasticsearch.persistentVolumeClaimSize if
unspecified. To obtain the OpenSearch PVC size:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
If the formula result is positive, it is an early indication that the
cluster is affected.
Verify whether the OpenSearchClusterStatusWarning or
OpenSearchClusterStatusCritical alert is firing. And if so,
verify the following:
Log in to the OpenSearch web UI.
In Management -> Dev Tools, run the following command:
GET_cluster/allocation/explain
The following system response indicates that the corresponding node is
affected:
"explanation":"the node is above the low watermark cluster setting \[cluster.routing.allocation.disk.watermark.low=85%], using more disk space \than the maximum allowed [85.0%], actual free: [xx.xxx%]"
Note
The system response may contain even higher watermark percent
than 85.0%, depending on the case.
Workaround:
Warning
The workaround implies adjustement of the retention threshold for
OpenSearch. And depending on the new threshold, some old logs will be
deleted.
A user-defined variable that specifies what percentage of the total storage
capacity should not be used by OpenSearch or Prometheus. This is used to
reserve space for other components. It should be expressed as a decimal.
For example, for 5% of reservation, Reserved_Percentage is 0.05.
Mirantis recommends using 0.05 as a starting point.
Filesystem_Reserve
Percentage to deduct for filesystems that may reserve some portion of the
available storage, which is marked as occupied. For example, for EXT4, it
is 5% by default, so the value must be 0.05.
Prometheus_PVC_Size_GB
Sourced from .values.prometheusServer.persistentVolumeClaimSize.
Total_Storage_Capacity_GB
Total capacity of the OpenSearch PVCs. For LVP, the capacity of the
storage pool. To obtain the total capacity:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
Calculation of above formula provides a maximum safe storage to allocate
for .values.elasticsearch.persistentVolumeUsableStorageSizeGB. Use this
formula as a reference for setting
.values.elasticsearch.persistentVolumeUsableStorageSizeGB on a cluster.
Wait up to 15-20 mins for OpenSearch to perform the cleaning.
Verify that the cluster is not affected anymore using the procedure above.
Container Cloud web UI¶[50181] Failure to deploy a compact cluster¶
A compact MOSK cluster fails to be deployed through the Container Cloud web UI
due to inability to add any label to the control plane machines along with
inability to change dedicatedControlPlane:false using the web UI.
To work around the issue, manually add the required labels using CLI. Once
done, the cluster deployment resumes.
[50168] Inability to use a new project right after creation¶
A newly created project does not display all available tabs in the Container
Cloud web UI and contains different accessdenied errors during first five
minutes after creation.
To work around the issue, refresh the browser in five minutes after the
project creation.
This section lists the artifacts of components included in the Container Cloud
patch release 2.28.4. For artifacts of the Cluster releases introduced in
2.28.4, see patch Cluster releases 17.3.4 and 16.3.4.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
This section contains historical information on the unsupported Container
Cloud releases delivered in 2024. For the latest supported Container
Cloud release, see Container Cloud releases.
Container Cloud 2.27.2 is the second patch release of the 2.27.x
release series that introduces the following updates:
Support for the patch Cluster release 16.2.2.
Support for the patch Cluster releases 16.1.7 and 17.1.7 that
represents MOSK patch release
24.1.7.
Support for MKE 3.7.11.
Bare metal: update of Ubuntu mirror to ubuntu-2024-07-16-014744 along with
update of the minor kernel version to 5.15.0-116-generic (Cluster release 16.2.2).
Support for the patch Cluster releases 16.2.7 and 17.2.7
that represents Mirantis OpenStack for Kubernetes (MOSK) patch release
24.2.5.
Bare metal: update of Ubuntu mirror from ubuntu-2024-10-28-012906 to
ubuntu-2024-11-18-003900 along with update of minor kernel version from
5.15.0-124-generic to 5.15.0-125-generic.
Security fixes for CVEs in images.
This patch release also supports the latest major Cluster releases
17.3.0 and 16.3.0. And it does not support greenfield
deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.28.3, refer
to 2.28.0.
In total, since Container Cloud 2.28.2, 66 Common Vulnerabilities and
Exposures (CVE) have been fixed in 2.28.3: 4 of critical and 62 of
high severity.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since Container Cloud 2.28.2.
The common CVEs are issues addressed across several images.
The following issues have been addressed in the Container Cloud patch release
2.28.3 along with the patch Cluster releases 16.3.3,
16.2.7, and 17.2.7:
[47594] [StackLight] Fixed the issue with Patroni pods getting stuck in
the CrashLoopBackOff state due to the patroni container being
terminated with reason:OOMKilled.
[47929] [LCM] Fixed the issue with incorrect restrictive permissions set
for registry certificate files in /etc/docker/certs.d, which were set to
644 instead of 444.
This section lists known issues with workarounds for the Mirantis
Container Cloud release 2.28.3 including the Cluster releases 16.2.7,
16.3.3, and 17.2.7.
If the dnsmasq pod is restarted during the bootstrap of newly added
nodes, those nodes may fail to undergo inspection. That can result in
inspectionerror in the corresponding BareMetalHost objects.
The issue can occur when:
The dnsmasq pod was moved to another node.
DHCP subnets were changed, including addition or removal. In this case, the
dhcpd container of the dnsmasq pod is restarted.
Caution
If changing or adding of DHCP subnets is required to bootstrap
new nodes, wait after changing or adding DHCP subnets until the
dnsmasq pod becomes ready, then create BareMetalHost objects.
To verify whether the nodes are affected:
Verify whether the BareMetalHost objects contain the
inspectionerror:
Verify whether the dnsmasq pod was in Ready state when the
inspection of the affected baremetal hosts (test-worker-3 in the example
above) was started:
In the system response above, inspection was started at
"2024-10-11T07:38:19Z", immediately before the period of the dhcpd
container downtime. Therefore, this node is most likely affected by the
issue.
Workaround
Reboot the node using the IPMI reset or cycle
command.
If the node fails to boot, remove the failed BareMetalHost object and
create it again:
Remove BareMetalHost object. For example:
kubectldeletebmh-nmanaged-nstest-worker-3
Verify that the BareMetalHost object is removed:
kubectlgetbmh-nmanaged-nstest-worker-3
Create a BareMetalHost object from the template. For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
Ceph¶[50566] Ceph upgrade is very slow during patch or major cluster update¶
Due to the upstream Ceph issue
66717,
during CVE upgrade of the Ceph daemon image of Ceph Reef 18.2.4, OSDs may start
slow and even fail the starting probe with the following describe output in
the rook-ceph-osd-X pod:
Complete the following steps during every patch or major cluster update of the
Cluster releases 17.2.x, 17.3.x, and 17.4.x (until Ceph 18.2.5 becomes
supported):
Plan extra time in the maintenance window for the patch cluster update.
Slow starts will still impact the update procedure, but after completing the
following step, the recovery process noticeably shortens without affecting
the overall cluster state and data responsiveness.
Select one of the following options:
Before the cluster update, set the noout flag:
cephosdsetnoout
Once the Ceph OSDs image upgrade is done, unset the flag:
cephosdunsetnoout
Monitor the Ceph OSDs image upgrade. If the symptoms of slow start appear,
set the noout flag as soon as possible. Once the Ceph OSDs image
upgrade is done, unset the flag.
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
StackLight¶[44193] OpenSearch reaches 85% disk usage watermark affecting the cluster state¶
On High Availability (HA) clusters that use Local Volume Provisioner (LVP),
Prometheus and OpenSearch from StackLight may share the same pool of storage.
In such configuration, OpenSearch may approach the 85% disk usage watermark
due to the combined storage allocation and usage patterns set by the Persistent
Volume Claim (PVC) size parameters for Prometheus and OpenSearch, which consume
storage the most.
When the 85% threshold is reached, the affected node is transitioned to the
read-only state, preventing shard allocation and causing the OpenSearch cluster
state to transition to Warning (Yellow) or Critical (Red).
Caution
The issue and the provided workaround apply only for clusters on
which OpenSearch and Prometheus utilize the same storage pool.
Derived from .values.elasticsearch.persistentVolumeUsableStorageSizeGB,
defaulting to .values.elasticsearch.persistentVolumeClaimSize if
unspecified. To obtain the OpenSearch PVC size:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
If the formula result is positive, it is an early indication that the
cluster is affected.
Verify whether the OpenSearchClusterStatusWarning or
OpenSearchClusterStatusCritical alert is firing. And if so,
verify the following:
Log in to the OpenSearch web UI.
In Management -> Dev Tools, run the following command:
GET_cluster/allocation/explain
The following system response indicates that the corresponding node is
affected:
"explanation":"the node is above the low watermark cluster setting \[cluster.routing.allocation.disk.watermark.low=85%], using more disk space \than the maximum allowed [85.0%], actual free: [xx.xxx%]"
Note
The system response may contain even higher watermark percent
than 85.0%, depending on the case.
Workaround:
Warning
The workaround implies adjustement of the retention threshold for
OpenSearch. And depending on the new threshold, some old logs will be
deleted.
A user-defined variable that specifies what percentage of the total storage
capacity should not be used by OpenSearch or Prometheus. This is used to
reserve space for other components. It should be expressed as a decimal.
For example, for 5% of reservation, Reserved_Percentage is 0.05.
Mirantis recommends using 0.05 as a starting point.
Filesystem_Reserve
Percentage to deduct for filesystems that may reserve some portion of the
available storage, which is marked as occupied. For example, for EXT4, it
is 5% by default, so the value must be 0.05.
Prometheus_PVC_Size_GB
Sourced from .values.prometheusServer.persistentVolumeClaimSize.
Total_Storage_Capacity_GB
Total capacity of the OpenSearch PVCs. For LVP, the capacity of the
storage pool. To obtain the total capacity:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
Calculation of above formula provides a maximum safe storage to allocate
for .values.elasticsearch.persistentVolumeUsableStorageSizeGB. Use this
formula as a reference for setting
.values.elasticsearch.persistentVolumeUsableStorageSizeGB on a cluster.
Wait up to 15-20 mins for OpenSearch to perform the cleaning.
Verify that the cluster is not affected anymore using the procedure above.
Container Cloud web UI¶[50181] Failure to deploy a compact cluster¶
A compact MOSK cluster fails to be deployed through the Container Cloud web UI
due to inability to add any label to the control plane machines along with
inability to change dedicatedControlPlane:false using the web UI.
To work around the issue, manually add the required labels using CLI. Once
done, the cluster deployment resumes.
[50168] Inability to use a new project right after creation¶
A newly created project does not display all available tabs in the Container
Cloud web UI and contains different accessdenied errors during first five
minutes after creation.
To work around the issue, refresh the browser in five minutes after the
project creation.
This section lists the artifacts of components included in the Container Cloud
patch release 2.28.3. For artifacts of the Cluster releases introduced in
2.28.3, see patch Cluster releases 17.2.7, 16.3.3, and
16.2.7.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
Support for the patch Cluster releases 16.2.6 and 17.2.6
that represents Mirantis OpenStack for Kubernetes (MOSK) patch release
24.2.4.
Support for MKE 3.7.16.
Bare metal: update of Ubuntu mirror from ubuntu-2024-10-14-013948 to
ubuntu-2024-10-28-012906 along with update of minor kernel version from
5.15.0-122-generic to 5.15.0-124-generic.
Security fixes for CVEs in images.
This patch release also supports the latest major Cluster releases
17.3.0 and 16.3.0. And it does not support greenfield
deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.28.2, refer
to 2.28.0.
In total, since Container Cloud 2.28.1, 15 Common Vulnerabilities and
Exposures (CVE) of high severity have been fixed in 2.28.2.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since Container Cloud 2.28.1.
The common CVEs are issues addressed across several images.
The following issues have been addressed in the Container Cloud patch release
2.28.2 along with the patch Cluster releases 16.3.2,
16.2.6, and 17.2.6.
[47741] [LCM] Fixed the issue with upgrade to MKE 3.7.15 getting stuck
due to the leftover ucp-upgrade-check-images service that is part of MKE
3.7.12.
[47304] [StackLight] Fixed the issue with OpenSearch not storing kubelet
logs due to the JSON-based format of ucp-kubelet logs.
This section lists known issues with workarounds for the Mirantis
Container Cloud release 2.28.2 including the Cluster releases 16.2.6,
16.3.2, and 17.2.6.
If the dnsmasq pod is restarted during the bootstrap of newly added
nodes, those nodes may fail to undergo inspection. That can result in
inspectionerror in the corresponding BareMetalHost objects.
The issue can occur when:
The dnsmasq pod was moved to another node.
DHCP subnets were changed, including addition or removal. In this case, the
dhcpd container of the dnsmasq pod is restarted.
Caution
If changing or adding of DHCP subnets is required to bootstrap
new nodes, wait after changing or adding DHCP subnets until the
dnsmasq pod becomes ready, then create BareMetalHost objects.
To verify whether the nodes are affected:
Verify whether the BareMetalHost objects contain the
inspectionerror:
Verify whether the dnsmasq pod was in Ready state when the
inspection of the affected baremetal hosts (test-worker-3 in the example
above) was started:
In the system response above, inspection was started at
"2024-10-11T07:38:19Z", immediately before the period of the dhcpd
container downtime. Therefore, this node is most likely affected by the
issue.
Workaround
Reboot the node using the IPMI reset or cycle
command.
If the node fails to boot, remove the failed BareMetalHost object and
create it again:
Remove BareMetalHost object. For example:
kubectldeletebmh-nmanaged-nstest-worker-3
Verify that the BareMetalHost object is removed:
kubectlgetbmh-nmanaged-nstest-worker-3
Create a BareMetalHost object from the template. For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
Ceph¶[50566] Ceph upgrade is very slow during patch or major cluster update¶
Due to the upstream Ceph issue
66717,
during CVE upgrade of the Ceph daemon image of Ceph Reef 18.2.4, OSDs may start
slow and even fail the starting probe with the following describe output in
the rook-ceph-osd-X pod:
Complete the following steps during every patch or major cluster update of the
Cluster releases 17.2.x, 17.3.x, and 17.4.x (until Ceph 18.2.5 becomes
supported):
Plan extra time in the maintenance window for the patch cluster update.
Slow starts will still impact the update procedure, but after completing the
following step, the recovery process noticeably shortens without affecting
the overall cluster state and data responsiveness.
Select one of the following options:
Before the cluster update, set the noout flag:
cephosdsetnoout
Once the Ceph OSDs image upgrade is done, unset the flag:
cephosdunsetnoout
Monitor the Ceph OSDs image upgrade. If the symptoms of slow start appear,
set the noout flag as soon as possible. Once the Ceph OSDs image
upgrade is done, unset the flag.
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
StackLight¶[47594] Patroni pods may get stuck in the CrashLoopBackOff state¶
The Patroni pods may get stuck in the CrashLoopBackOff state due to the
patroni container being terminated with reason:OOMKilled that you can
see in the pod status. For example:
On High Availability (HA) clusters that use Local Volume Provisioner (LVP),
Prometheus and OpenSearch from StackLight may share the same pool of storage.
In such configuration, OpenSearch may approach the 85% disk usage watermark
due to the combined storage allocation and usage patterns set by the Persistent
Volume Claim (PVC) size parameters for Prometheus and OpenSearch, which consume
storage the most.
When the 85% threshold is reached, the affected node is transitioned to the
read-only state, preventing shard allocation and causing the OpenSearch cluster
state to transition to Warning (Yellow) or Critical (Red).
Caution
The issue and the provided workaround apply only for clusters on
which OpenSearch and Prometheus utilize the same storage pool.
Derived from .values.elasticsearch.persistentVolumeUsableStorageSizeGB,
defaulting to .values.elasticsearch.persistentVolumeClaimSize if
unspecified. To obtain the OpenSearch PVC size:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
If the formula result is positive, it is an early indication that the
cluster is affected.
Verify whether the OpenSearchClusterStatusWarning or
OpenSearchClusterStatusCritical alert is firing. And if so,
verify the following:
Log in to the OpenSearch web UI.
In Management -> Dev Tools, run the following command:
GET_cluster/allocation/explain
The following system response indicates that the corresponding node is
affected:
"explanation":"the node is above the low watermark cluster setting \[cluster.routing.allocation.disk.watermark.low=85%], using more disk space \than the maximum allowed [85.0%], actual free: [xx.xxx%]"
Note
The system response may contain even higher watermark percent
than 85.0%, depending on the case.
Workaround:
Warning
The workaround implies adjustement of the retention threshold for
OpenSearch. And depending on the new threshold, some old logs will be
deleted.
A user-defined variable that specifies what percentage of the total storage
capacity should not be used by OpenSearch or Prometheus. This is used to
reserve space for other components. It should be expressed as a decimal.
For example, for 5% of reservation, Reserved_Percentage is 0.05.
Mirantis recommends using 0.05 as a starting point.
Filesystem_Reserve
Percentage to deduct for filesystems that may reserve some portion of the
available storage, which is marked as occupied. For example, for EXT4, it
is 5% by default, so the value must be 0.05.
Prometheus_PVC_Size_GB
Sourced from .values.prometheusServer.persistentVolumeClaimSize.
Total_Storage_Capacity_GB
Total capacity of the OpenSearch PVCs. For LVP, the capacity of the
storage pool. To obtain the total capacity:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
Calculation of above formula provides a maximum safe storage to allocate
for .values.elasticsearch.persistentVolumeUsableStorageSizeGB. Use this
formula as a reference for setting
.values.elasticsearch.persistentVolumeUsableStorageSizeGB on a cluster.
Wait up to 15-20 mins for OpenSearch to perform the cleaning.
Verify that the cluster is not affected anymore using the procedure above.
Container Cloud web UI¶[50181] Failure to deploy a compact cluster¶
A compact MOSK cluster fails to be deployed through the Container Cloud web UI
due to inability to add any label to the control plane machines along with
inability to change dedicatedControlPlane:false using the web UI.
To work around the issue, manually add the required labels using CLI. Once
done, the cluster deployment resumes.
[50168] Inability to use a new project right after creation¶
A newly created project does not display all available tabs in the Container
Cloud web UI and contains different accessdenied errors during first five
minutes after creation.
To work around the issue, refresh the browser in five minutes after the
project creation.
This section lists the artifacts of components included in the Container Cloud
patch release 2.28.2. For artifacts of the Cluster releases introduced in
2.28.2, see patch Cluster releases 17.2.6, 16.3.2, and
16.2.6.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
Support for the patch Cluster releases 16.2.5 and 17.2.5
that represents Mirantis OpenStack for Kubernetes (MOSK) patch release
24.2.3.
Support for MKE 3.7.15.
Bare metal: update of Ubuntu mirror from 2024-09-11-014225 to
ubuntu-2024-10-14-013948 along with update of minor kernel version from
5.15.0-119-generic to 5.15.0-122-generic.
Security fixes for CVEs in images.
This patch release also supports the latest major Cluster releases
17.3.0 and 16.3.0. And it does not support greenfield
deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.28.1, refer
to 2.28.0.
In total, since Container Cloud 2.28.0, 400 Common Vulnerabilities and
Exposures (CVE) have been fixed in 2.28.1: 46 of critical and 354 of
high severity.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since Container Cloud 2.28.0.
The common CVEs are issues addressed across several images.
This section lists known issues with workarounds for the Mirantis
Container Cloud release 2.28.1 including the Cluster releases 16.2.5,
16.3.1, and 17.2.5.
If the dnsmasq pod is restarted during the bootstrap of newly added
nodes, those nodes may fail to undergo inspection. That can result in
inspectionerror in the corresponding BareMetalHost objects.
The issue can occur when:
The dnsmasq pod was moved to another node.
DHCP subnets were changed, including addition or removal. In this case, the
dhcpd container of the dnsmasq pod is restarted.
Caution
If changing or adding of DHCP subnets is required to bootstrap
new nodes, wait after changing or adding DHCP subnets until the
dnsmasq pod becomes ready, then create BareMetalHost objects.
To verify whether the nodes are affected:
Verify whether the BareMetalHost objects contain the
inspectionerror:
Verify whether the dnsmasq pod was in Ready state when the
inspection of the affected baremetal hosts (test-worker-3 in the example
above) was started:
In the system response above, inspection was started at
"2024-10-11T07:38:19Z", immediately before the period of the dhcpd
container downtime. Therefore, this node is most likely affected by the
issue.
Workaround
Reboot the node using the IPMI reset or cycle
command.
If the node fails to boot, remove the failed BareMetalHost object and
create it again:
Remove BareMetalHost object. For example:
kubectldeletebmh-nmanaged-nstest-worker-3
Verify that the BareMetalHost object is removed:
kubectlgetbmh-nmanaged-nstest-worker-3
Create a BareMetalHost object from the template. For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
Ceph¶[50566] Ceph upgrade is very slow during patch or major cluster update¶
Due to the upstream Ceph issue
66717,
during CVE upgrade of the Ceph daemon image of Ceph Reef 18.2.4, OSDs may start
slow and even fail the starting probe with the following describe output in
the rook-ceph-osd-X pod:
Complete the following steps during every patch or major cluster update of the
Cluster releases 17.2.x, 17.3.x, and 17.4.x (until Ceph 18.2.5 becomes
supported):
Plan extra time in the maintenance window for the patch cluster update.
Slow starts will still impact the update procedure, but after completing the
following step, the recovery process noticeably shortens without affecting
the overall cluster state and data responsiveness.
Select one of the following options:
Before the cluster update, set the noout flag:
cephosdsetnoout
Once the Ceph OSDs image upgrade is done, unset the flag:
cephosdunsetnoout
Monitor the Ceph OSDs image upgrade. If the symptoms of slow start appear,
set the noout flag as soon as possible. Once the Ceph OSDs image
upgrade is done, unset the flag.
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
StackLight¶[47594] Patroni pods may get stuck in the CrashLoopBackOff state¶
The Patroni pods may get stuck in the CrashLoopBackOff state due to the
patroni container being terminated with reason:OOMKilled that you can
see in the pod status. For example:
Due to the JSON-based format of ucp-kubelet logs, OpenSearch does not store
kubelet logs. Mirantis is working on the issue and will deliver the resolution
in one of the nearest patch releases.
[44193] OpenSearch reaches 85% disk usage watermark affecting the cluster state¶
On High Availability (HA) clusters that use Local Volume Provisioner (LVP),
Prometheus and OpenSearch from StackLight may share the same pool of storage.
In such configuration, OpenSearch may approach the 85% disk usage watermark
due to the combined storage allocation and usage patterns set by the Persistent
Volume Claim (PVC) size parameters for Prometheus and OpenSearch, which consume
storage the most.
When the 85% threshold is reached, the affected node is transitioned to the
read-only state, preventing shard allocation and causing the OpenSearch cluster
state to transition to Warning (Yellow) or Critical (Red).
Caution
The issue and the provided workaround apply only for clusters on
which OpenSearch and Prometheus utilize the same storage pool.
Derived from .values.elasticsearch.persistentVolumeUsableStorageSizeGB,
defaulting to .values.elasticsearch.persistentVolumeClaimSize if
unspecified. To obtain the OpenSearch PVC size:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
If the formula result is positive, it is an early indication that the
cluster is affected.
Verify whether the OpenSearchClusterStatusWarning or
OpenSearchClusterStatusCritical alert is firing. And if so,
verify the following:
Log in to the OpenSearch web UI.
In Management -> Dev Tools, run the following command:
GET_cluster/allocation/explain
The following system response indicates that the corresponding node is
affected:
"explanation":"the node is above the low watermark cluster setting \[cluster.routing.allocation.disk.watermark.low=85%], using more disk space \than the maximum allowed [85.0%], actual free: [xx.xxx%]"
Note
The system response may contain even higher watermark percent
than 85.0%, depending on the case.
Workaround:
Warning
The workaround implies adjustement of the retention threshold for
OpenSearch. And depending on the new threshold, some old logs will be
deleted.
A user-defined variable that specifies what percentage of the total storage
capacity should not be used by OpenSearch or Prometheus. This is used to
reserve space for other components. It should be expressed as a decimal.
For example, for 5% of reservation, Reserved_Percentage is 0.05.
Mirantis recommends using 0.05 as a starting point.
Filesystem_Reserve
Percentage to deduct for filesystems that may reserve some portion of the
available storage, which is marked as occupied. For example, for EXT4, it
is 5% by default, so the value must be 0.05.
Prometheus_PVC_Size_GB
Sourced from .values.prometheusServer.persistentVolumeClaimSize.
Total_Storage_Capacity_GB
Total capacity of the OpenSearch PVCs. For LVP, the capacity of the
storage pool. To obtain the total capacity:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
Calculation of above formula provides a maximum safe storage to allocate
for .values.elasticsearch.persistentVolumeUsableStorageSizeGB. Use this
formula as a reference for setting
.values.elasticsearch.persistentVolumeUsableStorageSizeGB on a cluster.
Wait up to 15-20 mins for OpenSearch to perform the cleaning.
Verify that the cluster is not affected anymore using the procedure above.
Container Cloud web UI¶[50181] Failure to deploy a compact cluster¶
A compact MOSK cluster fails to be deployed through the Container Cloud web UI
due to inability to add any label to the control plane machines along with
inability to change dedicatedControlPlane:false using the web UI.
To work around the issue, manually add the required labels using CLI. Once
done, the cluster deployment resumes.
[50168] Inability to use a new project right after creation¶
A newly created project does not display all available tabs in the Container
Cloud web UI and contains different accessdenied errors during first five
minutes after creation.
To work around the issue, refresh the browser in five minutes after the
project creation.
This section lists the artifacts of components included in the Container Cloud
patch release 2.28.1. For artifacts of the Cluster releases introduced in
2.28.1, see patch Cluster releases 17.2.5, 16.3.1, and
16.2.5.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
Does not support greenfield deployments on deprecated Cluster releases
of the 17.2.x and 16.2.x series. Use the latest available Cluster releases
of the series instead.
Caution
Make sure to update the Cluster release version
of your managed cluster before the current Cluster release
version becomes unsupported by a new Container Cloud release
version.
Otherwise, Container Cloud stops auto-upgrade and eventually
Container Cloud itself becomes unsupported.
This section outlines release notes for the Container Cloud release 2.28.0.
This section outlines new features and enhancements introduced in the
Container Cloud release 2.28.0. For the list of enhancements delivered with
the Cluster releases introduced by Container Cloud 2.28.0, see
17.3.0 and 16.3.0.
General availability for Ubuntu 22.04 on MOSK clusters¶
Implemented full support for Ubuntu 22.04 LTS (Jammy Jellyfish) as the default
host operating system in MOSK clusters, including greenfield
deployments and update from Ubuntu 20.04 to 22.04 on existing clusters.
Ubuntu 20.04 is deprecated for greenfield deployments and supported during the
MOSK 24.3 release cycle only for existing clusters.
Warning
During the course of the Container Cloud 2.28.x series, Mirantis
highly recommends upgrading an operating system on any nodes of all your
managed cluster machines to Ubuntu 22.04 before the next major Cluster
release becomes available.
It is not mandatory to upgrade all machines at once. You can upgrade them
one by one or in small batches, for example, if the maintenance window is
limited in time.
Otherwise, the Cluster release update of the Ubuntu 20.04-based managed
clusters will become impossible as of Container Cloud 2.29.0 with Ubuntu
22.04 as the only supported version.
Management cluster update to Container Cloud 2.29.1 will be blocked if
at least one node of any related managed cluster is running Ubuntu 20.04.
Note
Since Container Cloud 2.27.0 (Cluster release 16.2.0), existing
MOSK management clusters were automatically updated to
Ubuntu 22.04 during cluster upgrade. Greenfield deployments of management
clusters are also based on Ubuntu 22.04.
Day-2 operations for bare metal: updating modules¶
TechPreview
Implemented the capability to update custom modules using deprecation. Once
you create a new custom module, you can use it to deprecate another module
by adding the deprecates field to metadata.yaml of the new module.
The related HostOSConfiguration and HostOSConfigurationModules objects
reflect the deprecation status of new and old modules using the corresponding
fields in spec and status sections.
Also, added monitoring of deprecated modules by implementing the StackLight
metrics for the Host Operating System Modules Controller along with the
Day2ManagementControllerTargetDown and Day2ManagementDeprecatedConfigs
alerts to notify the cloud operator about detected deprecations and issues with
host-os-modules-controller.
Note
Deprecation is soft, meaning that no actual restrictions are applied
to the usage of a deprecated module.
Caution
Deprecating a version automatically deprecates all lower SemVer
versions of the specified module.
Day-2 operations for bare metal: configuration enhancements for modules¶
TechPreview
Introduced the following configuration enhancements for custom modules:
Module-specific Ansible configuration
Updated the Ansible execution mechanism for running any modules. The default
ansible.cfg file is now placed in /etc/ansible/mcc.cfg and used for
execution of lcm-ansible and day-2 modules. However, if a module has its
own ansible.cfg in the module root folder, such configuration is used
for the module execution instead of the default one.
Configuration of supported operating system distribution
Added the supportedDistributions to the metadata section of a module
custom resource to define the list of supported operating system
distributions for the module. This field is informative and does not block
the module execution on machines running non-supported distributions, but
such execution will be most probably completed with an error.
Separate flag for machines requiring reboot
Introduced a separate /run/day2/reboot-required file for day-2 modules
to add a notification about required reboot for a machine and a reason for
reboot that appear after the module execution. The feature allows for
separation of the reboot reason between LCM and day-2 operations.
Implemented the update group for controller nodes using the UpdateGroup
resource, which is automatically generated during initial cluster creation with
the following settings:
Name: <cluster-name>-control
Index: 1
Concurrent updates: 1
This feature decouples the concurrency settings from the global cluster level
and provides update flexibility.
All control plane nodes are automatically assigned to the control update
group with no possibility to change it.
Note
On existing clusters created before 2.28.0 (Cluster releases 17.2.0,
16.2.0, or earlier), the control update group is created after upgrade of
the Container Cloud release to 2.28.0 (Cluster release 16.3.0) on the
management cluster.
Implemented the rebootIfUpdateRequires parameter for the UpdateGroup
custom resource. The parameter allows for rebooting a set of controller or
worker machines added to an update group during a Cluster release update that
requires a reboot, for example, when kernel version update is available in the
target Cluster release. The feature reduces manual intervention and overall
downtime during cluster update.
Note
By default, rebootIfUpdateRequires is set to false on managed
clusters and to true on management clusters.
Self-diagnostics for management and managed clusters¶
Implemented the Diagnostic Controller that is a tool with a set of diagnostic
checks to perform self-diagnostics of any Container Cloud cluster and help the
operator to easily understand, troubleshoot, and resolve potential issues
against the following major subsystems: core, bare metal, Ceph, StackLight,
Tungsten Fabric, and OpenStack. The Diagnostic Controller analyzes the
configuration of the cluster subsystems and reports results of checks that
contain useful information about cluster health.
Running self-diagnostics on both management and managed clusters is essential
to ensure the overall health and optimal performance of your cluster. Mirantis
recommends running self-diagnostics before cluster update, node replacement, or
any other significant changes in the cluster to prevent potential issues and
optimize maintenance window.
Simplified the default auditd configuration by implementing the preset
groups that you can use in presetRules instead of exact names or the
virtual group all. The feature allows enabling a limited set of presets
using a single keyword (group name).
Also, optimized disk usage by removing the following Docker rule that was
removed from the Docker CIS Benchmark 1.3.0 due to producing excessive events:
# 1.2.4 Ensure auditing is configured for Docker files and directories - /var/lib/docker-w /var/lib/docker -k docker
Enhanced the ClusterUpdatePlan object by adding a separate update step for
each UpdateGroup of worker nodes of a managed cluster. The feature allows
the operator to granularly control the update process and its impact on
workloads, with the option to pause the update after each step.
Also, added several StackLight alerts to notify the operator about the update
progress and potential update issues.
Refactoring of delayed auto-update of a management cluster¶
Refactored the MCCUpgrade object by implementing a new mechanism to delay
Container Cloud release updates. You now have the following options for
auto-update of a management cluster:
Automatically update a cluster on the publish day of a new release
(by default).
Set specific days and hours for an auto-update allowing delays of up to one
week. For example, if a release becomes available on Monday, you can delay it
until Sunday by setting Sunday as the only permitted day for auto-updates.
Delay auto-update for minimum 20 days for each newly discovered release.
The exact number of delay days is set in the release metadata and cannot be
changed by the user. It depends on the specifics of each release cycle and
on optional configuration of week days and hours selected for update.
You can verify the exact date of a scheduled auto-update either in the
Status section of the Management Cluster Updates
page in the web UI or in the status section of the MCCUpgrade
object.
Combine auto-update delay with the specific days and hours setting
(two previous options).
Also, optimized monitoring of auto-update by implementing several StackLight
metrics for the kaas-exporter job along with the MCCUpdateBlocked and
MCCUpdateScheduled alerts to notify the cloud operator about new releases
as well as other important information about management cluster auto-update.
On top of continuous improvements delivered to the existing Container Cloud
guides, added documentation on how to run Ceph performance tests using
Kubernetes batch or cron jobs that run
fio
processes according to a predefined KaaSCephOperationRequest CR.
The following issues have been addressed in the Mirantis Container Cloud
release 2.28.0 along with the Cluster releases 17.3.0 and
16.3.0.
Note
This section provides descriptions of issues addressed since
the last Container Cloud patch release 2.27.4.
For details on addressed issues in earlier patch releases since 2.27.0,
which are also included into the major release 2.28.0, refer to
2.27.x patch releases.
[41305] [Bare metal] Fixed the issue with newly added management cluster
nodes failing to undergo provisioning if the management cluster nodes were
configured with a single L2 segment used for all network traffic (PXE and
LCM/management networks).
[46245] [Bare metal] Fixed the issue with lack of permissions for
serviceuser and users with the global-admin and operator
roles to fetch
HostOSConfigurationModules and
HostOSConfiguration custom resources.
[43164] [StackLight] Fixed the issue with rollover policy not being added
to indicies created without a policy.
If the dnsmasq pod is restarted during the bootstrap of newly added
nodes, those nodes may fail to undergo inspection. That can result in
inspectionerror in the corresponding BareMetalHost objects.
The issue can occur when:
The dnsmasq pod was moved to another node.
DHCP subnets were changed, including addition or removal. In this case, the
dhcpd container of the dnsmasq pod is restarted.
Caution
If changing or adding of DHCP subnets is required to bootstrap
new nodes, wait after changing or adding DHCP subnets until the
dnsmasq pod becomes ready, then create BareMetalHost objects.
To verify whether the nodes are affected:
Verify whether the BareMetalHost objects contain the
inspectionerror:
Verify whether the dnsmasq pod was in Ready state when the
inspection of the affected baremetal hosts (test-worker-3 in the example
above) was started:
In the system response above, inspection was started at
"2024-10-11T07:38:19Z", immediately before the period of the dhcpd
container downtime. Therefore, this node is most likely affected by the
issue.
Workaround
Reboot the node using the IPMI reset or cycle
command.
If the node fails to boot, remove the failed BareMetalHost object and
create it again:
Remove BareMetalHost object. For example:
kubectldeletebmh-nmanaged-nstest-worker-3
Verify that the BareMetalHost object is removed:
kubectlgetbmh-nmanaged-nstest-worker-3
Create a BareMetalHost object from the template. For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
Ceph¶[50566] Ceph upgrade is very slow during patch or major cluster update¶
Due to the upstream Ceph issue
66717,
during CVE upgrade of the Ceph daemon image of Ceph Reef 18.2.4, OSDs may start
slow and even fail the starting probe with the following describe output in
the rook-ceph-osd-X pod:
Complete the following steps during every patch or major cluster update of the
Cluster releases 17.2.x, 17.3.x, and 17.4.x (until Ceph 18.2.5 becomes
supported):
Plan extra time in the maintenance window for the patch cluster update.
Slow starts will still impact the update procedure, but after completing the
following step, the recovery process noticeably shortens without affecting
the overall cluster state and data responsiveness.
Select one of the following options:
Before the cluster update, set the noout flag:
cephosdsetnoout
Once the Ceph OSDs image upgrade is done, unset the flag:
cephosdunsetnoout
Monitor the Ceph OSDs image upgrade. If the symptoms of slow start appear,
set the noout flag as soon as possible. Once the Ceph OSDs image
upgrade is done, unset the flag.
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
After upgrade of kernel to the latest supported version, old kernel
metapackages may remain on the cluster. The issue occurs if the system kernel
line is changed from LTS to HWE. This setting is controlled by the
upgrade_kernel_version parameter located in the ClusterRelease object
under the deploy StateItem. As a result, the operating system has both
LTS and HWE kernel packages installed and regularly updated, but only one
kernel image is used (loaded into memory). The unused kernel images consume
minimal amount of disk space.
Therefore, you can safely disregard the issue because it does not affect
cluster operability. If you still require removing unused kernel metapackages,
contact Mirantis support for detailed
instructions.
[39437] Failure to replace a master node on a Container Cloud cluster¶
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
StackLight¶[47594] Patroni pods may get stuck in the CrashLoopBackOff state¶
The Patroni pods may get stuck in the CrashLoopBackOff state due to the
patroni container being terminated with reason:OOMKilled that you can
see in the pod status. For example:
Due to the JSON-based format of ucp-kubelet logs, OpenSearch does not store
kubelet logs. Mirantis is working on the issue and will deliver the resolution
in one of the nearest patch releases.
[44193] OpenSearch reaches 85% disk usage watermark affecting the cluster state¶
On High Availability (HA) clusters that use Local Volume Provisioner (LVP),
Prometheus and OpenSearch from StackLight may share the same pool of storage.
In such configuration, OpenSearch may approach the 85% disk usage watermark
due to the combined storage allocation and usage patterns set by the Persistent
Volume Claim (PVC) size parameters for Prometheus and OpenSearch, which consume
storage the most.
When the 85% threshold is reached, the affected node is transitioned to the
read-only state, preventing shard allocation and causing the OpenSearch cluster
state to transition to Warning (Yellow) or Critical (Red).
Caution
The issue and the provided workaround apply only for clusters on
which OpenSearch and Prometheus utilize the same storage pool.
Derived from .values.elasticsearch.persistentVolumeUsableStorageSizeGB,
defaulting to .values.elasticsearch.persistentVolumeClaimSize if
unspecified. To obtain the OpenSearch PVC size:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
If the formula result is positive, it is an early indication that the
cluster is affected.
Verify whether the OpenSearchClusterStatusWarning or
OpenSearchClusterStatusCritical alert is firing. And if so,
verify the following:
Log in to the OpenSearch web UI.
In Management -> Dev Tools, run the following command:
GET_cluster/allocation/explain
The following system response indicates that the corresponding node is
affected:
"explanation":"the node is above the low watermark cluster setting \[cluster.routing.allocation.disk.watermark.low=85%], using more disk space \than the maximum allowed [85.0%], actual free: [xx.xxx%]"
Note
The system response may contain even higher watermark percent
than 85.0%, depending on the case.
Workaround:
Warning
The workaround implies adjustement of the retention threshold for
OpenSearch. And depending on the new threshold, some old logs will be
deleted.
A user-defined variable that specifies what percentage of the total storage
capacity should not be used by OpenSearch or Prometheus. This is used to
reserve space for other components. It should be expressed as a decimal.
For example, for 5% of reservation, Reserved_Percentage is 0.05.
Mirantis recommends using 0.05 as a starting point.
Filesystem_Reserve
Percentage to deduct for filesystems that may reserve some portion of the
available storage, which is marked as occupied. For example, for EXT4, it
is 5% by default, so the value must be 0.05.
Prometheus_PVC_Size_GB
Sourced from .values.prometheusServer.persistentVolumeClaimSize.
Total_Storage_Capacity_GB
Total capacity of the OpenSearch PVCs. For LVP, the capacity of the
storage pool. To obtain the total capacity:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
Calculation of above formula provides a maximum safe storage to allocate
for .values.elasticsearch.persistentVolumeUsableStorageSizeGB. Use this
formula as a reference for setting
.values.elasticsearch.persistentVolumeUsableStorageSizeGB on a cluster.
Wait up to 15-20 mins for OpenSearch to perform the cleaning.
Verify that the cluster is not affected anymore using the procedure above.
Container Cloud web UI¶[50181] Failure to deploy a compact cluster¶
A compact MOSK cluster fails to be deployed through the Container Cloud web UI
due to inability to add any label to the control plane machines along with
inability to change dedicatedControlPlane:false using the web UI.
To work around the issue, manually add the required labels using CLI. Once
done, the cluster deployment resumes.
[50168] Inability to use a new project right after creation¶
A newly created project does not display all available tabs in the Container
Cloud web UI and contains different accessdenied errors during first five
minutes after creation.
To work around the issue, refresh the browser in five minutes after the
project creation.
The following table lists the major components and their versions delivered in
Container Cloud 2.28.0. The components that are newly added, updated,
deprecated, or removed as compared to 2.27.0, are marked with a corresponding
superscript, for example, admission-controllerUpdated.
This section lists the artifacts of components included in the Container Cloud
release 2.27.0. The components that are newly added, updated,
deprecated, or removed as compared to 2.27.0, are marked with a corresponding
superscript, for example, admission-controllerUpdated.
In total, since Container Cloud 2.27.0, in 2.28.0, 2614
Common Vulnerabilities and Exposures (CVE) have been fixed:
299 of critical and 2315 of high severity.
The table below includes the total numbers of addressed unique and common
vulnerabilities and exposures (CVE) by product component since the 2.27.4
patch release. The common CVEs are issues addressed across several images.
This section describes the specific actions you as a cloud operator need to
complete before or after your Container Cloud cluster update to the Cluster
releases 17.3.0 or 16.3.0.
Pre-update actions¶Change label values in Ceph metrics used in customizations¶
Note
If you do not use Ceph metrics in any customizations, for example,
custom alerts, Grafana dashboards, or queries in custom workloads, skip
this section.
After deprecating the performance metric exporter that is integrated into the
Ceph Manager daemon for the sake of the dedicated Ceph Exporter daemon in
Container Cloud 2.27.0, you may need to update values of several labels in Ceph
metrics if you use them in any customizations such as custom alerts, Grafana
dashboards, or queries in custom tools. These labels are changed in Container
Cloud 2.28.0 (Cluster releases 16.3.0 and 17.3.0).
Note
Names of metrics are changed, no metrics are removed.
All Ceph metrics to be collected by the Ceph Exporter daemon changed their
labels job and instance due to scraping metrics from new Ceph Exporter
daemon instead of the performance metric exporter of Ceph Manager:
Values of the job labels are changed from rook-ceph-mgr to
prometheus-rook-exporter for all Ceph metrics moved to Ceph
Exporter. The full list of moved metrics is presented below.
Values of the instance labels are changed from the metric endpoint
of Ceph Manager with port 9283 to the metric endpoint of Ceph Exporter
with port 9926 for all Ceph metrics moved to Ceph Exporter. The full
list of moved metrics is presented below.
Values of the instance_id labels of Ceph metrics from the RADOS
Gateway (RGW) daemons are changed from the daemon GID to the daemon
subname. For example, instead of instance_id="<RGW_PROCESS_GID>",
the instance_id="a" (ceph_rgw_qlen{instance_id="a"}) is now
used. The list of moved Ceph RGW metrics is presented below.
List of affected Ceph RGW metrics
ceph_rgw_cache_.*
ceph_rgw_failed_req
ceph_rgw_gc_retire_object
ceph_rgw_get.*
ceph_rgw_keystone_.*
ceph_rgw_lc_.*
ceph_rgw_lua_.*
ceph_rgw_pubsub_.*
ceph_rgw_put.*
ceph_rgw_qactive
ceph_rgw_qlen
ceph_rgw_req
List of all metrics to be collected by Ceph Exporter instead of
Ceph Manager
ceph_bluefs_.*
ceph_bluestore_.*
ceph_mds_cache_.*
ceph_mds_caps
ceph_mds_ceph_.*
ceph_mds_dir_.*
ceph_mds_exported_inodes
ceph_mds_forward
ceph_mds_handle_.*
ceph_mds_imported_inodes
ceph_mds_inodes.*
ceph_mds_load_cent
ceph_mds_log_.*
ceph_mds_mem_.*
ceph_mds_openino_dir_fetch
ceph_mds_process_request_cap_release
ceph_mds_reply_.*
ceph_mds_request
ceph_mds_root_.*
ceph_mds_server_.*
ceph_mds_sessions_.*
ceph_mds_slow_reply
ceph_mds_subtrees
ceph_mon_election_.*
ceph_mon_num_.*
ceph_mon_session_.*
ceph_objecter_.*
ceph_osd_numpg.*
ceph_osd_op.*
ceph_osd_recovery_.*
ceph_osd_stat_.*
ceph_paxos.*
ceph_prioritycache.*
ceph_purge.*
ceph_rgw_cache_.*
ceph_rgw_failed_req
ceph_rgw_gc_retire_object
ceph_rgw_get.*
ceph_rgw_keystone_.*
ceph_rgw_lc_.*
ceph_rgw_lua_.*
ceph_rgw_pubsub_.*
ceph_rgw_put.*
ceph_rgw_qactive
ceph_rgw_qlen
ceph_rgw_req
ceph_rocksdb_.*
Post-update actions¶Manually disable collection of performance metrics by Ceph Manager (optional)¶
Since Container Cloud 2.28.0 (Cluster releases 17.3.0 and 16.3.0), Ceph cluster
metrics are collected by the dedicated Ceph Exporter daemon. At the same time,
same metrics are still available to be collected by the Ceph Manager daemon.
To improve performance of the Ceph Manager daemon, you can manually disable
collection of performance metrics by Ceph Manager, which are collected by the
Ceph Exporter daemon.
To disable performance metrics for the Ceph Manager daemon, add the following
parameter to the KaaSCephClusterspec in the rookConfig section:
Once you add this option, Ceph performance metrics are collected by the Ceph
Exporter daemon only. For more details, see Official Ceph documentation.
Upgrade to Ubuntu 22.04 on baremetal-based clusters¶
In Container Cloud 2.29.0, the Cluster release update of the Ubuntu 20.04-based
managed clusters will become impossible, and Ubuntu 22.04 will become the only
supported version of the operating system. Therefore, ensure that every node of
your managed clusters are running Ubuntu 22.04 to unblock managed cluster
update in Container Cloud 2.29.0.
Warning
Management cluster update to Container Cloud 2.29.1 will be blocked
if at least one node of any related managed cluster is running Ubuntu 20.04.
It is not mandatory to upgrade all machines at once. You can upgrade them
one by one or in small batches, for example, if the maintenance window is
limited in time.
Note
Existing management clusters were automatically updated to Ubuntu
22.04 during cluster upgrade to the Cluster release 16.2.0 in Container
Cloud 2.27.0. Greenfield deployments of management clusters are also based
on Ubuntu 22.04.
Warning
Usage of third-party software, which is not part of
Mirantis-supported configurations, for example, the use of custom DPDK
modules, may block upgrade of an operating system distribution. Users are
fully responsible for ensuring the compatibility of such custom components
with the latest supported Ubuntu version.
For MOSK clusters, Container Cloud 2.27.4 is the
second patch release of MOSK 24.2.x series using the patch
Cluster release 17.2.4. For the update path of 24.1 and 24.2 series, see
MOSK documentation: Cluster update scheme.
The Container Cloud patch release 2.27.4, which is based on the
2.27.0 major release, provides the following updates:
Support for the patch Cluster releases 16.2.4 and 17.2.4
that represents Mirantis OpenStack for Kubernetes (MOSK) patch release
24.2.2.
Bare metal: update of Ubuntu mirror from ubuntu-2024-08-06-014502 to
ubuntu-2024-08-21-014714 along with update of the minor kernel version from
5.15.0-117-generic to 5.15.0-119-generic for Jammy and to 5.15.0-118-generic
for Focal.
Security fixes for CVEs in images.
This patch release also supports the latest major Cluster releases
17.2.0 and 16.2.0. And it does not support greenfield
deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.27.4, refer
to 2.27.0.
In total, since Container Cloud 2.27.3, 131 Common Vulnerabilities and
Exposures (CVE) have been fixed in 2.27.4: 15 of critical and 116 of
high severity.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since Container Cloud 2.27.3.
The common CVEs are issues addressed across several images.
If the dnsmasq pod is restarted during the bootstrap of newly added
nodes, those nodes may fail to undergo inspection. That can result in
inspectionerror in the corresponding BareMetalHost objects.
The issue can occur when:
The dnsmasq pod was moved to another node.
DHCP subnets were changed, including addition or removal. In this case, the
dhcpd container of the dnsmasq pod is restarted.
Caution
If changing or adding of DHCP subnets is required to bootstrap
new nodes, wait after changing or adding DHCP subnets until the
dnsmasq pod becomes ready, then create BareMetalHost objects.
To verify whether the nodes are affected:
Verify whether the BareMetalHost objects contain the
inspectionerror:
Verify whether the dnsmasq pod was in Ready state when the
inspection of the affected baremetal hosts (test-worker-3 in the example
above) was started:
In the system response above, inspection was started at
"2024-10-11T07:38:19Z", immediately before the period of the dhcpd
container downtime. Therefore, this node is most likely affected by the
issue.
Workaround
Reboot the node using the IPMI reset or cycle
command.
If the node fails to boot, remove the failed BareMetalHost object and
create it again:
Remove BareMetalHost object. For example:
kubectldeletebmh-nmanaged-nstest-worker-3
Verify that the BareMetalHost object is removed:
kubectlgetbmh-nmanaged-nstest-worker-3
Create a BareMetalHost object from the template. For example:
When trying to list the HostOSConfigurationModules and HostOSConfiguration custom resources, serviceuser or a user with
the global-admin or operator role obtains the accessdenied error.
For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
After node maintenance of a management cluster, the newly added nodes may
fail to undergo provisioning successfully. The issue relates to new nodes
that are in the same L2 domain as the management cluster.
The issue was observed on environments having management cluster nodes
configured with a single L2 segment used for all network traffic
(PXE and LCM/management networks).
To verify whether the cluster is affected:
Verify whether the dnsmasq and dhcp-relay pods run on the same node
in the management cluster:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
LCM¶[39437] Failure to replace a master node on a Container Cloud cluster¶
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
Ceph¶[50566] Ceph upgrade is very slow during patch or major cluster update¶
Due to the upstream Ceph issue
66717,
during CVE upgrade of the Ceph daemon image of Ceph Reef 18.2.4, OSDs may start
slow and even fail the starting probe with the following describe output in
the rook-ceph-osd-X pod:
Complete the following steps during every patch or major cluster update of the
Cluster releases 17.2.x, 17.3.x, and 17.4.x (until Ceph 18.2.5 becomes
supported):
Plan extra time in the maintenance window for the patch cluster update.
Slow starts will still impact the update procedure, but after completing the
following step, the recovery process noticeably shortens without affecting
the overall cluster state and data responsiveness.
Select one of the following options:
Before the cluster update, set the noout flag:
cephosdsetnoout
Once the Ceph OSDs image upgrade is done, unset the flag:
cephosdunsetnoout
Monitor the Ceph OSDs image upgrade. If the symptoms of slow start appear,
set the noout flag as soon as possible. Once the Ceph OSDs image
upgrade is done, unset the flag.
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
Scale up the affected StatefulSet or Deployment back to the
original number of replicas and wait until its state becomes Running.
Container Cloud web UI¶[50181] Failure to deploy a compact cluster¶
A compact MOSK cluster fails to be deployed through the Container Cloud web UI
due to inability to add any label to the control plane machines along with
inability to change dedicatedControlPlane:false using the web UI.
To work around the issue, manually add the required labels using CLI. Once
done, the cluster deployment resumes.
[50168] Inability to use a new project right after creation¶
A newly created project does not display all available tabs in the Container
Cloud web UI and contains different accessdenied errors during first five
minutes after creation.
To work around the issue, refresh the browser in five minutes after the
project creation.
Patch cluster update¶[49713] Patch update is stuck with some nodes in Prepare state¶
Patch update from 2.27.3 to 2.27.4 may get stuck with one or more management
cluster nodes remaining in the Prepare state and with the following error
in the lcm-controller logs on the management cluster:
failed to create cluster updater for cluster default/kaas-mgmt:machine update group not found for machine default/master-0
To work around the issue, in the LCMMachine objects of the management
cluster, set the following annotation:
This section lists the artifacts of components included in the Container Cloud
patch release 2.27.4. For artifacts of the Cluster releases introduced in
2.27.4, see patch Cluster releases 16.2.4 and 17.2.4.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
For MOSK clusters, Container Cloud 2.27.3 is the
first patch release of MOSK 24.2.x series using the patch
Cluster release 17.2.3. For the update path of 24.1 and 24.2 series, see
MOSK documentation: Cluster update scheme.
The Container Cloud patch release 2.27.3, which is based on the
2.27.0 major release, provides the following updates:
Support for the patch Cluster releases 16.2.3 and 17.2.3
that represents Mirantis OpenStack for Kubernetes (MOSK) patch release
24.2.1.
MKE:
Support for MKE 3.7.12.
Improvements in the MKE benchmark compliance (control ID 5.1.5): analyzed
and fixed the majority of failed compliance checks for the following
components:
Container Cloud: iam-keycloak in the kaas namespace and
opensearch-dashboards in the stacklight namespace
MOSK: opensearch-dashboards in the stacklight
namespace
Bare metal: update of Ubuntu mirror from ubuntu-2024-07-16-014744 to
ubuntu-2024-08-06-014502 along with update of the minor kernel version from
5.15.0-116-generic to 5.15.0-117-generic.
VMware vSphere: suspension of support for cluster deployment, update, and
attachment. For details, see Deprecation notes.
Security fixes for CVEs in images.
This patch release also supports the latest major Cluster releases
17.2.0 and 16.2.0. And it does not support greenfield
deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.27.3, refer
to 2.27.0.
In total, since Container Cloud 2.27.2, 1559 Common Vulnerabilities and
Exposures (CVE) have been fixed in 2.27.3: 253 of critical and 1306 of
high severity.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since Container Cloud 2.27.2.
The common CVEs are issues addressed across several images.
If the dnsmasq pod is restarted during the bootstrap of newly added
nodes, those nodes may fail to undergo inspection. That can result in
inspectionerror in the corresponding BareMetalHost objects.
The issue can occur when:
The dnsmasq pod was moved to another node.
DHCP subnets were changed, including addition or removal. In this case, the
dhcpd container of the dnsmasq pod is restarted.
Caution
If changing or adding of DHCP subnets is required to bootstrap
new nodes, wait after changing or adding DHCP subnets until the
dnsmasq pod becomes ready, then create BareMetalHost objects.
To verify whether the nodes are affected:
Verify whether the BareMetalHost objects contain the
inspectionerror:
Verify whether the dnsmasq pod was in Ready state when the
inspection of the affected baremetal hosts (test-worker-3 in the example
above) was started:
In the system response above, inspection was started at
"2024-10-11T07:38:19Z", immediately before the period of the dhcpd
container downtime. Therefore, this node is most likely affected by the
issue.
Workaround
Reboot the node using the IPMI reset or cycle
command.
If the node fails to boot, remove the failed BareMetalHost object and
create it again:
Remove BareMetalHost object. For example:
kubectldeletebmh-nmanaged-nstest-worker-3
Verify that the BareMetalHost object is removed:
kubectlgetbmh-nmanaged-nstest-worker-3
Create a BareMetalHost object from the template. For example:
When trying to list the HostOSConfigurationModules and HostOSConfiguration custom resources, serviceuser or a user with
the global-admin or operator role obtains the accessdenied error.
For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
After node maintenance of a management cluster, the newly added nodes may
fail to undergo provisioning successfully. The issue relates to new nodes
that are in the same L2 domain as the management cluster.
The issue was observed on environments having management cluster nodes
configured with a single L2 segment used for all network traffic
(PXE and LCM/management networks).
To verify whether the cluster is affected:
Verify whether the dnsmasq and dhcp-relay pods run on the same node
in the management cluster:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
LCM¶[39437] Failure to replace a master node on a Container Cloud cluster¶
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
Ceph¶[50566] Ceph upgrade is very slow during patch or major cluster update¶
Due to the upstream Ceph issue
66717,
during CVE upgrade of the Ceph daemon image of Ceph Reef 18.2.4, OSDs may start
slow and even fail the starting probe with the following describe output in
the rook-ceph-osd-X pod:
Complete the following steps during every patch or major cluster update of the
Cluster releases 17.2.x, 17.3.x, and 17.4.x (until Ceph 18.2.5 becomes
supported):
Plan extra time in the maintenance window for the patch cluster update.
Slow starts will still impact the update procedure, but after completing the
following step, the recovery process noticeably shortens without affecting
the overall cluster state and data responsiveness.
Select one of the following options:
Before the cluster update, set the noout flag:
cephosdsetnoout
Once the Ceph OSDs image upgrade is done, unset the flag:
cephosdunsetnoout
Monitor the Ceph OSDs image upgrade. If the symptoms of slow start appear,
set the noout flag as soon as possible. Once the Ceph OSDs image
upgrade is done, unset the flag.
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
Scale up the affected StatefulSet or Deployment back to the
original number of replicas and wait until its state becomes Running.
Container Cloud web UI¶[50181] Failure to deploy a compact cluster¶
A compact MOSK cluster fails to be deployed through the Container Cloud web UI
due to inability to add any label to the control plane machines along with
inability to change dedicatedControlPlane:false using the web UI.
To work around the issue, manually add the required labels using CLI. Once
done, the cluster deployment resumes.
[50168] Inability to use a new project right after creation¶
A newly created project does not display all available tabs in the Container
Cloud web UI and contains different accessdenied errors during first five
minutes after creation.
To work around the issue, refresh the browser in five minutes after the
project creation.
This section lists the artifacts of components included in the Container Cloud
patch release 2.27.3. For artifacts of the Cluster releases introduced in
2.27.3, see patch Cluster releases 16.2.3 and 17.2.3.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
For MOSK clusters, Container Cloud 2.27.2 is the
continuation for MOSK 24.1.x series using the patch Cluster
release 17.1.7. For the update path of 24.1 and 24.2 series, see
MOSK documentation: Cluster update scheme.
The management cluster of a MOSK 24.1, 24.1.5, or 24.1.6
cluster is automatically updated to the latest patch Cluster release
16.2.2.
The Container Cloud patch release 2.27.2, which is based on the
2.27.0 major release, provides the following updates:
Support for the patch Cluster releases 16.1.7 and 17.1.7
that represents Mirantis OpenStack for Kubernetes (MOSK) patch release
24.1.7.
Support for MKE 3.7.11.
Bare metal: update of Ubuntu mirror from ubuntu-2024-06-27-095142 to
ubuntu-2024-07-16-014744 along with update of minor kernel version from
5.15.0-113-generic to 5.15.0-116-generic (Cluster release 16.2.2).
Security fixes for CVEs in images.
This patch release also supports the latest major Cluster releases
17.2.0 and 16.2.0. And it does not support greenfield
deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.27.2, refer
to 2.27.0.
In total, since Container Cloud 2.27.1, 95 Common Vulnerabilities and
Exposures (CVE) have been fixed in 2.27.2: 6 of critical and 89 of
high severity.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since Container Cloud 2.27.1.
The common CVEs are issues addressed across several images.
This section lists known issues with workarounds for the Mirantis
Container Cloud release 2.27.2 including the Cluster releases 16.2.2,
16.1.7, and 17.1.7.
If the dnsmasq pod is restarted during the bootstrap of newly added
nodes, those nodes may fail to undergo inspection. That can result in
inspectionerror in the corresponding BareMetalHost objects.
The issue can occur when:
The dnsmasq pod was moved to another node.
DHCP subnets were changed, including addition or removal. In this case, the
dhcpd container of the dnsmasq pod is restarted.
Caution
If changing or adding of DHCP subnets is required to bootstrap
new nodes, wait after changing or adding DHCP subnets until the
dnsmasq pod becomes ready, then create BareMetalHost objects.
To verify whether the nodes are affected:
Verify whether the BareMetalHost objects contain the
inspectionerror:
Verify whether the dnsmasq pod was in Ready state when the
inspection of the affected baremetal hosts (test-worker-3 in the example
above) was started:
In the system response above, inspection was started at
"2024-10-11T07:38:19Z", immediately before the period of the dhcpd
container downtime. Therefore, this node is most likely affected by the
issue.
Workaround
Reboot the node using the IPMI reset or cycle
command.
If the node fails to boot, remove the failed BareMetalHost object and
create it again:
Remove BareMetalHost object. For example:
kubectldeletebmh-nmanaged-nstest-worker-3
Verify that the BareMetalHost object is removed:
kubectlgetbmh-nmanaged-nstest-worker-3
Create a BareMetalHost object from the template. For example:
When trying to list the HostOSConfigurationModules and HostOSConfiguration custom resources, serviceuser or a user with
the global-admin or operator role obtains the accessdenied error.
For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
After node maintenance of a management cluster, the newly added nodes may
fail to undergo provisioning successfully. The issue relates to new nodes
that are in the same L2 domain as the management cluster.
The issue was observed on environments having management cluster nodes
configured with a single L2 segment used for all network traffic
(PXE and LCM/management networks).
To verify whether the cluster is affected:
Verify whether the dnsmasq and dhcp-relay pods run on the same node
in the management cluster:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
LCM¶[39437] Failure to replace a master node on a Container Cloud cluster¶
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
Ceph¶[50566] Ceph upgrade is very slow during patch or major cluster update¶
Due to the upstream Ceph issue
66717,
during CVE upgrade of the Ceph daemon image of Ceph Reef 18.2.4, OSDs may start
slow and even fail the starting probe with the following describe output in
the rook-ceph-osd-X pod:
Complete the following steps during every patch or major cluster update of the
Cluster releases 17.2.x, 17.3.x, and 17.4.x (until Ceph 18.2.5 becomes
supported):
Plan extra time in the maintenance window for the patch cluster update.
Slow starts will still impact the update procedure, but after completing the
following step, the recovery process noticeably shortens without affecting
the overall cluster state and data responsiveness.
Select one of the following options:
Before the cluster update, set the noout flag:
cephosdsetnoout
Once the Ceph OSDs image upgrade is done, unset the flag:
cephosdunsetnoout
Monitor the Ceph OSDs image upgrade. If the symptoms of slow start appear,
set the noout flag as soon as possible. Once the Ceph OSDs image
upgrade is done, unset the flag.
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
Scale up the affected StatefulSet or Deployment back to the
original number of replicas and wait until its state becomes Running.
Container Cloud web UI¶[50181] Failure to deploy a compact cluster¶
A compact MOSK cluster fails to be deployed through the Container Cloud web UI
due to inability to add any label to the control plane machines along with
inability to change dedicatedControlPlane:false using the web UI.
To work around the issue, manually add the required labels using CLI. Once
done, the cluster deployment resumes.
[50168] Inability to use a new project right after creation¶
A newly created project does not display all available tabs in the Container
Cloud web UI and contains different accessdenied errors during first five
minutes after creation.
To work around the issue, refresh the browser in five minutes after the
project creation.
This section lists the artifacts of components included in the Container Cloud
patch release 2.27.2. For artifacts of the Cluster releases introduced in
2.27.2, see patch Cluster releases 16.2.2, 16.1.7, and
17.1.7.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
For MOSK clusters, Container Cloud 2.27.1 is the
continuation for MOSK 24.1.x series using the patch Cluster
release 17.1.6. For the update path of 24.1 and 24.2 series, see
MOSK documentation: Cluster update scheme.
The management cluster of a MOSK 24.1 or 24.1.5 cluster is
automatically updated to the latest patch Cluster release 16.2.1.
The Container Cloud patch release 2.27.1, which is based on the
2.27.0 major release, provides the following updates:
Support for the patch Cluster releases 16.1.6 and 17.1.6
that represents Mirantis OpenStack for Kubernetes (MOSK) patch release
24.1.6.
Support for MKE 3.7.10.
Support for docker-ee-cli 23.0.13 in MCR 23.0.11 to fix several CVEs.
Bare metal: update of Ubuntu mirror from ubuntu-2024-05-17-013445 to
ubuntu-2024-06-27-095142 along with update of minor kernel version from
5.15.0-107-generic to 5.15.0-113-generic.
Security fixes for CVEs in images.
Bug fixes.
This patch release also supports the latest major Cluster releases
17.2.0 and 16.2.0. And it does not support greenfield
deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.27.1, refer
to 2.27.0.
In total, since Container Cloud 2.27.0, 270 Common Vulnerabilities and
Exposures (CVE) of high severity have been fixed in 2.27.1.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since Container Cloud 2.27.0.
The common CVEs are issues addressed across several images.
The following issues have been addressed in the Container Cloud patch release
2.27.1 along with the patch Cluster releases 16.2.1,
16.1.6, and 17.1.6.
[42304] [StackLight] [Cluster releases 17.1.6, 16.1.6] Fixed the issue
with failure of shard relocation in the OpenSearch cluster on large
Container Cloud managed clusters.
[40020] [StackLight] [Cluster releases 17.1.6, 16.1.6] Fixed the issue
with rollover_policy not being applied to the current indices while
updating the policy for the current system* and audit* data streams.
This section lists known issues with workarounds for the Mirantis
Container Cloud release 2.27.1 including the Cluster releases 16.2.1,
16.1.6, and 17.1.6.
If the dnsmasq pod is restarted during the bootstrap of newly added
nodes, those nodes may fail to undergo inspection. That can result in
inspectionerror in the corresponding BareMetalHost objects.
The issue can occur when:
The dnsmasq pod was moved to another node.
DHCP subnets were changed, including addition or removal. In this case, the
dhcpd container of the dnsmasq pod is restarted.
Caution
If changing or adding of DHCP subnets is required to bootstrap
new nodes, wait after changing or adding DHCP subnets until the
dnsmasq pod becomes ready, then create BareMetalHost objects.
To verify whether the nodes are affected:
Verify whether the BareMetalHost objects contain the
inspectionerror:
Verify whether the dnsmasq pod was in Ready state when the
inspection of the affected baremetal hosts (test-worker-3 in the example
above) was started:
In the system response above, inspection was started at
"2024-10-11T07:38:19Z", immediately before the period of the dhcpd
container downtime. Therefore, this node is most likely affected by the
issue.
Workaround
Reboot the node using the IPMI reset or cycle
command.
If the node fails to boot, remove the failed BareMetalHost object and
create it again:
Remove BareMetalHost object. For example:
kubectldeletebmh-nmanaged-nstest-worker-3
Verify that the BareMetalHost object is removed:
kubectlgetbmh-nmanaged-nstest-worker-3
Create a BareMetalHost object from the template. For example:
When trying to list the HostOSConfigurationModules and HostOSConfiguration custom resources, serviceuser or a user with
the global-admin or operator role obtains the accessdenied error.
For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
After node maintenance of a management cluster, the newly added nodes may
fail to undergo provisioning successfully. The issue relates to new nodes
that are in the same L2 domain as the management cluster.
The issue was observed on environments having management cluster nodes
configured with a single L2 segment used for all network traffic
(PXE and LCM/management networks).
To verify whether the cluster is affected:
Verify whether the dnsmasq and dhcp-relay pods run on the same node
in the management cluster:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
LCM¶[39437] Failure to replace a master node on a Container Cloud cluster¶
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
Ceph¶[50566] Ceph upgrade is very slow during patch or major cluster update¶
Due to the upstream Ceph issue
66717,
during CVE upgrade of the Ceph daemon image of Ceph Reef 18.2.4, OSDs may start
slow and even fail the starting probe with the following describe output in
the rook-ceph-osd-X pod:
Complete the following steps during every patch or major cluster update of the
Cluster releases 17.2.x, 17.3.x, and 17.4.x (until Ceph 18.2.5 becomes
supported):
Plan extra time in the maintenance window for the patch cluster update.
Slow starts will still impact the update procedure, but after completing the
following step, the recovery process noticeably shortens without affecting
the overall cluster state and data responsiveness.
Select one of the following options:
Before the cluster update, set the noout flag:
cephosdsetnoout
Once the Ceph OSDs image upgrade is done, unset the flag:
cephosdunsetnoout
Monitor the Ceph OSDs image upgrade. If the symptoms of slow start appear,
set the noout flag as soon as possible. Once the Ceph OSDs image
upgrade is done, unset the flag.
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
Scale up the affected StatefulSet or Deployment back to the
original number of replicas and wait until its state becomes Running.
Container Cloud web UI¶[50181] Failure to deploy a compact cluster¶
A compact MOSK cluster fails to be deployed through the Container Cloud web UI
due to inability to add any label to the control plane machines along with
inability to change dedicatedControlPlane:false using the web UI.
To work around the issue, manually add the required labels using CLI. Once
done, the cluster deployment resumes.
[50168] Inability to use a new project right after creation¶
A newly created project does not display all available tabs in the Container
Cloud web UI and contains different accessdenied errors during first five
minutes after creation.
To work around the issue, refresh the browser in five minutes after the
project creation.
This section describes the specific actions you as a cloud operator need to
complete before or after your Container Cloud cluster update to the Cluster
releases 17.1.6, 16.2.1, or 16.1.6.
Post-update actions¶Prepare for changing label values in Ceph metrics used in customizations¶
Note
If you do not use Ceph metrics in any customizations, for example,
custom alerts, Grafana dashboards, or queries in custom workloads, skip
this section.
After deprecating the performance metric exporter that is integrated into the
Ceph Manager daemon for the sake of the dedicated Ceph Exporter daemon in
Container Cloud 2.27.0, you may need to prepare for updating values of several
labels in Ceph metrics if you use them in any customizations such as custom
alerts, Grafana dashboards, or queries in custom tools. These labels will be
changed in Container Cloud 2.28.0 (Cluster releases 16.3.0 and 17.3.0).
Note
Names of metrics will not be changed, no metrics will be removed.
All Ceph metrics to be collected by the Ceph Exporter daemon will change their
labels job and instance due to scraping metrics from new Ceph Exporter
daemon instead of the performance metric exporter of Ceph Manager:
Values of the job labels will be changed from rook-ceph-mgr to
prometheus-rook-exporter for all Ceph metrics moved to Ceph
Exporter. The full list of moved metrics is presented below.
Values of the instance labels will be changed from the metric endpoint
of Ceph Manager with port 9283 to the metric endpoint of Ceph Exporter
with port 9926 for all Ceph metrics moved to Ceph Exporter. The full
list of moved metrics is presented below.
Values of the instance_id labels of Ceph metrics from the RADOS
Gateway (RGW) daemons will be changed from the daemon GID to the daemon
subname. For example, instead of instance_id="<RGW_PROCESS_GID>",
the instance_id="a" (ceph_rgw_qlen{instance_id="a"}) will be
used. The list of moved Ceph RGW metrics is presented below.
List of affected Ceph RGW metrics
ceph_rgw_cache_.*
ceph_rgw_failed_req
ceph_rgw_gc_retire_object
ceph_rgw_get.*
ceph_rgw_keystone_.*
ceph_rgw_lc_.*
ceph_rgw_lua_.*
ceph_rgw_pubsub_.*
ceph_rgw_put.*
ceph_rgw_qactive
ceph_rgw_qlen
ceph_rgw_req
List of all metrics to be collected by Ceph Exporter instead of
Ceph Manager
This section lists the artifacts of components included in the Container Cloud
patch release 2.27.1. For artifacts of the Cluster releases introduced in
2.27.1, see patch Cluster releases 16.2.1, 16.1.6, and
17.1.6.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
Does not support greenfield deployments on deprecated Cluster releases
of the 17.1.x and 16.1.x series. Use the latest available Cluster releases
of the series instead.
Caution
Make sure to update the Cluster release version
of your managed cluster before the current Cluster release
version becomes unsupported by a new Container Cloud release
version.
Otherwise, Container Cloud stops auto-upgrade and eventually
Container Cloud itself becomes unsupported.
This section outlines release notes for the Container Cloud release 2.27.0.
This section outlines new features and enhancements introduced in the
Container Cloud release 2.27.0. For the list of enhancements delivered with
the Cluster releases introduced by Container Cloud 2.27.0, see
17.2.0 and 16.2.0.
General availability for Ubuntu 22.04 on bare metal clusters¶
Implemented full support for Ubuntu 22.04 LTS (Jellyfish) as the default
host operating system that now installs on non-MOSK bare metal
management and managed clusters.
For MOSK:
Existing management clusters are automatically updated to Ubuntu 22.04 during
cluster upgrade to Container Cloud 2.27.0 (Cluster release 16.2.0).
Greenfield deployments of management clusters are based on Ubuntu 22.04.
Existing and greenfield deployments of managed clusters are still based on
Ubuntu 20.04. The support for Ubuntu 22.04 on this cluster type will be
announced in one of the following releases.
Caution
Upgrading from Ubuntu 20.04 to 22.04 on existing deployments
of Container Cloud managed clusters is not supported.
Improvements in the day-2 management API for bare metal clusters¶
TechPreview
Enhanced the day-2 management API the bare metal provider with several key
improvements:
Implemented the sysctl, package, and irqbalance configuration
modules, which become available for usage after your management cluster
upgrade to the Cluster release 16.2.0. These Container Cloud modules use the
designated HostOSConfiguration object named mcc-modules to distingish
them from custom modules.
Configuration modules allow managing the operating system of a bare metal
host granularly without rebuilding the node from scratch. Such approach
prevents workload evacuation and significantly reduces configuration time.
Optimized performance for faster, more efficient operations.
Enhanced user experience for easier and more intuitive interactions.
Resolved various internal issues to ensure smoother functionality.
Added comprehensive documentation, including concepts, guidelines,
and recommendations for effective use of day-2 operations.
Optimization of strict filtering for devices on bare metal clusters¶
Optimized the BareMetalHostProfile custom resource, which uses
the strict byID filtering to target system disks using the byPath,
serialNumber, and wwn reliable device options instead of the
unpredictable byName naming format.
The optimization includes changes in admission-controller that now blocks
the use of bmhp:spec:devices:by_name in new BareMetalHostProfile
objects.
Deprecation of SubnetPool and MetalLBConfigTemplate objects¶
As part of refactoring of the bare metal provider, deprecated the
SubnetPool and MetalLBConfigTemplate objects. The objects will be
completely removed from the product in one of the following releases.
Both objects are automatically migrated to the MetallbConfig object during
cluster update to the Cluster release 17.2.0 or 16.2.0.
The ClusterUpdatePlan object for a granular cluster update¶
TechPreview
Implemented the ClusterUpdatePlan custom resource to enable a granular
step-by-step update of a managed cluster. The operator can control the update
process by manually launching update stages using the commence flag.
Between the update stages, a cluster remains functional from the perspective
of cloud users and workloads.
A ClusterUpdatePlan object is automatically created by the respective
Container Cloud provider when a new Cluster release becomes available for your
cluster. This object contains a list of predefined self-descriptive update
steps that are cluster-specific. These steps are defined in the spec
section of the object with information about their impact on the cluster.
Implemented the UpdateGroup custom resource for creation of update groups
for worker machines on managed clusters. The use of update groups provides
enhanced control over update of worker machines. This feature decouples the
concurrency settings from the global cluster level, providing update
flexibility based on the workload characteristics of different worker machine
sets.
Implemented the same heartbeat model for the LCM Agent as Kubernetes uses
for Nodes. This model allows reflecting the actual status of the LCM Agent
when it fails. For visual representation, added the corresponding
LCM Agent status to the Container Cloud web UI for clusters and
machines, which reflects health status of the LCM agent along with its status
of update to the version from the current Cluster release.
Handling secret leftovers using secret-controller¶
Implemented secret-controller that runs on a management cluster and cleans
up secret leftovers of credentials that are not cleaned up automatically after
creation of new secrets. This controller replaces rhellicense-controller,
proxy-controller, and byo-credentials-controller as well as partially
replaces the functionality of license-controller and other credential
controllers.
Note
You can change memory limits for secret-controller on a
management cluster using the resources:limits parameter in the
spec:providerSpec:value:kaas:management:helmReleases: section of the
Cluster object.
MariaDB backup for bare metal and vSphere providers¶
Implemented the capability to back up and restore MariaDB databases on
management clusters for bare metal and vSphere providers. Also, added
documentation on how to change the storage node for backups on clusters of
these provider types.
The following issues have been addressed in the Mirantis Container Cloud
release 2.27.0 along with the Cluster releases 17.2.0 and
16.2.0.
Note
This section provides descriptions of issues addressed since
the last Container Cloud patch release 2.26.5.
For details on addressed issues in earlier patch releases since 2.26.0,
which are also included into the major release 2.27.0, refer to
2.26.x patch releases.
[42304] [StackLight] Fixed the issue with failure of shard relocation in
the OpenSearch cluster on large Container Cloud managed clusters.
[41890] [StackLight] Fixed the issue with Patroni failing to start
because of the short default timeout.
[40020] [StackLight] Fixed the issue with rollover_policy not being
applied to the current indices while updating the policy for the current
system* and audit* data streams.
[41819] [Ceph] Fixed the issue with the graceful cluster reboot being
blocked by active Ceph ClusterWorkloadLock objects.
[28865] [LCM] Fixed the issue with validation of the NTP configuration
before cluster deployment. Now, deployment does not start until the NTP
configuration is validated.
If the dnsmasq pod is restarted during the bootstrap of newly added
nodes, those nodes may fail to undergo inspection. That can result in
inspectionerror in the corresponding BareMetalHost objects.
The issue can occur when:
The dnsmasq pod was moved to another node.
DHCP subnets were changed, including addition or removal. In this case, the
dhcpd container of the dnsmasq pod is restarted.
Caution
If changing or adding of DHCP subnets is required to bootstrap
new nodes, wait after changing or adding DHCP subnets until the
dnsmasq pod becomes ready, then create BareMetalHost objects.
To verify whether the nodes are affected:
Verify whether the BareMetalHost objects contain the
inspectionerror:
Verify whether the dnsmasq pod was in Ready state when the
inspection of the affected baremetal hosts (test-worker-3 in the example
above) was started:
In the system response above, inspection was started at
"2024-10-11T07:38:19Z", immediately before the period of the dhcpd
container downtime. Therefore, this node is most likely affected by the
issue.
Workaround
Reboot the node using the IPMI reset or cycle
command.
If the node fails to boot, remove the failed BareMetalHost object and
create it again:
Remove BareMetalHost object. For example:
kubectldeletebmh-nmanaged-nstest-worker-3
Verify that the BareMetalHost object is removed:
kubectlgetbmh-nmanaged-nstest-worker-3
Create a BareMetalHost object from the template. For example:
When trying to list the HostOSConfigurationModules and HostOSConfiguration custom resources, serviceuser or a user with
the global-admin or operator role obtains the accessdenied error.
For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
After node maintenance of a management cluster, the newly added nodes may
fail to undergo provisioning successfully. The issue relates to new nodes
that are in the same L2 domain as the management cluster.
The issue was observed on environments having management cluster nodes
configured with a single L2 segment used for all network traffic
(PXE and LCM/management networks).
To verify whether the cluster is affected:
Verify whether the dnsmasq and dhcp-relay pods run on the same node
in the management cluster:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
LCM¶[39437] Failure to replace a master node on a Container Cloud cluster¶
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
Ceph¶[50566] Ceph upgrade is very slow during patch or major cluster update¶
Due to the upstream Ceph issue
66717,
during CVE upgrade of the Ceph daemon image of Ceph Reef 18.2.4, OSDs may start
slow and even fail the starting probe with the following describe output in
the rook-ceph-osd-X pod:
Complete the following steps during every patch or major cluster update of the
Cluster releases 17.2.x, 17.3.x, and 17.4.x (until Ceph 18.2.5 becomes
supported):
Plan extra time in the maintenance window for the patch cluster update.
Slow starts will still impact the update procedure, but after completing the
following step, the recovery process noticeably shortens without affecting
the overall cluster state and data responsiveness.
Select one of the following options:
Before the cluster update, set the noout flag:
cephosdsetnoout
Once the Ceph OSDs image upgrade is done, unset the flag:
cephosdunsetnoout
Monitor the Ceph OSDs image upgrade. If the symptoms of slow start appear,
set the noout flag as soon as possible. Once the Ceph OSDs image
upgrade is done, unset the flag.
[42908] The ceph-exporter pods are present in the Ceph crash list¶
After a managed cluster update, the ceph-exporter pods are present in the
ceph crash ls list while rook-ceph-exporter attempts to obtain
the port that is still in use. The issue does not block the managed cluster
update. Once the port becomes available, rook-ceph-exporter obtains the
port and the issue disappears.
As a workaround, run ceph crash archive-all to remove
ceph-exporter pods from the Ceph crash list.
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
On High Availability (HA) clusters that use Local Volume Provisioner (LVP),
Prometheus and OpenSearch from StackLight may share the same pool of storage.
In such configuration, OpenSearch may approach the 85% disk usage watermark
due to the combined storage allocation and usage patterns set by the Persistent
Volume Claim (PVC) size parameters for Prometheus and OpenSearch, which consume
storage the most.
When the 85% threshold is reached, the affected node is transitioned to the
read-only state, preventing shard allocation and causing the OpenSearch cluster
state to transition to Warning (Yellow) or Critical (Red).
Caution
The issue and the provided workaround apply only for clusters on
which OpenSearch and Prometheus utilize the same storage pool.
Derived from .values.elasticsearch.persistentVolumeUsableStorageSizeGB,
defaulting to .values.elasticsearch.persistentVolumeClaimSize if
unspecified. To obtain the OpenSearch PVC size:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
If the formula result is positive, it is an early indication that the
cluster is affected.
Verify whether the OpenSearchClusterStatusWarning or
OpenSearchClusterStatusCritical alert is firing. And if so,
verify the following:
Log in to the OpenSearch web UI.
In Management -> Dev Tools, run the following command:
GET_cluster/allocation/explain
The following system response indicates that the corresponding node is
affected:
"explanation":"the node is above the low watermark cluster setting \[cluster.routing.allocation.disk.watermark.low=85%], using more disk space \than the maximum allowed [85.0%], actual free: [xx.xxx%]"
Note
The system response may contain even higher watermark percent
than 85.0%, depending on the case.
Workaround:
Warning
The workaround implies adjustement of the retention threshold for
OpenSearch. And depending on the new threshold, some old logs will be
deleted.
A user-defined variable that specifies what percentage of the total storage
capacity should not be used by OpenSearch or Prometheus. This is used to
reserve space for other components. It should be expressed as a decimal.
For example, for 5% of reservation, Reserved_Percentage is 0.05.
Mirantis recommends using 0.05 as a starting point.
Filesystem_Reserve
Percentage to deduct for filesystems that may reserve some portion of the
available storage, which is marked as occupied. For example, for EXT4, it
is 5% by default, so the value must be 0.05.
Prometheus_PVC_Size_GB
Sourced from .values.prometheusServer.persistentVolumeClaimSize.
Total_Storage_Capacity_GB
Total capacity of the OpenSearch PVCs. For LVP, the capacity of the
storage pool. To obtain the total capacity:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
Calculation of above formula provides a maximum safe storage to allocate
for .values.elasticsearch.persistentVolumeUsableStorageSizeGB. Use this
formula as a reference for setting
.values.elasticsearch.persistentVolumeUsableStorageSizeGB on a cluster.
Wait up to 15-20 mins for OpenSearch to perform the cleaning.
Verify that the cluster is not affected anymore using the procedure above.
[43164] Rollover policy is not added to indicies created without a policy¶
The initial index for the system* and audit* data streams can be
created without any policy attached due to race condition.
One of indicators that the cluster is most likely affected is the
KubeJobFailed alert firing for the elasticsearch-curator job and one or
both of the following errors being present in elasticsearch-curator pods
that remain in the Error status:
2024-05-3113:16:04,459ERRORFailedtocompleteaction:delete_indices.\
<class'curator.exceptions.FailedExecution'>:Exceptionencountered.\
RerunwithloglevelDEBUGand/orcheckElasticsearchlogsformoreinformation.\
Exception:RequestError(400,'illegal_argument_exception','index [.ds-system-000001] \is the write index for data stream [system] and cannot be deleted')
or
2024-05-3113:16:04,459ERRORFailedtocompleteaction:delete_indices.\
<class'curator.exceptions.FailedExecution'>:Exceptionencountered.\
RerunwithloglevelDEBUGand/orcheckElasticsearchlogsformoreinformation.\
Exception:RequestError(400,'illegal_argument_exception','index [.ds-audit-000001] \is the write index for data stream [audit] and cannot be deleted')
If the above mentioned alert and errors are present, an immediate action is
required, because it indicates that the corresponding index size has already
exceeded the space allocated for the index.
To verify that the cluster is affected:
Caution
Verify and apply the workaround to both index patterns, system and
audit, separately.
If one of indices is affected, the second one is most likely affected
as well. Although in rare cases, only one index may be affected.
Perform again the last step of the cluster verification procedure provided
above and make sure that the policy is attached to the index.
Container Cloud web UI¶[50181] Failure to deploy a compact cluster¶
A compact MOSK cluster fails to be deployed through the Container Cloud web UI
due to inability to add any label to the control plane machines along with
inability to change dedicatedControlPlane:false using the web UI.
To work around the issue, manually add the required labels using CLI. Once
done, the cluster deployment resumes.
[50168] Inability to use a new project right after creation¶
A newly created project does not display all available tabs in the Container
Cloud web UI and contains different accessdenied errors during first five
minutes after creation.
To work around the issue, refresh the browser in five minutes after the
project creation.
The following table lists the major components and their versions delivered in
Container Cloud 2.27.0.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
This section lists the artifacts of components included in the Container Cloud
release 2.27.0.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
In total, since Container Cloud 2.26.0, in 2.27.0, 408
Common Vulnerabilities and Exposures (CVE) have been fixed:
26 of critical and 382 of high severity.
The table below includes the total numbers of addressed unique and common
vulnerabilities and exposures (CVE) by product component since the 2.26.5
patch release. The common CVEs are issues addressed across several images.
This section describes the specific actions you as a cloud operator need to
complete before or after your Container Cloud cluster update to the Cluster
releases 17.2.0 or 16.2.0.
For those clusters that update between only major versions, the update
scheme remains unchaged.
Caution
In Container Cloud patch releases 2.27.1 and 2.27.2,
only the 16.2.x patch Cluster releases will be delivered with an
automatic update of management clusters and the possibility to update
non-MOSK managed clusters.
In parallel, 2.27.1 and 2.27.2 will include new 16.1.x and 17.1.x patches
for MOSK 24.1.x. And the first 17.2.x patch Cluster release
for MOSK 24.2.x will be delivered in 2.27.3. For details,
see MOSK documentation: Update path for 24.1 and 24.2 series.
Pre-update actions¶Update bird configuration on BGP-enabled bare metal clusters¶
Note
If you have already completed the below procedure after updating
your clusters to Container Cloud 2.26.0 (Cluster releases 17.1.0 or 16.1.0),
skip this subsection.
Container Cloud 2.26.0 introduced the bird daemon update from v1.6.8
to v2.0.7 on master nodes if BGP is used for BGP announcement of the cluster
API load balancer address.
Configuration files for bird v1.x are not fully compatible with those for
bird v2.x. Therefore, if you used BGP announcement of cluster API LB address
on a deployment based on Cluster releases 17.0.0 or 16.0.0, update bird
configuration files to fit bird v2.x using configuration examples provided in
the API Reference: MultirRackCluster section.
Review and adjust the storage parameters for OpenSearch¶
Note
If you have already completed the below procedure after updating
your clusters to Container Cloud 2.26.0 (Cluster releases 17.1.0 or 16.1.0),
skip this subsection.
To prevent underused or overused storage space, review your storage space
parameters for OpenSearch on the StackLight cluster:
Review the value of elasticsearch.persistentVolumeClaimSize and
the real storage available on volumes.
Decide whether you have to additionally set
elasticsearch.persistentVolumeUsableStorageSizeGB.
Post-update actions¶Prepare for changing label values in Ceph metrics used in customizations¶
Note
If you do not use Ceph metrics in any customizations, for example,
custom alerts, Grafana dashboards, or queries in custom workloads, skip
this section.
After deprecating the performance metric exporter that is integrated into the
Ceph Manager daemon for the sake of the dedicated Ceph Exporter daemon in
Container Cloud 2.27.0, you may need to prepare for updating values of several
labels in Ceph metrics if you use them in any customizations such as custom
alerts, Grafana dashboards, or queries in custom tools. These labels will be
changed in Container Cloud 2.28.0 (Cluster releases 16.3.0 and 17.3.0).
Note
Names of metrics will not be changed, no metrics will be removed.
All Ceph metrics to be collected by the Ceph Exporter daemon will change their
labels job and instance due to scraping metrics from new Ceph Exporter
daemon instead of the performance metric exporter of Ceph Manager:
Values of the job labels will be changed from rook-ceph-mgr to
prometheus-rook-exporter for all Ceph metrics moved to Ceph
Exporter. The full list of moved metrics is presented below.
Values of the instance labels will be changed from the metric endpoint
of Ceph Manager with port 9283 to the metric endpoint of Ceph Exporter
with port 9926 for all Ceph metrics moved to Ceph Exporter. The full
list of moved metrics is presented below.
Values of the instance_id labels of Ceph metrics from the RADOS
Gateway (RGW) daemons will be changed from the daemon GID to the daemon
subname. For example, instead of instance_id="<RGW_PROCESS_GID>",
the instance_id="a" (ceph_rgw_qlen{instance_id="a"}) will be
used. The list of moved Ceph RGW metrics is presented below.
List of affected Ceph RGW metrics
ceph_rgw_cache_.*
ceph_rgw_failed_req
ceph_rgw_gc_retire_object
ceph_rgw_get.*
ceph_rgw_keystone_.*
ceph_rgw_lc_.*
ceph_rgw_lua_.*
ceph_rgw_pubsub_.*
ceph_rgw_put.*
ceph_rgw_qactive
ceph_rgw_qlen
ceph_rgw_req
List of all metrics to be collected by Ceph Exporter instead of
Ceph Manager
The Container Cloud patch release 2.26.5, which is based on the
2.26.0 major release, provides the following updates:
Support for the patch Cluster releases 16.1.5
and 17.1.5 that represents Mirantis OpenStack for Kubernetes
(MOSK) patch release
24.1.5.
Bare metal: update of Ubuntu mirror from 20.04~20240502102020 to
20.04~20240517090228 along with update of minor kernel version from
5.15.0-105-generic to 5.15.0-107-generic.
Security fixes for CVEs in images.
Bug fixes.
This patch release also supports the latest major Cluster releases
17.1.0 and 16.1.0. And it does not support greenfield
deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.26.5, refer
to 2.26.0.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since the Container Cloud 2.26.4 patch
release. The common CVEs are issues addressed across several images.
The following issues have been addressed in the Container Cloud patch release
2.26.5 along with the patch Cluster releases 17.1.5
and 16.1.5.
[42408] [bare metal] Fixed the issue with old versions of system
packages, including kernel, remaining on the manager nodes after cluster
update.
[41540] [LCM] Fixed the issue with lcm-agent failing to grab storage
information on a host and leaving lcmmachine.status.hostinfo.hardware
empty due to issues with managing physical NVME devices.
When trying to list the HostOSConfigurationModules and HostOSConfiguration custom resources, serviceuser or a user with
the global-admin or operator role obtains the accessdenied error.
For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
After node maintenance of a management cluster, the newly added nodes may
fail to undergo provisioning successfully. The issue relates to new nodes
that are in the same L2 domain as the management cluster.
The issue was observed on environments having management cluster nodes
configured with a single L2 segment used for all network traffic
(PXE and LCM/management networks).
To verify whether the cluster is affected:
Verify whether the dnsmasq and dhcp-relay pods run on the same node
in the management cluster:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
LCM¶[39437] Failure to replace a master node on a Container Cloud cluster¶
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
Ceph¶[41819] Graceful cluster reboot is blocked by the Ceph ClusterWorkloadLocks¶
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
On large managed clusters, shard relocation may fail in the OpenSearch cluster
with the yellow or red status of the OpenSearch cluster.
The characteristic symptom of the issue is that in the stacklight
namespace, the statefulset.apps/opensearch-master containers are
experiencing throttling with the KubeContainersCPUThrottlingHigh alert
firing for the following set of labels:
The throttling that OpenSearch is experiencing may be a temporary
situation, which may be related, for example, to a peaky load and the
ongoing shards initialization as part of disaster recovery or after node
restart. In this case, Mirantis recommends waiting until initialization
of all shards is finished. After that, verify the cluster state and whether
throttling still exists. And only if throttling does not disappear, apply
the workaround below.
To verify that the initialization of shards is ongoing:
The system response above indicates that shards from the
.ds-system-000072, .ds-system-000073, and .ds-audit-000001
indicies are in the INITIALIZING state. In this case, Mirantis
recommends waiting until this process is finished, and only then consider
changing the limit.
You can additionally analyze the exact level of throttling and the current
CPU usage on the Kubernetes Containers dashboard in Grafana.
Workaround:
Verify the currently configured CPU requests and limits for the
opensearch containers:
In the example above, the CPU request is 500m and the CPU limit is
600m.
Increase the CPU limit to a reasonably high number.
For example, the default CPU limit for the clusters with the
clusterSize:large parameter set was increased from
8000m to 12000m for StackLight in Container Cloud 2.27.0
(Cluster releases 17.2.0 and 16.2.0).
If the CPU limit for the opensearch component is already set, increase
it in the Cluster object for the opensearch parameter. Otherwise,
the default StackLight limit is used. In this case, increase the CPU limit
for the opensearch component using the resources parameter.
Wait until all opensearch-master pods are recreated with the new CPU
limits and become running and ready.
To verify the current CPU limit for every opensearch container in every
opensearch-master pod separately:
The waiting time may take up to 20 minutes depending on the cluster size.
If the issue is fixed, the KubeContainersCPUThrottlingHigh alert stops
firing immediately, while OpenSearchClusterStatusWarning or
OpenSearchClusterStatusCritical can still be firing for some time during
shard relocation.
If the KubeContainersCPUThrottlingHigh alert is still firing, proceed with
another iteration of the CPU limit increase.
[40020] Rollover policy update is not appllied to the current index¶
While updating rollover_policy for the current system* and audit*
data streams, the update is not applied to indices.
One of indicators that the cluster is most likely affected is the
KubeJobFailed alert firing for the elasticsearch-curator job and one or
both of the following errors being present in elasticsearch-curator pods
that remain in the Error status:
2024-05-3113:16:04,459ERRORFailedtocompleteaction:delete_indices.<class'curator.exceptions.FailedExecution'>:Exceptionencountered.RerunwithloglevelDEBUGand/orcheckElasticsearchlogsformoreinformation.Exception:RequestError(400,'illegal_argument_exception','index [.ds-audit-000001] is the write index for data stream [audit] and cannot be deleted')
or
2024-05-3113:16:04,459ERRORFailedtocompleteaction:delete_indices.<class'curator.exceptions.FailedExecution'>:Exceptionencountered.RerunwithloglevelDEBUGand/orcheckElasticsearchlogsformoreinformation.Exception:RequestError(400,'illegal_argument_exception','index [.ds-system-000001] is the write index for data stream [system] and cannot be deleted')
Note
Instead of .ds-audit-000001 or .ds-system-000001 index names,
similar names can be present with the same prefix but different suffix
numbers.
If the above mentioned alert and errors are present, an immediate action is
required, because it indicates that the corresponding index size has already
exceeded the space allocated for the index.
To verify that the cluster is affected:
Caution
Verify and apply the workaround to both index patterns, system and
audit, separately.
If one of indices is affected, the second one is most likely affected
as well. Although in rare cases, only one index may be affected.
The cluster is affected if the rollover policy is missing.
Otherwise, proceed to the following step.
Verify the system response from the previous step. For example:
{"_id":"system_rollover_policy","_version":7229,"_seq_no":42362,"_primary_term":28,"policy":{"policy_id":"system_rollover_policy","description":"system index rollover policy.","last_updated_time":1708505222430,"schema_version":19,"error_notification":null,"default_state":"rollover","states":[{"name":"rollover","actions":[{"retry":{"count":3,"backoff":"exponential","delay":"1m"},"rollover":{"min_size":"14746mb","copy_alias":false}}],"transitions":[]}],"ism_template":[{"index_patterns":["system*"],"priority":200,"last_updated_time":1708505222430}]}}
Verify and capture the following items separately for every policy:
The _seq_no and _primary_term values
The rollover policy threshold, which is defined in
policy.states[0].actions[0].rollover.min_size
If the rollover policy is not attached, the cluster is affected.
If the rollover policy is attached but _seq_no and _primary_term
numbers do not match the previously captured ones, the cluster is
affected.
If the index size drastically exceeds the defined threshold of the
rollover policy (which is the previously captured min_size),
the cluster is most probably affected.
Perform again the last step of the cluster verification procedure provided
above and make sure that the policy is attached to the index and has
the same _seq_no and _primary_term.
If the index size drastically exceeds the defined threshold of the
rollover policy (which is the previously captured min_size), wait
up to 15 minutes and verify that the additional index is created with
the consecutive number in the index name. For example:
system: if you applied changes to .ds-system-000001, wait until
.ds-system-000002 is created.
audit: if you applied changes to .ds-audit-000001, wait until
.ds-audit-000002 is created.
If such index is not created, escalate the issue to Mirantis support.
This section describes the specific actions you as a cloud operator need to
complete before or after your Container Cloud cluster update to the Cluster
releases 17.1.5 or 16.1.5.
To improve user update experience and make the update path more flexible,
Container Cloud is introducing a new scheme of updating between patch Cluster
releases. More specifically, Container Cloud intends to ultimately provide a
possibility to update to any newer patch version within single series at any
point of time. The patch version downgrade is not supported.
Though, in some cases, Mirantis may request to update to some specific
patch version in the series to be able to update to the next major series.
This may be necessary due to the specifics of technical content already
released or planned for the release. For possible update paths in
MOSK in 24.1 and 24.2 series, see MOSK
documentation: Cluster update scheme.
The exact number of patch releases for the 16.1.x and 17.1.x series is yet to
be confirmed, but the current target is 7 releases.
Note
The management cluster update scheme remains the same.
A management cluster obtains the new product version automatically
after release.
Post-update actions¶Delete ‘HostOSConfiguration’ objects on baremetal-based clusters¶
If you use the HostOSConfiguration and HostOSConfigurationModules
custom resources for the bare metal provider, which are available in the
Technology Preview scope in Container Cloud 2.26.x, delete all
HostOSConfiguration objects right after update of your managed cluster to
the Cluster release 17.1.5 or 16.1.5, before automatic upgrade of the
management cluster to Container Cloud 2.27.0 (Cluster release 16.2.0).
After the upgrade, you can recreate the required objects using the updated
parameters.
This precautionary step prevents re-processing and re-applying of existing
configuration, which is defined in HostOSConfiguration objects, during
management cluster upgrade to 2.27.0. Such behavior is caused by changes in
the HostOSConfiguration API introduced in 2.27.0.
Configure Kubernetes auditing and profiling for log rotation¶
Note
Skip this procedure if you have already completed it after updating
your managed cluster to Container Cloud 2.26.4 (Cluster release 17.1.4 or
16.1.4).
After the MKE update to 3.7.8, if you are going to enable or already enabled
Kubernetes auditing and profiling on your managed or management cluster,
keep in mind that enabling audit log rotation requires an additional
step. Set the following options in the MKE configuration file after enabling
auditing and profiling:
This section lists the artifacts of components included in the Container Cloud
patch release 2.26.5. For artifacts of the Cluster releases introduced in
2.26.5, see patch Cluster releases 17.1.5 and 16.1.5.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
The Container Cloud patch release 2.26.4, which is based on the
2.26.0 major release, provides the following updates:
Support for the patch Cluster releases 16.1.4
and 17.1.4 that represents Mirantis OpenStack for Kubernetes
(MOSK) patch release
24.1.4.
Support for MKE 3.7.8.
Bare metal: update of Ubuntu mirror from 20.04~20240411171541 to
20.04~20240502102020 along with update of minor kernel version from
5.15.0-102-generic to 5.15.0-105-generic.
Security fixes for CVEs in images.
Bug fixes.
This patch release also supports the latest major Cluster releases
17.1.0 and 16.1.0. And it does not support greenfield
deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.26.4, refer
to 2.26.0.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since the Container Cloud 2.26.3 patch
release. The common CVEs are issues addressed across several images.
The following issues have been addressed in the Container Cloud patch release
2.26.4 along with the patch Cluster releases 17.1.4
and 16.1.4.
[41806] [Container Cloud web UI] Fixed the issue with failure to
configure management cluster using the Configure cluster web UI
menu without updating the Keycloak Truststore settings.
When trying to list the HostOSConfigurationModules and HostOSConfiguration custom resources, serviceuser or a user with
the global-admin or operator role obtains the accessdenied error.
For example:
After managed cluster update, old versions of system packages, including
kernel, may remain on the manager nodes. This issue occurs because the task
responsible for updating packages fails to run after updating Ubuntu mirrors.
As a workaround, manually run apt-get upgrade on every manager
node after the cluster update but before rebooting the node.
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
After node maintenance of a management cluster, the newly added nodes may
fail to undergo provisioning successfully. The issue relates to new nodes
that are in the same L2 domain as the management cluster.
The issue was observed on environments having management cluster nodes
configured with a single L2 segment used for all network traffic
(PXE and LCM/management networks).
To verify whether the cluster is affected:
Verify whether the dnsmasq and dhcp-relay pods run on the same node
in the management cluster:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
LCM¶[41540] LCM Agent cannot grab storage information on a host¶
Due to issues with managing physical NVME devices, lcm-agent cannot grab
storage information on a host. As a result,
lcmmachine.status.hostinfo.hardware is empty and the following example
error is present in logs:
{"level":"error","ts":"2024-05-02T12:26:10Z","logger":"agent",\"msg":"get hardware details",\"host":"kaas-node-548b2861-aed0-41c9-8ff2-10c5476b000b",\"error":"new storage info: get disk info \"nvme0c0n1\": \invoke command: exit status 1","errorVerbose":"exit status 1
As a workaround, on the affected node, create a symlink for any device
indicated in lcm-agent logs. For example:
ln-sfn/dev/nvme0n1/dev/nvme0c0n1
[39437] Failure to replace a master node on a Container Cloud cluster¶
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
Ceph¶[41819] Graceful cluster reboot is blocked by the Ceph ClusterWorkloadLocks¶
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
On large managed clusters, shard relocation may fail in the OpenSearch cluster
with the yellow or red status of the OpenSearch cluster.
The characteristic symptom of the issue is that in the stacklight
namespace, the statefulset.apps/opensearch-master containers are
experiencing throttling with the KubeContainersCPUThrottlingHigh alert
firing for the following set of labels:
The throttling that OpenSearch is experiencing may be a temporary
situation, which may be related, for example, to a peaky load and the
ongoing shards initialization as part of disaster recovery or after node
restart. In this case, Mirantis recommends waiting until initialization
of all shards is finished. After that, verify the cluster state and whether
throttling still exists. And only if throttling does not disappear, apply
the workaround below.
To verify that the initialization of shards is ongoing:
The system response above indicates that shards from the
.ds-system-000072, .ds-system-000073, and .ds-audit-000001
indicies are in the INITIALIZING state. In this case, Mirantis
recommends waiting until this process is finished, and only then consider
changing the limit.
You can additionally analyze the exact level of throttling and the current
CPU usage on the Kubernetes Containers dashboard in Grafana.
Workaround:
Verify the currently configured CPU requests and limits for the
opensearch containers:
In the example above, the CPU request is 500m and the CPU limit is
600m.
Increase the CPU limit to a reasonably high number.
For example, the default CPU limit for the clusters with the
clusterSize:large parameter set was increased from
8000m to 12000m for StackLight in Container Cloud 2.27.0
(Cluster releases 17.2.0 and 16.2.0).
If the CPU limit for the opensearch component is already set, increase
it in the Cluster object for the opensearch parameter. Otherwise,
the default StackLight limit is used. In this case, increase the CPU limit
for the opensearch component using the resources parameter.
Wait until all opensearch-master pods are recreated with the new CPU
limits and become running and ready.
To verify the current CPU limit for every opensearch container in every
opensearch-master pod separately:
The waiting time may take up to 20 minutes depending on the cluster size.
If the issue is fixed, the KubeContainersCPUThrottlingHigh alert stops
firing immediately, while OpenSearchClusterStatusWarning or
OpenSearchClusterStatusCritical can still be firing for some time during
shard relocation.
If the KubeContainersCPUThrottlingHigh alert is still firing, proceed with
another iteration of the CPU limit increase.
[40020] Rollover policy update is not appllied to the current index¶
While updating rollover_policy for the current system* and audit*
data streams, the update is not applied to indices.
One of indicators that the cluster is most likely affected is the
KubeJobFailed alert firing for the elasticsearch-curator job and one or
both of the following errors being present in elasticsearch-curator pods
that remain in the Error status:
2024-05-3113:16:04,459ERRORFailedtocompleteaction:delete_indices.<class'curator.exceptions.FailedExecution'>:Exceptionencountered.RerunwithloglevelDEBUGand/orcheckElasticsearchlogsformoreinformation.Exception:RequestError(400,'illegal_argument_exception','index [.ds-audit-000001] is the write index for data stream [audit] and cannot be deleted')
or
2024-05-3113:16:04,459ERRORFailedtocompleteaction:delete_indices.<class'curator.exceptions.FailedExecution'>:Exceptionencountered.RerunwithloglevelDEBUGand/orcheckElasticsearchlogsformoreinformation.Exception:RequestError(400,'illegal_argument_exception','index [.ds-system-000001] is the write index for data stream [system] and cannot be deleted')
Note
Instead of .ds-audit-000001 or .ds-system-000001 index names,
similar names can be present with the same prefix but different suffix
numbers.
If the above mentioned alert and errors are present, an immediate action is
required, because it indicates that the corresponding index size has already
exceeded the space allocated for the index.
To verify that the cluster is affected:
Caution
Verify and apply the workaround to both index patterns, system and
audit, separately.
If one of indices is affected, the second one is most likely affected
as well. Although in rare cases, only one index may be affected.
The cluster is affected if the rollover policy is missing.
Otherwise, proceed to the following step.
Verify the system response from the previous step. For example:
{"_id":"system_rollover_policy","_version":7229,"_seq_no":42362,"_primary_term":28,"policy":{"policy_id":"system_rollover_policy","description":"system index rollover policy.","last_updated_time":1708505222430,"schema_version":19,"error_notification":null,"default_state":"rollover","states":[{"name":"rollover","actions":[{"retry":{"count":3,"backoff":"exponential","delay":"1m"},"rollover":{"min_size":"14746mb","copy_alias":false}}],"transitions":[]}],"ism_template":[{"index_patterns":["system*"],"priority":200,"last_updated_time":1708505222430}]}}
Verify and capture the following items separately for every policy:
The _seq_no and _primary_term values
The rollover policy threshold, which is defined in
policy.states[0].actions[0].rollover.min_size
If the rollover policy is not attached, the cluster is affected.
If the rollover policy is attached but _seq_no and _primary_term
numbers do not match the previously captured ones, the cluster is
affected.
If the index size drastically exceeds the defined threshold of the
rollover policy (which is the previously captured min_size),
the cluster is most probably affected.
Perform again the last step of the cluster verification procedure provided
above and make sure that the policy is attached to the index and has
the same _seq_no and _primary_term.
If the index size drastically exceeds the defined threshold of the
rollover policy (which is the previously captured min_size), wait
up to 15 minutes and verify that the additional index is created with
the consecutive number in the index name. For example:
system: if you applied changes to .ds-system-000001, wait until
.ds-system-000002 is created.
audit: if you applied changes to .ds-audit-000001, wait until
.ds-audit-000002 is created.
If such index is not created, escalate the issue to Mirantis support.
This section describes the specific actions you as a cloud operator need to
complete before or after your Container Cloud cluster update to the Cluster
releases 17.1.4 or 16.1.4.
Post-update actions¶Configure Kubernetes auditing and profiling for log rotation¶
After the MKE update to 3.7.8, if you are going to enable or already enabled
Kubernetes auditing and profiling on your managed or management cluster,
keep in mind that enabling audit log rotation requires an additional
step. Set the following options in the MKE configuration file after enabling
auditing and profiling:
This section lists the artifacts of components included in the Container Cloud
patch release 2.26.4. For artifacts of the Cluster releases introduced in
2.26.4, see patch Cluster releases 17.1.4 and 16.1.4.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
The Container Cloud patch release 2.26.3, which is based on the
2.26.0 major release, provides the following updates:
Support for the patch Cluster releases 16.1.3
and 17.1.3 that represents Mirantis OpenStack for Kubernetes
(MOSK) patch release
24.1.3.
Support for MKE 3.7.7.
Bare metal: update of Ubuntu mirror from 20.04~20240324172903 to
20.04~20240411171541 along with update of minor kernel version from
5.15.0-101-generic to 5.15.0-102-generic.
Security fixes for CVEs in images.
Bug fixes.
This patch release also supports the latest major Cluster releases
17.1.0 and 16.1.0. And it does not support greenfield
deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.26.3, refer
to 2.26.0.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since the Container Cloud 2.26.2 patch
release. The common CVEs are issues addressed across several images.
When trying to list the HostOSConfigurationModules and HostOSConfiguration custom resources, serviceuser or a user with
the global-admin or operator role obtains the accessdenied error.
For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
After node maintenance of a management cluster, the newly added nodes may
fail to undergo provisioning successfully. The issue relates to new nodes
that are in the same L2 domain as the management cluster.
The issue was observed on environments having management cluster nodes
configured with a single L2 segment used for all network traffic
(PXE and LCM/management networks).
To verify whether the cluster is affected:
Verify whether the dnsmasq and dhcp-relay pods run on the same node
in the management cluster:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
LCM¶[41540] LCM Agent cannot grab storage information on a host¶
Due to issues with managing physical NVME devices, lcm-agent cannot grab
storage information on a host. As a result,
lcmmachine.status.hostinfo.hardware is empty and the following example
error is present in logs:
{"level":"error","ts":"2024-05-02T12:26:10Z","logger":"agent",\"msg":"get hardware details",\"host":"kaas-node-548b2861-aed0-41c9-8ff2-10c5476b000b",\"error":"new storage info: get disk info \"nvme0c0n1\": \invoke command: exit status 1","errorVerbose":"exit status 1
As a workaround, on the affected node, create a symlink for any device
indicated in lcm-agent logs. For example:
ln-sfn/dev/nvme0n1/dev/nvme0c0n1
[39437] Failure to replace a master node on a Container Cloud cluster¶
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
Ceph¶[41819] Graceful cluster reboot is blocked by the Ceph ClusterWorkloadLocks¶
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
On large managed clusters, shard relocation may fail in the OpenSearch cluster
with the yellow or red status of the OpenSearch cluster.
The characteristic symptom of the issue is that in the stacklight
namespace, the statefulset.apps/opensearch-master containers are
experiencing throttling with the KubeContainersCPUThrottlingHigh alert
firing for the following set of labels:
The throttling that OpenSearch is experiencing may be a temporary
situation, which may be related, for example, to a peaky load and the
ongoing shards initialization as part of disaster recovery or after node
restart. In this case, Mirantis recommends waiting until initialization
of all shards is finished. After that, verify the cluster state and whether
throttling still exists. And only if throttling does not disappear, apply
the workaround below.
To verify that the initialization of shards is ongoing:
The system response above indicates that shards from the
.ds-system-000072, .ds-system-000073, and .ds-audit-000001
indicies are in the INITIALIZING state. In this case, Mirantis
recommends waiting until this process is finished, and only then consider
changing the limit.
You can additionally analyze the exact level of throttling and the current
CPU usage on the Kubernetes Containers dashboard in Grafana.
Workaround:
Verify the currently configured CPU requests and limits for the
opensearch containers:
In the example above, the CPU request is 500m and the CPU limit is
600m.
Increase the CPU limit to a reasonably high number.
For example, the default CPU limit for the clusters with the
clusterSize:large parameter set was increased from
8000m to 12000m for StackLight in Container Cloud 2.27.0
(Cluster releases 17.2.0 and 16.2.0).
If the CPU limit for the opensearch component is already set, increase
it in the Cluster object for the opensearch parameter. Otherwise,
the default StackLight limit is used. In this case, increase the CPU limit
for the opensearch component using the resources parameter.
Wait until all opensearch-master pods are recreated with the new CPU
limits and become running and ready.
To verify the current CPU limit for every opensearch container in every
opensearch-master pod separately:
The waiting time may take up to 20 minutes depending on the cluster size.
If the issue is fixed, the KubeContainersCPUThrottlingHigh alert stops
firing immediately, while OpenSearchClusterStatusWarning or
OpenSearchClusterStatusCritical can still be firing for some time during
shard relocation.
If the KubeContainersCPUThrottlingHigh alert is still firing, proceed with
another iteration of the CPU limit increase.
[40020] Rollover policy update is not appllied to the current index¶
While updating rollover_policy for the current system* and audit*
data streams, the update is not applied to indices.
One of indicators that the cluster is most likely affected is the
KubeJobFailed alert firing for the elasticsearch-curator job and one or
both of the following errors being present in elasticsearch-curator pods
that remain in the Error status:
2024-05-3113:16:04,459ERRORFailedtocompleteaction:delete_indices.<class'curator.exceptions.FailedExecution'>:Exceptionencountered.RerunwithloglevelDEBUGand/orcheckElasticsearchlogsformoreinformation.Exception:RequestError(400,'illegal_argument_exception','index [.ds-audit-000001] is the write index for data stream [audit] and cannot be deleted')
or
2024-05-3113:16:04,459ERRORFailedtocompleteaction:delete_indices.<class'curator.exceptions.FailedExecution'>:Exceptionencountered.RerunwithloglevelDEBUGand/orcheckElasticsearchlogsformoreinformation.Exception:RequestError(400,'illegal_argument_exception','index [.ds-system-000001] is the write index for data stream [system] and cannot be deleted')
Note
Instead of .ds-audit-000001 or .ds-system-000001 index names,
similar names can be present with the same prefix but different suffix
numbers.
If the above mentioned alert and errors are present, an immediate action is
required, because it indicates that the corresponding index size has already
exceeded the space allocated for the index.
To verify that the cluster is affected:
Caution
Verify and apply the workaround to both index patterns, system and
audit, separately.
If one of indices is affected, the second one is most likely affected
as well. Although in rare cases, only one index may be affected.
The cluster is affected if the rollover policy is missing.
Otherwise, proceed to the following step.
Verify the system response from the previous step. For example:
{"_id":"system_rollover_policy","_version":7229,"_seq_no":42362,"_primary_term":28,"policy":{"policy_id":"system_rollover_policy","description":"system index rollover policy.","last_updated_time":1708505222430,"schema_version":19,"error_notification":null,"default_state":"rollover","states":[{"name":"rollover","actions":[{"retry":{"count":3,"backoff":"exponential","delay":"1m"},"rollover":{"min_size":"14746mb","copy_alias":false}}],"transitions":[]}],"ism_template":[{"index_patterns":["system*"],"priority":200,"last_updated_time":1708505222430}]}}
Verify and capture the following items separately for every policy:
The _seq_no and _primary_term values
The rollover policy threshold, which is defined in
policy.states[0].actions[0].rollover.min_size
If the rollover policy is not attached, the cluster is affected.
If the rollover policy is attached but _seq_no and _primary_term
numbers do not match the previously captured ones, the cluster is
affected.
If the index size drastically exceeds the defined threshold of the
rollover policy (which is the previously captured min_size),
the cluster is most probably affected.
Perform again the last step of the cluster verification procedure provided
above and make sure that the policy is attached to the index and has
the same _seq_no and _primary_term.
If the index size drastically exceeds the defined threshold of the
rollover policy (which is the previously captured min_size), wait
up to 15 minutes and verify that the additional index is created with
the consecutive number in the index name. For example:
system: if you applied changes to .ds-system-000001, wait until
.ds-system-000002 is created.
audit: if you applied changes to .ds-audit-000001, wait until
.ds-audit-000002 is created.
If such index is not created, escalate the issue to Mirantis support.
Container Cloud web UI¶[41806] Configuration of a management cluster fails without Keycloak settings¶
During configuration of a management cluster settings using the
Configure cluster web UI menu, updating the Keycloak Truststore
settings is mandatory, despite being optional.
As a workaround, update the management cluster using the API or CLI.
This section lists the artifacts of components included in the Container Cloud
patch release 2.26.3. For artifacts of the Cluster releases introduced in
2.26.3, see patch Cluster releases 17.1.3 and 16.1.3.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
The Container Cloud patch release 2.26.2, which is based on the
2.26.0 major release, provides the following updates:
Support for the patch Cluster releases 16.1.2
and 17.1.2 that represents Mirantis OpenStack for Kubernetes
(MOSK) patch release
24.1.2.
Support for MKE 3.7.6.
Support for docker-ee-cli 23.0.10 in MCR 23.0.9 to fix several CVEs.
Bare metal: update of Ubuntu mirror from 20.04~20240302175618 to
20.04~20240324172903 along with update of minor kernel version from
5.15.0-97-generic to 5.15.0-101-generic.
Security fixes for CVEs in images.
This patch release also supports the latest major Cluster releases
17.1.0 and 16.1.0. And it does not
support greenfield deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.26.2, refer
to 2.26.0.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since the Container Cloud 2.26.1 patch
release. The common CVEs are issues addressed across several images.
When trying to list the HostOSConfigurationModules and HostOSConfiguration custom resources, serviceuser or a user with
the global-admin or operator role obtains the accessdenied error.
For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
After node maintenance of a management cluster, the newly added nodes may
fail to undergo provisioning successfully. The issue relates to new nodes
that are in the same L2 domain as the management cluster.
The issue was observed on environments having management cluster nodes
configured with a single L2 segment used for all network traffic
(PXE and LCM/management networks).
To verify whether the cluster is affected:
Verify whether the dnsmasq and dhcp-relay pods run on the same node
in the management cluster:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
LCM¶[41540] LCM Agent cannot grab storage information on a host¶
Due to issues with managing physical NVME devices, lcm-agent cannot grab
storage information on a host. As a result,
lcmmachine.status.hostinfo.hardware is empty and the following example
error is present in logs:
{"level":"error","ts":"2024-05-02T12:26:10Z","logger":"agent",\"msg":"get hardware details",\"host":"kaas-node-548b2861-aed0-41c9-8ff2-10c5476b000b",\"error":"new storage info: get disk info \"nvme0c0n1\": \invoke command: exit status 1","errorVerbose":"exit status 1
As a workaround, on the affected node, create a symlink for any device
indicated in lcm-agent logs. For example:
ln-sfn/dev/nvme0n1/dev/nvme0c0n1
[40811] Pod is stuck in the Terminating state on the deleted node¶
During deletion of a machine, the related DaemonSet Pod can remain on the
deleted node in the Terminating state. As a workaround, manually
delete the Pod:
kubectldeletepod-n<podNamespace><podName>
[39437] Failure to replace a master node on a Container Cloud cluster¶
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
Ceph¶[41819] Graceful cluster reboot is blocked by the Ceph ClusterWorkloadLocks¶
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
On large managed clusters, shard relocation may fail in the OpenSearch cluster
with the yellow or red status of the OpenSearch cluster.
The characteristic symptom of the issue is that in the stacklight
namespace, the statefulset.apps/opensearch-master containers are
experiencing throttling with the KubeContainersCPUThrottlingHigh alert
firing for the following set of labels:
The throttling that OpenSearch is experiencing may be a temporary
situation, which may be related, for example, to a peaky load and the
ongoing shards initialization as part of disaster recovery or after node
restart. In this case, Mirantis recommends waiting until initialization
of all shards is finished. After that, verify the cluster state and whether
throttling still exists. And only if throttling does not disappear, apply
the workaround below.
To verify that the initialization of shards is ongoing:
The system response above indicates that shards from the
.ds-system-000072, .ds-system-000073, and .ds-audit-000001
indicies are in the INITIALIZING state. In this case, Mirantis
recommends waiting until this process is finished, and only then consider
changing the limit.
You can additionally analyze the exact level of throttling and the current
CPU usage on the Kubernetes Containers dashboard in Grafana.
Workaround:
Verify the currently configured CPU requests and limits for the
opensearch containers:
In the example above, the CPU request is 500m and the CPU limit is
600m.
Increase the CPU limit to a reasonably high number.
For example, the default CPU limit for the clusters with the
clusterSize:large parameter set was increased from
8000m to 12000m for StackLight in Container Cloud 2.27.0
(Cluster releases 17.2.0 and 16.2.0).
If the CPU limit for the opensearch component is already set, increase
it in the Cluster object for the opensearch parameter. Otherwise,
the default StackLight limit is used. In this case, increase the CPU limit
for the opensearch component using the resources parameter.
Wait until all opensearch-master pods are recreated with the new CPU
limits and become running and ready.
To verify the current CPU limit for every opensearch container in every
opensearch-master pod separately:
The waiting time may take up to 20 minutes depending on the cluster size.
If the issue is fixed, the KubeContainersCPUThrottlingHigh alert stops
firing immediately, while OpenSearchClusterStatusWarning or
OpenSearchClusterStatusCritical can still be firing for some time during
shard relocation.
If the KubeContainersCPUThrottlingHigh alert is still firing, proceed with
another iteration of the CPU limit increase.
[40020] Rollover policy update is not appllied to the current index¶
While updating rollover_policy for the current system* and audit*
data streams, the update is not applied to indices.
One of indicators that the cluster is most likely affected is the
KubeJobFailed alert firing for the elasticsearch-curator job and one or
both of the following errors being present in elasticsearch-curator pods
that remain in the Error status:
2024-05-3113:16:04,459ERRORFailedtocompleteaction:delete_indices.<class'curator.exceptions.FailedExecution'>:Exceptionencountered.RerunwithloglevelDEBUGand/orcheckElasticsearchlogsformoreinformation.Exception:RequestError(400,'illegal_argument_exception','index [.ds-audit-000001] is the write index for data stream [audit] and cannot be deleted')
or
2024-05-3113:16:04,459ERRORFailedtocompleteaction:delete_indices.<class'curator.exceptions.FailedExecution'>:Exceptionencountered.RerunwithloglevelDEBUGand/orcheckElasticsearchlogsformoreinformation.Exception:RequestError(400,'illegal_argument_exception','index [.ds-system-000001] is the write index for data stream [system] and cannot be deleted')
Note
Instead of .ds-audit-000001 or .ds-system-000001 index names,
similar names can be present with the same prefix but different suffix
numbers.
If the above mentioned alert and errors are present, an immediate action is
required, because it indicates that the corresponding index size has already
exceeded the space allocated for the index.
To verify that the cluster is affected:
Caution
Verify and apply the workaround to both index patterns, system and
audit, separately.
If one of indices is affected, the second one is most likely affected
as well. Although in rare cases, only one index may be affected.
The cluster is affected if the rollover policy is missing.
Otherwise, proceed to the following step.
Verify the system response from the previous step. For example:
{"_id":"system_rollover_policy","_version":7229,"_seq_no":42362,"_primary_term":28,"policy":{"policy_id":"system_rollover_policy","description":"system index rollover policy.","last_updated_time":1708505222430,"schema_version":19,"error_notification":null,"default_state":"rollover","states":[{"name":"rollover","actions":[{"retry":{"count":3,"backoff":"exponential","delay":"1m"},"rollover":{"min_size":"14746mb","copy_alias":false}}],"transitions":[]}],"ism_template":[{"index_patterns":["system*"],"priority":200,"last_updated_time":1708505222430}]}}
Verify and capture the following items separately for every policy:
The _seq_no and _primary_term values
The rollover policy threshold, which is defined in
policy.states[0].actions[0].rollover.min_size
If the rollover policy is not attached, the cluster is affected.
If the rollover policy is attached but _seq_no and _primary_term
numbers do not match the previously captured ones, the cluster is
affected.
If the index size drastically exceeds the defined threshold of the
rollover policy (which is the previously captured min_size),
the cluster is most probably affected.
Perform again the last step of the cluster verification procedure provided
above and make sure that the policy is attached to the index and has
the same _seq_no and _primary_term.
If the index size drastically exceeds the defined threshold of the
rollover policy (which is the previously captured min_size), wait
up to 15 minutes and verify that the additional index is created with
the consecutive number in the index name. For example:
system: if you applied changes to .ds-system-000001, wait until
.ds-system-000002 is created.
audit: if you applied changes to .ds-audit-000001, wait until
.ds-audit-000002 is created.
If such index is not created, escalate the issue to Mirantis support.
Container Cloud web UI¶[41806] Configuration of a management cluster fails without Keycloak settings¶
During configuration of a management cluster settings using the
Configure cluster web UI menu, updating the Keycloak Truststore
settings is mandatory, despite being optional.
As a workaround, update the management cluster using the API or CLI.
This section lists the artifacts of components included in the Container Cloud
patch release 2.26.2. For artifacts of the Cluster releases introduced in
2.26.2, see patch Cluster releases 17.1.2 and 16.1.2.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
The Container Cloud patch release 2.26.1, which is based on the
2.26.0 major release, provides the following updates:
Support for the patch Cluster releases 16.1.1
and 17.1.1 that represents Mirantis OpenStack for Kubernetes
(MOSK) patch release
24.1.1.
Delivery mechanism for CVE fixes on Ubuntu in bare metal clusters that
includes update of Ubuntu kernel minor version.
For details, see Enhancements.
Security fixes for CVEs in images.
This patch release also supports the latest major Cluster releases
17.1.0 and 16.1.0. And it does not
support greenfield deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.26.1, refer
to 2.26.0.
This section outlines new features and enhancements introduced in the
Container Cloud patch release 2.26.1 along with Cluster releases 17.1.1 and
16.1.1.
Delivery mechanism for CVE fixes on Ubuntu in bare metal clusters¶
Introduced the ability to update Ubuntu packages including kernel minor
version update, when available in a Cluster release, for both management and
managed bare metal clusters to address CVE issues on a host operating system.
On management clusters, the update of Ubuntu mirror along with the update
of minor kernel version occurs automatically with cordon-drain and reboot
of machines.
On managed clusters, the update of Ubuntu mirror along with the update
of minor kernel version applies during a manual cluster update without
automatic cordon-drain and reboot of machines. After a managed cluster
update, all cluster machines have the rebootisrequired notification.
You can manually handle the reboot of machines during a convenient
maintenance window using
GracefulRebootRequest.
This section lists the artifacts of components included in the Container Cloud
patch release 2.26.1. For artifacts of the Cluster releases introduced in
2.26.1, see patch Cluster releases 17.1.1 and 16.1.1.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since the Container Cloud 2.26.0 major
release. The common CVEs are issues addressed across several images.
The following issues have been addressed in the Container Cloud patch release
2.26.1 along with the patch Cluster releases 17.1.1
and 16.1.1.
[39330] [StackLight] Fixed the issue with the OpenSearch cluster being
stuck due to initializing replica shards.
[39220] [StackLight] Fixed the issue with Patroni failure due to no limit
configuration for the max_timelines_history parameter.
[39080] [StackLight] Fixed the issue with the
OpenSearchClusterStatusWarning alert firing during cluster upgrade if
StackLight is deployed in the HA mode.
[38970] [StackLight] Fixed the issue with the Logs dashboard
in the OpenSearch Dashboards web UI not working for the system index.
[38937] [StackLight] Fixed the issue with the
View logs in OpenSearch Dashboards link not working in the
Grafana web UI.
[40747] [vSphere] Fixed the issue with the unsupported Cluster release
being available for greenfield vSphere-based managed cluster deployments
in the drop-down menu of the cluster creation window in the Container Cloud
web UI.
[40036] [LCM] Fixed the issue causing nodes to remain in the
Kubernetes cluster when the corresponding Machine object is disabled
during cluster update.
When trying to list the HostOSConfigurationModules and HostOSConfiguration custom resources, serviceuser or a user with
the global-admin or operator role obtains the accessdenied error.
For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
After node maintenance of a management cluster, the newly added nodes may
fail to undergo provisioning successfully. The issue relates to new nodes
that are in the same L2 domain as the management cluster.
The issue was observed on environments having management cluster nodes
configured with a single L2 segment used for all network traffic
(PXE and LCM/management networks).
To verify whether the cluster is affected:
Verify whether the dnsmasq and dhcp-relay pods run on the same node
in the management cluster:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
LCM¶[41540] LCM Agent cannot grab storage information on a host¶
Due to issues with managing physical NVME devices, lcm-agent cannot grab
storage information on a host. As a result,
lcmmachine.status.hostinfo.hardware is empty and the following example
error is present in logs:
{"level":"error","ts":"2024-05-02T12:26:10Z","logger":"agent",\"msg":"get hardware details",\"host":"kaas-node-548b2861-aed0-41c9-8ff2-10c5476b000b",\"error":"new storage info: get disk info \"nvme0c0n1\": \invoke command: exit status 1","errorVerbose":"exit status 1
As a workaround, on the affected node, create a symlink for any device
indicated in lcm-agent logs. For example:
ln-sfn/dev/nvme0n1/dev/nvme0c0n1
[40811] Pod is stuck in the Terminating state on the deleted node¶
During deletion of a machine, the related DaemonSet Pod can remain on the
deleted node in the Terminating state. As a workaround, manually
delete the Pod:
kubectldeletepod-n<podNamespace><podName>
[39437] Failure to replace a master node on a Container Cloud cluster¶
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
Ceph¶[41819] Graceful cluster reboot is blocked by the Ceph ClusterWorkloadLocks¶
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
On large managed clusters, shard relocation may fail in the OpenSearch cluster
with the yellow or red status of the OpenSearch cluster.
The characteristic symptom of the issue is that in the stacklight
namespace, the statefulset.apps/opensearch-master containers are
experiencing throttling with the KubeContainersCPUThrottlingHigh alert
firing for the following set of labels:
The throttling that OpenSearch is experiencing may be a temporary
situation, which may be related, for example, to a peaky load and the
ongoing shards initialization as part of disaster recovery or after node
restart. In this case, Mirantis recommends waiting until initialization
of all shards is finished. After that, verify the cluster state and whether
throttling still exists. And only if throttling does not disappear, apply
the workaround below.
To verify that the initialization of shards is ongoing:
The system response above indicates that shards from the
.ds-system-000072, .ds-system-000073, and .ds-audit-000001
indicies are in the INITIALIZING state. In this case, Mirantis
recommends waiting until this process is finished, and only then consider
changing the limit.
You can additionally analyze the exact level of throttling and the current
CPU usage on the Kubernetes Containers dashboard in Grafana.
Workaround:
Verify the currently configured CPU requests and limits for the
opensearch containers:
In the example above, the CPU request is 500m and the CPU limit is
600m.
Increase the CPU limit to a reasonably high number.
For example, the default CPU limit for the clusters with the
clusterSize:large parameter set was increased from
8000m to 12000m for StackLight in Container Cloud 2.27.0
(Cluster releases 17.2.0 and 16.2.0).
If the CPU limit for the opensearch component is already set, increase
it in the Cluster object for the opensearch parameter. Otherwise,
the default StackLight limit is used. In this case, increase the CPU limit
for the opensearch component using the resources parameter.
Wait until all opensearch-master pods are recreated with the new CPU
limits and become running and ready.
To verify the current CPU limit for every opensearch container in every
opensearch-master pod separately:
The waiting time may take up to 20 minutes depending on the cluster size.
If the issue is fixed, the KubeContainersCPUThrottlingHigh alert stops
firing immediately, while OpenSearchClusterStatusWarning or
OpenSearchClusterStatusCritical can still be firing for some time during
shard relocation.
If the KubeContainersCPUThrottlingHigh alert is still firing, proceed with
another iteration of the CPU limit increase.
[40020] Rollover policy update is not appllied to the current index¶
While updating rollover_policy for the current system* and audit*
data streams, the update is not applied to indices.
One of indicators that the cluster is most likely affected is the
KubeJobFailed alert firing for the elasticsearch-curator job and one or
both of the following errors being present in elasticsearch-curator pods
that remain in the Error status:
2024-05-3113:16:04,459ERRORFailedtocompleteaction:delete_indices.<class'curator.exceptions.FailedExecution'>:Exceptionencountered.RerunwithloglevelDEBUGand/orcheckElasticsearchlogsformoreinformation.Exception:RequestError(400,'illegal_argument_exception','index [.ds-audit-000001] is the write index for data stream [audit] and cannot be deleted')
or
2024-05-3113:16:04,459ERRORFailedtocompleteaction:delete_indices.<class'curator.exceptions.FailedExecution'>:Exceptionencountered.RerunwithloglevelDEBUGand/orcheckElasticsearchlogsformoreinformation.Exception:RequestError(400,'illegal_argument_exception','index [.ds-system-000001] is the write index for data stream [system] and cannot be deleted')
Note
Instead of .ds-audit-000001 or .ds-system-000001 index names,
similar names can be present with the same prefix but different suffix
numbers.
If the above mentioned alert and errors are present, an immediate action is
required, because it indicates that the corresponding index size has already
exceeded the space allocated for the index.
To verify that the cluster is affected:
Caution
Verify and apply the workaround to both index patterns, system and
audit, separately.
If one of indices is affected, the second one is most likely affected
as well. Although in rare cases, only one index may be affected.
The cluster is affected if the rollover policy is missing.
Otherwise, proceed to the following step.
Verify the system response from the previous step. For example:
{"_id":"system_rollover_policy","_version":7229,"_seq_no":42362,"_primary_term":28,"policy":{"policy_id":"system_rollover_policy","description":"system index rollover policy.","last_updated_time":1708505222430,"schema_version":19,"error_notification":null,"default_state":"rollover","states":[{"name":"rollover","actions":[{"retry":{"count":3,"backoff":"exponential","delay":"1m"},"rollover":{"min_size":"14746mb","copy_alias":false}}],"transitions":[]}],"ism_template":[{"index_patterns":["system*"],"priority":200,"last_updated_time":1708505222430}]}}
Verify and capture the following items separately for every policy:
The _seq_no and _primary_term values
The rollover policy threshold, which is defined in
policy.states[0].actions[0].rollover.min_size
If the rollover policy is not attached, the cluster is affected.
If the rollover policy is attached but _seq_no and _primary_term
numbers do not match the previously captured ones, the cluster is
affected.
If the index size drastically exceeds the defined threshold of the
rollover policy (which is the previously captured min_size),
the cluster is most probably affected.
Perform again the last step of the cluster verification procedure provided
above and make sure that the policy is attached to the index and has
the same _seq_no and _primary_term.
If the index size drastically exceeds the defined threshold of the
rollover policy (which is the previously captured min_size), wait
up to 15 minutes and verify that the additional index is created with
the consecutive number in the index name. For example:
system: if you applied changes to .ds-system-000001, wait until
.ds-system-000002 is created.
audit: if you applied changes to .ds-audit-000001, wait until
.ds-audit-000002 is created.
If such index is not created, escalate the issue to Mirantis support.
Container Cloud web UI¶[41806] Configuration of a management cluster fails without Keycloak settings¶
During configuration of a management cluster settings using the
Configure cluster web UI menu, updating the Keycloak Truststore
settings is mandatory, despite being optional.
As a workaround, update the management cluster using the API or CLI.
Does not support greenfield deployments on deprecated Cluster releases
of the 17.0.x and 16.0.x series. Use the latest available Cluster releases
of the series instead.
Caution
Make sure to update the Cluster release version
of your managed cluster before the current Cluster release
version becomes unsupported by a new Container Cloud release
version.
Otherwise, Container Cloud stops auto-upgrade and eventually
Container Cloud itself becomes unsupported.
This section outlines release notes for the Container Cloud release 2.26.0.
This section outlines new features and enhancements introduced in the
Container Cloud release 2.26.0. For the list of enhancements delivered with
the Cluster releases introduced by Container Cloud 2.26.0, see
17.1.0 and 16.1.0.
Pre-update inspection of pinned product artifacts in a ‘Cluster’ object¶
To ensure that Container Cloud clusters remain consistently updated with the
latest security fixes and product improvements, the Admission Controller
has been enhanced. Now, it actively prevents the utilization of pinned
custom artifacts for Container Cloud components. Specifically, it blocks
a management or managed cluster release update, or any cluster configuration
update, for example, adding public keys or proxy, if a Cluster object
contains any custom Container Cloud artifacts with global or image-related
values overwritten in the helm-releases section, until these values are
removed.
Normally, the Container Cloud clusters do not contain pinned artifacts,
which eliminates the need for any pre-update actions in most deployments.
However, if the update of your cluster is blocked with the
invalidHelmReleasesconfiguration error, refer to
Update notes: Pre-update actions for details.
Note
In rare cases, if the image-related or global values should be
changed, you can use the ClusterRelease or KaaSRelease objects
instead. But make sure to update these values manually after every major
and patch update.
Note
The pre-update inspection applies only to images delivered by
Container Cloud that are overwritten. Any custom images unrelated to the
product components are not verified and do not block cluster update.
Disablement of worker machines on managed clusters¶
TechPreview
Implemented the machine disabling API that allows you to seamlessly remove
a worker machine from the LCM control of a managed cluster. This action
isolates the affected node without impacting other machines in the cluster,
effectively eliminating it from the Kubernetes cluster. This functionality
proves invaluable in scenarios where a malfunctioning machine impedes cluster
updates.
Added initial Technology Preview support for the HostOSConfiguration and
HostOSConfigurationModules custom resources in the bare metal provider.
These resources introduce configuration modules that allow managing the
operating system of a bare metal host granularly without rebuilding the node
from scratch. Such approach prevents workload evacuation and significantly
reduces configuration time.
Configuration modules manage various settings of the operating system using
Ansible playbooks, adhering to specific schemas and metadata requirements.
For description of module format, schemas, and rules, contact Mirantis support.
Warning
For security reasons and to ensure safe and reliable cluster
operability, contact Mirantis support
to start using these custom resources.
Caution
As long as the feature is still on the development stage,
Mirantis highly recommends deleting all HostOSConfiguration objects,
if any, before automatic upgrade of the management cluster to Container Cloud
2.27.0 (Cluster release 16.2.0). After the upgrade, you can recreate the
required objects using the updated parameters.
This precautionary step prevents re-processing and re-applying of existing
configuration, which is defined in HostOSConfiguration objects, during
management cluster upgrade to 2.27.0. Such behavior is caused by changes in
the HostOSConfiguration API introduced in 2.27.0.
Strict filtering for devices on bare metal clusters¶
Implemented the strict byID filtering for targeting system disks using
specific device options: byPath, serialNumber, and wwn.
These options offer a more reliable alternative to the unpredictable
byName naming format.
Mirantis recommends adopting these new device naming options when adding new
nodes and redeploying existing ones to ensure a predictable and stable device
naming schema.
Dynamic IP allocation for faster host provisioning¶
Introduced a mechanism in the Container Cloud dnsmasq server to dynamically
allocate IP addresses for baremetal hosts during provisioning. This new
mechanism replaces sequential IP allocation that includes the ping check
with dynamic IP allocation without the ping check. Such behavior significantly
increases the amount of baremetal servers that you can provision in parallel,
which allows you to streamline the process of setting up a large managed
cluster.
Support for Kubernetes auditing and profiling on management clusters¶
Added support for the Kubernetes auditing and profiling enablement and
configuration on management clusters. The auditing option is enabled by
default. You can configure both options using Cluster object of the
management cluster.
Note
For managed clusters, you can also configure Kubernetes auditing
along with profiling using the Cluster object of a managed cluster.
Cleanup of LVM thin pool volumes during cluster provisioning¶
Implemented automatic cleanup of LVM thin pool volumes during the provisioning
stage to prevent issues with logical volume detection before removal, which
could cause node cleanup failure during cluster redeployment.
Wiping a device or partition before a bare metal cluster deployment¶
Implemented the capability to erase existing data from hardware devices to be
used for a bare metal management or managed cluster deployment. Using the
new wipeDevice structure, you can either erase an existing partition or
remove all existing partitions from a physical device. For these purposes,
use the eraseMetadata or eraseDevice option that configures cleanup
behavior during configuration of a custom bare metal host profile.
Note
The wipeDevice option replaces the deprecated wipe option
that will be removed in one of the following releases. For backward
compatibility, any existing wipe:true option is automatically converted
to the following structure:
Policy Controller for validating pod image signatures¶
Technology Preview
Introduced initial Technology Preview support for the Policy Controller that
validates signatures of pod images.
The Policy Controller verifies that images used by the Container Cloud and
Mirantis OpenStack for Kubernetes controllers are signed by a trusted authority.
The Policy Controller inspects defined image policies that list Docker
registries and authorities for signature validation.
Added support for configuring Keycloak truststore using the Container Cloud
web UI to allow for a proper validation of client self-signed certificates.
The truststore is used to ensure secured connection to identity brokers,
LDAP identity providers, and others.
Added the LCM Operation condition to monitor health of all LCM
operations on a cluster and its machines that is useful during cluster update.
You can monitor the status of LCM operations using the the Container Cloud
web UI in the status hover menus of a cluster and machine.
On top of continuous improvements delivered to the existing Container Cloud
guides, added the documentation on how to export logs from OpenSearch
dashboards to CSV.
The following issues have been addressed in the Mirantis Container Cloud
release 2.26.0 along with the Cluster releases 17.1.0 and
16.1.0.
Note
This section provides descriptions of issues addressed since
the last Container Cloud patch release 2.25.4.
For details on addressed issues in earlier patch releases since 2.25.0,
which are also included into the major release 2.26.0, refer to
2.25.x patch releases.
[32761] [LCM] Fixed the issue with node cleanup failing on
MOSK clusters due to the Ansible provisioner hanging in a
loop while trying to remove LVM thin pool logical volumes, which
occurred due to issues with volume detection before removal during cluster
redeployment. The issue resolution comprises implementation of automatic
cleanup of LVM thin pool volumes during the provisioning stage.
[36924] [LCM] Fixed the issue with Ansible starting to run on nodes
of a managed cluster after the mcc-cache certificate is applied
on a management cluster.
[37268] [LCM] Fixed the issue with Container Cloud cluster being
blocked by a node stuck in the Prepare or Deploy state with
errorprocessingpackageopenssh-server. The issue was caused by
customizations in /etc/ssh/sshd_config, such as additional Match
statements.
[34820] [Ceph] Fixed the issue with the Ceph rook-operator failing
to connect to Ceph RADOS Gateway pods on clusters with the
Federal Information Processing Standard mode enabled.
[38340] [StackLight] Fixed the issue with Telegraf Docker Swarm timing
out while collecting data by increasing its timeout from 10 to 25 seconds.
When trying to list the HostOSConfigurationModules and HostOSConfiguration custom resources, serviceuser or a user with
the global-admin or operator role obtains the accessdenied error.
For example:
[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
After node maintenance of a management cluster, the newly added nodes may
fail to undergo provisioning successfully. The issue relates to new nodes
that are in the same L2 domain as the management cluster.
The issue was observed on environments having management cluster nodes
configured with a single L2 segment used for all network traffic
(PXE and LCM/management networks).
To verify whether the cluster is affected:
Verify whether the dnsmasq and dhcp-relay pods run on the same node
in the management cluster:
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
vSphere¶[40747] Unsupported Cluster release is available for managed cluster deployment¶
The Cluster release 16.0.0, which is not supported for greenfield vSphere-based
deployments, is still available in the drop-down menu of the cluster creation
window in the Container Cloud web UI.
Do not select this Cluster release to prevent deployment failures.
Use the latest supported version instead.
LCM¶[41540] LCM Agent cannot grab storage information on a host¶
Due to issues with managing physical NVME devices, lcm-agent cannot grab
storage information on a host. As a result,
lcmmachine.status.hostinfo.hardware is empty and the following example
error is present in logs:
{"level":"error","ts":"2024-05-02T12:26:10Z","logger":"agent",\"msg":"get hardware details",\"host":"kaas-node-548b2861-aed0-41c9-8ff2-10c5476b000b",\"error":"new storage info: get disk info \"nvme0c0n1\": \invoke command: exit status 1","errorVerbose":"exit status 1
As a workaround, on the affected node, create a symlink for any device
indicated in lcm-agent logs. For example:
ln-sfn/dev/nvme0n1/dev/nvme0c0n1
[40036] Node is not removed from a cluster when its Machine is disabled¶
During the replacement of a master node on a cluster of any type, the process
may get stuck with Kubelet'sNodeReadyconditionisUnknown in the
machine status on the remaining master nodes.
As a workaround, log in on the affected node and run the following
command:
dockerrestartucp-kubelet
[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
Ceph¶[41819] Graceful cluster reboot is blocked by the Ceph ClusterWorkloadLocks¶
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
On High Availability (HA) clusters that use Local Volume Provisioner (LVP),
Prometheus and OpenSearch from StackLight may share the same pool of storage.
In such configuration, OpenSearch may approach the 85% disk usage watermark
due to the combined storage allocation and usage patterns set by the Persistent
Volume Claim (PVC) size parameters for Prometheus and OpenSearch, which consume
storage the most.
When the 85% threshold is reached, the affected node is transitioned to the
read-only state, preventing shard allocation and causing the OpenSearch cluster
state to transition to Warning (Yellow) or Critical (Red).
Caution
The issue and the provided workaround apply only for clusters on
which OpenSearch and Prometheus utilize the same storage pool.
Derived from .values.elasticsearch.persistentVolumeUsableStorageSizeGB,
defaulting to .values.elasticsearch.persistentVolumeClaimSize if
unspecified. To obtain the OpenSearch PVC size:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
If the formula result is positive, it is an early indication that the
cluster is affected.
Verify whether the OpenSearchClusterStatusWarning or
OpenSearchClusterStatusCritical alert is firing. And if so,
verify the following:
Log in to the OpenSearch web UI.
In Management -> Dev Tools, run the following command:
GET_cluster/allocation/explain
The following system response indicates that the corresponding node is
affected:
"explanation":"the node is above the low watermark cluster setting \[cluster.routing.allocation.disk.watermark.low=85%], using more disk space \than the maximum allowed [85.0%], actual free: [xx.xxx%]"
Note
The system response may contain even higher watermark percent
than 85.0%, depending on the case.
Workaround:
Warning
The workaround implies adjustement of the retention threshold for
OpenSearch. And depending on the new threshold, some old logs will be
deleted.
A user-defined variable that specifies what percentage of the total storage
capacity should not be used by OpenSearch or Prometheus. This is used to
reserve space for other components. It should be expressed as a decimal.
For example, for 5% of reservation, Reserved_Percentage is 0.05.
Mirantis recommends using 0.05 as a starting point.
Filesystem_Reserve
Percentage to deduct for filesystems that may reserve some portion of the
available storage, which is marked as occupied. For example, for EXT4, it
is 5% by default, so the value must be 0.05.
Prometheus_PVC_Size_GB
Sourced from .values.prometheusServer.persistentVolumeClaimSize.
Total_Storage_Capacity_GB
Total capacity of the OpenSearch PVCs. For LVP, the capacity of the
storage pool. To obtain the total capacity:
The system response contains multiple outputs, one per opensearch-master
node. Select the capacity for the affected node.
Note
Convert the values to GB if they are set in different units.
Calculation of above formula provides a maximum safe storage to allocate
for .values.elasticsearch.persistentVolumeUsableStorageSizeGB. Use this
formula as a reference for setting
.values.elasticsearch.persistentVolumeUsableStorageSizeGB on a cluster.
Wait up to 15-20 mins for OpenSearch to perform the cleaning.
Verify that the cluster is not affected anymore using the procedure above.
[42304] Failure of shard relocation in the OpenSearch cluster¶
On large managed clusters, shard relocation may fail in the OpenSearch cluster
with the yellow or red status of the OpenSearch cluster.
The characteristic symptom of the issue is that in the stacklight
namespace, the statefulset.apps/opensearch-master containers are
experiencing throttling with the KubeContainersCPUThrottlingHigh alert
firing for the following set of labels:
The throttling that OpenSearch is experiencing may be a temporary
situation, which may be related, for example, to a peaky load and the
ongoing shards initialization as part of disaster recovery or after node
restart. In this case, Mirantis recommends waiting until initialization
of all shards is finished. After that, verify the cluster state and whether
throttling still exists. And only if throttling does not disappear, apply
the workaround below.
To verify that the initialization of shards is ongoing:
The system response above indicates that shards from the
.ds-system-000072, .ds-system-000073, and .ds-audit-000001
indicies are in the INITIALIZING state. In this case, Mirantis
recommends waiting until this process is finished, and only then consider
changing the limit.
You can additionally analyze the exact level of throttling and the current
CPU usage on the Kubernetes Containers dashboard in Grafana.
Workaround:
Verify the currently configured CPU requests and limits for the
opensearch containers:
In the example above, the CPU request is 500m and the CPU limit is
600m.
Increase the CPU limit to a reasonably high number.
For example, the default CPU limit for the clusters with the
clusterSize:large parameter set was increased from
8000m to 12000m for StackLight in Container Cloud 2.27.0
(Cluster releases 17.2.0 and 16.2.0).
If the CPU limit for the opensearch component is already set, increase
it in the Cluster object for the opensearch parameter. Otherwise,
the default StackLight limit is used. In this case, increase the CPU limit
for the opensearch component using the resources parameter.
Wait until all opensearch-master pods are recreated with the new CPU
limits and become running and ready.
To verify the current CPU limit for every opensearch container in every
opensearch-master pod separately:
The waiting time may take up to 20 minutes depending on the cluster size.
If the issue is fixed, the KubeContainersCPUThrottlingHigh alert stops
firing immediately, while OpenSearchClusterStatusWarning or
OpenSearchClusterStatusCritical can still be firing for some time during
shard relocation.
If the KubeContainersCPUThrottlingHigh alert is still firing, proceed with
another iteration of the CPU limit increase.
[40020] Rollover policy update is not appllied to the current index¶
While updating rollover_policy for the current system* and audit*
data streams, the update is not applied to indices.
One of indicators that the cluster is most likely affected is the
KubeJobFailed alert firing for the elasticsearch-curator job and one or
both of the following errors being present in elasticsearch-curator pods
that remain in the Error status:
2024-05-3113:16:04,459ERRORFailedtocompleteaction:delete_indices.<class'curator.exceptions.FailedExecution'>:Exceptionencountered.RerunwithloglevelDEBUGand/orcheckElasticsearchlogsformoreinformation.Exception:RequestError(400,'illegal_argument_exception','index [.ds-audit-000001] is the write index for data stream [audit] and cannot be deleted')
or
2024-05-3113:16:04,459ERRORFailedtocompleteaction:delete_indices.<class'curator.exceptions.FailedExecution'>:Exceptionencountered.RerunwithloglevelDEBUGand/orcheckElasticsearchlogsformoreinformation.Exception:RequestError(400,'illegal_argument_exception','index [.ds-system-000001] is the write index for data stream [system] and cannot be deleted')
Note
Instead of .ds-audit-000001 or .ds-system-000001 index names,
similar names can be present with the same prefix but different suffix
numbers.
If the above mentioned alert and errors are present, an immediate action is
required, because it indicates that the corresponding index size has already
exceeded the space allocated for the index.
To verify that the cluster is affected:
Caution
Verify and apply the workaround to both index patterns, system and
audit, separately.
If one of indices is affected, the second one is most likely affected
as well. Although in rare cases, only one index may be affected.
The cluster is affected if the rollover policy is missing.
Otherwise, proceed to the following step.
Verify the system response from the previous step. For example:
{"_id":"system_rollover_policy","_version":7229,"_seq_no":42362,"_primary_term":28,"policy":{"policy_id":"system_rollover_policy","description":"system index rollover policy.","last_updated_time":1708505222430,"schema_version":19,"error_notification":null,"default_state":"rollover","states":[{"name":"rollover","actions":[{"retry":{"count":3,"backoff":"exponential","delay":"1m"},"rollover":{"min_size":"14746mb","copy_alias":false}}],"transitions":[]}],"ism_template":[{"index_patterns":["system*"],"priority":200,"last_updated_time":1708505222430}]}}
Verify and capture the following items separately for every policy:
The _seq_no and _primary_term values
The rollover policy threshold, which is defined in
policy.states[0].actions[0].rollover.min_size
If the rollover policy is not attached, the cluster is affected.
If the rollover policy is attached but _seq_no and _primary_term
numbers do not match the previously captured ones, the cluster is
affected.
If the index size drastically exceeds the defined threshold of the
rollover policy (which is the previously captured min_size),
the cluster is most probably affected.
Perform again the last step of the cluster verification procedure provided
above and make sure that the policy is attached to the index and has
the same _seq_no and _primary_term.
If the index size drastically exceeds the defined threshold of the
rollover policy (which is the previously captured min_size), wait
up to 15 minutes and verify that the additional index is created with
the consecutive number in the index name. For example:
system: if you applied changes to .ds-system-000001, wait until
.ds-system-000002 is created.
audit: if you applied changes to .ds-audit-000001, wait until
.ds-audit-000002 is created.
If such index is not created, escalate the issue to Mirantis support.
Container Cloud web UI¶[41806] Configuration of a management cluster fails without Keycloak settings¶
During configuration of a management cluster settings using the
Configure cluster web UI menu, updating the Keycloak Truststore
settings is mandatory, despite being optional.
As a workaround, update the management cluster using the API or CLI.
The following table lists the major components and their versions delivered in
the Container Cloud 2.26.0.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
This section lists the artifacts of components included in the Container Cloud
release 2.26.0.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
The table below includes the total numbers of addressed unique and common
vulnerabilities and exposures (CVE) by product component since the 2.25.4
patch release. The common CVEs are issues addressed across several images.
This section describes the specific actions you as a cloud operator need to
complete before or after your Container Cloud cluster update to the Cluster
releases 17.1.0 or 16.1.0.
Pre-update actions¶Unblock cluster update by removing any pinned product artifacts¶
If any pinned product artifacts are present in the Cluster object of a
management or managed cluster, the update will be blocked by the Admission
Controller with the invalidHelmReleasesconfiguration error until such
artifacts are removed. The update process does not start and any changes in
the Cluster object are blocked by the Admission Controller except the
removal of fields with pinned product artifacts.
Therefore, verify that the following sections of the Cluster objects
do not contain any image-related (tag, name, pullPolicy,
repository) and global values inside Helm releases:
The custom pinned product artifacts are inspected and blocked by the
Admission Controller to ensure that Container Cloud clusters remain
consistently updated with the latest security fixes and product improvements
Note
The pre-update inspection applies only to images delivered by
Container Cloud that are overwritten. Any custom images unrelated to the
product components are not verified and do not block cluster update.
Update queries for custom log-based metrics in StackLight¶
Container Cloud 2.26.0 introduces reorganized and significantly improved
StackLight logging pipeline. It involves changes in queries implemented
in the scope of the logging.metricQueries feature designed for creation
of custom log-based metrics. For the procedure, see StackLight
operations: Create logs-based metrics.
If you already have some custom log-based metrics:
Before the cluster update, save existing queries.
After the cluster update, update the queries according to the changes
implemented in the scope of the logging.metricQueries feature.
These steps prevent failures of queries containing fields that are renamed
or removed in Container Cloud 2.26.0.
Post-update actions¶Update bird configuration on BGP-enabled bare metal clusters¶
Container Cloud 2.26.0 introduces the bird daemon update from v1.6.8
to v2.0.7 on master nodes if BGP is used for BGP announcement of the cluster
API load balancer address.
Configuration files for bird v1.x are not fully compatible with those for
bird v2.x. Therefore, if you used BGP announcement of cluster API LB address
on a deployment based on Cluster releases 17.0.0 or 16.0.0, update bird
configuration files to fit bird v2.x using configuration examples provided in
the API Reference: MultirRackCluster section.
Review and adjust the storage parameters for OpenSearch¶
To prevent underused or overused storage space, review your storage space
parameters for OpenSearch on the StackLight cluster:
Review the value of elasticsearch.persistentVolumeClaimSize and
the real storage available on volumes.
Decide whether you have to additionally set
elasticsearch.persistentVolumeUsableStorageSizeGB.
The Container Cloud patch release 2.25.4, which is based on the
2.25.0 major release, provides the following updates:
Support for the patch Cluster releases 16.0.4 and 17.0.4
that represents Mirantis OpenStack for Kubernetes (MOSK) patch release
23.3.4.
Security fixes for CVEs in images.
This patch release also supports the latest major Cluster releases
17.0.0 and 16.0.0. And it does not support greenfield
deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.25.4, refer
to 2.25.0.
This section lists the artifacts of components included in the Container Cloud
patch release 2.25.4. For artifacts of the Cluster releases introduced in
2.25.4, see patch Cluster releases 17.0.4 and 16.0.4.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since the Container Cloud 2.25.3 patch
release. The common CVEs are issues addressed across several images.
The following issues have been addressed in the Container Cloud patch release
2.25.4 along with the patch Cluster releases 17.0.4
and 16.0.4.
[38259] Fixed the issue causing the failure to attach an existing
MKE cluster to a Container Cloud management cluster. The issue was related
to byo-provider and prevented the attachment of MKE clusters having
less than three manager nodes and two worker nodes.
[38399] Fixed the issue causing the failure to deploy a management
cluster in the offline mode due to the issue in the setup script.
This section contains historical information on the unsupported Container
Cloud releases delivered in 2023. For the latest supported Container
Cloud release, see Container Cloud releases.
Introduces the major Cluster release 15.0.1 that is based on 14.0.1
and supports Mirantis OpenStack for Kubernetes (MOSK)
23.2.
Supports the Cluster release 14.0.1.
The deprecated Cluster release 14.0.0 and the 12.7.x along with
11.7.x series are not supported for new deployments.
Contains features and amendments of the parent releases
2.24.0 and 2.24.1.
Support for the patch Cluster releases 16.0.3 and 17.0.3
that represents Mirantis OpenStack for Kubernetes (MOSK) patch release
23.3.3.
Security fixes for CVEs in images.
This patch release also supports the latest major Cluster releases
17.0.0 and 16.0.0. And it does not support greenfield
deployments based on deprecated Cluster releases. Use the latest available Cluster release
instead.
For main deliverables of the parent Container Cloud release of 2.25.3, refer
to 2.25.0.
This section lists the artifacts of components included in the Container Cloud
patch release 2.25.3. For artifacts of the Cluster releases introduced in
2.25.3, see patch Cluster releases 17.0.3 and 16.0.3.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since the Container Cloud 2.25.2 patch
release. The common CVEs are issues addressed across several images.
The following issues have been addressed in the Container Cloud patch release
2.25.3 along with the patch Cluster releases 17.0.3
and 16.0.3.
[37634][OpenStack] Fixed the issue with a management or managed cluster
deployment or upgrade being blocked by all pods being stuck in the
Pending state due to incorrect secrets being used to initialize
the OpenStack external Cloud Provider Interface.
[37766][IAM] Fixed the issue with sign-in to the MKE web UI of the
management cluster using the Sign in with External Provider
option, which failed with the invalid parameter: redirect_uri
error.
The Container Cloud patch release 2.25.2, which is based on the
2.25.0 major release, provides the following updates:
Renewed support for attachment of MKE clusters that are not originally
deployed by Container Cloud for vSphere-based management clusters.
Support for the patch Cluster releases 16.0.2 and 17.0.2
that represents Mirantis OpenStack for Kubernetes (MOSK) patch release
23.3.2.
Security fixes for CVEs in images.
This patch release also supports the latest major Cluster releases
17.0.0 and 16.0.0. And it does not support greenfield
deployments based on deprecated Cluster releases 14.0.1,
15.0.1, 16.0.1, and 17.0.1. Use the latest
available Cluster releases instead.
For main deliverables of the parent Container Cloud release of 2.25.2, refer
to 2.25.0.
This section lists the artifacts of components included in the Container Cloud
patch release 2.25.2. For artifacts of the Cluster releases introduced in
2.25.2, see patch Cluster releases 17.0.2 and 16.0.2.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since the Container Cloud 2.25.1 patch
release. The common CVEs are issues addressed across several images.
The Container Cloud patch release 2.25.1, which is based on the
2.25.0 major release, provides the following updates:
Support for the patch Cluster releases 16.0.1
and 17.0.1 that represents Mirantis OpenStack for Kubernetes
(MOSK) patch release
23.3.1.
Several product improvements. For details, see Enhancements.
Security fixes for CVEs in images.
This patch release also supports the latest major Cluster releases
17.0.0 and 16.0.0.
And it does not support greenfield deployments based on deprecated Cluster
releases 14.1.0, 14.0.1, and
15.0.1. Use the latest available Cluster releases
instead.
For main deliverables of the parent Container Cloud release of 2.25.1, refer
to 2.25.0.
This section outlines new features and enhancements introduced in the
Container Cloud patch release 2.25.1 along with Cluster releases 17.0.1 and
16.0.1.
Introduced support for Mirantis Kubernetes Engine (MKE) 3.7.2 on Container
Cloud management and managed clusters. On existing managed clusters, MKE is
updated to the latest supported version when you update your cluster to the
patch Cluster release 17.0.1 or 16.0.1.
To simplify MKE configuration through API, moved management of MKE parameters
controlled by Container Cloud from lcm-ansible to lcm-controller.
Now, Container Cloud overrides only a set of MKE configuration parameters that
are automatically managed by Container Cloud.
Introduced Kubernetes network policies for all StackLight components. The
feature is implemented using the networkPolicies parameter that is enabled
by default.
The Kubernetes NetworkPolicy resource allows controlling network connections
to and from Pods within a cluster. This enhances security by restricting
communication from compromised Pod applications and provides transparency
into how applications communicate with each other.
External vSphere CCM with CSI supporting vSphere 6.7 on Kubernetes 1.27¶
Switched to the external vSphere cloud controller manager (CCM) that uses
vSphere Container Storage Plug-in 3.0 for volume attachment. The feature
implementation implies an automatic migration of PersistentVolume and
PersistentVolumeClaim.
The external vSphere CCM supports vSphere 6.7 on Kubernetes 1.27 as compared
to the in-tree vSphere CCM that does not support vSphere 6.7 since
Kubernetes 1.25.
Important
The major Cluster release 14.1.0 is the last Cluster release
for the vSphere provider based on MCR 20.10 and MKE 3.6.6 with
Kubernetes 1.24. Therefore, Mirantis highly recommends updating your
existing vSphere-based managed clusters to the Cluster release
16.0.1 that contains newer versions on MCR and MKE with
Kubernetes. Otherwise, your management cluster upgrade to Container Cloud
2.25.2 will blocked.
Since Container Cloud 2.25.1, the major Cluster release 14.1.0 is deprecated.
Greenfield vSphere-based deployments on this Cluster release are not
supported. Use the patch Cluster release 16.0.1 for new deployments instead.
This section lists the artifacts of components included in the Container Cloud
patch release 2.25.1. For artifacts of the Cluster releases introduced in
2.25.1, see patch Cluster releases 17.0.1 and
16.0.1.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
The table below includes the total numbers of addressed unique and common
CVEs in images by product component since the Container Cloud 2.25.0 major
release. The common CVEs are issues addressed across several images.
The following issues have been addressed in the Container Cloud patch release
2.25.1 along with the patch Cluster releases 17.0.1
and 16.0.1.
[35426] [StackLight] Fixed the issue with the prometheus-libvirt-exporter
Pod failing to reconnect to libvirt after the libvirt Pod recovery from
a failure.
[35339] [LCM] Fixed the issue with the LCM Ansible task of copying
kubectl from the ucp-hyperkube image failing if
kubectl exec is in use, for example, during a management cluster
upgrade.
[35089] [bare metal, Calico] Fixed the issue with arbitrary Kubernetes pods
getting stuck in an error loop due to a failed Calico networking setup for
that pod.
[33936] [bare metal, Calico] Fixed the issue with deletion failure of a
controller node during machine replacement due to the upstream
Calico issue.
The Mirantis Container Cloud major release 2.25.0:
Introduces support for the Cluster release 17.0.0
that is based on the Cluster release 16.0.0 and
represents Mirantis OpenStack for Kubernetes (MOSK)
23.3.
Introduces support for the Cluster release 16.0.0 that
is based on Mirantis Container Runtime (MCR) 23.0.7 and Mirantis Kubernetes
Engine (MKE) 3.7.1 with Kubernetes 1.27.
Introduces support for the Cluster release 14.1.0 that
is dedicated for the vSphere provider only. This is the last Cluster
release for the vSphere provider based on MKE 3.6.6 with Kubernetes 1.24.
Does not support greenfield deployments on deprecated Cluster releases
of the 15.x and 14.x series. Use the latest available Cluster releases
of the series instead.
Caution
Make sure to update the Cluster release version
of your managed cluster before the current Cluster release
version becomes unsupported by a new Container Cloud release
version.
Otherwise, Container Cloud stops auto-upgrade and eventually
Container Cloud itself becomes unsupported.
This section outlines release notes for the Container Cloud release 2.25.0.
This section outlines new features and enhancements introduced in the
Container Cloud release 2.25.0. For the list of enhancements delivered with
the Cluster releases introduced by Container Cloud 2.25.0, see
17.0.0, 16.0.0, and
14.1.0.
Implemented Container Cloud Bootstrap v2 that provides an exceptional user
experience to set up Container Cloud. With Bootstrap v2, you also gain access
to a comprehensive and user-friendly web UI for the OpenStack and vSphere
providers.
Bootstrap v2 empowers you to effortlessly provision management clusters before
deployment, while benefiting from a streamlined process that isolates
each step. This approach not only simplifies the bootstrap process but also
enhances troubleshooting capabilities for addressing any potential
intermediate failures.
Note
The Bootstrap web UI support for the bare metal provider will be
added in one of the following Container Cloud releases.
General availability for ‘MetalLBConfigTemplate’ and ‘MetalLBConfig’ objects¶
Completed development of the MetalLB configuration related to address
allocation and announcement for load-balanced services using the
MetalLBConfigTemplate object for bare metal and the MetalLBConfig
object for vSphere. Container Cloud uses these objects in default templates as
recommended during creation of a management or managed cluster.
At the same time, removed the possibility to use the deprecated options, such
as configInline value of the MetalLB chart and the use of Subnet
objects without new MetalLBConfigTemplate and MetalLBConfig objects.
Automated migration, which applied to these deprecated options during creation
of clusters of any type or cluster update to Container Cloud 2.24.x, is
removed automatically during your management cluster upgrade to Container
Cloud 2.25.0. After that, any changes in MetalLB configuration related to
address allocation and announcement for load-balanced services will be applied
using the MetalLBConfig, MetalLBConfigTemplate, and Subnet objects
only.
These annotations are helpful if you have a limited amount of free and unused
IP addresses for server provisioning. Using these annotations, you can
manually create bare metal hosts one by one and provision servers in small,
manually managed chunks.
Status of infrastructure health for bare metal and OpenStack providers¶
Implemented the Infrastructure Status condition to monitor
infrastructure readiness in the Container Cloud web UI during cluster
deployment for bare metal and OpenStack providers. Readiness of the following
components is monitored:
Bare metal: the MetalLBConfig object along with MetalLB and DHCP subnets
OpenStack: cluster network, routers, load balancers, and Bastion along with
their ports and floating IPs
For the bare metal provider, also implemented the
Infrastructure Status condition for machines to monitor readiness
of the IPAMHost, L2Template, BareMetalHost, and
BareMetalHostProfile objects associated with the machine.
General availability for RHEL 8.7 on vSphere-based clusters¶
Introduced general availability support for RHEL 8.7 on VMware vSphere-based
clusters. You can install this operating system on any type of a Container
Cloud cluster including the bootstrap node.
Note
RHEL 7.9 is not supported as the operating system for the bootstrap
node.
Caution
A Container Cloud cluster based on mixed RHEL versions, such as
RHEL 7.9 and 8.7, is not supported.
Implemented automatic cleanup of old Ubuntu kernel and other unnecessary
system packages. During cleanup, Container Cloud keeps two most recent kernel
versions, which is the default behavior of the Ubuntu
apt autoremove command.
Mirantis recommends keeping two kernel versions with the previous kernel
version as a fallback option in the event that the current kernel may become
unstable at any time. However, if you absolutely require leaving only the
latest version of kernel packages, you can use the
cleanup-kernel-packages script after considering all possible risks.
Configuration of a custom OIDC provider for MKE on managed clusters¶
Implemented the ability to configure a custom OpenID Connect (OIDC) provider
for MKE on managed clusters using the ClusterOIDCConfiguration custom
resource. Using this resource, you can add your own OIDC provider
configuration to authenticate user requests to Kubernetes.
Note
For OpenStack and StackLight, Container Cloud supports only
Keycloak, which is configured on the management cluster,
as the OIDC provider.
Implemented the management-admin OIDC role to grant full admin access
specifically to a management cluster. This role enables the user to manage
Pods and all other resources of the cluster, for example, for debugging
purposes.
General availability for graceful machine deletion¶
Introduced general availability support for graceful machine deletion with
a safe cleanup of node resources:
Changed the default deletion policy from unsafe to graceful for
machine deletion using the Container Cloud API.
Using the deletionPolicy:graceful parameter in the
providerSpec.value section of the Machine object, the cloud provider
controller prepares a machine for deletion by cordoning, draining, and
removing the related node from Docker Swarm. If required, you can abort a
machine deletion when using deletionPolicy:graceful, but only before
the related node is removed from Docker Swarm.
Implemented the following machine deletion methods in the Container Cloud
web UI: Graceful, Unsafe, Forced.
Added support for deletion of manager machines, which is intended only for
replacement or recovery of failed nodes, for MOSK-based
clusters using either of deletion policies mentioned above.
General availability for parallel update of worker nodes¶
Completed development of the parallel update of worker nodes during cluster
update by implementing the ability to configure the required options using the
Container Cloud web UI. Parallelizing of node update operations significantly
optimizes the update efficiency of large clusters.
The following options are added to the Create Cluster window:
Parallel Upgrade Of Worker Machines that sets the maximum number
of worker nodes to update simultaneously
Parallel Preparation For Upgrade Of Worker Machines that sets
the maximum number of worker nodes for which new artifacts are downloaded
at a given moment of time
The following issues have been addressed in the Mirantis Container Cloud
release 2.25.0 along with the Cluster releases 17.0.0,
16.0.0, and 14.1.0.
Note
This section provides descriptions of issues addressed since
the last Container Cloud patch release 2.24.5.
For details on addressed issues in earlier patch releases since 2.24.0,
which are also included into the major release 2.25.0, refer to
2.24.x patch releases.
[34462] [BM] Fixed the issue with incorrect handling of the DHCP egress
traffic by reconfiguring the external traffic policy for the dhcp-lb
Kubernetes Service. For details about the issue, refer to the
Kubernetes upstream bug.
On existing clusters with multiple L2 segments using DHCP relays on the
border switches, in order to successfully provision new nodes or reprovision
existing ones, manually point the DHCP relays on your network infrastructure
to the new IP address of the dhcp-lb Service of the Container Cloud
cluster.
To obtain the new IP address:
kubectl-nkaasgetservicedhcp-lb
[35429] [BM] Fixed the issue with the WireGuard interface not having
the IPv4 address assigned. The fix implies automatic restart of the
calico-node Pod to allocate the IPv4 address on the WireGuard interface.
[36131] [BM] Fixed the issue with IpamHost object changes not being
propagated to LCMMachine during netplan configuration after cluster
deployment.
[34657] [LCM] Fixed the issue with iam-keycloak Pods not starting
after powering up master nodes and starting the Container Cloud upgrade
right after.
[34750] [LCM] Fixed the issue with journald generating a lot of log
messages that already exist in the auditd log due to enabled
systemd-journald-audit.socket.
[35738] [StackLight] Fixed the issue with ucp-node-exporter being
unable to bind the port 9100 with the ucp-node-exporter failing to start
due to a conflict with the StackLight node-exporter binding the same
port.
The resolution of the issue involves an automatic change of the port for the
StackLight node-exporter from 9100 to 19100. No manual port update is
required.
If your cluster uses a firewall, add an additional firewall rule that
grants the same permissions to port 19100 as those currently assigned
to port 9100 on all cluster nodes.
[34296] [StackLight] Fixed the issue with the CPU over-consumption by
helm-controller leading to the KubeContainersCPUThrottlingHigh
alert firing.
This section lists known issues with workarounds for the Mirantis
Container Cloud release 2.25.0 including the Cluster releases
17.0.0, 16.0.0, and
14.1.0.
This section also outlines still valid known issues
from previous Container Cloud releases.
Bare metal¶[42386] A load balancer service does not obtain the external IP address¶
Due to the MetalLB upstream issue,
a load balancer service may not obtain the external IP address.
The issue occurs when two services share the same external IP address and have
the same externalTrafficPolicy value. Initially, the services have the
external IP address assigned and are accessible. After modifying the
externalTrafficPolicy value for both services from Cluster to
Local, the first service that has been changed remains with no external IP
address assigned. Though, the second service, which was changed later, has the
external IP assigned as expected.
To work around the issue, make a dummy change to the service object where
external IP is <pending>:
An arbitrary Kubernetes pod may get stuck in an error loop due to a failed
Calico networking setup for that pod. The pod cannot access any network
resources. The issue occurs more often during cluster upgrade or node
replacement, but this can sometimes happen during the new deployment as well.
You may find the following log for the failed pod IP (for example,
10.233.121.132) in calico-node logs:
Due to the upstream Calico issue, a controller node
cannot be deleted if the calico-node Pod is stuck blocking node deletion.
One of the symptoms is the following warning in the baremetal-operator
logs:
Resolving dependency Service dhcp-lb in namespace kaas failed:\
the server was unable to return a response in the time allotted,\
but may still be processing the request (get endpoints dhcp-lb).
As a workaround, delete the Pod that is stuck to retrigger the node
deletion.
[24005] Deletion of a node with ironic Pod is stuck in the Terminating state¶
During deletion of a manager machine running the ironic Pod from a bare
metal management cluster, the following problems occur:
All Pods are stuck in the Terminating state
A new ironic Pod fails to start
The related bare metal host is stuck in the deprovisioning state
As a workaround, before deletion of the node running the ironic Pod,
cordon and drain the node using the kubectl cordon <nodeName> and
kubectl drain <nodeName> commands.
OpenStack¶[37634] Cluster deployment or upgrade is blocked by all pods in ‘Pending’ state¶
When using OpenStackCredential with a custom CACert, a management or
managed cluster deployment or upgrade is blocked by all pods being stuck in
the Pending state. The issue is caused by incorrect secrets being used to
initialize the OpenStack external Cloud Provider Interface.
As a workaround, copy CACert from the OpenStackCredential object
to openstack-ca-secret:
A sign-in to the MKE web UI of the management cluster using the
Sign in with External Provider option can fail with the
invalid parameter: redirect_uri error.
Workaround:
Log in to the Keycloak admin console.
In the sidebar menu, switch to the IAM realm.
Navigate to Clients > kaas.
On the page, navigate to
Seetings > Access settings > Valid redirect URIs.
Add https://<mgmtmkeip>:6443/* to the list of valid redirect URIs
and click Save.
Refresh the browser window with the sign-in URI.
LCM¶[31186,34132] Pods get stuck during MariaDB operations¶
During MariaDB operations on a management cluster, Pods may get stuck
in continuous restarts with the following example error:
On MOSK clusters, the Ansible provisioner may hang in a loop while trying to
remove LVM thin pool logical volumes (LVs) due to issues with volume detection
before removal. The Ansible provisioner cannot remove LVM thin pool LVs
correctly, so it consistently detects the same volumes whenever it scans
disks, leading to a repetitive cleanup process.
The following symptoms mean that a cluster can be affected:
A node was configured to use thin pool LVs. For example, it had the
OpenStack Cinder role in the past.
A bare metal node deployment flaps between provisioninig and
deprovisioning states.
In the Ansible provisioner logs, the following example warnings are growing:
88621.log:7389:2023-06-22 16:30:45.109 88621 ERROR ansible.plugins.callback.ironic_log[-] Ansible task clean:fail failed on node 14eb0dbc-c73a-4298-8912-4bb12340ff49:{'msg':'There are more devices to clean', '_ansible_no_log': None, 'changed': False}
Important
Therearemoredevicestoclean is a regular warning
indicating some in-progress tasks. But if the number of such warnings is
growing along with the node flapping between provisioninig and
deprovisioning states, the cluster is highly likely affected by the
issue.
As a workaround, erase disks manually using any preferred tool.
[30294] Replacement of a master node is stuck on the calico-node Pod start¶
During replacement of a master node on a cluster of any type, the
calico-node Pod fails to start on a new node that has the same IP address
as the node being replaced.
Workaround:
Log in to any master node.
From a CLI with an MKE client bundle, create a shell alias to start
calicoctl using the mirantis/ucp-dsinfo image:
During the unsafe or forced deletion of a manager machine running the
calico-kube-controllers Pod in the kube-system namespace,
the following issues occur:
The calico-kube-controllers Pod fails to clean up resources associated
with the deleted node
The calico-node Pod may fail to start up on a newly created node if the
machine is provisioned with the same IP address as the deleted machine had
As a workaround, before deletion of the node running the
calico-kube-controllers Pod, cordon and drain the node:
kubectlcordon<nodeName>
kubectldrain<nodeName>
Ceph¶[34820] The Ceph ‘rook-operator’ fails to connect to RGW on FIPS nodes¶
Due to the upstream Ceph issue,
on clusters with the Federal Information Processing Standard (FIPS) mode
enabled, the Ceph rook-operator fails to connect to Ceph RADOS Gateway
(RGW) pods.
As a workaround, do not place Ceph RGW pods on nodes where FIPS mode is
enabled.
[26441] Cluster update fails with the MountDevice failed for volume warning¶
Update of a managed cluster based on bare metal and Ceph enabled fails with
PersistentVolumeClaim getting stuck in the Pending state for the
prometheus-server StatefulSet and the
MountVolume.MountDevicefailedforvolume warning in the StackLight event
logs.
Workaround:
Verify that the description of the Pods that failed to run contain the
FailedMount events:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name where
the Pods failed to run
<affectedPodName> is a Pod name that failed to run in the specified project
In the Pod description, identify the node name where the Pod failed to run.
Verify that the csi-rbdplugin logs of the affected node contain the
rbdvolumemountfailed:<csi-vol-uuid>isbeingused error.
The <csi-vol-uuid> is a unique RBD volume name.
Identify csiPodName of the corresponding csi-rbdplugin:
Container Cloud upgrade may be blocked by a node being stuck in the Prepare
or Deploy state with errorprocessingpackageopenssh-server.
The issue is caused by customizations in /etc/ssh/sshd_config, such as
additional Match statements. This file is managed by Container Cloud and
must not be altered manually.
As a workaround, move customizations from sshd_config to a new file
in the /etc/ssh/sshd_config.d/ directory.
[36928] The helm-controllerDeployment is stuck during cluster update¶
During a cluster update, a Kubernetes helm-controllerDeployment may
get stuck in a restarting Pod loop with Terminating and Running states
flapping. Other Deployment types may also be affected.
As a workaround, restart the Deployment that got stuck:
In the command above, replace the following values:
<affectedProjectName> is the Container Cloud project name containing
the cluster with stuck Pods
<affectedDeployName> is the Deployment name that failed to run Pods
in the specified project
<replicasNumber> is the original number of replicas for the
Deployment that you can obtain using the get deploy command
[33438] ‘CalicoDataplaneFailuresHigh’ alert is firing during cluster update¶
During cluster update of a managed bare metal cluster, the false positive
CalicoDataplaneFailuresHigh alert may be firing. Disregard this alert,
which will disappear once cluster update succeeds.
The observed behavior is typical for calico-node during upgrades,
as workload changes occur frequently. Consequently, there is a possibility
of temporary desynchronization in the Calico dataplane. This can occasionally
result in throttling when applying workload changes to the Calico dataplane.
The following table lists the major components and their versions delivered in
the Container Cloud 2.25.0.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
This section lists the artifacts of components included in the Container Cloud
release 2.25.0.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
The table below includes the total numbers of addressed unique and common
CVEs by product component since the 2.24.5 patch release. The common
CVEs are issues addressed across several images.
This section describes the specific actions you as a cloud operator need to
complete before or after your Container Cloud cluster update to the Cluster
releases 17.0.0, 16.0.0, or 14.1.0.
Pre-update actions¶Upgrade to Ubuntu 20.04 on baremetal-based clusters¶
The Cluster release series 14.x and 15.x are the last ones where Ubuntu 18.04
is supported on existing clusters. A Cluster release update to 17.0.0 or
16.0.0 is impossible for a cluster running on Ubuntu 18.04.
Configure managed clusters with the etcd storage quota set¶
If your cluster has custom etcd storage quota set as described in
Increase storage quota for etcd, before the management cluster upgrade to 2.25.0,
configure LCMMachine resources:
Manually set the ucp_etcd_storage_quota parameter in LCMMachine
resources of the cluster controller nodes:
After the management cluster is upgraded to 2.25.0, update your managed
cluster to the Cluster release 17.0.0 or 16.0.0.
Manually remove the ucp_etcd_storage_quota parameter from the
stateItemsOverwrites.deploy section.
Allow the TCP port 12392 for management cluster nodes¶
The Cluster release 16.x and 17.x series are shipped with MKE 3.7.x.
To ensure cluster operability after the update, verify that the TCP
port 12392 is allowed in your network for the Container Cloud management
cluster nodes.
Post-update actions¶Migrate Ceph cluster to address storage devices using by-id¶
Container Cloud uses the device by-id identifier as the default method
of addressing the underlying devices of Ceph OSDs. This is the only persistent
device identifier for a Ceph cluster that remains stable after cluster
upgrade or any other cluster maintenance.
Point DHCP relays on routers to the new dhcp-lb IP address¶
If your managed cluster has multiple L2 segments using DHCP relays on the
border switches, after the related management cluster automatically upgrades
to Container Cloud 2.25.0, manually point the DHCP relays on your network
infrastructure to the new IP address of the dhcp-lb service of the
Container Cloud managed cluster in order to successfully provision new nodes
or reprovision existing ones.
To obtain the new IP address:
kubectl-nkaasgetservicedhcp-lb
This change is required after the product has included the resolution of
the issue related to the incorrect handling of DHCP egress traffic. The fix
involves reconfiguring the external traffic policy for the dhcp-lb
Kubernetes Service. For details about the issue, refer to the
Kubernetes upstream bug.
The Container Cloud patch release 2.24.5, which is based on the
2.24.2 major release, provides the following updates:
Support for the patch Cluster releases 14.0.4
and 15.0.4 that represents Mirantis OpenStack for Kubernetes
(MOSK) patch release
23.2.3.
Security fixes for CVEs of Critical and High severity
This patch release also supports the latest major Cluster releases
14.0.1 and 15.0.1.
And it does not support greenfield deployments based on deprecated Cluster
releases 15.0.3, 15.0.2,
14.0.3, 14.0.2
along with 12.7.x and 11.7.x series.
Use the latest available Cluster releases for new deployments instead.
For main deliverables of the parent Container Cloud releases of 2.24.5, refer
to 2.24.0 and 2.24.1.
This section lists the components artifacts of the Container Cloud patch
release 2.24.5. For artifacts of the Cluster releases introduced in 2.24.5,
see patch Cluster releases 15.0.4 and
14.0.4.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
In total, since Container Cloud 2.24.4, in 2.24.5, 21
Common Vulnerabilities and Exposures (CVE) have been fixed:
18 of critical and 3 of high severity.
The summary table contains the total number of unique CVEs along with the
total number of issues fixed across the images.
The full list of the CVEs present in the current Container Cloud release is
available at the Mirantis Security Portal.
The Container Cloud patch release 2.24.4, which is based on the
2.24.2 major release, provides the following updates:
Support for the patch Cluster releases 14.0.3
and 15.0.3 that represents Mirantis OpenStack for Kubernetes
(MOSK) patch release
23.2.2.
Support for the multi-rack topology on bare metal managed clusters
Support for configuration of the etcd storage quota
Security fixes for CVEs of Critical and High severity
This patch release also supports the latest major Cluster releases
14.0.1 and 15.0.1.
And it does not support greenfield deployments based on deprecated Cluster
releases 15.0.2, 14.0.2,
along with 12.7.x and 11.7.x series.
Use the latest available Cluster releases for new deployments instead.
For main deliverables of the parent Container Cloud releases of 2.24.4, refer
to 2.24.0 and 2.24.1.
Added the capability to configure storage quota, which is 2 GB by default.
You may need to increase the default etcd storage quota if etcd runs out of
space and there is no other way to clean up the storage on your management
or managed cluster.
Multi-rack topology for bare metal managed clusters¶
TechPreview
Added support for the multi-rack topology on bare metal managed clusters.
Implementation of the multi-rack topology implies the use of Rack and
MultiRackCluster objects that support configuration of BGP announcement
of the cluster API load balancer address.
You can now create a managed cluster where cluster nodes including Kubernetes
masters are distributed across multiple racks without L2 layer extension
between them, and use BGP for announcement of the cluster API load balancer
address and external addresses of Kubernetes load-balanced services.
This section lists the components artifacts of the Container Cloud patch
release 2.24.4. For artifacts of the Cluster releases introduced in 2.24.4,
see patch Cluster releases 15.0.3 and
14.0.3.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
In total, since Container Cloud 2.24.3, in 2.24.4, 18
Common Vulnerabilities and Exposures (CVE) have been fixed:
3 of critical and 15 of high severity.
The summary table contains the total number of unique CVEs along with the
total number of issues fixed across the images.
The full list of the CVEs present in the current Container Cloud release is
available at the Mirantis Security Portal.
Support for enablement of Kubernetes auditing and profiling options using
the Container Cloud Cluster object on managed clusters. For details,
see Configure Kubernetes auditing and profiling.
Support for the patch Cluster releases 14.0.2
and 15.0.2 that represents Mirantis OpenStack for Kubernetes
(MOSK) patch release.
23.2.1.
This patch release also supports the latest major Cluster releases
14.0.1 and 15.0.1.
And it does not support greenfield deployments based on deprecated Cluster
release 14.0.0 along with 12.7.x and
11.7.x series. Use the latest available Cluster releases
instead.
For main deliverables of the parent Container Cloud releases of 2.24.3, refer
to 2.24.0 and 2.24.1.
This section lists the components artifacts of the Container Cloud patch
release 2.24.3. For artifacts of the Cluster releases introduced in 2.24.3,
see Cluster releases 15.0.2 and 14.0.2.
Note
The components that are newly added, updated, deprecated, or removed
as compared to the previous release version, are marked
with a corresponding superscript,
for example, lcm-ansibleUpdated.
The Container Cloud major release 2.24.2 based on 2.24.0
and 2.24.1 provides the following:
Introduces support for the major Cluster release 15.0.1
that is based on the Cluster release 14.0.1 and
represents Mirantis OpenStack for Kubernetes (MOSK)
23.2.
This Cluster release is based on the updated version of Mirantis Kubernetes
Engine 3.6.5 with Kubernetes 1.24 and Mirantis Container Runtime 20.10.17.
Does not support greenfield deployments based on deprecated Cluster release
14.0.0 along with 12.7.x and
11.7.x series. Use the latest available Cluster releases
of the series instead.
For main deliverables of the Container Cloud release 2.24.2, refer to its
parent release 2.24.0:
Make sure to update the Cluster release version
of your managed cluster before the current Cluster release
version becomes unsupported by a new Container Cloud release
version.
Otherwise, Container Cloud stops auto-upgrade and eventually
Container Cloud itself becomes unsupported.
The Container Cloud patch release 2.24.1 based on 2.24.0
includes updated baremetal-operator, admission-controller, and iam
artifacts and provides hot fixes for the following issues:
[34218] Fixed the issue with the iam-keycloak Pod being stuck in the
Pending state during Keycloak upgrade to version 21.1.1.
[34247] Fixed the issue with MKE backup failing during cluster update
due to wrong permissions in the etcd backup directory. If the issue still
persists, which may occur on clusters that were originally deployed using
early Container Cloud releases delivered in 2020-2021, follow the
workaround steps described in Known issues: LCM.
Note
Container Cloud patch release 2.24.1 does not introduce new Cluster
releases.
For main deliverables of the Container Cloud release 2.24.1, refer to its
parent release 2.24.0:
Make sure to update the Cluster release version
of your managed cluster before the current Cluster release
version becomes unsupported by a new Container Cloud release
version.
Otherwise, Container Cloud stops auto-upgrade and eventually
Container Cloud itself becomes unsupported.
Container Cloud 2.24.0 has been successfully applied to a
certain number of clusters. The 2.24.0 related documentation content
fully applies to these clusters.
If your cluster started to update but was reverted to the previous product
version or the update is stuck, you automatically receive the 2.24.1 patch
release with the bug fixes to unblock the update to the 2.24 series.
There is no impact on the cluster workloads. For details on the patch
release, see 2.24.1.
The Mirantis Container Cloud GA release 2.24.0:
Introduces support for the Cluster release 14.0.0
that is based on Mirantis Container Runtime 20.10.17 and
Mirantis Kubernetes Engine 3.6.5 with Kubernetes 1.24.
Supports the latest major and patch Cluster releases of the
12.7.x series that supports Mirantis OpenStack for Kubernetes
(MOSK) 23.1 series.
Does not support greenfield deployments on deprecated Cluster releases
12.7.3, 11.7.4, or earlier patch
releases, 12.5.0, or 11.7.0.
Use the latest available Cluster releases of the series instead.
Caution
Make sure to update the Cluster release version
of your managed cluster before the current Cluster release
version becomes unsupported by a new Container Cloud release
version.
Otherwise, Container Cloud stops auto-upgrade and eventually
Container Cloud itself becomes unsupported.
This section outlines release notes for the Container Cloud release 2.24.0.
This section outlines new features and enhancements introduced in the
Mirantis Container Cloud release 2.24.0. For the list of enhancements in the
Cluster release 14.0.0 that is introduced by the Container Cloud
release 2.24.0, see the 14.0.0.
Automated upgrade of operating system on bare metal clusters¶
Support status of the feature
Since MOSK 23.2, the feature is generally available for
MOSK clusters.
Since Container Cloud 2.24.2, the feature is generally available for any
type of bare metal clusters.
Since Container Cloud 2.24.0, the feature is available as Technology
Preview for management and regional clusters only.
Implemented automatic in-place upgrade of an operating system (OS)
distribution on bare metal clusters. The OS upgrade occurs as part of cluster
update that requires machines reboot. The OS upgrade workflow is as follows:
The distribution ID value is taken from the id field of the
distribution from the allowedDistributions list in the spec of the
ClusterRelease object.
The distribution that has the default:true value is used during
update. This distribution ID is set in the
spec:providerSpec:value:distribution field of the Machine object
during cluster update.
On management and regional clusters, the operating system upgrades
automatically during cluster update. For managed clusters, an in-place OS
distribution upgrade should be performed between cluster updates.
This scenario implies a machine cordoning, draining, and reboot.
Warning
During the course of the Container Cloud 2.28.x series, Mirantis
highly recommends upgrading an operating system on any nodes of all your
managed cluster machines to Ubuntu 22.04 before the next major Cluster
release becomes available.
It is not mandatory to upgrade all machines at once. You can upgrade them
one by one or in small batches, for example, if the maintenance window is
limited in time.
Otherwise, the Cluster release update of the Ubuntu 20.04-based managed
clusters will become impossible as of Container Cloud 2.29.0 with Ubuntu
22.04 as the only supported version.
Management cluster update to Container Cloud 2.29.1 will be blocked if
at least one node of any related managed cluster is running Ubuntu 20.04.
Added initial Technology Preview support for WireGuard that enables traffic
encryption on the Kubernetes workloads network. Set secureOverlay:true
in the Cluster object during deployment of management, regional, or
managed bare metal clusters to enable WireGuard encryption.
Also, added the possibility to configure the maximum transmission unit (MTU)
size for Calico that is required for the WireGuard functionality and allows
maximizing network performance.
Note
For MOSK-based deployments, the feature support is
available since MOSK 23.2.
MetalLB configuration changes for bare metal and vSphere¶
For management and regional clusters
Caution
For managed clusters, this object is available as Technology
Preview and will become generally available in one of the following
Container Cloud releases.
Introduced the following MetalLB configuration changes and objects related to
address allocation and announcement of services LB for bare metal and vSphere
providers:
Introduced the MetalLBConfigTemplate object for bare metal and the
MetalLBConfig object for vSphere to be used as default and recommended.
For vSphere, during creation of clusters of any type, now a separate
MetalLBConfig object is created instead of corresponding settings
in the Cluster object.
The use of either Subnet objects without the new MetalLB objects or the
configInline MetalLB value of the Cluster object is deprecated and
will be removed in one of the following releases.
If the MetalLBConfig object is not used for MetalLB configuration
related to address allocation and announcement of services LB, then
automated migration applies during creation of clusters of any type or
cluster update to Container Cloud 2.24.0.
During automated migration, the MetalLBConfig and
MetalLBConfigTemplate objects for bare metal or the MetalLBConfig
for vSphere are created and contents of the MetalLB chart configInline
value is converted to the parameters of the MetalLBConfigTemplate object
for bare metal or of the MetalLBConfig object for vSphere.
The following changes apply to the bare metal bootstrap procedure:
Moved the following environment variables from cluster.yaml.template to
the dedicated ipam-objects.yaml.template:
BOOTSTRAP_METALLB_ADDRESS_POOL
KAAS_BM_BM_DHCP_RANGE
SET_METALLB_ADDR_POOL
SET_LB_HOST
Modified the default network configuration. Now it includes a bond interface
and separated PXE and management networks. Mirantis recommends using
separate PXE and management networks for management and regional clusters.
Added support for RHEL 8.7 on the vSphere-based management, regional, and
managed clusters.
Custom flavors for Octavia on OpenStack-based clusters¶
Implemented the possibility to use custom Octavia Amphora flavors that you can
enable in spec:providerSpec section of the Cluster object using
serviceAnnotations:loadbalancer.openstack.org/flavor-id during
management or regional cluster deployment.
Note
For managed clusters, you can enable the feature through the
Container Cloud API. The web UI functionality will be added in one of the
following Container Cloud releases.
Deletion of persistent volumes during an OpenStack-based cluster deletion¶
Completed the development of persistent volumes deletion during an
OpenStack-based managed cluster deletion by implementing the
Delete all volumes in the cluster check box in the cluster
deletion menu of the Container Cloud web UI.
Upgraded the Keycloak major version from 18.0.0 to 21.1.1. For the list of new
features and enhancements, see
Keycloak Release Notes.
The upgrade path is fully automated. No data migration or custom LCM changes
are required.
Important
After the Keycloak upgrade, access the Keycloak Admin Console
using the new URL format: https://<keycloak.ip>/auth instead of
https://<keycloak.ip>. Otherwise, the Resource not found
error displays in a browser.
Added initial Technology Preview support for custom host names of machines on
any supported provider and any cluster type. When enabled, any machine host
name in a particular region matches the related Machine object name. For
example, instead of the default kaas-node-<UID>, a machine host name will
be master-0. The custom naming format is more convenient and easier to
operate with.
You can enable the feature before or after management or regional cluster
deployment. If enabled after deployment, custom host names will apply to all
newly deployed machines in the region. Existing host names will remain the
same.
Added initial Technology Preview support for parallelizing of node update
operations that significantly improves the efficiency of your cluster. To
configure the parallel node update, use the following parameters located under
spec.providerSpec of the Cluster object:
maxWorkerUpgradeCount - maximum number of worker nodes for simultaneous
update to limit machine draining during update
maxWorkerPrepareCount - maximum number of workers for artifacts
downloading to limit network load during update
Implemented the CacheWarmupRequest resource to predownload, aka warm up,
a list of artifacts included in a given set of Cluster releases into the
mcc-cache service only once per release. The feature facilitates and
speeds up deployment and update of managed clusters.
After a successful cache warm-up, the object of the CacheWarmupRequest
resource is automatically deleted from the cluster and cache remains for
managed clusters deployment or update until next Container Cloud auto-upgrade
of the management or regional cluster.
Caution
If the disk space for cache runs out, the cache for the oldest
object is evicted. To avoid running out of space in the cache, verify and
adjust its size before each cache warm-up.
Note
For MOSK-based deployments, the feature support is
available since MOSK 23.2.
Added initial Technology Preview support for the Linux Audit daemon
auditd to monitor activity of cluster processes on any type of
Container Cloud cluster. The feature is an essential requirement for many
security guides that enables auditing of any cluster process to detect
potential malicious activity.
You can enable and configure auditd either during or after cluster deployment
using the Cluster object.
Note
For MOSK-based deployments, the feature support is
available since MOSK 23.2.
Enhanced TLS certificates configuration for cluster applications:
Added support for configuration of TLS certificates for MKE on management
or regional clusters to the existing support on managed clusters.
Implemented the ability to configure TLS certificates using the Container
Cloud web UI through the Security section located in the
More > Configure cluster menu.
Expanded the capability to perform a graceful reboot on a management,
regional, or managed cluster for all supported providers by adding the
Reboot machines option to the cluster menu in the Container
Cloud web UI. The feature allows for a rolling reboot of all cluster
machines without workloads interruption. The reboot occurs in the order of
cluster upgrade policy.
Note
For MOSK-based deployments, the feature support is
available since MOSK 23.2.
Creation and deletion of bare metal host credentials using web UI¶
Improved management of bare metal host credentials using the Container Cloud
web UI:
Added the Add Credential menu to the Credentials
tab. The feature facilitates association of credentials with bare metal
hosts created using the BM Hosts tab.
Implemented automatic deletion of credentials during deletion of bare metal
hosts after deletion of managed cluster.
Improved the Node Labels menu in the Container Cloud web UI by
making it more intuitive. Replaced the greyed out (disabled) label names with
the No labels have been assigned to this machine. message and
the Add a node label button link.
Also, added the possibility to configure node labels for machine pools
after deployment using the More > Configure Pool option.
On top of continuous improvements delivered to the existing Container Cloud
guides, added the documentation on managing Ceph OSDs with a separate metadata
device.
The following issues have been addressed in the Mirantis Container Cloud
release 2.24.0 along with the Cluster release 14.0.0. For
the list of hot fixes delivered in the 2.24.1 patch release, see
2.24.1.
[5981] Fixed the issue with upgrade of a cluster containing more than
120 nodes getting stuck on one node with errors about IP addresses
exhaustion in the docker logs. On existing clusters, after updating to
the Cluster release 14.0.0 or later, you can optionally remove the abandoned
mke-overlay network using docker network rm mke-overlay.
[29604] Fixed the issue with the false positive
failedtogetkubeconfig error occurring on the
WaitingforTLSsettingstobeapplied stage during TLS configuration.
[29762] Fixed the issue with a wrong IP address being assigned after the
MetalLB controller restart.
[30635] Fixed the issue with the pg_autoscaler module of Ceph
Manager failing with the pool <poolNumber> has overlapping roots error
if a Ceph cluster contains a mix of pools with deviceClass
either explicitly specified or not specified.
[30857] Fixed the issue with irrelevant error message displaying in the
osd-prepare Pod during the deployment of Ceph OSDs on removable devices
on AMD nodes. Now, the error message clearly states that removable devices
(with hotplug enabled) are not supported for deploying Ceph OSDs.
This issue has been addressed since the Cluster release 14.0.0.
[30781] Fixed the issue with cAdvisor failing to collect metrics on
CentOS-based deployments. Missing metrics affected the
KubeContainersCPUThrottlingHigh alert and the following Grafana
dashboards: Kubernetes Containers, Kubernetes Pods,
and Kubernetes Namespaces.
[31288] Fixed the issue with Fluentd agent failing and the
fluentd-logs Pods reporting the maximumopenshards limit error,
thus preventing OpenSearch to accept new logs. The fix enables the
possibility to increase the limit for maximum open shards using
cluster.max_shards_per_node. For details, see Tune StackLight for long-term log retention.
[31485] Fixed the issue with Elasticsearch Curator not deleting indices
according to the configured retention period on any type of Container Cloud
clusters.