The documentation is intended to help operators understand the core concepts
of the product.
The information provided in this documentation set is being constantly
improved and amended based on the feedback and kind requests from our
software consumers. This documentation set outlines description of
the features that are supported within two latest Cloud Container
minor releases, with a corresponding note Available since release.
The following table lists the guides included in the documentation set you
are reading:
GUI elements that include any part of interactive user interface and
menu navigation.
Superscript
Some extra, brief information. For example, if a feature is
available from a specific release or if a feature is in the
Technology Preview development stage.
Note
The Note block
Messages of a generic meaning that may be useful to the user.
Caution
The Caution block
Information that prevents a user from mistakes and undesirable
consequences when following the procedures.
Warning
The Warning block
Messages that include details that can be easily missed, but should not
be ignored by the user and are valuable before proceeding.
See also
The See also block
List of references that may be helpful for understanding of some related
tools, concepts, and so on.
Learn more
The Learn more block
Used in the Release Notes to wrap a list of internal references to
the reference architecture, deployment and operation procedures specific
to a newly implemented product feature.
A Technology Preview feature provides early access to upcoming product
innovations, allowing customers to experiment with the functionality and
provide feedback.
Technology Preview features may be privately or publicly available and
neither are intended for production use. While Mirantis will provide
assistance with such features through official channels, normal Service
Level Agreements do not apply.
As Mirantis considers making future iterations of Technology Preview features
generally available, we will do our best to resolve any issues that customers
experience when using these features.
During the development of a Technology Preview feature, additional components
may become available to the public for evaluation. Mirantis cannot guarantee
the stability of such features. As a result, if you are using Technology
Preview features, you may not be able to seamlessly upgrade to subsequent
product releases.
Mirantis makes no guarantees that Technology Preview features will graduate
to generally available features.
The documentation set refers to Mirantis Container Cloud GA as to the latest
released GA version of the product. For details about the Container Cloud
GA minor releases dates, refer to
Container Cloud releases.
Mirantis Container Cloud enables you to ship code faster by enabling
speed with choice, simplicity, and security. Through a single pane
of glass you can deploy, manage, and observe Kubernetes
clusters on public clouds, private clouds, or bare metal infrastructure.
Mirantis Container Cloud provides the ability to leverage multiple
on premises (VMware, OpenStack, and bare metal) and public cloud
(AWS, Azure, Equinix Metal) infrastructure.
The list of the most common use cases includes:
Multi-cloud
Organizations are increasingly moving toward a multi-cloud strategy,
with the goal of enabling the effective placement of workloads over
multiple platform providers. Multi-cloud strategies can introduce
a lot of complexity and management overhead. Mirantis Container Cloud
enables you to effectively deploy and manage container clusters
(Kubernetes and Swarm) across multiple cloud provider platforms,
both on premises and in the cloud.
Hybrid cloud
The challenges of consistently deploying, tracking, and managing hybrid
workloads across multiple cloud platforms is compounded by not having
a single point that provides information on all available resources.
Mirantis Container Cloud enables hybrid cloud workload by providing
a central point of management and visibility of all your cloud resources.
Kubernetes cluster lifecycle management
The consistent lifecycle management of a single Kubernetes cluster
is a complex task on its own that is made infinitely more difficult
when you have to manage multiple clusters across different platforms
spread across the globe. Mirantis Container Cloud provides a single,
centralized point from which you can perform full lifecycle management
of your container clusters, including automated updates and upgrades.
We also support attaching existing Mirantis Kubernetes Engine clusters.
Highly regulated industries
Regulated industries need a fine level of access control granularity,
high security standards and extensive reporting capabilities to ensure
that they can meet and exceed the security standards and requirements.
Mirantis Container Cloud provides for a fine-grained Role Based Access
Control (RBAC) mechanism and easy integration and federation to existing
identity management systems (IDM).
Logging, monitoring, alerting
A complete operational visibility is required to identify and address issues
in the shortest amount of time – before the problem becomes serious.
Mirantis StackLight is the proactive monitoring, logging, and alerting
solution designed for large-scale container and cloud observability with
extensive collectors, dashboards, trend reporting and alerts.
Storage
Cloud environments require a unified pool of storage that can be scaled up by
simply adding storage server nodes. Ceph is a unified, distributed storage
system designed for excellent performance, reliability, and scalability.
Deploy Ceph utilizing Rook to provide and manage a robust persistent storage
that can be used by Kubernetes workloads on the bare metal and Equinix Metal
based clusters.
Security
Security is a core concern for all enterprises, especially with more
of our systems being exposed to the Internet as a norm. Mirantis
Container Cloud provides for a multi-layered security approach that
includes effective identity management and role based authentication,
secure out of the box defaults and extensive security scanning and
monitoring during the development process.
5G and Edge
The introduction of 5G technologies and the support of Edge workloads
requires an effective multi-tenant solution to manage the underlying
container infrastructure. Mirantis Container Cloud provides for a full
stack, secure, multi-cloud cluster management and Day-2 operations
solution that supports both on premises bare metal and cloud.
Mirantis Container Cloud is a set of microservices
that are deployed using Helm charts and run in a Kubernetes cluster.
Container Cloud is based on the Kubernetes Cluster API community initiative.
The following diagram illustrates an overview of Container Cloud
and the clusters it manages:
All artifacts used by Kubernetes and workloads are stored
on the Container Cloud content delivery network (CDN):
mirror.mirantis.com (Debian packages including the Ubuntu mirrors)
binary.mirantis.com (Helm charts and binary artifacts)
mirantis.azurecr.io (Docker image registry)
All Container Cloud components are deployed in the Kubernetes clusters.
All Container Cloud APIs are implemented using the Kubernetes
Custom Resource Definition (CRD) that represents custom objects
stored in Kubernetes and allows you to expand Kubernetes API.
The Container Cloud logic is implemented using controllers.
A controller handles the changes in custom resources defined
in the controller CRD.
A custom resource consists of a spec that describes the desired state
of a resource provided by a user.
During every change, a controller reconciles the external state of a custom
resource with the user parameters and stores this external state in the
status subresource of its custom resource.
Container Cloud can have several regions. A region is a physical location,
for example, a data center, that has access to one or several cloud provider
back ends. A separate regional cluster manages a region that can include
multiple providers. A region must have a two-way (full) network connectivity
between a regional cluster and a cloud provider back end. For example,
an OpenStack VM must have access to the related regional cluster. And this
regional cluster must have access to the OpenStack floating IPs and
load balancers.
The following diagram illustrates the structure of the Container Cloud regions:
The types of the Container Cloud clusters include:
Bootstrap cluster
Runs the bootstrap process on a seed node. For the OpenStack, AWS,
Equinix Metal, Microsoft Azure, or VMware vSphere-based Container Cloud,
it can be an operator desktop computer. For the baremetal-based Container
Cloud, this is the first temporary data center node.
Requires access to a provider back end: OpenStack, AWS, Azure, vSphere,
Equinix Metal, or bare metal.
Contains minimum set of services to deploy
the management and regional clusters.
Is destroyed completely after a successful bootstrap.
Management and regional clusters
Management cluster:
Runs all public APIs and services including the web UIs
of Container Cloud.
Does not require access to any provider back end.
Regional cluster:
Is combined with management cluster by default.
Runs the provider-specific services and internal API including
LCMMachine and LCMCluster. Also, it runs an LCM controller for
orchestrating managed clusters and other controllers for handling
different resources.
Requires two-way access to a provider back end. The provider connects
to a back end to spawn managed cluster nodes,
and the agent running on the nodes accesses the regional cluster
to obtain the deployment information.
Requires access to a management cluster to obtain user parameters.
Supports multi-regional deployments.
For example, you can deploy an AWS-based management cluster
with AWS-based and OpenStack-based regional clusters.
A Mirantis Kubernetes Engine (MKE) cluster that an end user
creates using the Container Cloud web UI.
Requires access to a regional cluster. Each node of a managed
cluster runs an LCM agent that connects to the LCM machine of the
regional cluster to obtain the deployment details.
An attached MKE cluster that is not created using
Container Cloud. In such case, nodes of the attached cluster
do not contain LCM agent. For supported MKE versions that can be attached
to Container Cloud, see Release compatibility matrix.
Baremetal-based managed clusters support the Mirantis OpenStack for Kubernetes
(MOSK) product. For details, see
MOSK documentation.
All types of the Container Cloud clusters except the bootstrap cluster
are based on the MKE and Mirantis Container Runtime (MCR) architecture.
For details, see MKE and
MCR documentation.
The following diagram illustrates the distribution of services
between each type of the Container Cloud clusters:
The Mirantis Container Cloud provider is the central component
of Container Cloud that provisions a node of a management, regional,
or managed cluster and runs the LCM agent on this node.
It runs in a management and regional clusters and requires connection
to a provider back end.
The Container Cloud provider interacts with the following
types of public API objects:
Public API object name
Description
Container Cloud release object
Contains the following information about clusters:
Version of the supported Cluster release for a management and
regional clusters
List of supported Cluster releases for the managed clusters
and supported upgrade path
Description of Helm charts that are installed
on the management and regional clusters
depending on the selected provider
Cluster release object
Provides a specific version of a management, regional, or
managed cluster.
Any Cluster release object, as well as a Container Cloud release
object never changes, only new releases can be added.
Any change leads to a new release of a cluster.
Contains references to all components and their versions
that are used to deploy all cluster types:
LCM components:
LCM agent
Ansible playbooks
Scripts
Description of steps to execute during a cluster deployment
and upgrade
Helm controller image references
Supported Helm charts description:
Helm chart name and version
Helm release name
Helm values
Cluster object
References the Credentials, KaaSRelease and ClusterRelease objects.
Is tied to a specific Container Cloud region and provider.
Represents all cluster-level resources. For example, for the OpenStack-based
clusters, it represents networks, load balancer for the Kubernetes
API, and so on. It uses data from the Credentials object
to create these resources and data from the KaaSRelease and ClusterRelease objects
to ensure that all lower-level cluster objects are created.
Machine object
References the Cluster object.
Represents one node of a managed cluster, for example, an OpenStack VM,
and contains all data to provision it.
Credentials object
Contains all information necessary to connect to a provider back end.
Is tied to a specific Container Cloud region and provider.
PublicKey object
Is provided to every machine to obtain an SSH access.
The following diagram illustrates the Container Cloud provider data flow:
The Container Cloud provider performs the following operations
in Container Cloud:
Consumes the below types of data from a management and regional cluster:
Credentials to connect to a provider back end
Deployment instructions from the KaaSRelease and ClusterRelease
objects
The cluster-level parameters from the Cluster objects
The machine-level parameters from the Machine objects
Prepares data for all Container Cloud components:
Creates the LCMCluster and LCMMachine custom resources
for LCM controller and LCM agent. The LCMMachine custom resources
are created empty to be later handled by the LCM controller.
Creates the the HelmBundle custom resources for the Helm controller
using data from the KaaSRelease and ClusterRelease objects.
Creates service accounts for these custom resources.
Creates a scope in Identity and access management (IAM)
for a user access to a managed cluster.
Provisions nodes for a managed cluster using the cloud-init script
that downloads and runs the LCM agent.
The Mirantis Container Cloud release controller is responsible
for the following functionality:
Monitor and control the KaaSRelease and ClusterRelease objects
present in a management cluster. If any release object is used
in a cluster, the release controller prevents the deletion
of such an object.
Trigger the Container Cloud auto-upgrade procedure if a new
KaaSRelease object is found:
Search for the managed clusters with old Cluster releases
that are not supported by a new Container Cloud release.
If any are detected, abort the auto-upgrade and display
a corresponding note about an old Cluster release in the Container
Cloud web UI for the managed clusters. In this case, a user must update
all managed clusters using the Container Cloud web UI.
Once all managed clusters are upgraded to the Cluster releases
supported by a new Container Cloud release,
the Container Cloud auto-upgrade is retriggered
by the release controller.
Trigger the Container Cloud release upgrade of all Container Cloud
components in a management cluster.
The upgrade itself is processed by the Container Cloud provider.
Trigger the Cluster release upgrade of a management cluster
to the Cluster release version that is indicated
in the upgraded Container Cloud release version.
The LCMCluster components, such as MKE, are upgraded before
the HelmBundle components, such as StackLight or Ceph.
Verify the regional cluster(s) status. If the regional cluster
is ready, trigger the Cluster release upgrade
of the regional cluster.
Once a management cluster is upgraded, an option to update
a managed cluster becomes available in the Container Cloud web UI.
During a managed cluster update, all cluster components including
Kubernetes are automatically upgraded to newer versions if available.
The LCMCluster components, such as MKE, are upgraded before
the HelmBundle components, such as StackLight or Ceph.
The Operator can delay the Container Cloud automatic upgrade procedure for a
limited amount of time or schedule upgrade to run at desired hours or weekdays.
For details, see Schedule Mirantis Container Cloud upgrades.
Container Cloud remains operational during the management and
regional clusters upgrade. Managed clusters are not affected
during this upgrade. For the list of components that are updated during
the Container Cloud upgrade, see the Components versions section
of the corresponding Container Cloud release in
Release Notes.
When Mirantis announces support of the newest versions of
Mirantis Container Runtime (MCR) and Mirantis Kubernetes Engine
(MKE), Container Cloud automatically upgrades these components as well.
For the maintenance window best practices before upgrade of these
components, see
MKE Documentation.
The Mirantis Container Cloud web UI is mainly designed
to create and update the managed clusters as well as add or remove machines
to or from an existing managed cluster.
It also allows attaching existing Mirantis Kubernetes Engine (MKE) clusters.
You can use the Container Cloud web UI
to obtain the management cluster details including endpoints, release version,
and so on.
The management cluster update occurs automatically
with a new release change log available through the Container Cloud web UI.
The Container Cloud web UI is a JavaScript application that is based
on the React framework. The Container Cloud web UI is designed to work
on a client side only. Therefore, it does not require a special back end.
It interacts with the Kubernetes and Keycloak APIs directly.
The Container Cloud web UI uses a Keycloak token
to interact with Container Cloud API and download kubeconfig
for the management and managed clusters.
The Container Cloud web UI uses NGINX that runs on a management cluster
and handles the Container Cloud web UI static files.
NGINX proxies the Kubernetes and Keycloak APIs
for the Container Cloud web UI.
The bare metal service provides for the discovery, deployment, and management
of bare metal hosts.
The bare metal management in Mirantis Container Cloud
is implemented as a set of modular microservices.
Each microservice implements a certain requirement or function
within the bare metal management system.
The back-end bare metal manager in a standalone mode with its auxiliary
services that include httpd, dnsmasq, and mariadb.
OpenStack Ironic Inspector
Introspects and discovers the bare metal hosts inventory.
Includes OpenStack Ironic Python Agent (IPA) that is used
as a provision-time agent for managing bare metal hosts.
Ironic Operator
Monitors changes in the external IP addresses of httpd, ironic,
and ironic-inspector and automatically reconciles the configuration
for dnsmasq, ironic, baremetal-provider,
and baremetal-operator.
Bare Metal Operator
Manages bare metal hosts through the Ironic API. The Container Cloud
bare-metal operator implementation is based on the Metal³ project.
Bare metal resources manager
Ensures that the bare metal provisioning artifacts such as the
distribution image of the operating system is available and up to date.
cluster-api-provider-baremetal
The plugin for the Kubernetes Cluster API integrated with Container Cloud.
Container Cloud uses the Metal³ implementation of
cluster-api-provider-baremetal for the Cluster API.
HAProxy
Load balancer for external access to the Kubernetes API endpoint.
LCM agent
Used for physical and logical storage, physical and logical network,
and control over the life cycle of a bare metal machine resources.
Ceph
Distributed shared storage is required by the Container Cloud services
to create persistent volumes to store their data.
MetalLB
Load balancer for Kubernetes services on bare metal. 1
NGINX
Starting from Container Cloud 2.15.0, replaced with HAProxy. For
details, see the HAProxy description above.
Keepalived
Monitoring service that ensures availability of the virtual IP for
the external load balancer endpoint (HAProxy). 1
IPAM
IP address management services provide consistent IP address space
to the machines in bare metal clusters. See details in
IP Address Management.
Mirantis Container Cloud on bare metal uses IP Address Management (IPAM)
to keep track of the network addresses allocated to bare metal hosts.
This is necessary to avoid IP address conflicts
and expiration of address leases to machines through DHCP.
Note
Only IPv4 address family is currently supported by Container Cloud
and IPAM. IPv6 is not supported and not used in Container Cloud.
IPAM is provided by the kaas-ipam controller. Its functions
include:
Allocation of IP address ranges or subnets to newly created clusters using
SubnetPool and Subnet resources.
Allocation IP addresses to machines and cluster services at the request
of baremetal-provider using the IpamHost and IPaddr resources.
Creation and maintenance of host networking configuration
on the bare metal hosts using the IpamHost resources.
The IPAM service can support different networking topologies and network
hardware configurations on the bare metal hosts.
In the most basic network configuration, IPAM uses a single L3 network
to assign addresses to all bare metal hosts, as defined in
Managed cluster networking.
You can apply complex networking configurations to a bare metal host
using the L2 templates. The L2 templates imply multihomed host networking
and enable you to create a managed cluster where nodes use separate host
networks for different types of traffic. Multihoming is required
to ensure the security and performance of a managed cluster.
Warning
Avoid modifying existing L2 templates and subnets that the
deployed machines use. This prevents multiple clusters failures
caused by unsafe changes. The list of risks posed by modifying
L2 templates includes:
Services running on hosts cannot reconfigure automatically
to switch to the new IP addresses and/or interfaces.
Connections between services are interrupted unexpectedly,
which can cause data loss.
Incorrect configurations on hosts can lead to irrevocable loss
of connectivity between services and unexpected cluster
partition or disassembly.
Note
Starting from Container Cloud 2.17.0, modification of L2 templates
in use is prohibited in the API to prevent accidental cluster failures
due to unsafe changes.
The main purpose of networking in a Container Cloud management or regional
cluster is to provide access to the Container Cloud Management API
that consists of the Kubernetes API of the Container Cloud management and
regional clusters and the Container Cloud LCM API. This API allows end users to
provision and configure managed clusters and machines. Also, this API is used
by LCM agents in managed clusters to obtain configuration and report status.
The following types of networks are supported for the management and regional
clusters in Container Cloud:
PXE network
Enables PXE boot of all bare metal machines in the Container Cloud region.
PXE subnet
Provides IP addresses for DHCP and network boot of the bare metal hosts
for initial inspection and operating system provisioning.
This network may not have the default gateway or a router connected
to it. The PXE subnet is defined by the Container Cloud Operator
during bootstrap.
Provides IP addresses for the bare metal management services of
Container Cloud, such as bare metal provisioning service (Ironic).
These addresses are allocated and served by MetalLB.
Management network
Connects LCM agents running on the hosts to the Container Cloud LCM API.
Serves the external connections to the Container Cloud Management API.
The network is also used for communication between kubelet
and the Kubernetes API server inside a Kubernetes cluster. The MKE
components use this network for communication inside a swarm cluster.
The network also serves storage traffic for the built-in Ceph cluster.
LCM subnet
Provides IP addresses for the Kubernetes nodes in the management cluster.
This network also provides a Virtual IP (VIP) address for the load
balancer that enables external access to the Kubernetes API
of a management cluster. This VIP is also the endpoint to access
the Container Cloud Management API in the management cluster.
Provides IP addresses for the externally accessible services of
Container Cloud, such as Keycloak, web UI, StackLight.
These addresses are allocated and served by MetalLB.
Kubernetes workloads network
Technology Preview
Serves the internal traffic between workloads on the management cluster.
Kubernetes workloads subnet
Provides IP addresses that are assigned to nodes and used by Calico.
Out-of-Band (OOB) network
Connects to Baseboard Management Controllers of the servers that host
the management cluster. The OOB subnet must be accessible from the
management network through IP routing. The OOB network
is not managed by Container Cloud and is not represented in the IPAM API.
A Kubernetes cluster networking is typically focused on connecting pods on
different nodes. On bare metal, however, the cluster networking is more
complex as it needs to facilitate many different types of traffic.
Kubernetes clusters managed by Mirantis Container Cloud
have the following types of traffic:
PXE network
Enables the PXE boot of all bare metal machines in Container Cloud.
This network is not configured on the hosts in a managed cluster.
It is used by the bare metal provider to provision additional
hosts in managed clusters and is disabled on the hosts after
provisioning is done.
Life-cycle management (LCM) network
Connects LCM agents running on the hosts to the Container Cloud LCM API.
The LCM API is provided by the regional or management cluster.
The LCM network is also used for communication between kubelet
and the Kubernetes API server inside a Kubernetes cluster. The MKE
components use this network for communication inside a swarm cluster.
LCM subnet
Provides IP addresses that are statically allocated by the IPAM service
to bare metal hosts. This network must be connected to the Kubernetes API
endpoint of the regional cluster through an IP router.
LCM agents running on managed clusters will connect to
the regional cluster API through this router. LCM subnets may be
different per managed cluster as long as this connection requirement is
satisfied.
The Virtual IP (VIP) address for load balancer that enables access to
the Kubernetes API of the managed cluster must be allocated from the LCM
subnet.
Kubernetes workloads network
Technology Preview
Serves as an underlay network for traffic between pods in
the managed cluster. This network should not be shared between clusters.
Kubernetes workloads subnet
Provides IP addresses that are assigned to nodes and used by Calico.
Kubernetes external network
Serves ingress traffic to the managed cluster from the outside world.
This network can be shared between clusters, but must have a dedicated
subnet per cluster.
Services subnet
Technology Preview
Provides IP addresses for externally available load-balanced services.
The address ranges for MetalLB are assigned from this subnet.
This subnet must be unique per managed cluster.
Storage network
Serves storage access and replication traffic from and to Ceph OSD services.
The storage network does not need to be connected to any IP routers
and does not require external access, unless you want to use Ceph
from outside of a Kubernetes cluster.
To use a dedicated storage network, define and configure
both subnets listed below.
Storage access subnet
Provides IP addresses that are assigned to Ceph nodes.
The Ceph OSD services bind to these addresses on their respective
nodes. Serves Ceph access traffic from and to storage clients.
This is a public network in Ceph terms. 1
This subnet is unique per managed cluster.
Storage replication subnet
Provides IP addresses that are assigned to Ceph nodes.
The Ceph OSD services bind to these addresses on their respective
nodes. Serves Ceph internal replication traffic. This is a
cluster network in Ceph terms. 1
This subnet is unique per managed cluster.
Out-of-Band (OOB) network
Connects baseboard management controllers (BMCs) of the bare metal hosts.
This network must not be accessible from the managed clusters.
The following diagram illustrates the networking schema of the Container Cloud
deployment on bare metal with a managed cluster:
The following network roles are defined for all Mirantis Container Cloud
clusters nodes on bare metal including the bootstrap,
management, regional, and managed cluster nodes:
Out-of-band (OOB) network
Connects the Baseboard Management Controllers (BMCs) of the hosts
in the network to Ironic. This network is out of band for the
host operating system.
PXE network
Enables remote booting of servers through the PXE protocol. In management
or regional clusters, DHCP server listens on this network for hosts
discovery and inspection. In managed clusters, hosts use this network
for the initial PXE boot and provisioning.
LCM network
Connects LCM agents running on the node to the LCM API of the management
or regional cluster. It is also used for communication between kubelet
and the Kubernetes API server inside a Kubernetes cluster. The MKE
components use this network for communication inside a swarm cluster.
In management or regional clusters, it is replaced by the management
network.
Kubernetes workloads (pods) network
Technology Preview
Serves connections between Kubernetes pods.
Each host has an address on this network, and this address is used
by Calico as an endpoint to the underlay network.
Kubernetes external network
Technology Preview
Serves external connection to the Kubernetes API
and the user services exposed by the cluster. In management or regional
clusters, it is replaced by the management network.
Management network
Serves external connections to the Container Cloud Management API and
services of the management or regional cluster.
Not available in a managed cluster.
Storage access network
Connects Ceph nodes to the storage clients. The Ceph OSD service is
bound to the address on this network. This is a public network in
Ceph terms. 0
In management or regional clusters, it is replaced by the management
network.
Storage replication network
Connects Ceph nodes to each other. Serves internal replication traffic.
This is a cluster network in Ceph terms. 0
In management or regional clusters, it is replaced by the management
network.
Each network is represented on the host by a virtual Linux bridge. Physical
interfaces may be connected to one of the bridges directly, or through a
logical VLAN subinterface, or combined into a bond interface that is in
turn connected to a bridge.
The following table summarizes the default names used for the bridges
connected to the networks listed above:
Mirantis Container Cloud provides APIs that enable you
to define hardware configurations that extend the reference architecture:
Bare Metal Host Profile API
Enables for quick configuration of host boot and storage devices
and assigning of custom configuration profiles to individual machines.
See Create a custom bare metal host profile.
IP Address Management API
Enables for quick configuration of host network interfaces and IP addresses
and setting up of IP addresses ranges for automatic allocation.
See Create L2 templates.
Typically, operations with the extended hardware configurations are available
through the API and CLI, but not the web UI.
To keep operating system on a bare metal host up to date with the latest
security updates, the operating system requires periodic software
packages upgrade that may or may not require the host reboot.
Mirantis Container Cloud uses life cycle management tools to update
the operating system packages on the bare metal hosts. Container Cloud
may also trigger restart of bare metal hosts to apply the updates.
In the management cluster of Container Cloud, software package upgrade and
host restart is applied automatically when a new Container Cloud version
with available kernel or software packages upgrade is released.
In managed clusters, package upgrade and host restart is applied
as part of usual cluster upgrade using the Update cluster option
in the Container Cloud web UI.
Operating system upgrade and host restart are applied to cluster
nodes one by one. If Ceph is installed in the cluster, the Container
Cloud orchestration securely pauses the Ceph OSDs on the node before
restart. This allows avoiding degradation of the storage service.
Caution
Depending on the cluster configuration, applying security
updates and host restart can increase the update time for each node to up to
1 hour.
Cluster nodes are updated one by one. Therefore, for large clusters,
the update may take several days to complete.
The Mirantis Container Cloud managed clusters that are based on
vSphere, Equinix Metal, or bare metal use MetalLB for load balancing
of services and HAProxy with VIP managed by Virtual Router Redundancy
Protocol (VRRP) with Keepalived for the Kubernetes API load balancer.
Every control plane node of each Kubernetes cluster runs the kube-api
service in a container. This service provides a Kubernetes API endpoint.
Every control plane node also runs the haproxy server that provides
load balancing with back-end health checking for all kube-api endpoints as
back ends.
The default load balancing method is least_conn. With this method,
a request is sent to the server with the least number of active
connections. The default load balancing method cannot be changed
using the Container Cloud API.
Only one of the control plane nodes at any given time serves as a
front end for Kubernetes API. To ensure this, the Kubernetes clients
use a virtual IP (VIP) address for accessing Kubernetes API.
This VIP is assigned to one node at a time using VRRP. Keepalived running on
each control plane node provides health checking and failover of the VIP.
Keepalived is configured in multicast mode.
Note
The use of VIP address for load balancing of Kubernetes API requires
that all control plane nodes of a Kubernetes cluster are connected
to a shared L2 segment. This limitation prevents from installing
full L3 topologies where control plane nodes are split
between different L2 segments and L3 networks.
Caution
External load balancers for services are not supported by
the current version of the Container Cloud vSphere provider.
The built-in load balancing described in this section is the only supported
option and cannot be disabled.
The services provided by the Kubernetes clusters, including
Container Cloud and user services, are balanced by MetalLB.
The metallb-speaker service runs on every worker node in
the cluster and handles connections to the service IP addresses.
MetalLB runs in the MAC-based (L2) mode. It means that all
control plane nodes must be connected to a shared L2 segment.
This is a limitation that does not allow installing full L3
cluster topologies.
Caution
External load balancers for services are not supported by
the current version of the Container Cloud vSphere provider.
The built-in load balancing described in this section is the only supported
option and cannot be disabled.
VMware vSphere network objects and IPAM recommendations¶
The VMware vSphere provider of Mirantis Container Cloud supports
the following types of vSphere network objects:
Virtual network
A network of virtual machines running on a hypervisor(s) that are logically
connected to each other so that they can exchange data. Virtual machines
can be connected to virtual networks that you create when you add a network.
Distributed port group
A port group associated with a vSphere distributed switch that specifies
port configuration options for each member port. Distributed port groups
define how connection is established through the vSphere distributed switch
to the network.
A Container Cloud cluster can be deployed using one of these network objects
with or without a DHCP server in the network:
Non-DHCP
Container Cloud uses IPAM service to manage IP addresses assignment to
machines. You must provide additional network parameters, such as
CIDR, gateway, IP ranges, and nameservers.
Container Cloud processes this data to the cloud-init metadata and
passes the data to machines during their bootstrap.
DHCP
Container Cloud relies on a DHCP server to assign IP addresses
to virtual machines.
Mirantis recommends using IP address management (IPAM) for cluster
machines provided by Container Cloud. IPAM must be enabled
for deployment in the non-DHCP vSphere networks. But Mirantis
recommends enabling IPAM in the DHCP-based networks as well. In this case,
the dedicated IPAM range should not intersect with the IP range used in the
DHCP server configuration for the provided vSphere network.
Such configuration prevents issues with accidental IP address change
for machines. For the issue details, see
vSphere troubleshooting.
The following parameters are required to enable IPAM:
Network CIDR.
Network gateway address.
Minimum 1 DNS server.
IP address include range to be allocated for cluster machines.
Make sure that this range is not part of the DHCP range if the network has
a DHCP server.
Minimal number of addresses in the range:
3 IPs for management or regional cluster
3+N IPs for a managed cluster, where N is the number of worker nodes
Optional. IP address exclude range that is the list of IPs not to be
assigned to machines from the include ranges.
A dedicated Container Cloud network must not contain any virtual machines
with the keepalived instance running inside them as this may lead to the
vrouter_id conflict. By default, the Container Cloud management or
regional cluster is deployed with vrouter_id set to 1.
Managed clusters are deployed with the vrouter_id value starting from
2 and upper.
This section describes the architecture for a Mirantis Container Cloud
deployment based on the Equinix Metal infrastructure using private networks.
Private networks are required for the following use cases:
Connect the Container Cloud to the on-premises corporate networks without
exposing it to the Internet. This can be required by corporate security
policies.
Reduce ingress and egress bandwidth costs and the number of public IP
addresses utilized by the deployment. Public IP addresses are a scarce
and valuable resource, and Container Cloud should only expose the necessary
services in that address space.
Testing and staging environments typically do not require accepting
connections from the outside of the cluster. Such Container Cloud clusters
should be isolated in private VLANs.
The following diagram illustrates a high-level overview of the architecture.
It covers the Container Cloud deployment across multiple Metros, marked as
A and B on the diagram.
Container Cloud clusters are isolated in private VLANs and do not use
any public IP addresses. An external infrastructure allows exposing necessary
services to the outside world. This external infrastructure must be provided
by the Operator before installing Container Cloud.
A certain infrastructure must be deployed and configured by the Operator who
installs the Container Cloud before the actual installation starts. This
infrastructure must include the following elements, created and configured
through the Equinix Metal UI or API and command line:
At least one node that provides the following services:
Routers
IP router service that connects all private networks in each Metro and
allows managed clusters to communicate with the management cluster.
It must be connected to all VLANs in a Metro, provide IP
routing to the hosts connected to these VLANs, and act as the default
router for all of them. In a multi-Metro case, it should have a VXLAN
tunnel with all other routers in other Metros.
DHCP Relay
Forwards DHCP requests from the private network (VLAN) of
the managed cluster to the DHCP server of the bare metal management
service of Container Cloud. The DHCP server is placed in the VLAN
of the management cluster.
Proxy Server (optional)
If direct access to the Internet from management or managed clusters is
not desired, a proxy server can be used to provide access to the
artifacts placed in the Container Cloud CDN and other external
resources.
Management and regional clusters require direct or proxy access to the
Mirantis CDN to download artifacts and send encrypted telemetry.
Temporary seed node for the management cluster bootstrap
The seed node should be deployed through the Equinix Metal console or API
in the Metro where the management cluster will be deployed. This node
must be attached to the VLAN that will be used by the management cluster.
Equinix Backend Transfer enabled in the current Container Cloud project
Backend Transfer enables inter-Metro communication between
managed and management clusters placed in different Metros.
Caution
Ensure that the IP subnets allocated to VLANs are not overlapping. Correctly
and consistently configure the IP routing with allocation of IP addresses
to management, regional, and managed clusters.
Before deploying Container Cloud, verify the following:
Subnets and IP ranges in the bootstrap templates to avoid CIDRs overlapping
Proxy configuration in the templates and environment variables
VLAN attachments to routers
Note
For an example of Terraform templates and Ansible playbooks to
use for deployment and configuration of all components described
above, see
Container Cloud on Equinix Metal templates.
To improve isolation, each cluster is placed in its own private network (VLAN).
All other connectivity, including access to the Equinix Metal internal network,
is disabled on all nodes.
The following diagram illustrates a high-level overview of the architecture.
It covers the Container Cloud deployment across multiple Metros, marked as
A and B on the diagram.
The main element in the private networking architecture is the
infrastructure-level IP routing provided by routers that are connected to
the VLAN of each cluster by enabling communication between clusters.
Another critical service that routers provide is DHCP relay. It forwards
DHCP/PXE requests from managed cluster nodes to the Ironic DHCP server in the
management cluster.
Example configuration:
Metro A has 2 VLANS:
VLAN 1001 with the IP range 192.168.0.0/24 connected to the router
interface bond0.1001 with the IP 192.168.0.1 for the management
cluster
VLAN 1002 with IP range 192.168.1.0/24 connected to the router
interface bond0.1002 with IP 192.168.1.1 for the managed cluster
The IP address of the Ironic DHCP server is 192.168.0.50
With these settings, the router should have the following configuration:
Forwarding between the bond0.1001 and bond0.1002 interfaces
Route for 192.168.0.0/24 via the bond0.1001 interface
Route for 192.168.1.0/24 via the bond0.1002 interface
DHCP relay on both bond0.1001 and bond0.1002 interfaces forwarding
DHCP requests to 192.168.0.50
One of the following services:
SNAT for traffic from the 192.168.0.0/24 and 192.168.1.0/24
networks to the Internet or the proxy service
Proxy service for the Mirantis CDN and telemetry servers running
on the router itself
During Container Cloud deployment, the seed node will have to:
If users need to deploy clusters in different Metros, every Metro requires
a separate router and VLAN configuration.
For example, Metro B has a separate router and 2 VLANS:
VLAN 2001 with the IP range 192.168.16.0/24 connected to the router
interface bond0.2001 with the IP 192.168.16.1 for the management
cluster
VLAN 2002 with the IP range 192.168.17.0/24 connected to the router
interface bond0.2002 with the IP 192.168.17.1 for the managed cluster
With this configuration, to deploy managed clusters using the management
cluster from Metro A, the router in Metro B should have the following
configuration:
The VXLAN interface (or any other tunnel) vxlan1, for example, with the
IP address 192.168.255.2 and the remote address of the router in Metro A
Forwarding between the bond0.2001, bond0.2002, and vxlan1
interfaces
Route for 192.168.16.0/24 via the bond0.1001 interface
Route for 192.168.17.0/24 via the bond0.1002 interface
Route for 192.168.0.0/20 via 192.168.255.1 that is the VXLAN address
of the router in Metro A
DHCP relay on both bond0.1001 and bond0.1002 interfaces forwarding
DHCP requests to 192.168.0.50
The router in Metro A requires additional configuration for:
The VXLAN interface (or any other tunnel) vxlan1, for example, with the
IP address 192.168.255.1 and the remote address of the router in Metro B
Route for 192.168.16.0/20 via 192.168.255.2 that is the VXLAN address
of the router in Metro B
You can deploy a separate regional cluster in Metro B to reduce traffic
between Metros. For example, if you use VLAN 2001 for the regional cluster,
the router in Metro B requires additional configuration for:
DHCP relay on both bond0.2001 and bond0.2002 interfaces forwarding
DHCP requests to 192.168.16.50 that is the IP address of the Ironic
DHCP server in the regional cluster
SNAT or proxy service to allow the regional cluster access to the Mirantis
CDN and telemetry
During regional cluster deployment, the seed node for the regional cluster
will have to:
Be deployed in Metro B
Be directly connected to VLAN 2001
Have access to the proxy service or the Internet
The following diagram illustrates the Container Cloud deployment with an
additional regional cluster:
Note
For an example of Terraform templates and Ansible playbooks to
use for deployment and configuration of all components described
above, see
Container Cloud on Equinix Metal templates.
The Kubernetes lifecycle management (LCM) engine in Mirantis Container Cloud
consists of the following components:
LCM controller
Responsible for all LCM operations. Consumes the LCMCluster object
and orchestrates actions through LCM agent.
LCM agent
Relates only to Mirantis Kubernetes Engine (MKE) clusters deployed
using Container Cloud, and is not used for attached MKE clusters.
Runs on the target host. Executes Ansible playbooks in headless mode.
Helm controller
Responsible for the lifecycle of the Helm charts.
It is installed by LCM controller and interacts with Tiller.
The Kubernetes LCM components handle the following custom resources:
LCMCluster
LCMMachine
HelmBundle
The following diagram illustrates handling of the LCM custom resources by
the Kubernetes LCM components. On a managed cluster,
apiserver handles multiple Kubernetes objects,
for example, deployments, nodes, RBAC, and so on.
The Kubernetes LCM components handle the following custom resources (CRs):
LCMMachine
LCMCluster
HelmBundle
LCMMachine
Describes a machine that is located on a cluster.
It contains the machine type, control or worker,
StateItems that correspond to Ansible playbooks and miscellaneous actions,
for example, downloading a file or executing a shell command.
LCMMachine reflects the current state of the machine, for example,
a node IP address, and each StateItem through its status.
Multiple LCMMachine CRs can correspond to a single cluster.
LCMCluster
Describes a managed cluster. In its spec,
LCMCluster contains a set of StateItems for each type of LCMMachine,
which describe the actions that must be performed to deploy the cluster.
LCMCluster is created by the provider, using machineTypes
of the Release object. The status field of LCMCluster
reflects the status of the cluster,
for example, the number of ready or requested nodes.
HelmBundle
Wrapper for Helm charts that is handled by Helm controller.
HelmBundle tracks what Helm charts must be installed
on a managed cluster.
LCM controller runs on the management and regional cluster and orchestrates
the LCMMachine objects according to their type and their LCMCluster object.
Once the LCMCluster and LCMMachine objects are created, LCM controller
starts monitoring them to modify the spec fields and update
the status fields of the LCMMachine objects when required.
The status field of LCMMachine is updated by LCM agent
running on a node of a management, regional, or managed cluster.
Each LCMMachine has the following lifecycle states:
Uninitialized - the machine is not yet assigned to an LCMCluster.
Pending - the agent reports a node IP address and host name.
Prepare - the machine executes StateItems that correspond
to the prepare phase. This phase usually involves downloading
the necessary archives and packages.
Deploy - the machine executes StateItems that correspond
to the deploy phase that is becoming a Mirantis Kubernetes Engine (MKE)
node.
Ready - the machine is being deployed.
Upgrade - the machine is being upgraded to the new MKE version.
Reconfigure - the machine executes StateItems that correspond
to the reconfigure phase. The machine configuration is being updated
without affecting workloads running on the machine.
The templates for StateItems are stored in the machineTypes
field of an LCMCluster object, with separate lists
for the MKE manager and worker nodes.
Each StateItem has the execution phase field for a management,
regional, and managed cluster:
The prepare phase is executed for all machines for which
it was not executed yet. This phase comprises downloading the files
necessary for the cluster deployment, installing the required packages,
and so on.
During the deploy phase, a node is added to the cluster.
LCM controller applies the deploy phase to the nodes
in the following order:
First manager node is deployed.
The remaining manager nodes are deployed one by one
and the worker nodes are deployed in batches (by default,
up to 50 worker nodes at the same time).
After at least one manager and one worker node
are in the ready state, helm-controller is installed
on the cluster.
LCM controller deploys and upgrades a Mirantis Container Cloud cluster
by setting StateItems of LCMMachine objects following the corresponding
StateItems phases described above. The Container Cloud cluster upgrade
process follows the same logic that is used for a new deployment,
that is applying a new set of StateItems to the LCMMachines after updating
the LCMCluster object. But during the upgrade, the following additional actions
are performed:
If the existing worker node is being upgraded, LCM controller performs
draining and cordoning on this node honoring the
Pod Disruption Budgets.
This operation prevents unexpected disruptions of the workloads.
LCM controller verifies that the required version of helm-controller
is installed.
LCM agent handles a single machine that belongs to a management, regional, or
managed cluster.
It runs on the machine operating system but communicates with apiserver
of the regional cluster. LCM agent is deployed as a systemd unit using
cloud-init. LCM agent has a built-in self-upgrade mechanism.
LCM agent monitors the spec of a particular LCMMachine object
to reconcile the machine state with the object StateItems and update
the LCMMachine status accordingly. The actions that LCM agent performs
while handling the StateItems are as follows:
Download configuration files
Run shell commands
Run Ansible playbooks in headless mode
LCM agent provides the IP address and hostname of the machine
for the LCMMachine status parameter.
Helm controller is used by Mirantis Container Cloud to handle
management, regional, and managed clusters core addons such as StackLight
and the application addons such as the OpenStack components.
Helm controller runs in the same pod as the Tiller process.
The Tiller gRPC endpoint is not accessible outside the pod.
The pod is created using StatefulSet inside a cluster
by LCM controller once the cluster contains at least one manager and worker
node.
The Helm release information is stored in the KaaSRelease object for
the management and regional clusters and in the ClusterRelease object
for all types of the Container Cloud clusters.
These objects are used by the Container Cloud provider.
The Container Cloud provider uses the information from the
ClusterRelease object together with the Container Cloud API
Clusterspec. In Clusterspec, the operator can specify
the Helm release name and charts to use.
By combining the information from the ClusterproviderSpec parameter
and its ClusterRelease object, the cluster actuator generates
the LCMCluster objects. These objects are further handled by LCM controller
and the HelmBundle object handled by Helm controller.
HelmBundle must have the same name as the LCMCluster object
for the cluster that HelmBundle applies to.
Although a cluster actuator can only create a single HelmBundle
per cluster, Helm controller can handle multiple HelmBundle objects
per cluster.
Helm controller handles the HelmBundle objects and reconciles them with the
Tiller state in its cluster.
Helm controller can also be used by the management cluster with corresponding
HelmBundle objects created as part of the initial management cluster setup.
Identity and access management (IAM) provides a central point
of users and permissions management of the Mirantis Container
Cloud cluster resources in a granular and unified manner.
Also, IAM provides infrastructure for single sign-on user experience
across all Container Cloud web portals.
IAM for Container Cloud consists of the following components:
Keycloak
Provides the OpenID Connect endpoint
Integrates with an external identity provider (IdP), for example,
existing LDAP or Google Open Authorization (OAuth)
Stores roles mapping for users
IAM controller
Provides IAM API with data about Container Cloud projects
Handles all role-based access control (RBAC) components in Kubernetes API
IAM API
Provides an abstraction API for creating user scopes and roles
Mirantis IAM exposes the versioned and backward compatible Google
remote procedure call (gRPC) protocol API to interact with IAM CLI.
IAM API is designed as a user-facing functionality. For this reason,
it operates in the context of user authentication and authorization.
In IAM API, an operator can use the following entities:
Grants - to grant or revoke user access
Scopes - to describe user roles
Users - to provide user account information
Mirantis Container Cloud UI interacts with IAM API
on behalf of the user. However, the user can directly work with IAM API
using IAM CLI.
IAM CLI uses the OpenID Connect (OIDC) endpoint to obtain
the OIDC token for authentication in IAM API and enable you to perform
different API operations.
The following diagram illustrates the interaction between IAM API and CLI:
To be consistent and keep the integrity of a user database
and user permissions, in Mirantis Container Cloud,
IAM stores the user identity information internally.
However in real deployments, the identity provider usually already exists.
Out of the box, in Container Cloud, IAM supports
integration with LDAP and Google Open Authorization (OAuth).
If LDAP is configured as an external identity provider,
IAM performs one-way synchronization by mapping attributes according
to configuration.
In the case of the Google Open Authorization (OAuth) integration,
the user is automatically registered and their credentials are stored
in the internal database according to the user template configuration.
The Google OAuth registration workflow is as follows:
The user requests a Container Cloud web UI resource.
The user is redirected to the IAM login page and logs in using
the Log in with Google account option.
IAM creates a new user with the default access rights that are defined
in the user template configuration.
The user can access the Container Cloud web UI resource.
The following diagram illustrates the external IdP integration to IAM:
You can configure simultaneous integration with both external IdPs
with the user identity matching feature enabled.
Mirantis IAM performs as an OpenID Connect (OIDC) provider,
it issues a token and exposes discovery endpoints.
The credentials can be handled by IAM itself or delegated
to an external identity provider (IdP).
The issued JSON Web Token (JWT) is sufficient to perform operations across
Mirantis Container Cloud according to the scope and role defined
in it. Mirantis recommends using asymmetric cryptography for token signing
(RS256) to minimize the dependency between IAM and managed components.
When Container Cloud calls Mirantis Kubernetes Engine (MKE),
the user in Keycloak is created automatically with a JWT issued by Keycloak
on behalf of the end user.
MKE, in its turn, verifies whether the JWT is issued by Keycloak. If
the user retrieved from the token does not exist in the MKE database,
the user is automatically created in the MKE database based on the
information from the token.
The authorization implementation is out of the scope of IAM in Container
Cloud. This functionality is delegated to the component level.
IAM interacts with a Container Cloud component using the OIDC token
content that is processed by a component itself and required authorization
is enforced. Such an approach enables you to have any underlying authorization
that is not dependent on IAM and still to provide a unified user experience
across all Container Cloud components.
The following diagram illustrates the Kubernetes CLI authentication flow.
The authentication flow for Helm and other Kubernetes-oriented CLI utilities
is identical to the Kubernetes CLI flow,
but JSON Web Tokens (JWT) must be pre-provisioned.
The baremetal-based or Equinix Metal based Mirantis Container Cloud uses Ceph
as a distributed storage system for file, block, and object storage.
This section provides an overview of a Ceph cluster deployed by
Container Cloud.
Mirantis Container Cloud deploys Ceph on the baremetal-based
management and managed clusters and on the Equinix Metal based managed clusters
using Helm charts with the following components:
Rook Ceph Operator
A storage orchestrator that deploys Ceph on top of a Kubernetes cluster. Also
known as Rook or RookOperator. Rook operations include:
Deploying and managing a Ceph cluster based on provided Rook CRs such as
CephCluster, CephBlockPool, CephObjectStore, and so on.
Orchestrating the state of the Ceph cluster and all its daemons.
KaaSCephCluster custom resource (CR)
Represents the customization of a Kubernetes installation and allows you to
define the required Ceph configuration through the Container Cloud web UI
before deployment. For example, you can define the failure domain, Ceph pools,
Ceph node roles, number of Ceph components such as Ceph OSDs, and so on.
The ceph-kcc-controller controller on the Container Cloud management
cluster manages the KaaSCephCluster CR.
Ceph Controller
A Kubernetes controller that obtains the parameters from Container Cloud
through a CR, creates CRs for Rook and updates its CR status based on the Ceph
cluster deployment progress. It creates users, pools, and keys for OpenStack
and Kubernetes and provides Ceph configurations and keys to access them. Also,
Ceph Controller eventually obtains the data from the OpenStack Controller for
the Keystone integration and updates the RADOS Gateway services configurations
to use Kubernetes for user authentication. Ceph Controller operations include:
Transforming user parameters from the Container Cloud Ceph CR into Rook CRs
and deploying a Ceph cluster using Rook.
Providing integration of the Ceph cluster with Kubernetes.
Providing data for OpenStack to integrate with the deployed Ceph cluster.
Ceph Status Controller
A Kubernetes controller that collects all valuable parameters from the current
Ceph cluster, its daemons, and entities and exposes them into the
KaaSCephCluster status. Ceph Status Controller operations include:
Collecting all statuses from a Ceph cluster and corresponding Rook CRs.
Collecting additional information on the health of Ceph daemons.
Provides information to the status section of the KaaSCephCluster
CR.
Ceph Request Controller
A Kubernetes controller that obtains the parameters from Container Cloud
through a CR and manages Ceph OSD lifecycle management (LCM) operations. It
allows for a safe Ceph OSD removal from the Ceph cluster. Ceph Request
Controller operations include:
Providing an ability to perform Ceph OSD LCM operations.
Obtaining specific CRs to remove Ceph OSDs and executing them.
Pausing the regular Ceph Controller reconcile until all requests are
completed.
A typical Ceph cluster consists of the following components:
Ceph Monitors - three or, in rare cases, five Ceph Monitors.
Ceph Managers - one Ceph Manager in a regular cluster.
RADOS Gateway services - Mirantis recommends having three or more RADOS
Gateway instances for HA.
Ceph OSDs - the number of Ceph OSDs may vary according to the deployment
needs.
Warning
A Ceph cluster with 3 Ceph nodes does not provide
hardware fault tolerance and is not eligible
for recovery operations,
such as a disk or an entire Ceph node replacement.
A Ceph cluster uses the replication factor that equals 3.
If the number of Ceph OSDs is less than 3, a Ceph cluster
moves to the degraded state with the write operations
restriction until the number of alive Ceph OSDs
equals the replication factor again.
The placement of Ceph Monitors and Ceph Managers is defined in the
KaaSCephCluster CR.
The following diagram illustrates the way a Ceph cluster is deployed in
Container Cloud:
The following diagram illustrates the processes within a deployed Ceph cluster:
A Ceph cluster configuration in Mirantis Container Cloud
includes but is not limited to the following limitations:
Only one Ceph Controller per a management, regional, or managed cluster
and only one Ceph cluster per Ceph Controller are supported.
The replication size for any Ceph pool must be set to more than 1.
Only one CRUSH tree per cluster. The separation of devices per Ceph pool is
supported through device classes
with only one pool of each type for a device class.
All CRUSH rules must have the same failure_domain.
Only the following types of CRUSH buckets are supported:
topology.kubernetes.io/region
topology.kubernetes.io/zone
topology.rook.io/datacenter
topology.rook.io/room
topology.rook.io/pod
topology.rook.io/pdu
topology.rook.io/row
topology.rook.io/rack
topology.rook.io/chassis
Consuming an existing Ceph cluster is not supported.
CephFS is not fully supported TechPreview.
Only IPv4 is supported.
If two or more Ceph OSDs are located on the same device, there must be no
dedicated WAL or DB for this class.
Only a full collocation or dedicated WAL and DB configurations are supported.
The minimum size of any defined Ceph OSD device is 5 GB.
Reducing the number of Ceph Monitors is not supported and causes the Ceph
Monitor daemons removal from random nodes.
Removal of the mgr role in the nodes section of the
KaaSCephCluster CR does not remove Ceph Managers. To remove a Ceph
Manager from a node, remove it from the nodes spec and manually delete
the mgr pod in the Rook namespace.
When adding a Ceph node with the Ceph Monitor role, if any issues occur with
the Ceph Monitor, rook-ceph removes it and adds a new Ceph Monitor instead,
named using the next alphabetic character in order. Therefore, the Ceph Monitor
names may not follow the alphabetical order. For example, a, b, d,
instead of a, b, c.
Mirantis Container Cloud uses StackLight, the logging,
monitoring, and alerting solution that provides a single pane of glass
for cloud maintenance and day-to-day operations as well as
offers critical insights into cloud health including operational information
about the components deployed in management, regional, and managed clusters.
StackLight is based on Prometheus, an open-source monitoring solution and a
time series database.
Mirantis Container Cloud deploys the StackLight stack
as a release of a Helm chart that contains the helm-controller
and helmbundles.lcm.mirantis.com (HelmBundle) custom resources.
The StackLight HelmBundle consists of a set of Helm charts
with the StackLight components that include:
Receives, consolidates, and deduplicates the alerts sent by Alertmanager
and visually represents them through a simple web UI. Using the Alerta
web UI, you can view the most recent or watched alerts, group, and
filter alerts.
Alertmanager
Handles the alerts sent by client applications such as Prometheus,
deduplicates, groups, and routes alerts to receiver integrations.
Using the Alertmanager web UI, you can view the most recent fired
alerts, silence them, or view the Alertmanager configuration.
Elasticsearch Curator
Maintains the data (indexes) in Elasticsearch 0 by performing
such operations as creating, closing, or opening an index as well as
deleting a snapshot. Also, manages the data retention policy in
Elasticsearch 0.
Elasticsearch Exporter
The Prometheus exporter that gathers internal Elasticsearch 0
metrics.
Grafana
Builds and visually represents metric graphs based on time series
databases. Grafana supports querying of Prometheus using the PromQL
language.
Database back ends
StackLight uses PostgreSQL for Alerta and Grafana. PostgreSQL reduces
the data storage fragmentation while enabling high availability.
High availability is achieved using Patroni, the PostgreSQL cluster
manager that monitors for node failures and manages failover
of the primary node. StackLight also uses Patroni to manage major
version upgrades of PostgreSQL clusters, which allows leveraging
the database engine functionality and improvements
as they are introduced upstream in new releases,
maintaining functional continuity without version lock-in.
Logging stack
Responsible for collecting, processing, and persisting logs and
Kubernetes events. By default, when deploying through the Container
Cloud web UI, only the metrics stack is enabled on managed clusters. To
enable StackLight to gather managed cluster logs, enable the logging
stack during deployment. On management clusters, the logging stack is
enabled by default. The logging stack components include:
Elasticsearch 0, which stores logs and notifications.
Fluentd-elasticsearch, which collects logs, sends them to
Elasticsearch 0, generates metrics based on analysis of
incoming log entries, and exposes these metrics to Prometheus.
Kibana 0, which provides real-time visualization of the data
stored in Elasticsearch 0 and enables you to detect issues.
Metricbeat, which collects Kubernetes events and sends them to
Elasticsearch 0 for storage.
Prometheus-es-exporter, which presents the Elasticsearch 0 data
as Prometheus metrics by periodically sending configured queries to
the Elasticsearch 0 cluster and exposing the results to a
scrapable HTTP endpoint like other Prometheus targets.
Optional. Cerebro, a web UI for managing the Elasticsearch 0
cluster. Using the Cerebro web UI, you can get a detailed view on your
Elasticsearch 0 cluster and debug issues. Cerebro is disabled
by default.
Note
The logging mechanism performance depends on the cluster log load. In
case of a high load, you may need to increase the default resource requests
and limits for fluentdElasticsearch. For details, see
StackLight configuration parameters: Resource limits.
Metric collector
Collects telemetry data (CPU or memory usage, number of active alerts,
and so on) from Prometheus and sends the data to centralized cloud
storage for further processing and analysis. Metric collector runs on
the management cluster.
Prometheus
Gathers metrics. Automatically discovers and monitors the endpoints.
Using the Prometheus web UI, you can view simple visualizations and
debug. By default, the Prometheus database stores metrics of the past 15
days or up to 15 GB of data depending on the limit that is reached
first.
Prometheus Blackbox Exporter
Allows monitoring endpoints over HTTP, HTTPS, DNS, TCP, and ICMP.
Prometheus-es-exporter
Presents the Elasticsearch 0 data as Prometheus metrics by
periodically sending configured queries to the Elasticsearch 0
cluster and exposing the results to a scrapable HTTP endpoint like other
Prometheus targets.
Prometheus Node Exporter
Gathers hardware and operating system metrics exposed by kernel.
Prometheus Relay
Adds a proxy layer to Prometheus to merge the results from underlay
Prometheus servers to prevent gaps in case some data is missing on
some servers. Is available only in the HA StackLight mode.
Pushgateway
Enables ephemeral and batch jobs to expose their metrics
to Prometheus. Since these jobs may not exist long enough to be
scraped, they can instead push their metrics to Pushgateway, which
then exposes these metrics to Prometheus. Pushgateway is not an
aggregator or a distributed counter but rather a metrics cache.
The pushed metrics are exactly the same as scraped from a
permanently running program.
Salesforce notifier
Enables sending Alertmanager notifications to Salesforce to allow
creating Salesforce cases and closing them once the alerts are resolved.
Disabled by default.
Salesforce reporter
Queries Prometheus for the data about the amount of vCPU, vRAM, and
vStorage used and available, combines the data, and sends it to
Salesforce daily. Mirantis uses the collected data for further analysis
and reports to improve the quality of customer support. Disabled by
default.
Telegraf
Collects metrics from the system. Telegraf is plugin-driven and has
the concept of two distinct set of plugins: input plugins collect
metrics from the system, services, or third-party APIs; output plugins
write and expose metrics to various destinations.
The Telegraf agents used in Container Cloud include:
telegraf-ds-smart monitors SMART disks, and runs on both
management and managed clusters.
telegraf-ironic monitors Ironic on the baremetal-based
management clusters. The ironic input plugin collects and
processes data from Ironic HTTP API, while the http_response
input plugin checks Ironic HTTP API availability. As an output plugin,
to expose collected data as Prometheus target, Telegraf uses
prometheus.
telegraf-docker-swarm gathers metrics from the Mirantis Container
Runtime API about the Docker nodes, networks, and Swarm services. This
is a Docker Telegraf input plugin with downstream additions.
Telemeter
Enables a multi-cluster view through a Grafana dashboard of the
management cluster. Telemeter includes a Prometheus federation push
server and clients to enable isolated Prometheus instances, which
cannot be scraped from a central Prometheus instance, to push metrics
to the central location.
The Telemeter services are distributed as follows:
Management cluster hosts the Telemeter server
Regional clusters host the Telemeter server and Telemeter client
Managed clusters host the Telemeter client
The metrics from managed clusters are aggregated on regional clusters.
Then both regional and managed clusters metrics are sent from regional
clusters to the management cluster.
Every Helm chart contains a default values.yml file. These default values
are partially overridden by custom values defined in the StackLight Helm chart.
Before deploying a managed cluster, you can select the HA or non-HA StackLight
architecture type. The non-HA mode is set by default. On the management and
regional clusters, StackLight is deployed in the HA mode only.
The following table lists the differences between the HA and non-HA modes:
One persistent volume is provided for storing data. In case of a service
or node failure, a new pod is redeployed and the volume is reattached to
provide the existing data. Such setup has a reduced hardware footprint
but provides less performance.
Local Volume Provisioner is used to provide local host storage. In case
of a service or node failure, the traffic is automatically redirected to
any other running Prometheus or Elasticsearch 0 server. For
better performance, Mirantis recommends that you deploy StackLight in
the HA mode.
Depending on the Container Cloud cluster type and selected StackLight database
mode, StackLight is deployed on the following number of nodes:
Starting from Container Cloud 2.16.0, Elasticsearch has switched to
OpenSearch and Kibana has switched to OpenSearch Dashboards. For details,
see Elasticsearch switch to OpenSearch.
StackLight provides five web UIs including Prometheus, Alertmanager, Alerta,
Kibana, and Grafana. Access to StackLight web UIs is protected by
Keycloak-based Identity and access management (IAM). All web UIs except Alerta
are exposed to IAM through the IAM proxy middleware. The Alerta configuration
provides direct integration with IAM.
The following diagram illustrates accessing the IAM-proxied StackLight web UIs,
for example, Prometheus web UI:
Authentication flow for the IAM-proxied StackLight web UIs:
A user enters the public IP of a StackLight web UI, for example, Prometheus
web UI.
The public IP leads to IAM proxy, deployed as a Kubernetes LoadBalancer,
which protects the Prometheus web UI.
LoadBalancer routes the HTTP request to Kubernetes internal IAM proxy
service endpoints, specified in the X-Forwarded-Proto or X-Forwarded-Host
headers.
The Keycloak login form opens (the login_url field in the IAM proxy
configuration, which points to Keycloak realm) and the user enters
the user name and password.
Keycloak validates the user name and password.
The user obtains access to the Prometheus web UI (the upstreams field
in the IAM proxy configuration).
Note
The discovery URL is the URL of the IAM service.
The upstream URL is the hidden endpoint of a web UI (Prometheus web UI in
the example above).
The following diagram illustrates accessing the Alerta web UI:
Authentication flow for the Alerta web UI:
A user enters the public IP of the Alerta web UI.
The public IP leads to Alerta deployed as a Kubernetes LoadBalancer type.
LoadBalancer routes the HTTP request to the Kubernetes internal Alerta
service endpoint.
The Keycloak login form opens (Alerta refers to the IAM realm) and
the user enters the user name and password.
Using the Mirantis Container Cloud web UI,
on the pre-deployment stage of a managed cluster,
you can view, enable or disable, or tune the following StackLight features
available:
StackLight HA mode.
Database retention size and time for Prometheus.
Tunable index retention period for Elasticsearch 0.
Tunable PersistentVolumeClaim (PVC) size for Prometheus and Elasticsearch
0 set to 16 GB for Prometheus and 30 GB for Elasticsearch 0 by
default. The PVC size must be logically aligned with the retention periods or
sizes for these components.
Email and Slack receivers for the Alertmanager notifications.
Predefined set of dashboards.
Predefined set of alerts and capability to add
new custom alerts for Prometheus in the following exemplary format:
StackLight measures, analyzes, and reports in a timely manner about failures
that may occur in the following Mirantis Container Cloud
components and their sub-components, if any:
The data collected and transmitted through an encrypted channel back to
Mirantis provides our Customer Success Organization information to better
understand the operational usage patterns our customers are experiencing
as well as to provide feedback on product usage statistics to enable our
product teams to enhance our products and services for our customers.
The node-level resource data are broken down into three broad categories:
Cluster, Node, and Namespace. The telemetry data tracks Allocatable,
Capacity, Limits, Requests, and actual Usage of node-level resources.
StackLight components, which require external access, automatically use the
same proxy that is configured for Mirantis Container Cloud clusters. Therefore,
you only need to configure proxy during deployment of your management,
regional, or managed clusters. No additional actions are required to set up
proxy for StackLight. For more details about implementation of proxy support in
Container Cloud, see Proxy and cache support.
Note
Proxy handles only the HTTP and HTTPS traffic. Therefore, for
clusters with limited or no Internet access, it is not possible to set up
Alertmanager email notifications, which use SMTP, when proxy is used.
Proxy is used for the following StackLight components:
Component
Cluster type
Usage
Alertmanager
Any
As a default http_config
for all HTTP-based receivers except the predefined HTTP-alerta and
HTTP-salesforce. For these receivers, http_config is overridden on
the receiver level.
Metric collector
Management
To send outbound cluster metrics to Mirantis.
Salesforce notifier
Any
To send notifications to the Salesforce instance.
Salesforce reporter
Any
To send metric reports to the Salesforce instance.
Telemeter client
Regional
To send all metrics from the clusters of a region, including the managed
and regional clusters, to the management cluster. Proxy is not used for
the Telemeter client on managed clusters because managed clusters must
have a direct access to their regional cluster.
Using Mirantis Container Cloud, you can deploy a Mirantis Kubernetes Engine
(MKE) cluster on bare metal, OpenStack, Microsoft Azure, VMware vSphere,
Equinix Metal, or Amazon Web Services (AWS).
Each cloud provider requires corresponding resources.
Note
Using the free Mirantis license, you can create up to three
Container Cloud managed clusters
with three worker nodes on each cluster.
Within the same quota, you can also attach existing
MKE clusters that are not deployed by Container Cloud.
If you need to increase this quota,
contact Mirantis support for further details.
A bootstrap node is necessary only to deploy the management cluster.
When the bootstrap is complete, the bootstrap node can be
redeployed and its resources can be reused
for the managed cluster workloads.
The minimum reference system requirements of a baremetal-based bootstrap
seed node are described in System requirements for the seed node.
The minimum reference system requirements a bootstrap node for other supported
Container Cloud providers are as follows:
Any local machine on Ubuntu 20.04 that requires access to the provider API
with the following configuration:
2 vCPUs
4 GB of RAM
5 GB of available storage
Docker version currently available for Ubuntu 20.04
Internet access for downloading of all required artifacts
Note
For the vSphere cloud provider, you can use RHEL 7.9 as the
operating system for the bootstrap node. The system requirements
are the same as for Ubuntu.
Note
For the Equinix Metal cloud provider with private networks, a
bootstrap node must be attached to the VLAN that will be used
to deploy a management cluster.
If you use a firewall or proxy, make sure that the bootstrap, management,
and regional clusters have access to the following IP ranges and domain names:
The following hardware configuration is used as a reference to deploy
Mirantis Container Cloud with bare metal Container Cloud clusters with
Mirantis Kubernetes Engine.
Reference hardware configuration for Container Cloud
management and managed clusters on bare metal¶
Three manager nodes for HA and three workerstorage
nodes for a minimal Ceph cluster. For more details about Ceph
requirements, see Management cluster storage.
A management cluster requires 2 volumes for Container Cloud
(total 50 GB) and 5 volumes for StackLight (total 60 GB).
A managed cluster requires 5 volumes for StackLight.
The seed node is necessary only to deploy the management cluster.
When the bootstrap is complete, the bootstrap node can be
redeployed and its resources can be reused
for the managed cluster workloads.
The minimum reference system requirements for a baremetal-based bootstrap
seed node are as follows:
Basic server on Ubuntu 20.04 with the following configuration:
Kernel version 4.15.0-76.86 or later
8 GB of RAM
4 CPU
10 GB of free disk space for the bootstrap cluster cache
No DHCP or TFTP servers on any NIC networks
Routable access IPMI network for the hardware servers. For more details, see
Host networking.
Internet access for downloading of all required artifacts
The following diagram illustrates the physical and virtual L2 underlay
networking schema for the final state of the Mirantis Container Cloud
bare metal deployment.
The network fabric reference configuration is a spine/leaf with 2 leaf ToR
switches and one out-of-band (OOB) switch per rack.
Reference configuration uses the following switches for ToR and OOB:
Cisco WS-C3560E-24TD has 24 of 1 GbE ports. Used in OOB network
segment.
Dell Force 10 S4810P has 48 of 1/10GbE ports. Used as ToR in Common/PXE
network segment.
In the reference configuration, all odd interfaces from NIC0 are connected
to TORSwitch1, and all even interfaces from NIC0 are connected
to TORSwitch2. The Baseboard Management Controller (BMC) interfaces
of the servers are connected to OOBSwitch1.
The following recommendations apply to all types of nodes:
Use the Link Aggregation Control Protocol (LACP) bonding mode
with MC-LAG domains configured on leaf switches. This corresponds to
the 802.3ad bond mode on hosts.
Use ports from different multi-port NICs when creating bonds. This makes
network connections redundant if failure of a single NIC occurs.
Configure the ports that connect servers to the PXE network with PXE VLAN
as native or untagged. On these ports, configure LACP fallback to ensure
that the servers can reach DHCP server and boot over network.
The management cluster requires minimum three storage devices per node.
Each device is used for different type of storage.
The first device is always used for boot partitions and the root
file system. SSD is recommended. RAID device is not supported.
One storage device per server is reserved for local persistent
volumes. These volumes are served by the Local Storage Static Provisioner
(local-volume-provisioner) and used by many services of Container Cloud.
At least one disk per server must be configured
as a device managed by a Ceph OSD.
The recommended number of Ceph OSDs per a management cluster node is
2 OSDs per node, to the total of 6 OSDs. The recommended replication
factor 3 ensures that no data is lost if any single node
of the management cluster fails.
While planning the deployment of an OpenStack-based Mirantis Container Cloud
cluster with Mirantis Kubernetes Engine (MKE), consider the following general
requirements:
Kubernetes on OpenStack requires the Cinder and Octavia APIs availability.
The only supported OpenStack networking is Open vSwitch. Other networking
technologies, such as Tungsten Fabric, are not supported.
Container Cloud is developed and tested on OpenStack Queens.
If you use a firewall or proxy, make sure that the bootstrap, management,
and regional clusters have access to the following IP ranges and domain names:
mirror.mirantis.com and repos.mirantis.com for packages
binary.mirantis.com for binaries and Helm charts
mirantis.azurecr.io and *.blob.core.windows.net for Docker images
mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry
(port 443 if proxy is enabled)
mirantis.my.salesforce.com for Salesforce alerts
Note
Access to Salesforce is required from any Container Cloud
cluster type.
If any additional Alertmanager notification receiver is enabled,
for example, Slack, its endpoint must also be accessible
from the cluster.
Requirements for an OpenStack-based Container Cloud cluster¶
Resource
Management or regional cluster
Managed cluster
Comments
# of nodes
3 (HA) + 1 (Bastion)
5 (6 with StackLight HA)
A bootstrap cluster requires access to the OpenStack API.
Each management or regional cluster requires 3 nodes for the manager nodes HA.
Adding more than 3 nodes to a management or regional cluster is not supported.
A managed cluster requires 3 manager nodes for HA and 2 worker nodes for the
Container Cloud workloads. If the multiserver mode is enabled for StackLight,
3 worker nodes are required for workloads.
Each management or regional cluster requires 1 node for the Bastion instance
that is created with a public IP address to allow SSH access to instances.
# of vCPUs per node
8
8
The Bastion node requires 1 vCPU.
Refer to the RAM recommendations described below to plan resources
for different types of nodes.
RAM in GB per node
24
16
To prevent issues with low RAM, Mirantis recommends the following types
of instances for a managed cluster with 50-200 nodes:
16 vCPUs and 32 GB of RAM - manager node
16 vCPUs and 128 GB of RAM - nodes where the StackLight server components run
The Bastion node requires 1 GB of RAM.
Storage in GB per node
120
120
For the Bastion node, the default amount of storage is enough.
While planning the deployment of an AWS-based Mirantis Container Cloud cluster
with Mirantis Kubernetes Engine, consider the requirements described below.
Some of the AWS features required for Container Cloud
may not be included into your AWS account quota.
Therefore, carefully consider the AWS fees
applied to your account that may increase
for the Container Cloud infrastructure.
If you use a firewall or proxy, make sure that the bootstrap, management,
and regional clusters have access to the following IP ranges and domain names:
Requirements for an AWS-based Container Cloud cluster¶
Resource
Management or regional cluster
Managed cluster
Comment
# of nodes
3 (HA)
5 (6 with StackLight HA)
A management cluster requires 3 nodes for the manager nodes HA. Adding
more than 3 nodes to a management or regional cluster is not supported.
A managed cluster requires 3 manager nodes for HA and 2 worker nodes for the
Container Cloud workloads. If the multiserver mode is enabled for StackLight,
3 worker nodes are required for workloads.
# of vCPUs per node
8
8
RAM in GB per node
24
16
Storage in GB per node
120
120
Operating system
Ubuntu 20.04
Ubuntu 20.04
For a management and managed cluster, a base Ubuntu 20.04 image is required.
MCR
20.10.8
20.10.8
Mirantis Container Runtime (MCR) is deployed by Container Cloud as a
Container Runtime Interface (CRI) instead of Docker Engine.
Instance type
c5.4xlarge
c5.2xlarge
To prevent issues with low RAM, Mirantis recommends the following types
of instances for a managed cluster with 50-200 nodes:
c5.4xlarge - manager node
r5.4xlarge - nodes where the StackLight server components run
Starting from Container Cloud 2.17.0, the /var/lib/docker Docker
data is located on the same EBS volume drive as the operating system.
Warning
Do not stop the AWS instances dedicated to the Container Cloud
clusters to prevent data failure and cluster disaster.
Bastion host instance type
t2.micro
t2.micro
The Bastion instance is created with a public Elastic IP address
to allow SSH access to instances.
# of volumes
7 (total 110 GB)
5 (total 60 GB)
A management cluster requires 2 volumes for Container Cloud (total 50 GB)
and 5 volumes for StackLight (total 60 GB)
A managed cluster requires 5 volumes for StackLight
# of Elastic load balancers to be used
10
6
Elastic LBs for a management cluster: 1 for Kubernetes, 4 for Container Cloud,
5 for StackLight
Elastic LBs for a managed cluster: 1 for Kubernetes and 5 for StackLight
While planning the deployment of an Azure-based Mirantis Container Cloud
cluster with Mirantis Kubernetes Engine, consider the requirements
described below.
Some of the Azure features required for Container Cloud
may not be included into your Azure account quota.
Therefore, carefully consider the Azure fees
applied to your account that may increase
for the Container Cloud infrastructure.
If you use a firewall or proxy, make sure that the bootstrap, management,
and regional clusters have access to the following IP ranges and domain names:
mirror.mirantis.com and repos.mirantis.com for packages
binary.mirantis.com for binaries and Helm charts
mirantis.azurecr.io and *.blob.core.windows.net for Docker images
mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry
(port 443 if proxy is enabled)
mirantis.my.salesforce.com for Salesforce alerts
Note
Access to Salesforce is required from any Container Cloud
cluster type.
If any additional Alertmanager notification receiver is enabled,
for example, Slack, its endpoint must also be accessible
from the cluster.
Requirements for an Azure-based Container Cloud cluster¶
Resource
Management or regional cluster
Managed cluster
Comment
# of nodes
3 (HA)
5 (6 with StackLight HA)
A management cluster requires 3 nodes for the manager nodes HA. Adding
more than 3 nodes to a management or regional cluster is not supported.
A managed cluster requires 3 manager nodes for HA and 2 worker nodes for the
Container Cloud workloads. If the multiserver mode is enabled for StackLight,
3 worker nodes are required for workloads.
# of vCPUs per node
8
8
RAM in GB per node
24
16
Storage in GB per node
128
128
Operating system
Ubuntu 20.04
Ubuntu 20.04
For a management, regional and managed cluster, a base Ubuntu 20.04 image
is required.
MCR
20.10.8
20.10.8
Mirantis Container Runtime (MCR) is deployed by Container Cloud as a
Container Runtime Interface (CRI) instead of Docker Engine.
Virtual Machine size
Standard_F16s_v2
Standard_F8s_v2
To prevent issues with low RAM, Mirantis recommends selecting Azure
virtual machine sizes that meet the following minimum requirements
for managed clusters:
16 GB RAM (24 GB RAM for a cluster with 50-200 nodes)
8 CPUs
Ephemeral OS drive supported
OS drive size is more than 128 GB
# of Azure resource groups
1
1
# of Azure networks
1
1
# of Azure subnets
1
1
# of Azure security groups
1
1
# of Azure network interfaces
3
One network interface per each machine
# of Azure route tables
1
1
# of Azure load balancers to be used
2
2
1 load balancer for an API server and 1 for Kubernetes services
# of public IP addresses to be used
12/9
8
Management cluster: 10 public IPs for Kubernetes services and 2
public IPs as front-end IPs for load balancers
Regional cluster: 7 public IPs for Kubernetes services and 2
public IPs as front-end IPs for load balancers
Managed cluster: 6 public IPs for Kubernetes services and 2 public
IPs as front-end IPs for load balancers
# of OS disks
3
1 OS disk per each machine
# of data disks
0
5 (total 60 GB)
A managed cluster requires 5 volumes for StackLight
While planning the deployment of a Mirantis Container Cloud cluster with MKE
that is based on the Equinix Metal cloud provider,
consider the requirements described below.
Mirantis supports deploying of clusters on Equinix Metal in two modes: with
public or private networks. The deployment mode for management and managed
clusters must be the same. For details on the private networks mode, see
Equinix Metal with private networking.
For the Equinix Metal cloud provider with private networks, a
bootstrap node must be attached to the VLAN that will be used
to deploy a management cluster.
If you want to deploy an Equinix Metal based managed cluster with public
networks on top of an
AWS management cluster, also refer to requirements for an Requirements for an AWS-based cluster.
If you use a firewall or proxy, make sure that the bootstrap, management,
and regional clusters have access to the following IP ranges and domain names:
mirror.mirantis.com and repos.mirantis.com for packages
binary.mirantis.com for binaries and Helm charts
mirantis.azurecr.io and *.blob.core.windows.net for Docker images
mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry
(port 443 if proxy is enabled)
mirantis.my.salesforce.com for Salesforce alerts
Note
Access to Salesforce is required from any Container Cloud
cluster type.
If any additional Alertmanager notification receiver is enabled,
for example, Slack, its endpoint must also be accessible
from the cluster.
Requirements for an Equinix Metal based Container Cloud cluster¶
Resource
Management or regional cluster
Managed cluster
Comment
# of nodes
3 (HA)
5 (6 with StackLight HA)
A management cluster requires 3 nodes for the manager nodes HA. Adding
more than 3 nodes to a management or regional cluster is not supported.
A managed cluster requires 3 manager nodes for HA and 2 worker nodes for the
Container Cloud workloads. If the multiserver mode is enabled for StackLight,
3 worker nodes are required for workloads.
# of vCPUs per node
8
8
RAM in GB per node
24
16
Operating system
Ubuntu 20.04
Ubuntu 20.04
MCR
20.10.8
20.10.8
Mirantis Container Runtime (MCR) is deployed by Container Cloud as a
Container Runtime Interface (CRI) instead of Docker Engine.
Server type
c3.small.x86
c3.small.x86
Most available Equinix Metal servers are configured with minimal
requirements to deploy Container Cloud clusters. However, ensure that
the selected Equinix Metal server type meets the following
minimal requirements for a managed cluster:
16 GB RAM
8 CPUs
2 storage devices with more than 120 GB each
Warning
If the Equinix Metal data center has not enough capacity,
the server provisioning request will fail. Servers
of particular types can be unavailable at a given time.
Therefore, before you deploy a cluster, verify that the
selected server type is available as described in
Verify the capacity of the Equinix Metal facility.
Elastic IPs for a management cluster: 1 for Kubernetes, 5 for Container Cloud,
6 for StackLight
Elastic IPs for a managed cluster: 1 for Kubernetes and 5 for StackLight
Elastic IPs are not needed for clusters with private networks
# of IP addresses for a cluster with private networks
12
5
Managed cluster requires 5 IPs for StackLight
Management cluster requires IPs for the following services:
6 for StackLight
2 for IAM
2 for Ironic
1 for mcc-cache
1 for UI
# VLANs for a cluster with private networks
1
1
Each cluster deployed on Equinix Metal with private networks requires
1 separate VLAN.
Ceph nodes
-
See comments
Recommended minimal number of Ceph node roles:
Storage
Manager and Monitor
1-2
1
3-500
3 (for HA)
> 500
5
If you select Manual Ceph Configuration during the cluster
creation, you can manually configure Ceph roles for each machine in the
cluster following the recommended minimal number of Ceph node roles.
Otherwise, Equinix Metal cloud provider will automatically configure
Ceph roles: all control plane machines will be configured with
Storage and Manager and Monitor roles. All
worker machines will be configured with Storage role.
If you use a firewall or proxy, make sure that the bootstrap, management,
and regional clusters have access to the following IP ranges and domain names:
mirror.mirantis.com and repos.mirantis.com for packages
binary.mirantis.com for binaries and Helm charts
mirantis.azurecr.io and *.blob.core.windows.net for Docker images
mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry
(port 443 if proxy is enabled)
mirantis.my.salesforce.com for Salesforce alerts
Note
Access to Salesforce is required from any Container Cloud
cluster type.
If any additional Alertmanager notification receiver is enabled,
for example, Slack, its endpoint must also be accessible
from the cluster.
Requirements for a vSphere-based Container Cloud cluster¶
Resource
Management cluster
Managed cluster
Comments
# of nodes
3 (HA)
5 (6 with StackLight HA)
A bootstrap cluster requires access to the vSphere API.
A management cluster requires 3 nodes for the manager nodes HA. Adding
more than 3 nodes to a management or regional cluster is not supported.
A managed cluster requires 3 manager nodes for HA and 2 worker nodes for the
Container Cloud workloads. If the multiserver mode is enabled for StackLight,
3 worker nodes are required for workloads.
# of vCPUs per node
8
8
Refer to the RAM recommendations described below to plan resources
for different types of nodes.
RAM in GB per node
24
16
To prevent issues with low RAM, Mirantis recommends the following VM
templates for a managed cluster with 50-200 nodes:
16 vCPUs and 32 GB of RAM - manager node
16 vCPUs and 128 GB of RAM - nodes where the StackLight server components run
Storage in GB per node
120
120
The listed amount of disk space must be available as a shared
datastore of any type, for example, NFS or vSAN, mounted on all
hosts of the vCenter cluster.
For a management and managed cluster, a base OS VM template
must be present in the VMware VM templates folder
available to Container Cloud. For details about the template,
see Prepare the OVF template.
This license type allows running unlimited guests inside one hypervisor.
The amount of licenses is equal to the amount of hypervisors in
vCenter Server, which will be used to host RHEL-based machines.
Container Cloud will schedule machines according to scheduling rules
applied to vCenter Server. Therefore, make sure that your
RedHat Customer portal account has enough licenses for allowed
hypervisors.
MCR
20.10.8
20.10.8
Mirantis Container Runtime (MCR) is deployed by Container Cloud as a
Container Runtime Interface (CRI) instead of Docker Engine.
VMware vSphere version
7.0, 6.7
7.0, 6.7
cloud-init version
19.4 for RHEL/CentOS 7.9, 20.3 for RHEL 8.4 TechPreview
19.4 for RHEL/CentOS 7.9, 20.3 for RHEL 8.4 TechPreview
A shared datastore must be mounted on all hosts of the
vCenter cluster. Combined with Distributed Resources Scheduler (DRS),
it ensures that the VMs are dynamically scheduled to the cluster hosts.
RHEL 7.8 deployment is possible with allowed access to the
rhel-7-server-rpms repository provided by the Red Hat Enterprise
Linux Server 7 x86_64.
Verify that your RHEL license or activation key meets this requirement.
CentOS 7.9 and RHEL 8.4 deployments are available as Technology Preview.
Use this configuration for testing and evaluation purposes only.
A Container Cloud cluster based on both RHEL and CentOS operating
systems or on mixed RHEL versions is not supported.
StackLight requirements for an MKE attached cluster¶
Note
Attachment of MKE clusters is tested on the following operating
systems:
Ubuntu 20.04
RHEL 7.9
CentOS 8 and 7.9
While planning the attachment of an existing Mirantis Kubernetes Engine (MKE)
cluster that is not deployed by Container Cloud, consider the following cluster
size requirements for StackLight. Depending on the following specific
StackLight HA and logging settings, use the example size guidelines below:
The non-HA mode - StackLight services are installed on a minimum of one node
with the StackLight label (StackLight nodes) with no redundancy
using Persistent Volumes (PVs) from the default storage class to store data.
Metric collection agents are installed on each node (Other nodes).
The HA mode - StackLight services are installed on a minimum of three nodes
with the StackLight label (StackLight nodes) with redundancy
using PVs provided by Local Volume Provisioner to store data. Metric
collection agents are installed on each node (Other nodes).
Logging enabled - the Enable logging option is turned on, which
enables the Elasticsearch 2 cluster to store
infrastructure logs.
Logging disabled - the Enable logging option is turned off. In
this case, StackLight will not install Elasticsearch 2 and will not
collect infrastructure logs.
LoadBalancer (LB) Services support is required to provide external access
to StackLight web UIs.
StackLight requirements for an attached MKE cluster, with logging enabled:¶
In the non-HA mode, StackLight components are bound to the nodes labeled
with the StackLight label. If there are no nodes labeled, StackLight
components will be scheduled to all schedulable worker nodes until the
StackLight label(s) are added. The requirements presented in the
table for the non-HA mode are summarized requirements for all StackLight
nodes.
If you require all Internet access to go through a proxy server
for security and audit purposes, you can bootstrap management and regional
clusters using proxy. The proxy server settings consist of three standard
environment variables that are set prior to the bootstrap process:
HTTP_PROXY
HTTPS_PROXY
NO_PROXY
These settings are not propagated to managed clusters. However, you can enable
a separate proxy access on a managed cluster using the Container Cloud web UI.
This proxy is intended for the end user needs and is not used
for a managed cluster deployment or for access to the Mirantis resources.
Caution
Since Container Cloud uses the OpenID Connect (OIDC) protocol
for IAM authentication, management clusters require
a direct non-proxy access from regional and managed clusters.
StackLight components, which require external access, automatically use the
same proxy that is configured for Container Cloud clusters.
On the managed clusters with limited Internet access, a proxy is required for
StackLight components that use HTTP and HTTPS and are disabled by default but
need external access if enabled, for example, for the Salesforce integration
and Alertmanager notifications external rules.
For more details about proxy implementation in StackLight, see StackLight proxy.
For the list of Mirantis resources and IP addresses to be accessible
from the Container Cloud clusters, see Hardware and system requirements.
The Container Cloud managed clusters are deployed without direct Internet
access in order to consume less Internet traffic in your cloud.
The Mirantis artifacts used during managed clusters deployment are downloaded
through a cache running on a regional cluster.
The feature is enabled by default on new managed clusters
and will be automatically enabled on existing clusters during upgrade
to the latest version.
Caution
IAM operations require a direct non-proxy access
of a managed cluster to a management cluster.
To ensure the Mirantis Container Cloud stability in managing
the Container Cloud-based Mirantis Kubernetes Engine (MKE) clusters,
the following MKE API functionality is not available for the
Container Cloud-based MKE clusters as compared to the attached MKE clusters
that are not deployed by Container Cloud.
Use the Container Cloud web UI or CLI for this functionality instead.
Public APIs limitations in a Container Cloud-based MKE cluster¶
API endpoint
Limitation
GET/swarm
Swarm Join Tokens are filtered out for all users, including admins.
PUT/api/ucp/config-toml
All requests are forbidden.
POST/nodes/{id}/update
Requests for the following changes are forbidden:
Change Role
Add or remove the com.docker.ucp.orchestrator.swarm and
com.docker.ucp.orchestrator.kubernetes labels.
The bare metal management system enables the Infrastructure Operator
to deploy Mirantis Container Cloud on a set of bare metal
servers.
It also enables Container Cloud to deploy managed clusters on bare metal
servers without a pre-provisioned operating system.
The Infrastructure Operator performs the following steps to install
Container Cloud in a bare metal environment:
The baremetal-based Container Cloud does not manage
the underlay networking fabric but requires
specific network configuration to operate.
Install Ubuntu 20.04 on one of the bare metal machines to create a seed
node and copy the bootstrap tarball to this node.
Obtain the Mirantis license file that will be required during the bootstrap.
Create the deployment configuration files that include the bare metal hosts
metadata.
Validate the deployment templates using fast preflight.
Run the bootstrap script for the fully automated installation of the
management cluster onto the selected bare metal hosts.
Using the bootstrap script, the Container Cloud bare metal management
system prepares the seed node for the management cluster
and starts the deployment of Container Cloud itself. The bootstrap script
performs all necessary operations to perform the automated management cluster
setup. The deployment diagram below illustrates the bootstrap workflow
of a baremetal-based management cluster.
Install basic Ubuntu 20.04 server using standard installation images of the
operating system on the bare metal seed node.
Log in to the seed node that is running Ubuntu 20.04.
Prepare the system and network configuration:
Create a virtual bridge to connect to your PXE network on the
seed node. Use the following netplan-based configuration file
as an example:
# cat /etc/netplan/config.yamlnetwork:version:2renderer:networkdethernets:ens3:dhcp4:falsedhcp6:falsebridges:br0:addresses:# Please, adjust for your environment-10.0.0.15/24dhcp4:falsedhcp6:false# Please, adjust for your environmentgateway4:10.0.0.1interfaces:# Interface name may be different in your environment-ens3nameservers:addresses:# Please, adjust for your environment-8.8.8.8parameters:forward-delay:4stp:false
Apply the new network configuration using netplan:
sudo netplan apply
Verify the new network configuration:
sudo brctl show
Example of system response:
bridge name bridge id STP enabled interfaces
br0 8000.fa163e72f146 no ens3
Verify that the interface connected to the PXE network
belongs to the previously configured bridge.
Install the current Docker version available for Ubuntu 20.04:
sudo apt install docker.io
Verify that your logged USER has access to the Docker daemon:
sudo usermod -aG docker $USER
Log out and log in again to the seed node to apply the changes.
Verify that Docker is configured correctly and has access
to Container Cloud CDN. For example:
docker run --rm alpine sh -c "apk add --no-cache curl; \curl https://binary.mirantis.com"
The system output must contain a json file with no error messages.
In case of errors, follow the steps provided in Troubleshooting.
Note
If you require all Internet access to go through a proxy server
for security and audit purposes, configure Docker proxy settings
as described in the official
Docker documentation.
Before you proceed to bootstrapping the management cluster on bare metal,
perform the following steps:
Verify that the seed node has direct access to the Baseboard Management
Controller (BMC) of each baremetal host. All target hardware nodes must
be in the poweroff state.
For example, using the IPMI tool:
ipmitool -I lanplus -H 'IPMI IP' -U 'IPMI Login' -P 'IPMI password'\
chassis power status
Example of system response:
Chassis Power is off
Verify that you configured each bare metal host as follows:
Enable the boot NIC support for UEFI load. Usually, at least the built-in
network interfaces support it.
Enable the UEFI-LAN-OPROM support in
BIOS -> Advanced -> PCIPCIe.
Enable the IPv4-PXE stack.
Set the following boot order:
UEFI-DISK
UEFI-PXE
If your PXE network is not configured to use the first network interface,
fix the UEFI-PXE boot order to speed up node discovering
by selecting only one required network interface.
Power off all bare metal hosts.
Warning
Only one Ethernet port on a host must be connected to the
PXE network at any given time. The physical address
(MAC) of this interface must be noted and used to configure
the BareMetalHost object describing the host.
Prepare metadata and deploy the management cluster¶
Using the example procedure below, replace the addresses and credentials
in the configuration YAML files with the data from your environment.
Keep everything else as is, including the file names and YAML structure.
The overall network mapping scheme with all L2/L3 parameters, for example,
for a single 10.0.0.0/24 network, is described in the following table.
The configuration of each parameter indicated in this table is described
in the steps below.
Log in to your account and download the mirantis.lic license file.
Save the license file as mirantis.lic under the kaas-bootstrap
directory on the bootstrap node.
Verify that mirantis.lic contains the exact Container Cloud license
previously downloaded from www.mirantis.com
by decoding the license JWT token, for example, using jwt.io.
Example of a valid decoded Container Cloud license data with the mandatory
license field:
Update the cluster definition template in
templates/bm/cluster.yaml.template
according to the environment configuration. Use the table below.
Manually set all parameters that start with SET_. For example,
SET_METALLB_ADDR_POOL.
The IP address of the externally accessible API endpoint
of the cluster. This address must NOT be
within the SET_METALLB_ADDR_POOL range but must be within
the PXE/Management network. External load balancers are not supported.
10.0.0.90
SET_METALLB_ADDR_POOL
The IP range to be used as external load balancers for the Kubernetes
services with the LoadBalancer type. This range must be within
the PXE/Management network. The minimum required range is 19 IP addresses.
The dnsmasq configuration options dhcp-option=3 and dhcp-option=6
are absent in the default configuration. So, by default, dnsmasq
will send the DNS server and default route to DHCP clients as defined in the
dnsmasq official documentation:
The netmask and broadcast address are the same as on the host
running dnsmasq.
The DNS server and default route are set to the address of the host
running dnsmasq.
If the domain name option is set, this name is sent to DHCP clients.
If such behavior is not desirable during the cluster deployment,
add the corresponding DHCP options, such as a specific gateway address
and DNS addresses, using the dnsmasq.dnsmasq_extra_opts parameter
for the baremetal-operator release in
templates/bm/cluster.yaml.template:
Optional if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org)
are accessible from the node where your cluster is being
provisioned. Otherwise, this step is mandatory.
Configure the regional NTP server parameters to be applied to all machines
of regional and managed clusters in the specified region.
In templates/bm/cluster.yaml.template, add the ntp:servers section
with the list of required servers names:
Inspect the default bare metal host profile definition in
templates/bm/baremetalhostprofiles.yaml.template.
If your hardware configuration differs from the reference,
adjust the default profile to match. For details, see
Customize the default bare metal host profile.
Warning
All data will be wiped during cluster deployment on devices
defined directly or indirectly in the fileSystems list of
BareMetalHostProfile. For example:
A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a
file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it
The wipe field is always considered true for these devices.
The false value is ignored.
Therefore, to prevent data loss, move the necessary data from these file
systems to another server beforehand, if required.
Update the bare metal hosts definition template in
templates/bm/baremetalhosts.yaml.template
according to the environment configuration. Use the table below.
Manually set all parameters that start with SET_.
The IPMI user name in the base64 encoding to access the BMC. 0
dXNlcg== (base64 encoded user)
SET_MACHINE_0_IPMI_PASSWORD
The IPMI password in the base64 encoding to access the BMC. 0
cGFzc3dvcmQ= (base64 encoded password)
SET_MACHINE_0_MAC
The MAC address of the first master node in the PXE network.
ac:1f:6b:02:84:71
SET_MACHINE_0_BMC_ADDRESS
The IP address of the BMC endpoint for the first master node in
the cluster. Must be an address from the OOB network
that is accessible through the PXE network default gateway.
192.168.100.11
SET_MACHINE_1_IPMI_USERNAME
The IPMI user name in the base64 encoding to access the BMC. 0
dXNlcg== (base64 encoded user)
SET_MACHINE_1_IPMI_PASSWORD
The IPMI password in the base64 encoding to access the BMC. 0
cGFzc3dvcmQ= (base64 encoded password)
SET_MACHINE_1_MAC
The MAC address of the second master node in the PXE network.
ac:1f:6b:02:84:72
SET_MACHINE_1_BMC_ADDRESS
The IP address of the BMC endpoint for the second master node in
the cluster. Must be an address from the OOB network
that is accessible through the PXE network default gateway.
192.168.100.12
SET_MACHINE_2_IPMI_USERNAME
The IPMI user name in the base64 encoding to access the BMC. 0
dXNlcg== (base64 encoded user)
SET_MACHINE_2_IPMI_PASSWORD
The IPMI password in the base64 encoding to access the BMC. 0
cGFzc3dvcmQ= (base64 encoded password)
SET_MACHINE_2_MAC
The MAC address of the third master node in the PXE network.
ac:1f:6b:02:84:73
SET_MACHINE_2_BMC_ADDRESS
The IP address of the BMC endpoint for the third master node in
the cluster. Must be an address from the OOB network
that is accessible through the PXE network default gateway.
You can obtain the base64-encoded user name and password using
the following command in your Linux console:
$ echo -n <username|password> | base64
Update the Subnet objects definition template in
templates/bm/ipam-objects.yaml.template
according to the environment configuration. Use the table below.
Manually set all parameters that start with SET_.
For example, SET_IPAM_POOL_RANGE.
The IP address of the externally accessible API endpoint
of the cluster. This address must NOT be
within the SET_METALLB_ADDR_POOL range but must be within the
PXE/Management network. External load balancers are not supported.
The IP address range to be used as external load balancers for the
Kubernetes services with the LoadBalancer type. This range must
be within the PXE/Management network. The minimum required range is
19 IP addresses.
Use the same value that you used for this parameter in the
cluster.yaml.template file (see above).
Optional. To configure the separated PXE and management networks instead of
one PXE/management network, proceed to Separate PXE and management networks.
Optional. To connect the cluster hosts to the PXE/Management
network using bond interfaces, proceed to Configure NIC bonding.
If you require all Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables to bootstrap
the cluster using proxy:
Optional. Technology Preview. Configure Ceph
controller to manage Ceph nodes resources. In
templates/bm/cluster.yaml.template, in the ceph-controller
section of spec.providerSpec.value.helmReleases, specify the
hyperconverge parameter with required resource requests, limits, or
tolerations:
Set up the disk configuration according to your hardware node
specification. Verify that the storageDevices section
has a valid list of HDD, SSD, or NVME device names and each
device is empty, that is, no file system is present on it.
...# This part of KaaSCephCluster should contain valid networks definitionnetwork:clusterNet:10.10.10.0/24publicNet:10.10.11.0/24...nodes:master-0:...<node_name>:...# This part of KaaSCephCluster should contain valid device namesstorageDevices:-name:sdbconfig:deviceClass:hdd# Each storageDevices dicts can have several devices-name:sdcconfig:deviceClass:hdd# All devices for Ceph also should be described to ``wipe`` in# ``baremetalhosts.yaml.template``-name:sddconfig:deviceClass:hdd# Do not to include first devices here (like vda or sda)# because they will be allocated for operating system
In machines.yaml.template, verify that the metadata:name
structure matches the machine names in the spec:nodes
structure of kaascephcluster.yaml.template.
Verify that the kaas-bootstrap directory contains the following files:
The provisioning IP address. This address will be assigned to the
interface of the seed node defined by the KAAS_BM_PXE_BRIDGE
parameter (see below). The PXE service of the bootstrap cluster will
use this address to network boot the bare metal hosts for the
cluster.
10.0.0.20
KAAS_BM_PXE_MASK
The CIDR prefix for the PXE network. It will be used with KAAS_BM_PXE_IP
address when assigning it to network interface.
24
KAAS_BM_PXE_BRIDGE
The PXE network bridge name. The name must match the name
of the bridge created on the seed node during the
Prepare the seed node stage.
br0
KAAS_BM_BM_DHCP_RANGE
The start_ip and end_ip addresses must be within the PXE network.
This range will be used by dnsmasq to provide IP addresses for nodes
during provisioning.
10.0.0.30,10.0.0.49,255.255.255.0
BOOTSTRAP_METALLB_ADDRESS_POOL
The pool of IP addresses that will be used by services
in the bootstrap cluster. Can be the same as the
SET_METALLB_ADDR_POOL range for the cluster, or a different range.
10.0.0.61-10.0.0.80
Run the verification preflight script to validate the deployment
templates configuration:
./bootstrap.sh preflight
The command outputs a human-readable report with the verification details.
The report includes the list of verified bare metal nodes and their
ChassisPower status.
This status is based on the deployment templates configuration used
during the verification.
Caution
If the report contains information about missing dependencies
or incorrect configuration, fix the issues before proceeding
to the next step.
The Keycloak URL that the system outputs when the bootstrap completes.
The admin password for Keycloak is located in
kaas-bootstrap/passwords.yml along with other IAM passwords.
Note
The Container Cloud web UI and StackLight endpoints are available
through Transport Layer Security (TLS) and communicate with Keycloak
to authenticate users. Keycloak is exposed using HTTPS and
self-signed TLS certificates that are not trusted by web browsers.
You can configure L2 templates for the management cluster to set up
a bond network interface for the PXE/Management network.
This configuration must be applied to the bootstrap templates,
before you run the bootstrap script to deploy the management
cluster.
Caution
This configuration requires each host in your management
cluster to have at least two physical interfaces.
Connect at least two interfaces per host to an Ethernet switch
that supports Link Aggregation Control Protocol (LACP)
port groups and LACP fallback.
Configure an LACP group on the ports connected
to the NICs of a host.
Configure the LACP fallback on the port group to ensure that
the host can boot over the PXE network before the bond interface
is set up on the host operating system.
Configure server BIOS for both NICs of a bond to be PXE-enabled.
If the server does not support booting from multiple NICs,
configure the port of the LACP group that is connected to the
PXE-enabled NIC of a server to be primary port.
With this setting, the port becomes active in the fallback mode.
To configure a bond interface that aggregates two interfaces
for the PXE/Management network:
In kaas-bootstrap/templates/bm/ipam-objects.yaml.template:
Configure only the following parameters for the declaration
of {{nic0}}, as shown in the example below:
dhcp4
dhcp6
match
set-name
Remove other parameters.
Add the declaration of the second NIC {{nic1}} to be added to the
bond interface:
Specify match:mac-address:{{mac1}} to match the MAC
of the desired NIC.
Specify set-name:{{nic1}} to ensure the correct name of the NIC.
Add the declaration of the bond interface bond0. It must have the
interfaces parameter listing both Ethernet interfaces.
Set the interfaces parameter of the k8s-lcm bridge to include
bond0.
Set the addresses, gateway4, and nameservers fields of
the k8s-lcm bridge to fetch data from the kaas-mgmt subnet.
Configure bonding options using the parameters field. The only
mandatory option is mode. See the example below for details.
Note
You can set any mode supported by netplan
and your hardware.
Verify your configuration using the following example:
This section describes the bare metal host profile settings and
instructs how to configure this profile before deploying
Mirantis Container Cloud on physical servers.
The bare metal host profile is a Kubernetes custom resource.
It allows the Infrastructure Operator to define how the storage devices
and the operating system are provisioned and configured.
The bootstrap templates for a bare metal deployment include the template for
the default BareMetalHostProfile object in the following file
that defines the default bare metal host profile:
templates/bm/baremetalhostprofiles.yaml.template
Note
Using BareMetalHostProfile, you can configure LVM or mdadm-based
software RAID support during a management or managed cluster
creation. For details, see Configure RAID support.
This feature is available as Technology Preview. Use such
configuration for testing and evaluation purposes only. For the
Technology Preview feature definition, refer to Technology Preview features.
Warning
All data will be wiped during cluster deployment on devices
defined directly or indirectly in the fileSystems list of
BareMetalHostProfile. For example:
A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a
file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it
The wipe field is always considered true for these devices.
The false value is ignored.
Therefore, to prevent data loss, move the necessary data from these file
systems to another server beforehand, if required.
The customization procedure of BareMetalHostProfile is almost the same for
the management and managed clusters, with the following differences:
For a management cluster, the customization automatically applies
to machines during bootstrap. And for a managed cluster, you apply
the changes using kubectl before creating a managed cluster.
For a management cluster, you edit the default
baremetalhostprofiles.yaml.template. And for a managed cluster, you
create a new BareMetalHostProfile with the necessary configuration.
For the procedure details, see Create a custom bare metal host profile.
Use this procedure for both types of clusters considering the differences
described above.
This section describes how to configure a dedicated PXE network for a
management or regional bare metal cluster.
A separate PXE network allows isolating sensitive bare metal provisioning
process from the end users. The users still have access to Container Cloud
services, such as Keycloak, to authenticate workloads in managed clusters,
such as Horizon in a Mirantis OpenStack for Kubernetes cluster.
The following table describes the overall network mapping scheme with all
L2/L3 parameters, for example, for two networks, PXE (CIDR 10.0.0.0/24)
and management (CIDR 10.0.11.0/24):
When using a separate PXE network, the management cluster services are exposed
in different networks using two separate MetalLB address pools:
Services exposed through the PXE network are as follows:
Ironic API (bare metal provisioning server)
HTTP server that provides images for network boot and server
provisioning
Caching server for accessing the Container Cloud artifacts deployed
on hosts
Services exposed through the management network are all other Container Cloud
services, such as Keycloak, web UI, and so on
To configure separate PXE and management networks:
In kaas-bootstrap/templates/bm/ipam-objects.yaml.template:
Substitute all the Subnet object templates with the new ones
as described in the example template below
Update the L2 template spec.l3Layout and spec.npTemplate fields
as described in the example template below
Example of the Subnet object templates
# Subnet object that provides IP addresses for bare metal hosts of# management cluster in the PXE network.apiVersion:"ipam.mirantis.com/v1alpha1"kind:Subnetmetadata:name:mgmt-pxenamespace:defaultlabels:kaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onekaas-mgmt-pxe-subnet:""spec:cidr:SET_IPAM_CIDRgateway:SET_PXE_NW_GWnameservers:-SET_PXE_NW_DNSincludeRanges:-SET_IPAM_POOL_RANGEexcludeRanges:-SET_METALLB_PXE_ADDR_POOL---# Subnet object that provides IP addresses for bare metal hosts of# management cluster in the management network.apiVersion:"ipam.mirantis.com/v1alpha1"kind:Subnetmetadata:name:mgmt-lcmnamespace:defaultlabels:kaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-onekaas-mgmt-lcm-subnet:""ipam/SVC-k8s-lcm:"1"ipam/SVC-ceph-cluster:"1"ipam/SVC-ceph-public:"1"cluster.sigs.k8s.io/cluster-name:CLUSTER_NAMEspec:cidr:{{SET_LCM_CIDR}}includeRanges:-{{SET_LCM_RANGE}}excludeRanges:-SET_LB_HOST-SET_METALLB_ADDR_POOL---# Subnet object that provides configuration for "services-pxe" MetalLB# address pool that will be used to expose services LB endpoints in the# PXE network.apiVersion:"ipam.mirantis.com/v1alpha1"kind:Subnetmetadata:name:mgmt-pxe-lbnamespace:defaultlabels:kaas.mirantis.com/provider:baremetalkaas.mirantis.com/region:region-oneipam/SVC-MetalLB:""metallb/address-pool-name:services-pxemetallb/address-pool-protocol:layer2cluster.sigs.k8s.io/cluster-name:CLUSTER_NAMEspec:cidr:SET_IPAM_CIDRincludeRanges:-SET_METALLB_PXE_ADDR_POOL
The last Subnet template named mgmt-pxe-lb in the example above
will be used to configure the MetalLB address pool in the PXE network.
The bare metal provider will automatically configure MetalLB
with address pools using the Subnet objects identified by specific
labels.
Use the following labels to identify the Subnet object as a MetalLB
address pool and configure the name and protocol for that address pool.
All labels below are mandatory for the Subnet object that configures
a MetalLB address pool.
Mandatory Subnet labels for a MetalLB address pool¶
Label
Description
ipam/SVC-MetalLB
Defines that the Subnet object will be used to provide
a new address pool/range for MetalLB.
metallb/address-pool-name
Sets the name services-pxe for the newly created address pool.
The services-pxe address pool name is mandatory when configuring
a dedicated PXE network in the management cluster. This name will be
used in annotations for services exposed through the PXE network.
Every address pool must have a distinct name except the default
name that is reserved for the management network.
metallb/address-pool-protocol
Sets the address pool protocol.
The only supported value is layer2 (default).
cluster.sigs.k8s.io/cluster-name
Specifies the management or regional cluster name that
the Subnet should be bound to.
Caution
Do not set the same address pool name for two or more
Subnet objects. Otherwise, the corresponding MetalLB address pool
configuration fails with a warning message in the bare metal provider
log.
Verify the current MetalLB configuration that is stored in the
ConfigMap object:
kubectl -n metallb-system get cm metallb -o jsonpath={.data.config}
For the example configuration described above, the following lines must
appear in the ConfigMap for MetalLB:
The auto-assign parameter will be set to false for all address
pools except the default one. So, a particular service will get an
address from such an address pool only if the Service object has a
special metallb.universe.tf/address-pool annotation that points to
the specific address pool name.
Note
It is expected that every Container Cloud service on a management
and regional cluster will be assigned to one of the address pools.
Current consideration is to have two MetalLB address pools:
services-pxe is a reserved address pool name to use for
the Container Cloud services in the PXE network (Ironic API,
HTTP server, caching server)
default is an address pool to use for all other Container
Cloud services in the management network. No annotation
is required on the Service objects in this case.
In kaas-bootstrap/templates/bm/cluster.yaml.template,
add the dedicatedMetallbPools flag and set it to true:
User sets this flag to enable splitting of LB endpoints for the Container
Cloud services. The metallb.universe.tf/address-pool annotations on the
Service objects are configured by the bare metal provider automatically
when the dedicatedMetallbPools flag is set to true.
Example Service object configured by the baremetal-operator Helm
release:
The metallb.universe.tf/address-pool annotation on the Service
object is set to services-pxe by the baremetal provider, so the
ironic-api service will be assigned an LB address from the
corresponding MetalLB address pool.
Address of a management network for the management cluster
in the CIDR notation. You can later share this network with managed
clusters where it will act as the LCM network.
If managed clusters have their separate LCM networks,
those networks must be routable to the management network.
10.0.11.0/24
SET_LCM_RANGE
Address range that includes addresses to be allocated to
bare metal hosts in the management network for the management
cluster. When this network is shared with managed clusters,
the size of this range limits the number of hosts that can be
deployed in all clusters that share this network.
When this network is solely used by a management cluster,
the range should include at least 3 IP addresses
for bare metal hosts of the management cluster.
10.0.11.100-10.0.11.109
SET_METALLB_PXE_ADDR_POOL
Address range to be used for LB endpoints of the Container Cloud
services: Ironic-API, HTTP server, and caching server.
This range must be within the PXE network.
The minimum required range is 5 IP addresses.
Subnet template parameters migrated to management network¶
Parameter
Description
Example value
SET_LB_HOST
IP address of the externally accessible API endpoint
of the management cluster. This address must NOT be
within the SET_METALLB_ADDR_POOL range but within the
management network. External load balancers are not supported.
10.0.11.90
SET_METALLB_ADDR_POOL
The address range to be used for the externally accessible LB
endpoints of the Container Cloud services, such as Keycloak, web UI,
and so on. This range must be within the management network.
The minimum required range is 19 IP addresses.
Configure multiple DHCP ranges using Subnet resources¶
To facilitate multi-rack and other types of distributed bare metal datacenter
topologies, the dnsmasq DHCP server used for host provisioning in Container
Cloud supports working with multiple L2 segments through network routers that
support DHCP relay.
Caution
Networks used for hosts provisioning of a managed cluster
must have routes to the PXE network (when a dedicated PXE network
is configured) or to the combined PXE/management network
of the management cluster. This configuration enables hosts to
have access to the management cluster services that are used
during host provisioning.
To configure DHCP ranges for dnsmasq, create the Subnet objects
tagged with the ipam/SVC-dhcp-range label while setting up subnets
for a managed cluster using CLI.
For every dhcp-range record, Container Cloud also configures the
dhcp-option record to pass the default route through the default gateway
from the corresponding subnet to all hosts that obtain addresses
from that DHCP range. You can also specify DNS server addresses for servers
that boot over PXE. They will be configured by Container Cloud using another
dhcp-option record.
Note
The Subnet objects for DHCP ranges should not reference
any specific cluster, as DHCP server configuration is only
applicable to the management or regional cluster.
The kaas.mirantis.com/region label that specifies the region
will be used to determine where to apply the DHCP ranges from the
given Subnet object. The Cluster reference will be ignored.
The baremetal-operator chart allows using multiple DHCP ranges
in the dnsmasq.conf file. The chart iterates over a list
of the dhcp-range parameters from its values and adds all items
from the list to the dnsmasq configuration.
The baremetal-operator chart allows using single DHCP range
for backwards compatibility. By default, the
KAAS_BM_BM_DHCP_RANGE environment variable is still used
to define the DHCP range for a management or regional cluster
nodes during provisioning.
The dnsmasq configuration options dhcp-option=3 and dhcp-option=6
are absent in the default configuration. So, by default, dnsmasq
will send the DNS server and default route to DHCP clients as defined in the
dnsmasq official documentation:
The netmask and broadcast address are the same as on the host
running dnsmasq.
The DNS server and default route are set to the address of the host
running dnsmasq.
If the domain name option is set, this name is sent to DHCP clients.
If such default behavior is not desirable during deployment of managed
clusters:
Open the management cluster spec for editing.
In the baremetal-operator release values, remove the
dnsmasq.dhcp_range parameter:
DHCP range is set according to the cidr and includeRanges parameters
of the Subnet object. The mgmt-dhcp-range-0 tag is formed from the Subnet
object name and address range index within the Subnet object.
Optional, available when the nameservers parameter is set
in the Subnet object. The DNS server option is set according
to the nameservers parameter of the Subnet object.
The tag is the same as in the dhcp-range parameter.
Verify that the changes are applied to dnsmasq.conf:
kubectl --kubeconfig <pathToMgmtOrRegionalClusterKubeconfig> \
-n kaas get cm dnsmasq-config -ojson| jq -r '.data."dnsmasq.conf"'
For servers to access the DHCP server across the L2 segment boundaries,
for example, from another rack with a different VLAN for PXE network,
you must configure DHCP relay service on the border switch of the segment.
For example, on a top-of-rack (ToR) or leaf (distribution) switch, depending
on the data center network topology.
In Container Cloud, the dnsmasq server listens on the PXE interface of the
management cluster node.
To configure DHCP relay, you need to specify the address(es) of a
DHCP helper or the server that handles DHCP requests.
Depending on the PXE network setup, select from the following options:
If the PXE network is combined with the management network, identify LCM
addresses of the management cluster nodes:
kubectl -n default get lcmmachine -o wide
In the output, select the addresses from the INTERNALIP column to use
as the DHCP helper addresses.
If you use a dedicated PXE network, identify the addresses assigned
to your nodes using the corresponding IpamHost objects:
kubectl -n default get ipamhost -o yaml
In status.netconfigV2 of each management cluster host, obtain the
interface name used for PXE network and collect associated addresses to use
as the DHCP helper addresses. For example:
Log in to any personal computer or VM running Ubuntu 20.04
that you will be using as the bootstrap node.
If you use a newly created VM, run:
sudo apt-get update
Install the current Docker version available for Ubuntu 20.04:
sudo apt install docker.io
Grant your USER access to the Docker daemon:
sudo usermod -aG docker $USER
Log off and log in again to the bootstrap node to apply the changes.
Verify that Docker is configured correctly and has access
to Container Cloud CDN. For example:
docker run --rm alpine sh -c "apk add --no-cache curl; \curl https://binary.mirantis.com"
The system output must contain no error records.
In case of issues, follow the steps provided in Troubleshooting.
Note
If you require all Internet access to go through a proxy server
for security and audit purposes, configure Docker proxy settings
as described in the official
Docker documentation.
After you complete the prerequisite steps described in Prerequisites,
proceed with bootstrapping your OpenStack-based Mirantis Container Cloud
management cluster.
To bootstrap an OpenStack-based management cluster:
Log in to the bootstrap node running Ubuntu 20.04 that is configured
as described in Prerequisites.
Prepare the bootstrap script:
Download and run the Container Cloud bootstrap script:
Log in to your account and download the mirantis.lic license file.
Save the license file as mirantis.lic under the kaas-bootstrap
directory on the bootstrap node.
Verify that mirantis.lic contains the exact Container Cloud license
previously downloaded from www.mirantis.com
by decoding the license JWT token, for example, using jwt.io.
Example of a valid decoded Container Cloud license data with the mandatory
license field:
Verify access to the target cloud endpoint from Docker. For example:
docker run --rm alpine sh -c "apk add --no-cache curl; \curl https://auth.openstack.example.com:5000/v3"
The system output must contain no error records.
In case of issues, follow the steps provided in Troubleshooting.
Configure the cluster and machines metadata:
In templates/machines.yaml.template,
modify the spec:providerSpec:value section for 3 control plane nodes
marked with the cluster.sigs.k8s.io/control-plane label
by substituting the flavor and image parameters
with the corresponding values of the control plane nodes in the related
OpenStack cluster. For example:
The flavor parameter value provided in the example above
is cloud-specific and must meet the Container Cloud
requirements.
Also, modify other parameters as required.
Modify the templates/cluster.yaml.template parameters to fit your
deployment. For example, add the corresponding values for cidrBlocks
in the spec::clusterNetwork::services section.
Optional if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org)
are accessible from the node where the management cluster is being
provisioned. Otherwise, this step is mandatory.
Configure the regional NTP server parameters to be applied to all machines
of regional and managed clusters in the specified region.
In templates/cluster.yaml.template, add the ntp:servers section
with the list of required servers names:
Optional. If you require all Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables
to bootstrap the management and regional cluster using proxy:
The Keycloak URL that the system outputs when the bootstrap completes.
The admin password for Keycloak is located in
kaas-bootstrap/passwords.yml along with other IAM passwords.
Note
The Container Cloud web UI and StackLight endpoints are available
through Transport Layer Security (TLS) and communicate with Keycloak
to authenticate users. Keycloak is exposed using HTTPS and
self-signed TLS certificates that are not trusted by web browsers.
Log in to any personal computer or VM running Ubuntu 20.04
that you will be using as the bootstrap node.
If you use a newly created VM, run:
sudo apt-get update
Install the current Docker version available for Ubuntu 20.04:
sudo apt install docker.io
Grant your USER access to the Docker daemon:
sudo usermod -aG docker $USER
Log off and log in again to the bootstrap node to apply the changes.
Verify that Docker is configured correctly and has access
to Container Cloud CDN. For example:
docker run --rm alpine sh -c "apk add --no-cache curl; \curl https://binary.mirantis.com"
The system output must contain no error records.
In case of issues, follow the steps provided in Troubleshooting.
Note
If you require all Internet access to go through a proxy server
for security and audit purposes, configure Docker proxy settings
as described in the official
Docker documentation.
After you complete the prerequisite steps described in Prerequisites,
proceed with bootstrapping your AWS-based Mirantis Container Cloud
management cluster.
To bootstrap an AWS-based management cluster:
Log in to the bootstrap node running Ubuntu 20.04 that is configured
as described in Prerequisites.
Prepare the bootstrap script:
Download and run the Container Cloud bootstrap script:
Log in to your account and download the mirantis.lic license file.
Save the license file as mirantis.lic under the kaas-bootstrap
directory on the bootstrap node.
Verify that mirantis.lic contains the exact Container Cloud license
previously downloaded from www.mirantis.com
by decoding the license JWT token, for example, using jwt.io.
Example of a valid decoded Container Cloud license data with the mandatory
license field:
The MKE license does not apply to mirantis.lic. For
details about MKE license, see MKE documentation.
Prepare the AWS deployment templates:
Verify access to the target cloud endpoint from Docker. For example:
docker run --rm alpine sh -c "apk add --no-cache curl; \curl https://ec2.amazonaws.com"
The system output must contain no error records.
In case of issues, follow the steps provided in Troubleshooting.
Change the directory to the kaas-bootstrap folder.
In templates/aws/machines.yaml.template,
modify the spec:providerSpec:value section
by substituting the ami:id parameter with the corresponding value
for Ubuntu 20.04 from the required AWS region. For example:
Do not stop the AWS instances dedicated to the Container Cloud
clusters to prevent data failure and cluster disaster.
Optional. In templates/aws/cluster.yaml.template,
modify the values of the spec:providerSpec:value:bastion:amiId and
spec:providerSpec:value:bastion:instanceType sections
by setting the necessary Ubuntu AMI ID and instance type in the required
AWS region respectively. For example:
Optional. In templates/aws/cluster.yaml.template, modify the default
configuration of the AWS instance types and AMI IDs for further creation
of managed clusters:
providerSpec:value:...kaas:...regional:-provider:awshelmReleases:-name:aws-credentials-controllervalues:config:allowedInstanceTypes:minVCPUs:8# in MiBminMemory:16384# in GBminStorage:120supportedArchitectures:-"x86_64"filters:-name:instance-storage-info.disk.typevalues:-"ssd"allowedAMIs:--name:namevalues:-"ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210325"-name:owner-idvalues:-"099720109477"
Optional if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org)
are accessible from the node where the management cluster is being
provisioned. Otherwise, this step is mandatory.
Configure the regional NTP server parameters to be applied to all machines
of regional and managed clusters in the specified region.
In templates/aws/cluster.yaml.template, add the ntp:servers section
with the list of required servers names:
Generate the AWS Access Key ID with Secret Access Key for the user with
the IAMFullAccess permissions and select the AWS default region name.
For details, see AWS General Reference: Programmatic access.
Export the following parameters by adding the corresponding values
for the AWS IAMFullAccess user credentials created in the previous step:
Configure the bootstrapper.cluster-api-provider-aws.kaas.mirantis.com
user created in the previous steps:
Using your AWS Management Console, generate the AWS Access Key ID with
Secret Access Key for
bootstrapper.cluster-api-provider-aws.kaas.mirantis.com
and select the AWS default region name.
Note
Other authorization methods, such as usage of
AWS_SESSION_TOKEN, are not supported.
Export the AWS bootstrapper.cluster-api-provider-aws.kaas.mirantis.com
user credentials that were created in the previous step:
If you require all Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables
to bootstrap the management and regional cluster using proxy:
The Keycloak URL that the system outputs when the bootstrap completes.
The admin password for Keycloak is located in
kaas-bootstrap/passwords.yml along with other IAM passwords.
Note
The Container Cloud web UI and StackLight endpoints are available
through Transport Layer Security (TLS) and communicate with Keycloak
to authenticate users. Keycloak is exposed using HTTPS and
self-signed TLS certificates that are not trusted by web browsers.
Now, you can proceed with operating your management cluster using
the Container Cloud web UI and deploying managed clusters as described in
Create and operate an AWS-based managed cluster.
Log in to any personal computer or VM running Ubuntu 20.04
that you will be using as the bootstrap node.
If you use a newly created VM, run:
sudo apt-get update
Install the current Docker version available for Ubuntu 20.04:
sudo apt install docker.io
Grant your USER access to the Docker daemon:
sudo usermod -aG docker $USER
Log off and log in again to the bootstrap node to apply the changes.
Verify that Docker is configured correctly and has access
to Container Cloud CDN. For example:
docker run --rm alpine sh -c "apk add --no-cache curl; \curl https://binary.mirantis.com"
The system output must contain no error records.
In case of issues, follow the steps provided in Troubleshooting.
Note
If you require all Internet access to go through a proxy server
for security and audit purposes, configure Docker proxy settings
as described in the official
Docker documentation.
After you complete the prerequisite steps described in
Prerequisites, proceed with bootstrapping your Mirantis
Container Cloud management cluster based on the Azure provider.
To bootstrap an Azure-based management cluster:
Log in to the bootstrap node running Ubuntu 20.04 that is configured
as described in Prerequisites.
Prepare the bootstrap script:
Download and run the Container Cloud bootstrap script:
Log in to your account and download the mirantis.lic license file.
Save the license file as mirantis.lic under the kaas-bootstrap
directory on the bootstrap node.
Verify that mirantis.lic contains the exact Container Cloud license
previously downloaded from www.mirantis.com
by decoding the license JWT token, for example, using jwt.io.
Example of a valid decoded Container Cloud license data with the mandatory
license field:
In templates/azure/azure-config.yaml.template, modify the following
parameters using credentials obtained in the previous steps or using
credentials of an existing Azure service principal obtained from the
subscription owner:
spec:subscriptionID is the subscription ID of your Azure account
spec:tenantID is the value of "tenant"
spec:clientID is the value of "appId"
spec:clientSecret:value is the value of "password"
In templates/azure/cluster.yaml.template,
modify the default configuration of the Azure cluster location.
This is an Azure region that your subscription has quota for.
To obtain the list of available locations, run:
az account list-locations -o=table
For example:
providerSpec:value:...location:southcentralus
Also, modify other parameters as required.
Optional. In templates/azure/machines.yaml.template,
modify the default configuration of the Azure virtual machine
size and OS disk size.
Mirantis Container Cloud only supports Azure virtual machine sizes
that meet the following minimum requirements:
More than 8 CPU
More than 24 GB RAM
Ephemeral OS drive supported
Temporary storage size is more than 128 GB
Set the OS disk size parameter to at least 128 GB (default value)
and verify that it does not exceed the temporary storage size.
To obtain the list of all Azure virtual machine sizes available in the
selected Azure region:
az vm list-skus -l southcentralus -o=json
To filter virtual machine sizes by the Container Cloud minimum
requirements:
Optional if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org)
are accessible from the node where the management cluster is being
provisioned. Otherwise, this step is mandatory.
Configure the regional NTP server parameters to be applied to all machines
of regional and managed clusters in the specified region.
In templates/azure/cluster.yaml.template, add the ntp:servers
section with the list of required servers names:
If you require Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables
to bootstrap the management and regional cluster using proxy:
The Keycloak URL that the system outputs when the bootstrap completes.
The admin password for Keycloak is located in
kaas-bootstrap/passwords.yml along with other IAM passwords.
Note
The Container Cloud web UI and StackLight endpoints are available
through Transport Layer Security (TLS) and communicate with Keycloak
to authenticate users. Keycloak is exposed using HTTPS and
self-signed TLS certificates that are not trusted by web browsers.
Deploy an Equinix Metal based management cluster with public networking¶
This section describes how to bootstrap a Mirantis Container Cloud management
cluster that is based on the Equinix Metal cloud provider with public
networking.
Log in to any personal computer or VM running Ubuntu 20.04
that you will be using as the bootstrap node.
If you use a newly created VM, run:
sudo apt-get update
Install the current Docker version available for Ubuntu 20.04:
sudo apt install docker.io
Grant your USER access to the Docker daemon:
sudo usermod -aG docker $USER
Log off and log in again to the bootstrap node to apply the changes.
Verify that Docker is configured correctly and has access
to Container Cloud CDN. For example:
docker run --rm alpine sh -c "apk add --no-cache curl; \curl https://binary.mirantis.com"
The system output must contain no error records.
In case of issues, follow the steps provided in Troubleshooting.
Note
If you require all Internet access to go through a proxy server
for security and audit purposes, configure Docker proxy settings
as described in the official
Docker documentation.
Before deploying an Equinix Metal based Container Cloud cluster with public
networking, ensure that local Border Gateway Protocol (BGP) is enabled and
properly configured for your Equinix Metal project.
To configure BGP in the Equinix Metal project:
Log in to the Equinix Metal console.
In IPs & Networks, select BGP.
In the window that opens:
Click Activate BGP on This Project.
Select local type.
Click Add and wait for the request to finalize.
Verify the value of the max_prefix BGP parameter:
Set the token variable to your project token.
To obtain the token in the Equinix Metal console, navigate to
Project Settings > Project API Keys > Add New Key.
Set the project variable to your project ID.
To obtain the project ID in the Equinix Metal console, navigate to
Project Settings > General > PROJECT ID.
In the system output, if the value is 10 (default), contact the
Equinix Metal support to increase this parameter to at least 150.
The default value allows creating only two Container Cloud clusters per
one Equinix Metal project. Hence, Mirantis recommends increasing the
max_prefix value.
Verify the capacity of the Equinix Metal facility¶
Before deploying an Equinix Metal based Container Cloud cluster with public
networking, ensure that the Equinix Metal project has enough capacity to
deploy the required number of machines. Otherwise, the machines will be stuck
in the Provisioned state with an error message about no available servers
of particular type in your facility.
To verify the capacity of the Equinix Metal facility:
After you complete the prerequisite steps described in
Prerequisites, proceed with bootstrapping your Mirantis
Container Cloud management cluster based on the Equinix Metal provider with
public networking.
To bootstrap an Equinix Metal based management cluster with public
networking:
Log in to the bootstrap node running Ubuntu 20.04 that is configured
as described in Prerequisites.
Prepare the bootstrap script:
Download and run the Container Cloud bootstrap script:
Log in to your account and download the mirantis.lic license file.
Save the license file as mirantis.lic under the kaas-bootstrap
directory on the bootstrap node.
Verify that mirantis.lic contains the exact Container Cloud license
previously downloaded from www.mirantis.com
by decoding the license JWT token, for example, using jwt.io.
Example of a valid decoded Container Cloud license data with the mandatory
license field:
The MKE license does not apply to mirantis.lic. For
details about MKE license, see MKE documentation.
Using the Equinix Metal console, obtain the project ID and
the user-level API Key of the Equinix Metal project to be used
for the Container Cloud deployment:
Log in to the Equinix Metal console.
Select the project that you want to use for the Container Cloud deployment.
In Project Settings > General, capture your
Project ID.
In Profile Settings > Personal API Keys, capture
the existing user-level API Key or create a new one:
In Profile Settings > Personal API Keys,
click Add New Key.
Fill in the Description and select
the Read/Write permissions.
Click Add Key.
Prepare the Equinix Metal configuration:
Change the directory to kaas-bootstrap.
In templates/equinix/equinix-config.yaml.template,
modify spec:projectID and spec:apiToken:value using the values
obtained in the previous steps. For example:
In templates/equinix/cluster.yaml.template,
modify the default configuration of the Equinix Metal facility
depending on the previously prepared capacity settings:
providerSpec:value:...facility:am6
Also, modify other parameters as required.
Optional. In templates/equinix/machines.yaml.template,
modify the default configuration of the Equinix Metal machine type.
The minimal required type is c3.small.x86.
providerSpec:value:...machineType:c3.small.x86
Also, modify other parameters as required.
Optional if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org)
are accessible from the node where the management cluster is being
provisioned. Otherwise, this step is mandatory.
Configure the regional NTP server parameters to be applied to all machines
of regional and managed clusters in the specified region.
In templates/equinix/cluster.yaml.template, add the ntp:servers
section with the list of required servers names:
If you require all Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables
to bootstrap the management and regional cluster using proxy:
Re-verify that the selected Equinix Metal facility for the management
cluster bootstrap is still available and has enough capacity:
metal capacity check --facility $EQUINIX_FACILITY --plan $EQUINIX_MACHINE_TYPE --quantity $MACHINES_AMOUNT
In the system response, if the value in the AVAILABILITY section
has changed from true to false, find an available facility and
update the previously configured facility field in
cluster.yaml.template.
The Keycloak URL that the system outputs when the bootstrap completes.
The admin password for Keycloak is located in
kaas-bootstrap/passwords.yml along with other IAM passwords.
Note
The Container Cloud web UI and StackLight endpoints are available
through Transport Layer Security (TLS) and communicate with Keycloak
to authenticate users. Keycloak is exposed using HTTPS and
self-signed TLS certificates that are not trusted by web browsers.
Deploy an Equinix Metal based management cluster with private networking¶
This section describes how to bootstrap a Mirantis Container Cloud management
cluster that is based on the Equinix Metal cloud provider with private
networking.
Before you start with bootstrapping the Equinix Metal based management
or regional cluster with private networking, complete the following
prerequisite steps:
Deploy all necessary infrastructure (VLANs, routers, proxy server,
and so on) as described in Infrastructure prerequisites.
Configure the bootstrap node:
Log in to any personal computer or VM running Ubuntu 20.04
that you will be using as the bootstrap node.
If you use a newly created VM, run:
sudo apt-get update
Install the current Docker version available for Ubuntu 20.04:
sudo apt install docker.io
Grant your USER access to the Docker daemon:
sudo usermod -aG docker $USER
Log off and log in again to the bootstrap node to apply the changes.
Verify that Docker is configured correctly and has access
to Container Cloud CDN. For example:
docker run --rm alpine sh -c "apk add --no-cache curl; \curl https://binary.mirantis.com"
The system output must contain no error records.
In case of issues, follow the steps provided in Troubleshooting.
Note
If you require all Internet access to go through a proxy server
for security and audit purposes, configure Docker proxy settings
as described in the official
Docker documentation.
Verify the capacity of the Equinix Metal facility¶
Before deploying an Equinix Metal based Container Cloud cluster with private
networking, ensure that the Equinix Metal project has enough capacity to
deploy the required number of machines. Otherwise, the machines will be stuck
in the Provisioned state with an error message about no available servers
of particular type in your facility.
To verify the capacity of the Equinix Metal facility:
Warning
Mirantis highly recommends using the c3.small.x86 machine
type for the control plane machines deployed with private network
to prevent hardware issues with incorrect BIOS boot order.
After you complete the prerequisite steps described in
Prerequisites, proceed with bootstrapping your Mirantis
Container Cloud management cluster based on the Equinix Metal provider with
private networking.
To bootstrap an Equinix Metal based management cluster with private
networking:
Log in to the bootstrap node running Ubuntu 20.04 that is configured
as described in Prerequisites.
Prepare the bootstrap script:
Download and run the Container Cloud bootstrap script:
Log in to your account and download the mirantis.lic license file.
Save the license file as mirantis.lic under the kaas-bootstrap
directory on the bootstrap node.
Verify that mirantis.lic contains the exact Container Cloud license
previously downloaded from www.mirantis.com
by decoding the license JWT token, for example, using jwt.io.
Example of a valid decoded Container Cloud license data with the mandatory
license field:
The MKE license does not apply to mirantis.lic. For
details about MKE license, see MKE documentation.
Using the Equinix Metal console, obtain the project ID and
the user-level API Key of the Equinix Metal project to be used
for the Container Cloud deployment:
Log in to the Equinix Metal console.
Select the project that you want to use for the Container Cloud deployment.
In Project Settings > General, capture your
Project ID.
In Profile Settings > Personal API Keys, capture
the existing user-level API Key or create a new one:
In Profile Settings > Personal API Keys,
click Add New Key.
Fill in the Description and select
the Read/Write permissions.
Click Add Key.
Prepare the Equinix Metal configuration:
Change the directory to kaas-bootstrap.
In templates/equinixmetalv2/equinix-config.yaml.template,
modify spec:projectID and spec:apiToken:value using the values
obtained in the previous steps. For example:
In templates/equinixmetalv2/cluster.yaml.template:
Modify the default configuration of the Equinix Metal facility
depending on the previously prepared capacity settings as described in
Prerequisites:
providerSpec:value:# ...facility:am6
Add projectSSHKeys that is the list of the Equinix Metal project
SSH key names to be attached to cluster machines. These keys are required
for access to the Equinix Metal out-of-band console Serial Over SSH
(SOS) to debug provisioning failures. We recommend adding at least one
project SSH key per cluster.
ID of the VLAN created in the corresponding Equinix Metal Metro that
the seed node and cluster nodes should be attached to.
loadBalancerHost
IP address to use for the MKE and Kubernetes API endpoints
of the cluster.
metallbRanges
List of IP ranges in the 192.168.0.129-192.168.0.200 format to use
for Kubernetes LoadBalancer services. For example, on a
management cluster, these services include the Container Cloud web UI
and Keycloak. This list should include at least 12 addresses for a
management cluster and 5 for managed clusters.
cidr
Network address in CIDR notation. For example, 192.168.0.0/24.
gateway
IP address of a gateway attached to this VLAN that provides the
necessary external connectivity.
dhcpRanges
List of IP ranges in the 192.168.0.10-192.168.0.50 format.
IP addresses from these ranges will be allocated to nodes that boot
from DHCP during the provisioning process. Should include at least
one address for each machine in the cluster.
includeRanges
List of IP ranges in the 192.168.0.51-192.168.0.128 format.
IP addresses from these ranges will be allocated as permanent
addresses of machines in this cluster. Should include at least one
address for each machine in the cluster.
excludeRanges
Optional. List of IP ranges in the 192.168.0.51-192.168.0.128 format.
IP addresses from these ranges will not be allocated as permanent
addresses of machines in this cluster.
nameservers
List of IP addresses of DNS servers that should be configured on machines.
These servers must be accessible through the gateway from the
provided VLAN. Required unless a proxy server is used.
Add the following parameters to the bootstrap.env file:
Parameter
Description
KAAS_BM_PXE_BRIDGE
Name of the bridge that will be used to provide PXE services to provision
machines during bootstrap.
KAAS_BM_PXE_IP
IP address that will be used for PXE services. Will be assigned to the
KAAS_BM_PXE_BRIDGE bridge. Must be part of the cidr parameter.
KAAS_BM_PXE_MASK
Number of bits in the network address KAAS_BM_PXE_IP.
Must match the CIDR suffix in the cidr parameter.
BOOTSTRAP_METALLB_ADDRESS_POOL
IP range in the 192.168.0.129-192.168.0.200 format that will be
used for Kubernetes LoadBalancer services in the bootstrap cluster.
Optional. In templates/equinixmetalv2/machines.yaml.template,
modify the default configuration of the Equinix Metal machine type.
The minimal required type is c3.small.x86.
Warning
Mirantis highly recommends using the c3.small.x86 machine
type for the control plane machines deployed with private network
to prevent hardware issues with incorrect BIOS boot order.
providerSpec:value:# ...machineType:c3.small.x86
Also, modify other parameters as required.
Optional if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org)
are accessible from the VLAN where the management cluster is being
provisioned. Otherwise, this step is mandatory.
Configure the regional NTP server parameters to be applied to all machines
of regional and managed clusters in the specified region.
In templates/equinixmetalv2/cluster.yaml.template, add the ntp:servers
section with the list of required servers names:
If you require all Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables
to bootstrap the management and regional cluster using proxy:
Re-verify that the selected Equinix Metal facility for the management
cluster bootstrap is still available and has enough capacity:
metal capacity check --facility $EQUINIX_FACILITY --plan $EQUINIX_MACHINE_TYPE --quantity $MACHINES_AMOUNT
In the system response, if the value in the AVAILABILITY section
has changed from true to false, find an available facility and
update the previously configured facility field in
cluster.yaml.template.
The Keycloak URL that the system outputs when the bootstrap completes.
The admin password for Keycloak is located in
kaas-bootstrap/passwords.yml along with other IAM passwords.
Note
The Container Cloud web UI and StackLight endpoints are available
through Transport Layer Security (TLS) and communicate with Keycloak
to authenticate users. Keycloak is exposed using HTTPS and
self-signed TLS certificates that are not trusted by web browsers.
This section describes how to bootstrap a VMware vSphere-based Mirantis
Container Cloud management cluster.
Note
You can deploy vSphere-based clusters on CentOS. Support of this operating
system is available as Technology Preview.
Use it for testing and evaluation purposes only.
Deployment of a Container Cloud cluster that is based on both
RHEL and CentOS operating systems is not supported.
The VMware vSphere provider of Mirantis Container Cloud requires the following
resources to successfully create virtual machines for Container Cloud clusters:
Data center
All resources below must be related to one data center.
Cluster
All virtual machines must run on the hosts of one cluster.
Storage for virtual machines disks and Kubernetes volumes.
Folder
Placement of virtual machines.
Resource pool
Pool of CPU and memory resources for virtual machines.
You must provide the data center and cluster resources by name.
You can provide other resources by:
Name
Resource name must be unique in the data center and cluster.
Otherwise, the vSphere provider detects multiple resources with same name
and cannot determine which one to use.
Full path (recommended)
Full path to a resource depends on its type. For example:
Log in to any personal computer or VM running Ubuntu 20.04
that you will be using as the bootstrap node.
If you use a newly created VM, run:
sudo apt-get update
Install the current Docker version available for Ubuntu 20.04:
sudo apt install docker.io
Grant your USER access to the Docker daemon:
sudo usermod -aG docker $USER
Log off and log in again to the bootstrap node to apply the changes.
Verify that Docker is configured correctly and has access
to Container Cloud CDN. For example:
docker run --rm alpine sh -c "apk add --no-cache curl; \curl https://binary.mirantis.com"
The system output must contain no error records.
In case of issues, follow the steps provided in Troubleshooting.
Note
If you require all Internet access to go through a proxy server
for security and audit purposes, configure Docker proxy settings
as described in the official
Docker documentation.
For RHEL:
Log in to a VM running RHEL 7.9 or 8.4 TechPreview that you will
be using as a bootstrap node.
If you do not use RedHat Satellite server locally in your
infrastructure and require all Internet access to
go through a proxy server, including access to RedHat customer
portal, configure proxy parameters for subscription-manager using
the example below:
Add the Docker mirror according to the operating system major version
(7 for 7.9 and 8 for 8.4 TechPreview).
Provide the proxy URL, if required, or set to _none_.
Verify that Docker is configured correctly and has access
to Container Cloud CDN. For example:
docker run --rm alpine sh -c "apk add --no-cache curl; \curl https://binary.mirantis.com"
The system output must contain no error records.
In case of issues, follow the steps provided in Troubleshooting.
Note
If you require all Internet access to go through a proxy server
for security and audit purposes, configure Docker proxy settings
as described in the official
Docker documentation.
For RHEL deployments, if you do not have a RHEL machine with the
virt-who service configured to report the vSphere environment
configuration and hypervisors information to RedHat Customer Portal
or RedHat Satellite server, set up the virt-who service
inside the Container Cloud machines for a proper RHEL license activation.
Create a virt-who user with at least read-only access
to all objects in the vCenter Data Center.
The virt-who service on RHEL machines will be provided with the
virt-who user credentials to properly manage RHEL subscriptions.
After you complete the prerequisite steps described in Prerequisites,
proceed with bootstrapping your VMware vSphere-based Mirantis Container Cloud
management cluster.
To bootstrap a vSphere-based management cluster:
Log in to the bootstrap node running Ubuntu 18.04 that is configured
as described in Prerequisites.
Prepare the bootstrap script:
Download and run the Container Cloud bootstrap script:
Log in to your account and download the mirantis.lic license file.
Save the license file as mirantis.lic under the kaas-bootstrap
directory on the bootstrap node.
Verify that mirantis.lic contains the exact Container Cloud license
previously downloaded from www.mirantis.com
by decoding the license JWT token, for example, using jwt.io.
Example of a valid decoded Container Cloud license data with the mandatory
license field:
Port of the vCenter Server. For example, port:"8443".
Leave empty to use 443 by default.
SET_VSPHERE_DATACENTER
vSphere data center name.
SET_VSPHERE_SERVER_INSECURE
Flag that controls validation of the vSphere Server certificate.
Must be true or false.
SET_VSPHERE_CAPI_PROVIDER_USERNAME
vSphere Cluster API provider user name that you added when
preparing the deployment user setup and permissions.
SET_VSPHERE_CAPI_PROVIDER_PASSWORD
vSphere Cluster API provider user password.
SET_VSPHERE_CLOUD_PROVIDER_USERNAME
vSphere Cloud Provider deployment user name that you added when
preparing the deployment user setup and permissions.
SET_VSPHERE_CLOUD_PROVIDER_PASSWORD
vSphere Cloud Provider deployment user password.
Modify the templates/vsphere/cluster.yaml.template parameters
to fit your deployment. For example, add the corresponding values
for cidrBlocks in the spec::clusterNetwork::services section.
Provide the following additional parameters for a proper network setup
on machines using embedded IP address management (IPAM)
in templates/vsphere/cluster.yaml.template
Enables IPAM. Set to true for networks without DHCP.
SET_VSPHERE_NETWORK_CIDR
CIDR of the provided vSphere network. For example, 10.20.0.0/16.
SET_VSPHERE_NETWORK_GATEWAY
Gateway of the provided vSphere network.
SET_VSPHERE_CIDR_INCLUDE_RANGES
Optional. IP range for the cluster machines. Specify the range of the provided CIDR.
For example, 10.20.0.100-10.20.0.200.
SET_VSPHERE_CIDR_EXCLUDE_RANGES
Optional. IP ranges to be excluded from being assigned to the cluster
machines. The MetalLB range and SET_LB_HOST should not intersect with
the addresses for IPAM. For example, 10.20.0.150-10.20.0.170.
SET_VSPHERE_NETWORK_NAMESERVERS
List of nameservers for the provided vSphere network.
For RHEL deployments, fill out
templates/vsphere/rhellicenses.yaml.template
using one of the following set of parameters for RHEL machines subscription:
The user name and password of your RedHat Customer Portal account
associated with your RHEL license for Virtual Datacenters.
Optionally, provide the subscription allocation pools to use for the RHEL
subscriptions activation. If not needed, remove the poolIDs field
for subscription-manager to automatically select the licenses for
machines.
The activation key and organization ID associated with your RedHat
account with RHEL license for Virtual Datacenters. The activation key can
be created by the organization administrator on RedHat Customer Portal.
If you use the RedHat Satellite server for management of your
RHEL infrastructure, you can provide a pre-generated activation key from
that server. In this case:
Provide the URL to the RedHat Satellite RPM for installation
of the CA certificate that belongs to that server.
Configure squid-proxy on the management or regional cluster to allow
access to your Satellite server. For details, see Configure squid-proxy.
For RHEL 8.4 TechPreview, verify mirrors
configuration for your activation key. For more details,
see RHEL 8 mirrors configuration.
Caution
Provide only one set of parameters.
Mixing of parameters from different activation methods
will cause deployment failure.
For CentOS deployments, in templates/vsphere/rhellicenses.yaml.template,
remove all lines under items:.
In bootstrap.env, add the KAAS_VSPHERE_ENABLED=true environment
variable that enables the vSphere provider deployment in Container Cloud.
Optional if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org)
are accessible from the node where the management cluster is being
provisioned. Otherwise, this step is mandatory.
Configure the regional NTP server parameters to be applied to all machines
of regional and managed clusters in the specified region.
In templates/vsphere/cluster.yaml.template, add the ntp:servers
section with the list of required servers names:
The <rhel-license-name> value is the RHEL license name defined
in rhellicenses.yaml.tempalte, defaults to
kaas-mgmt-rhel-license.
Remove or comment out this parameter for CentOS deployments.
If you require all Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables
to bootstrap the management and regional cluster using proxy:
The Keycloak URL that the system outputs when the bootstrap completes.
The admin password for Keycloak is located in
kaas-bootstrap/passwords.yml along with other IAM passwords.
Note
The Container Cloud web UI and StackLight endpoints are available
through Transport Layer Security (TLS) and communicate with Keycloak
to authenticate users. Keycloak is exposed using HTTPS and
self-signed TLS certificates that are not trusted by web browsers.
To deploy Mirantis Container Cloud on a vSphere-based
environment, the OVF template for cluster machines must be
prepared according to the following requirements:
The VMware Tools package is installed.
The cloud-init utility is installed and configured with the
specific VMwareGuestInfo data source.
For RHEL deployments, the virt-who service is enabled and configured
to connect to the VMware vCenter Server to properly apply the
RHEL subscriptions on the nodes. The virt-who service can run
on a standalone machine or can be integrated into a VM template.
The following procedures describe how to meet the requirements above
either using the Container Cloud script or manually.
To prepare the OVF template using the Container Cloud script:
Prepare the Container Cloud bootstrap and modify
templates/vsphere/vsphere-config.yaml.template and
templates/vsphere/cluster.yaml.template
as described in Bootstrap a management cluster, steps 1-9.
Download the ISO image depending on the target OS:
After the template is prepared, set the SET_VSPHERE_TEMPLATE_PATH
parameter in templates/vsphere/machines.yaml.template as described
in Bootstrap a management cluster.
To prepare the OVF template manually:
Run a virtual machine on the vSphere data center with the DVD ISO
mounted to it. Specify the amount of resources that will be used
in the Container Cloud setup. A minimal resources configuration must match
the Requirements for a VMware vSphere-based cluster for a vSphere-based Container Cloud cluster.
Bootstrap the OS using vSphere Web Console. Select a minimal setup in the
VM installation configuration. Create a user with root or sudo permissions
to access the machine.
Log in to the VM when it starts.
Optional. If you require all Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables:
Add 99-DataSourceVMwareGuestInfo.cfg to /etc/cloud/cloud.cfg.d/.
Depending on the Python version on the VM operating system,
add DataSourceVMwareGuestInfo.py to the cloud-init sources
folder. Obtain the cloud-init folder on the OS:
python -c 'import os; from cloudinit import sources; print(os.path.dirname(sources.__file__));'
Specifies the connection of the defined virt-who user
to the vCenter Server.
server
The FQDN of the vCenter Server.
username
The virt-who user name on the vCenter Server with the read-only access.
encrypted_password
The virt-who password encrypted by the virt-who-password utility
using the virt-who-password -p <password> command.
owner
The organization that the hypervisors belong to.
hypervisor_id
Specifies how to identify the hypervisors. Use a host name
to provide meaningful host names to the Subscription Management.
Alternatively, use uuid or hwuuid to avoid duplication
in case of hypervisor renaming.
filter_hosts
List of hypervisors that never run RHEL VMs.
Such hypervisors do not have to be reported by virt-who.
For CentOS, verify that .yum mirrors are set to use
only the *.centos.org URLs. Otherwise, access to other mirrors
may be blocked by squid-proxy on managed clusters.
For details, see Configure squid-proxy.
For RHEL, remove the RHEL subscription from the node.
By default squid-proxy allows an access only to the official RedHat
subscription.rhsm.redhat.com and .cdn.redhat.com URLs or to the
CentOS *.centos.org mirrors.
If you use RedHat Satellite server or if you want to access some specific
yum repositories of RedHat or CentOS, allow those domains
(or IPs addresses) in the squid-proxy configuration
on the management or regional cluster.
Note
You can apply the procedure below before or after the management or
regional cluster deployment.
To configure squid-proxy for an access to specific domains:
Modify the allowed domains for squid-proxy in the regional Helm
releases configuration for the vsphere provider using the example below.
For new deployments, modify templates/vsphere/cluster.yaml.template
For existing deployments, modify the management or regional cluster
configuration:
By default, the RHEL subscription grants access to the AppStream
and BaseOS repositories that are not bound to a specific operating system
version and that are stream repositories, so they are frequently updated.
To deploy RHEL 8.4 and make sure that packages are installed
from the version 8.4 AppStream and BaseOS repositories,
RHEL VM template have the releasever variable for .yum set to 8.4.
You can verify this variable in /etc/yum/vars/releasever on a VM.
If you are using the RedHat Satellite server, verify that your activation key
is configured with the release version set to 8.4 and includes only
the following repositories:
Red Hat Enterprise Linux 8 for x86_64 - BaseOS RPMs 8.4
Red Hat Enterprise Linux 8 for x86_64 - AppStream RPMs 8.4
After you bootstrap a management cluster of the required cloud provider type,
you can optionally deploy an additional regional cluster of the same
or different provider type. For details about regions, see Container Cloud regions.
Perform this procedure if you wish to operate managed clusters across clouds
from a single Mirantis Container Cloud management plane.
Caution
A regional cluster requires access to the management cluster.
If you deploy a management cluster on a public cloud, such as
AWS, Equinix Metal, or Microsoft Azure, you can add any type
of regional cluster.
If you deploy a management cluster on a private cloud, such as
OpenStack or vSphere, you can add only private-based regional
clusters.
Multi-regional deployment enables you to create managed clusters of several
provider types using one management cluster. For example, you can
bootstrap an AWS-based management cluster and deploy an OpenStack-based
regional cluster on this management cluster.
Such cluster enables creation of OpenStack-based and AWS-based
managed clusters with Kubernetes deployments.
Note
If the bootstrap node for deployment of an additional regional
cluster is not the same where you bootstrapped the management
cluster, first prepare the bootstrap as described in
Configure the bootstrap node.
This section describes how to prepare a new bootstrap node for an additional
regional cluster deployment on top of the management cluster.
To use the same node where you bootstrapped the management cluster, skip this
instruction and proceed to deploying a regional cluster of the required
provider type.
To configure a new bootstrap node for a regional cluster:
Install and configure Docker:
Log in to any personal computer or VM running Ubuntu 20.04
that you will be using as the bootstrap node.
If you use a newly created VM, run:
sudo apt-get update
Install the current Docker version available for Ubuntu 20.04:
sudo apt install docker.io
Grant your USER access to the Docker daemon:
sudo usermod -aG docker $USER
Log off and log in again to the bootstrap node to apply the changes.
Verify that Docker is configured correctly and has access
to Container Cloud CDN. For example:
docker run --rm alpine sh -c "apk add --no-cache curl; \curl https://binary.mirantis.com"
The system output must contain no error records.
In case of issues, follow the steps provided in Troubleshooting.
Note
If you require all Internet access to go through a proxy server
for security and audit purposes, configure Docker proxy settings
as described in the official
Docker documentation.
Prepare the bootstrap script:
Download and run the Container Cloud bootstrap script:
Log in to your account and download the mirantis.lic license file.
Save the license file as mirantis.lic under the kaas-bootstrap
directory on the bootstrap node.
Verify that mirantis.lic contains the exact Container Cloud license
previously downloaded from www.mirantis.com
by decoding the license JWT token, for example, using jwt.io.
Example of a valid decoded Container Cloud license data with the mandatory
license field:
For clusters deployed using the Container Cloud release earlier than 2.11.0
or if you deleted the kaas-bootstrap folder, download and run
the Container Cloud bootstrap script:
Prepare the AWS configuration for the new regional cluster:
Verify access to the target cloud endpoint from Docker. For example:
docker run --rm alpine sh -c "apk add --no-cache curl; \curl https://ec2.amazonaws.com"
The system output must contain no error records.
In case of issues, follow the steps provided in Troubleshooting.
Change the directory to the kaas-bootstrap folder.
In templates/aws/machines.yaml.template,
modify the spec:providerSpec:value section
by substituting the ami:id parameter with the corresponding value
for Ubuntu 20.04 from the required AWS region. For example:
Do not stop the AWS instances dedicated to the Container Cloud
clusters to prevent data failure and cluster disaster.
Optional. In templates/aws/cluster.yaml.template,
modify the values of the spec:providerSpec:value:bastion:amiId and
spec:providerSpec:value:bastion:instanceType sections
by setting the necessary Ubuntu AMI ID and instance type in the required
AWS region respectively. For example:
Optional. In templates/aws/cluster.yaml.template, modify the default
configuration of the AWS instance types and AMI IDs for further creation
of managed clusters:
providerSpec:value:...kaas:...regional:-provider:awshelmReleases:-name:aws-credentials-controllervalues:config:allowedInstanceTypes:minVCPUs:8# in MiBminMemory:16384# in GBminStorage:120supportedArchitectures:-"x86_64"filters:-name:instance-storage-info.disk.typevalues:-"ssd"allowedAMIs:--name:namevalues:-"ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210325"-name:owner-idvalues:-"099720109477"
Also, modify other parameters as required.
If you require all Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables
to bootstrap the regional cluster using proxy:
Optional if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org)
are accessible from the node where the regional cluster is being
provisioned. Otherwise, this step is mandatory.
Configure the regional NTP server parameters to be applied to all machines
of regional and managed clusters in the specified region.
In templates/aws/cluster.yaml.template, add the ntp:servers section
with the list of required servers names:
Configure the bootstrapper.cluster-api-provider-aws.kaas.mirantis.com
user created in the previous steps:
Using your AWS Management Console, generate the AWS Access Key ID with
Secret Access Key for
bootstrapper.cluster-api-provider-aws.kaas.mirantis.com
and select the AWS default region name.
Note
Other authorization methods, such as usage of
AWS_SESSION_TOKEN, are not supported.
Export the AWS bootstrapper.cluster-api-provider-aws.kaas.mirantis.com
user credentials that were created in the previous step:
Substitute the parameters enclosed in angle brackets with the corresponding
values of your cluster.
Caution
The REGION and REGIONAL_CLUSTER_NAME parameters values
must contain only lowercase alphanumeric characters, hyphens,
or periods.
Note
If the bootstrap node for the regional cluster deployment is not
the same where you bootstrapped the management cluster, also
export SSH_KEY_NAME. It is required for the management
cluster to create a publicKey Kubernetes CRD with the
public part of your newly generated ssh_key
for the regional cluster.
exportSSH_KEY_NAME=<newRegionalClusterSshKeyName>
Run the regional cluster bootstrap script:
./bootstrap.sh deploy_regional
Note
When the bootstrap is complete, obtain and save in a secure location
the kubeconfig-<regionalClusterName> file
located in the same directory as the bootstrap script.
This file contains the admin credentials for the regional cluster.
If the bootstrap node for the regional cluster deployment is not
the same where you bootstrapped the management cluster, a new
regional ssh_key will be generated.
Make sure to save this key in a secure location as well.
The workflow of the regional cluster bootstrap script¶
#
Description
1
Prepare the bootstrap cluster for the new regional cluster.
2
Load the updated Container Cloud CRDs for Credentials,
Cluster, and Machines with information about the new
regional cluster to the management cluster.
3
Connect to each machine of the management cluster through SSH.
4
Wait for the Machines and Cluster objects of the new regional
cluster to be ready on the management cluster.
5
Load the following objects to the new regional cluster: Secret
with the management cluster kubeconfig and ClusterRole for
the Container Cloud provider.
6
Forward the bootstrap cluster endpoint to helm-controller.
7
Wait for all CRDs to be available and verify the objects
created using these CRDs.
8
Pivot the cluster API stack to the regional cluster.
9
Switch the LCM agent from the bootstrap cluster to the regional one.
10
Wait for the Container Cloud components to start on the regional
cluster.
In templates/azure/azure-config.yaml.template, modify the following
parameters using credentials obtained in the previous steps or using
credentials of an existing Azure service principal obtained from the
subscription owner:
spec:subscriptionID is the subscription ID of your Azure account
spec:tenantID is the value of "tenant"
spec:clientID is the value of "appId"
spec:clientSecret:value is the value of "password"
In templates/azure/cluster.yaml.template,
modify the default configuration of the Azure cluster location.
This is an Azure region that your subscription has quota for.
To obtain the list of available locations, run:
az account list-locations -o=table
For example:
providerSpec:value:...location:southcentralus
Also, modify other parameters as required.
If you require Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables
to bootstrap the regional cluster using proxy:
Optional if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org)
are accessible from the node where the regional cluster is being
provisioned. Otherwise, this step is mandatory.
Configure the regional NTP server parameters to be applied to all machines
of regional and managed clusters in the specified region.
In templates/azure/cluster.yaml.template, add the ntp:servers
section with the list of required servers names:
Substitute the parameters enclosed in angle brackets with the corresponding
values of your cluster.
Caution
The REGION and REGIONAL_CLUSTER_NAME parameters values
must contain only lowercase alphanumeric characters, hyphens,
or periods.
Note
If the bootstrap node for the regional cluster deployment is not
the same where you bootstrapped the management cluster, also
export SSH_KEY_NAME. It is required for the management
cluster to create a publicKey Kubernetes CRD with the
public part of your newly generated ssh_key
for the regional cluster.
exportSSH_KEY_NAME=<newRegionalClusterSshKeyName>
Run the regional cluster bootstrap script:
./bootstrap.sh deploy_regional
Note
When the bootstrap is complete, obtain and save in a secure location
the kubeconfig-<regionalClusterName> file
located in the same directory as the bootstrap script.
This file contains the admin credentials for the regional cluster.
If the bootstrap node for the regional cluster deployment is not
the same where you bootstrapped the management cluster, a new
regional ssh_key will be generated.
Make sure to save this key in a secure location as well.
The workflow of the regional cluster bootstrap script¶
#
Description
1
Prepare the bootstrap cluster for the new regional cluster.
2
Load the updated Container Cloud CRDs for Credentials,
Cluster, and Machines with information about the new
regional cluster to the management cluster.
3
Connect to each machine of the management cluster through SSH.
4
Wait for the Machines and Cluster objects of the new regional
cluster to be ready on the management cluster.
5
Load the following objects to the new regional cluster: Secret
with the management cluster kubeconfig and ClusterRole for
the Container Cloud provider.
6
Forward the bootstrap cluster endpoint to helm-controller.
7
Wait for all CRDs to be available and verify the objects
created using these CRDs.
8
Pivot the cluster API stack to the regional cluster.
9
Switch the LCM agent from the bootstrap cluster to the regional one.
10
Wait for the Container Cloud components to start on the regional
cluster.
For clusters deployed using the Container Cloud release earlier than 2.11.0
or if you deleted the kaas-bootstrap folder, download and run
the Container Cloud bootstrap script:
Verify access to the target cloud endpoint from Docker. For example:
docker run --rm alpine sh -c "apk add --no-cache curl; \curl https://auth.openstack.example.com:5000/v3"
The system output must contain no error records.
In case of issues, follow the steps provided in Troubleshooting.
Configure the cluster and machines metadata:
In templates/machines.yaml.template,
modify the spec:providerSpec:value section for 3 control plane nodes
marked with the cluster.sigs.k8s.io/control-plane label
by substituting the flavor and image parameters
with the corresponding values of the control plane nodes in the related
OpenStack cluster. For example:
The flavor parameter value provided in the example above
is cloud-specific and must meet the Container Cloud
requirements.
Also, modify other parameters as required.
Modify the templates/cluster.yaml.template parameters to fit your
deployment. For example, add the corresponding values for cidrBlocks
in the spec::clusterNetwork::services section.
Optional if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org)
are accessible from the node where the regional cluster is being
provisioned. Otherwise, this step is mandatory.
Configure the regional NTP server parameters to be applied to all machines
of regional and managed clusters in the specified region.
In templates/cluster.yaml.template, add the ntp:servers section
with the list of required servers names:
Optional. If you require all Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables
to bootstrap the regional cluster using proxy:
Substitute the parameters enclosed in angle brackets with the corresponding
values of your cluster.
Caution
The REGION and REGIONAL_CLUSTER_NAME parameters values
must contain only lowercase alphanumeric characters, hyphens,
or periods.
Note
If the bootstrap node for the regional cluster deployment is not
the same where you bootstrapped the management cluster, also
export SSH_KEY_NAME. It is required for the management
cluster to create a publicKey Kubernetes CRD with the
public part of your newly generated ssh_key
for the regional cluster.
exportSSH_KEY_NAME=<newRegionalClusterSshKeyName>
Run the regional cluster bootstrap script:
./bootstrap.sh deploy_regional
Note
When the bootstrap is complete, obtain and save in a secure location
the kubeconfig-<regionalClusterName> file
located in the same directory as the bootstrap script.
This file contains the admin credentials for the regional cluster.
If the bootstrap node for the regional cluster deployment is not
the same where you bootstrapped the management cluster, a new
regional ssh_key will be generated.
Make sure to save this key in a secure location as well.
The workflow of the regional cluster bootstrap script¶
#
Description
1
Prepare the bootstrap cluster for the new regional cluster.
2
Load the updated Container Cloud CRDs for Credentials,
Cluster, and Machines with information about the new
regional cluster to the management cluster.
3
Connect to each machine of the management cluster through SSH.
4
Wait for the Machines and Cluster objects of the new regional
cluster to be ready on the management cluster.
5
Load the following objects to the new regional cluster: Secret
with the management cluster kubeconfig and ClusterRole for
the Container Cloud provider.
6
Forward the bootstrap cluster endpoint to helm-controller.
7
Wait for all CRDs to be available and verify the objects
created using these CRDs.
8
Pivot the cluster API stack to the regional cluster.
9
Switch the LCM agent from the bootstrap cluster to the regional one.
10
Wait for the Container Cloud components to start on the regional
cluster.
Deploy an Equinix Metal based regional cluster with public networking¶
You can deploy an additional regional Equinix Metal based cluster with
public networking to create managed clusters of several provider types or
with different configurations.
To deploy an Equinix Metal based regional cluster:
For clusters deployed using the Container Cloud release earlier than 2.11.0
or if you deleted the kaas-bootstrap folder, download and run
the Container Cloud bootstrap script:
Prepare the Equinix Metal configuration for the new regional cluster:
Change the directory to kaas-bootstrap.
In templates/equinix/equinix-config.yaml.template,
modify spec:projectID and spec:apiToken:value using the values
obtained in the previous steps. For example:
In templates/equinix/cluster.yaml.template,
modify the default configuration of the Equinix Metal facility
depending on the previously prepared capacity settings:
providerSpec:value:...facility:am6
Also, modify other parameters as required.
Optional. In templates/equinix/machines.yaml.template,
modify the default configuration of the Equinix Metal machine type.
The minimal required type is c3.small.x86.
providerSpec:value:...machineType:c3.small.x86
Also, modify other parameters as required.
If you require all Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables
to bootstrap the regional cluster using proxy:
Optional if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org)
are accessible from the node where the regional cluster is being
provisioned. Otherwise, this step is mandatory.
Configure the regional NTP server parameters to be applied to all machines
of regional and managed clusters in the specified region.
In templates/equinix/cluster.yaml.template, add the ntp:servers
section with the list of required servers names:
Substitute the parameters enclosed in angle brackets with the corresponding
values of your cluster.
Caution
The REGION and REGIONAL_CLUSTER_NAME parameters values
must contain only lowercase alphanumeric characters, hyphens,
or periods.
Note
If the bootstrap node for the regional cluster deployment is not
the same where you bootstrapped the management cluster, also
export SSH_KEY_NAME. It is required for the management
cluster to create a publicKey Kubernetes CRD with the
public part of your newly generated ssh_key
for the regional cluster.
exportSSH_KEY_NAME=<newRegionalClusterSshKeyName>
Run the regional cluster bootstrap script:
./bootstrap.sh deploy_regional
Note
When the bootstrap is complete, obtain and save in a secure location
the kubeconfig-<regionalClusterName> file
located in the same directory as the bootstrap script.
This file contains the admin credentials for the regional cluster.
If the bootstrap node for the regional cluster deployment is not
the same where you bootstrapped the management cluster, a new
regional ssh_key will be generated.
Make sure to save this key in a secure location as well.
The workflow of the regional cluster bootstrap script¶
#
Description
1
Prepare the bootstrap cluster for the new regional cluster.
2
Load the updated Container Cloud CRDs for Credentials,
Cluster, and Machines with information about the new
regional cluster to the management cluster.
3
Connect to each machine of the management cluster through SSH.
4
Wait for the Machines and Cluster objects of the new regional
cluster to be ready on the management cluster.
5
Load the following objects to the new regional cluster: Secret
with the management cluster kubeconfig and ClusterRole for
the Container Cloud provider.
6
Forward the bootstrap cluster endpoint to helm-controller.
7
Wait for all CRDs to be available and verify the objects
created using these CRDs.
8
Pivot the cluster API stack to the regional cluster.
9
Switch the LCM agent from the bootstrap cluster to the regional one.
10
Wait for the Container Cloud components to start on the regional
cluster.
Deploy an Equinix Metal based regional cluster with private networking¶
Before you deploy an additional regional Equinix Metal based cluster with
private networking, complete the prerequisite steps described in
Prerequisites.
To deploy an Equinix Metal based regional cluster with private networking:
Log in to the bootstrap node running Ubuntu 20.04 that is configured
as described in Prerequisites. Properly connect this node
to the regional cluster VLAN.
Prepare the bootstrap script:
Download and run the Container Cloud bootstrap script:
Log in to your account and download the mirantis.lic license file.
Save the license file as mirantis.lic under the kaas-bootstrap
directory on the bootstrap node.
Verify that mirantis.lic contains the exact Container Cloud license
previously downloaded from www.mirantis.com
by decoding the license JWT token, for example, using jwt.io.
Example of a valid decoded Container Cloud license data with the mandatory
license field:
The MKE license does not apply to mirantis.lic. For
details about MKE license, see MKE documentation.
Using the Equinix Metal console, obtain the project ID and
the user-level API Key of the Equinix Metal project to be used
for the Container Cloud deployment:
Log in to the Equinix Metal console.
Select the project that you want to use for the Container Cloud deployment.
In Project Settings > General, capture your
Project ID.
In Profile Settings > Personal API Keys, capture
the existing user-level API Key or create a new one:
In Profile Settings > Personal API Keys,
click Add New Key.
Fill in the Description and select
the Read/Write permissions.
Click Add Key.
Prepare the Equinix Metal configuration:
Change the directory to kaas-bootstrap.
In templates/equinixmetalv2/equinix-config.yaml.template,
modify spec:projectID and spec:apiToken:value using the values
obtained in the previous steps. For example:
In templates/equinixmetalv2/cluster.yaml.template:
Modify the default configuration of the Equinix Metal facility
depending on the previously prepared capacity settings as described in
Prerequisites:
providerSpec:value:# ...facility:am6
Add projectSSHKeys that is the list of the Equinix Metal project
SSH key names to be attached to cluster machines. These keys are required
for access to the Equinix Metal out-of-band console Serial Over SSH
(SOS) to debug provisioning failures. We recommend adding at least one
project SSH key per cluster.
ID of the VLAN created in the corresponding Equinix Metal Metro that
the seed node and cluster nodes should be attached to.
loadBalancerHost
IP address to use for the MKE and Kubernetes API endpoints
of the cluster.
metallbRanges
List of IP ranges in the 192.168.0.129-192.168.0.200 format to use
for Kubernetes LoadBalancer services. For example, on a
management cluster, these services include the Container Cloud web UI
and Keycloak. This list should include at least 12 addresses for a
management cluster and 5 for managed clusters.
cidr
Network address in CIDR notation. For example, 192.168.0.0/24.
gateway
IP address of a gateway attached to this VLAN that provides the
necessary external connectivity.
dhcpRanges
List of IP ranges in the 192.168.0.10-192.168.0.50 format.
IP addresses from these ranges will be allocated to nodes that boot
from DHCP during the provisioning process. Should include at least
one address for each machine in the cluster.
includeRanges
List of IP ranges in the 192.168.0.51-192.168.0.128 format.
IP addresses from these ranges will be allocated as permanent
addresses of machines in this cluster. Should include at least one
address for each machine in the cluster.
excludeRanges
Optional. List of IP ranges in the 192.168.0.51-192.168.0.128 format.
IP addresses from these ranges will not be allocated as permanent
addresses of machines in this cluster.
nameservers
List of IP addresses of DNS servers that should be configured on machines.
These servers must be accessible through the gateway from the
provided VLAN. Required unless a proxy server is used.
Add the following parameters to the bootstrap.env file:
Parameter
Description
KAAS_BM_PXE_BRIDGE
Name of the bridge that will be used to provide PXE services to provision
machines during bootstrap.
KAAS_BM_PXE_IP
IP address that will be used for PXE services. Will be assigned to the
KAAS_BM_PXE_BRIDGE bridge. Must be part of the cidr parameter.
KAAS_BM_PXE_MASK
Number of bits in the network address KAAS_BM_PXE_IP.
Must match the CIDR suffix in the cidr parameter.
BOOTSTRAP_METALLB_ADDRESS_POOL
IP range in the 192.168.0.129-192.168.0.200 format that will be
used for Kubernetes LoadBalancer services in the bootstrap cluster.
Optional. In templates/equinixmetalv2/machines.yaml.template,
modify the default configuration of the Equinix Metal machine type.
The minimal required type is c3.small.x86.
Warning
Mirantis highly recommends using the c3.small.x86 machine
type for the control plane machines deployed with private network
to prevent hardware issues with incorrect BIOS boot order.
providerSpec:value:# ...machineType:c3.small.x86
Also, modify other parameters as required.
Optional if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org)
are accessible from the VLAN where the regional cluster is being
provisioned. Otherwise, this step is mandatory.
Configure the regional NTP server parameters to be applied to all machines
of regional and managed clusters in the specified region.
In templates/equinixmetalv2/cluster.yaml.template, add the ntp:servers
section with the list of required servers names:
If you require all Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables
to bootstrap the regional cluster using proxy:
Substitute the parameters enclosed in angle brackets with the corresponding
values of your cluster.
Caution
The REGION and REGIONAL_CLUSTER_NAME parameters values
must contain only lowercase alphanumeric characters, hyphens,
or periods.
Re-verify that the selected Equinix Metal facility for the regional
cluster bootstrap is still available and has enough capacity:
metal capacity check --facility $EQUINIX_FACILITY --plan $EQUINIX_MACHINE_TYPE --quantity $MACHINES_AMOUNT
In the system response, if the value in the AVAILABILITY section
has changed from true to false, find an available facility and
update the previously configured facility field in
cluster.yaml.template.
When the bootstrap is complete, obtain and save in a secure
location the kubeconfig-<regionalClusterName> file
located in the same directory as the bootstrap script.
This file contains the admin credentials for the regional cluster.
The workflow of the regional cluster bootstrap script¶
#
Description
1
Prepare the bootstrap cluster for the new regional cluster.
2
Load the updated Container Cloud CRDs for Credentials,
Cluster, and Machines with information about the new
regional cluster to the management cluster.
3
Connect to each machine of the management cluster through SSH.
4
Wait for the Machines and Cluster objects of the new regional
cluster to be ready on the management cluster.
5
Load the following objects to the new regional cluster: Secret
with the management cluster kubeconfig and ClusterRole for
the Container Cloud provider.
6
Forward the bootstrap cluster endpoint to helm-controller.
7
Wait for all CRDs to be available and verify the objects
created using these CRDs.
8
Pivot the cluster API stack to the regional cluster.
9
Switch the LCM agent from the bootstrap cluster to the regional one.
10
Wait for the Container Cloud components to start on the regional
cluster.
Establish connection to the cluster private network:
To decrease network traffic cost and not to complicate the
network infrastructure, you must deploy managed clusters in the same
region as the regional cluster to have both clusters deployed in the same
metro.
For example, if you have a management cluster with region-one in
Frankfurt and a regional cluster with region-two in Silicon Valley,
create all Frankfurt-based managed clusters in region-one and all Silicon
Valley based managed clusters in region-two.
You can deploy an additional regional baremetal-based cluster
to create managed clusters of several provider types or with different
configurations within a single Container Cloud deployment.
To deploy a baremetal-based regional cluster:
Log in to the node where you bootstrapped the Container Cloud management
cluster.
Verify that the bootstrap directory is updated.
Select from the following options:
For clusters deployed using Container Cloud 2.11.0 or later:
For clusters deployed using the Container Cloud release earlier than 2.11.0
or if you deleted the kaas-bootstrap folder, download and run
the Container Cloud bootstrap script:
Prepare the bare metal configuration for the new regional cluster:
Create a virtual bridge to connect to your PXE network on the
seed node. Use the following netplan-based configuration file
as an example:
# cat /etc/netplan/config.yamlnetwork:version:2renderer:networkdethernets:ens3:dhcp4:falsedhcp6:falsebridges:br0:addresses:# Please, adjust for your environment-10.0.0.15/24dhcp4:falsedhcp6:false# Please, adjust for your environmentgateway4:10.0.0.1interfaces:# Interface name may be different in your environment-ens3nameservers:addresses:# Please, adjust for your environment-8.8.8.8parameters:forward-delay:4stp:false
Apply the new network configuration using netplan:
sudo netplan apply
Verify the new network configuration:
sudo brctl show
Example of system response:
bridge name bridge id STP enabled interfaces
br0 8000.fa163e72f146 no ens3
Verify that the interface connected to the PXE network
belongs to the previously configured bridge.
Install the current Docker version available for Ubuntu 20.04:
sudo apt install docker.io
Verify that your logged USER has access to the Docker daemon:
sudo usermod -aG docker $USER
Log out and log in again to the seed node to apply the changes.
Verify that Docker is configured correctly and has access
to Container Cloud CDN. For example:
docker run --rm alpine sh -c "apk add --no-cache curl; \curl https://binary.mirantis.com"
The system output must contain a json file with no error messages.
In case of errors, follow the steps provided in Troubleshooting.
Note
If you require all Internet access to go through a proxy server
for security and audit purposes, configure Docker proxy settings
as described in the official
Docker documentation.
Verify that the seed node has direct access to the Baseboard Management
Controller (BMC) of each baremetal host. All target hardware nodes must
be in the poweroff state.
For example, using the IPMI tool:
ipmitool -I lanplus -H 'IPMI IP' -U 'IPMI Login' -P 'IPMI password'\
chassis power status
Example of system response:
Chassis Power is off
Prepare the deployment configuration files that contain the cluster
and machines metadata, including Ceph configuration:
Create a copy of the current templates directory for future reference.
Update the cluster definition template in
templates/bm/cluster.yaml.template
according to the environment configuration. Use the table below.
Manually set all parameters that start with SET_. For example,
SET_METALLB_ADDR_POOL.
The IP address of the externally accessible API endpoint
of the cluster. This address must NOT be
within the SET_METALLB_ADDR_POOL range but must be within
the PXE/Management network. External load balancers are not supported.
10.0.0.90
SET_METALLB_ADDR_POOL
The IP range to be used as external load balancers for the Kubernetes
services with the LoadBalancer type. This range must be within
the PXE/Management network. The minimum required range is 19 IP addresses.
The dnsmasq configuration options dhcp-option=3 and dhcp-option=6
are absent in the default configuration. So, by default, dnsmasq
will send the DNS server and default route to DHCP clients as defined in the
dnsmasq official documentation:
The netmask and broadcast address are the same as on the host
running dnsmasq.
The DNS server and default route are set to the address of the host
running dnsmasq.
If the domain name option is set, this name is sent to DHCP clients.
If such behavior is not desirable during the cluster deployment,
add the corresponding DHCP options, such as a specific gateway address
and DNS addresses, using the dnsmasq.dnsmasq_extra_opts parameter
for the baremetal-operator release in
templates/bm/cluster.yaml.template:
Optional if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org)
are accessible from the node where your cluster is being
provisioned. Otherwise, this step is mandatory.
Configure the regional NTP server parameters to be applied to all machines
of regional and managed clusters in the specified region.
In templates/bm/cluster.yaml.template, add the ntp:servers section
with the list of required servers names:
Inspect the default bare metal host profile definition in
templates/bm/baremetalhostprofiles.yaml.template.
If your hardware configuration differs from the reference,
adjust the default profile to match. For details, see
Customize the default bare metal host profile.
Warning
All data will be wiped during cluster deployment on devices
defined directly or indirectly in the fileSystems list of
BareMetalHostProfile. For example:
A raw device partition with a file system on it
A device partition in a volume group with a logical volume that has a
file system on it
An mdadm RAID device with a file system on it
An LVM RAID device with a file system on it
The wipe field is always considered true for these devices.
The false value is ignored.
Therefore, to prevent data loss, move the necessary data from these file
systems to another server beforehand, if required.
Update the bare metal hosts definition template in
templates/bm/baremetalhosts.yaml.template
according to the environment configuration. Use the table below.
Manually set all parameters that start with SET_.
The IPMI user name in the base64 encoding to access the BMC. 0
dXNlcg== (base64 encoded user)
SET_MACHINE_0_IPMI_PASSWORD
The IPMI password in the base64 encoding to access the BMC. 0
cGFzc3dvcmQ= (base64 encoded password)
SET_MACHINE_0_MAC
The MAC address of the first master node in the PXE network.
ac:1f:6b:02:84:71
SET_MACHINE_0_BMC_ADDRESS
The IP address of the BMC endpoint for the first master node in
the cluster. Must be an address from the OOB network
that is accessible through the PXE network default gateway.
192.168.100.11
SET_MACHINE_1_IPMI_USERNAME
The IPMI user name in the base64 encoding to access the BMC. 0
dXNlcg== (base64 encoded user)
SET_MACHINE_1_IPMI_PASSWORD
The IPMI password in the base64 encoding to access the BMC. 0
cGFzc3dvcmQ= (base64 encoded password)
SET_MACHINE_1_MAC
The MAC address of the second master node in the PXE network.
ac:1f:6b:02:84:72
SET_MACHINE_1_BMC_ADDRESS
The IP address of the BMC endpoint for the second master node in
the cluster. Must be an address from the OOB network
that is accessible through the PXE network default gateway.
192.168.100.12
SET_MACHINE_2_IPMI_USERNAME
The IPMI user name in the base64 encoding to access the BMC. 0
dXNlcg== (base64 encoded user)
SET_MACHINE_2_IPMI_PASSWORD
The IPMI password in the base64 encoding to access the BMC. 0
cGFzc3dvcmQ= (base64 encoded password)
SET_MACHINE_2_MAC
The MAC address of the third master node in the PXE network.
ac:1f:6b:02:84:73
SET_MACHINE_2_BMC_ADDRESS
The IP address of the BMC endpoint for the third master node in
the cluster. Must be an address from the OOB network
that is accessible through the PXE network default gateway.
You can obtain the base64-encoded user name and password using
the following command in your Linux console:
$ echo -n <username|password> | base64
Update the Subnet objects definition template in
templates/bm/ipam-objects.yaml.template
according to the environment configuration. Use the table below.
Manually set all parameters that start with SET_.
For example, SET_IPAM_POOL_RANGE.
The IP address of the externally accessible API endpoint
of the cluster. This address must NOT be
within the SET_METALLB_ADDR_POOL range but must be within the
PXE/Management network. External load balancers are not supported.
The IP address range to be used as external load balancers for the
Kubernetes services with the LoadBalancer type. This range must
be within the PXE/Management network. The minimum required range is
19 IP addresses.
Use the same value that you used for this parameter in the
cluster.yaml.template file (see above).
Optional. To configure the separated PXE and management networks instead of
one PXE/management network, proceed to Separate PXE and management networks.
Optional. To connect the cluster hosts to the PXE/Management
network using bond interfaces, proceed to Configure NIC bonding.
If you require all Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables to bootstrap
the cluster using proxy:
Optional. Technology Preview. Configure Ceph
controller to manage Ceph nodes resources. In
templates/bm/cluster.yaml.template, in the ceph-controller
section of spec.providerSpec.value.helmReleases, specify the
hyperconverge parameter with required resource requests, limits, or
tolerations:
Set up the disk configuration according to your hardware node
specification. Verify that the storageDevices section
has a valid list of HDD, SSD, or NVME device names and each
device is empty, that is, no file system is present on it.
...# This part of KaaSCephCluster should contain valid networks definitionnetwork:clusterNet:10.10.10.0/24publicNet:10.10.11.0/24...nodes:master-0:...<node_name>:...# This part of KaaSCephCluster should contain valid device namesstorageDevices:-name:sdbconfig:deviceClass:hdd# Each storageDevices dicts can have several devices-name:sdcconfig:deviceClass:hdd# All devices for Ceph also should be described to ``wipe`` in# ``baremetalhosts.yaml.template``-name:sddconfig:deviceClass:hdd# Do not to include first devices here (like vda or sda)# because they will be allocated for operating system
In machines.yaml.template, verify that the metadata:name
structure matches the machine names in the spec:nodes
structure of kaascephcluster.yaml.template.
Verify that the kaas-bootstrap directory contains the following files:
The provisioning IP address. This address will be assigned to the
interface of the seed node defined by the KAAS_BM_PXE_BRIDGE
parameter (see below). The PXE service of the bootstrap cluster will
use this address to network boot the bare metal hosts for the
cluster.
10.0.0.20
KAAS_BM_PXE_MASK
The CIDR prefix for the PXE network. It will be used with KAAS_BM_PXE_IP
address when assigning it to network interface.
24
KAAS_BM_PXE_BRIDGE
The PXE network bridge name. The name must match the name
of the bridge created on the seed node during the
Prepare the seed node stage.
br0
KAAS_BM_BM_DHCP_RANGE
The start_ip and end_ip addresses must be within the PXE network.
This range will be used by dnsmasq to provide IP addresses for nodes
during provisioning.
10.0.0.30,10.0.0.49,255.255.255.0
BOOTSTRAP_METALLB_ADDRESS_POOL
The pool of IP addresses that will be used by services
in the bootstrap cluster. Can be the same as the
SET_METALLB_ADDR_POOL range for the cluster, or a different range.
10.0.0.61-10.0.0.80
Run the verification preflight script to validate the deployment
templates configuration:
./bootstrap.sh preflight
The command outputs a human-readable report with the verification details.
The report includes the list of verified bare metal nodes and their
ChassisPower status.
This status is based on the deployment templates configuration used
during the verification.
Caution
If the report contains information about missing dependencies
or incorrect configuration, fix the issues before proceeding
to the next step.
Verify that the following provider selection parameters are unset:
Substitute the parameters enclosed in angle brackets with the corresponding
values of your cluster.
Caution
The REGION and REGIONAL_CLUSTER_NAME parameters values
must contain only lowercase alphanumeric characters, hyphens,
or periods.
Note
If the bootstrap node for the regional cluster deployment is not
the same where you bootstrapped the management cluster, also
export SSH_KEY_NAME. It is required for the management
cluster to create a publicKey Kubernetes CRD with the
public part of your newly generated ssh_key
for the regional cluster.
exportSSH_KEY_NAME=<newRegionalClusterSshKeyName>
Run the regional cluster bootstrap script:
./bootstrap.sh deploy_regional
Note
When the bootstrap is complete, obtain and save in a secure location
the kubeconfig-<regionalClusterName> file
located in the same directory as the bootstrap script.
This file contains the admin credentials for the regional cluster.
If the bootstrap node for the regional cluster deployment is not
the same where you bootstrapped the management cluster, a new
regional ssh_key will be generated.
Make sure to save this key in a secure location as well.
The workflow of the regional cluster bootstrap script¶
#
Description
1
Prepare the bootstrap cluster for the new regional cluster.
2
Load the updated Container Cloud CRDs for Credentials,
Cluster, and Machines with information about the new
regional cluster to the management cluster.
3
Connect to each machine of the management cluster through SSH.
4
Wait for the Machines and Cluster objects of the new regional
cluster to be ready on the management cluster.
5
Load the following objects to the new regional cluster: Secret
with the management cluster kubeconfig and ClusterRole for
the Container Cloud provider.
6
Forward the bootstrap cluster endpoint to helm-controller.
7
Wait for all CRDs to be available and verify the objects
created using these CRDs.
8
Pivot the cluster API stack to the regional cluster.
9
Switch the LCM agent from the bootstrap cluster to the regional one.
10
Wait for the Container Cloud components to start on the regional
cluster.
You can deploy an additional regional VMware vSphere-based cluster
to create managed clusters of several provider types or with different
configurations.
To deploy a vSphere-based regional cluster:
Log in to the node where you bootstrapped a management
cluster.
Verify that the bootstrap directory is updated.
Select from the following options:
For clusters deployed using Container Cloud 2.11.0 or later:
For clusters deployed using the Container Cloud release earlier than 2.11.0
or if you deleted the kaas-bootstrap folder, download and run
the Container Cloud bootstrap script:
Port of the vCenter Server. For example, port:"8443".
Leave empty to use 443 by default.
SET_VSPHERE_DATACENTER
vSphere data center name.
SET_VSPHERE_SERVER_INSECURE
Flag that controls validation of the vSphere Server certificate.
Must be true or false.
SET_VSPHERE_CAPI_PROVIDER_USERNAME
vSphere Cluster API provider user name that you added when
preparing the deployment user setup and permissions.
SET_VSPHERE_CAPI_PROVIDER_PASSWORD
vSphere Cluster API provider user password.
SET_VSPHERE_CLOUD_PROVIDER_USERNAME
vSphere Cloud Provider deployment user name that you added when
preparing the deployment user setup and permissions.
SET_VSPHERE_CLOUD_PROVIDER_PASSWORD
vSphere Cloud Provider deployment user password.
Modify the templates/vsphere/cluster.yaml.template parameters
to fit your deployment. For example, add the corresponding values
for cidrBlocks in the spec::clusterNetwork::services section.
Provide the following additional parameters for a proper network setup
on machines using embedded IP address management (IPAM)
in templates/vsphere/cluster.yaml.template
Enables IPAM. Set to true for networks without DHCP.
SET_VSPHERE_NETWORK_CIDR
CIDR of the provided vSphere network. For example, 10.20.0.0/16.
SET_VSPHERE_NETWORK_GATEWAY
Gateway of the provided vSphere network.
SET_VSPHERE_CIDR_INCLUDE_RANGES
Optional. IP range for the cluster machines. Specify the range of the provided CIDR.
For example, 10.20.0.100-10.20.0.200.
SET_VSPHERE_CIDR_EXCLUDE_RANGES
Optional. IP ranges to be excluded from being assigned to the cluster
machines. The MetalLB range and SET_LB_HOST should not intersect with
the addresses for IPAM. For example, 10.20.0.150-10.20.0.170.
SET_VSPHERE_NETWORK_NAMESERVERS
List of nameservers for the provided vSphere network.
For RHEL deployments, fill out
templates/vsphere/rhellicenses.yaml.template
using one of the following set of parameters for RHEL machines subscription:
The user name and password of your RedHat Customer Portal account
associated with your RHEL license for Virtual Datacenters.
Optionally, provide the subscription allocation pools to use for the RHEL
subscriptions activation. If not needed, remove the poolIDs field
for subscription-manager to automatically select the licenses for
machines.
The activation key and organization ID associated with your RedHat
account with RHEL license for Virtual Datacenters. The activation key can
be created by the organization administrator on RedHat Customer Portal.
If you use the RedHat Satellite server for management of your
RHEL infrastructure, you can provide a pre-generated activation key from
that server. In this case:
Provide the URL to the RedHat Satellite RPM for installation
of the CA certificate that belongs to that server.
Configure squid-proxy on the management or regional cluster to allow
access to your Satellite server. For details, see Configure squid-proxy.
For RHEL 8.4 TechPreview, verify mirrors
configuration for your activation key. For more details,
see RHEL 8 mirrors configuration.
Caution
Provide only one set of parameters.
Mixing of parameters from different activation methods
will cause deployment failure.
For CentOS deployments, in templates/vsphere/rhellicenses.yaml.template,
remove all lines under items:.
Optional if servers from the Ubuntu NTP pool (*.ubuntu.pool.ntp.org)
are accessible from the node where the regional cluster is being
provisioned. Otherwise, this step is mandatory.
Configure the regional NTP server parameters to be applied to all machines
of regional and managed clusters in the specified region.
In templates/vsphere/cluster.yaml.template, add the ntp:servers
section with the list of required servers names:
The <rhel-license-name> value is the RHEL license name defined
in rhellicenses.yaml.tempalte, defaults to
kaas-mgmt-rhel-license.
Remove or comment out this parameter for CentOS deployments.
Optional. If you require all Internet access to go through a proxy server,
in bootstrap.env, add the following environment variables
to bootstrap the regional cluster using proxy:
Substitute the parameters enclosed in angle brackets with the corresponding
values of your cluster.
Caution
The REGION and REGIONAL_CLUSTER_NAME parameters values
must contain only lowercase alphanumeric characters, hyphens,
or periods.
Note
If the bootstrap node for the regional cluster deployment is not
the same where you bootstrapped the management cluster, also
export SSH_KEY_NAME. It is required for the management
cluster to create a publicKey Kubernetes CRD with the
public part of your newly generated ssh_key
for the regional cluster.
exportSSH_KEY_NAME=<newRegionalClusterSshKeyName>
Run the regional cluster bootstrap script:
./bootstrap.sh deploy_regional
Note
When the bootstrap is complete, obtain and save in a secure location
the kubeconfig-<regionalClusterName> file
located in the same directory as the bootstrap script.
This file contains the admin credentials for the regional cluster.
If the bootstrap node for the regional cluster deployment is not
the same where you bootstrapped the management cluster, a new
regional ssh_key will be generated.
Make sure to save this key in a secure location as well.
The workflow of the regional cluster bootstrap script¶
#
Description
1
Prepare the bootstrap cluster for the new regional cluster.
2
Load the updated Container Cloud CRDs for Credentials,
Cluster, and Machines with information about the new
regional cluster to the management cluster.
3
Connect to each machine of the management cluster through SSH.
4
Wait for the Machines and Cluster objects of the new regional
cluster to be ready on the management cluster.
5
Load the following objects to the new regional cluster: Secret
with the management cluster kubeconfig and ClusterRole for
the Container Cloud provider.
6
Forward the bootstrap cluster endpoint to helm-controller.
7
Wait for all CRDs to be available and verify the objects
created using these CRDs.
8
Pivot the cluster API stack to the regional cluster.
9
Switch the LCM agent from the bootstrap cluster to the regional one.
10
Wait for the Container Cloud components to start on the regional
cluster.
Create initial users after a management cluster bootstrap¶
Once you bootstrap your management or regional cluster,
create Keycloak users for access to the Container Cloud web UI.
Use the created credentials to log in to the Container Cloud web UI.
Mirantis recommends creating at least two users, user and operator,
that are required for a typical Container Cloud deployment.
To create the user for access to the Container Cloud web UI, use the following
command:
./container-cloud bootstrap user add --username <userName> --roles <roleName>
--kubeconfig <pathToMgmtKubeconfig>
Note
You will be asked for the user password interactively.
Set the following command flags as required:
Flag
Description
--username
Required. Name of the user to create.
--roles
Required. Role to assign to the user:
If you run the command without the --namespace flag,
you can assign the following roles:
global-admin - read and write access for global role bindings
writer - read and write access
reader - view access
operator - required for bare metal deployments only
to create and manage the BaremetalHost objects
If you run the command for a specific project using the --namespace
flag, you can assign the following roles:
operator or writer - read and write access
user or reader - view access
bm-pool-operator - required for bare metal deployments only
to create and manage the BaremetalHost objects
--kubeconfig
Required. Path to the management cluster kubeconfig generated during
the management cluster bootstrap.
--namespace
Optional. Name of the Container Cloud project where the user will be
created. If not set, a global user will be created for all Container
Cloud projects with the corresponding role access to view or manage
all Container Cloud public objects.
--password-stdin
Optional. Flag to provide the user password from a file or stdin:
/objects/cluster - logs of the non-namespaced Kubernetes objects
/objects/namespaced - logs of the namespaced Kubernetes objects
/objects/namespaced/<namespaceName>/core/pods
- pods logs from a specified Kubernetes namespace
/objects/namespaced/<namespaceName>/core/pods/<containerName>.prev.log
- logs of the pods from a specified Kubernetes namespace
that were previously removed or failed
/objects/namespaced/<namespaceName>/core/pods/<ironicPodName>/syslog.logTechnology Preview - Ironic pod logs of the bare metal clusters
Note
Logs collected by the syslog container during the bootstrap phase
are not transferred to the management cluster during pivoting.
These logs are located in
/volume/log/ironic/ansible_conductor.log inside the Ironic
pod.
Depending on the type of issue found in logs, apply the corresponding fixes.
For example, if you detect the LoadBalancerERRORstate errors
during the bootstrap of an OpenStack-based management cluster,
contact your system administrator to fix the issue.
To troubleshoot other issues, refer to the corresponding section
in Troubleshooting.
If you have issues related to the default network address configuration,
cURL either hangs or the following error occurs:
curl: (7) Failed to connect to xxx.xxx.xxx.xxx port xxxx: Host is unreachable
The issue may occur because the default Docker network address
172.17.0.0/16 and/or the kind Docker network, which is used by
kind, overlap with your cloud address or other addresses
of the network configuration.
Workaround:
Log in to your local machine.
Verify routing to the IP addresses of the target cloud endpoints:
Obtain the IP address of your target cloud. For example: