Mirantis Container Cloud Documentation

The documentation is intended to help operators understand the core concepts of the product.

The information provided in this documentation set is being constantly improved and amended based on the feedback and kind requests from our software consumers. This documentation set outlines description of the features that are supported within two latest Cloud Container minor releases, with a corresponding note Available since release.

The following table lists the guides included in the documentation set you are reading:

Guides list

Guide

Purpose

Reference Architecture

Learn the fundamentals of Container Cloud reference architecture to plan your deployment.

Deployment Guide

Deploy Container Cloud of a preferred configuration using supported deployment profiles tailored to the demands of specific business cases.

Operations Guide

Deploy and operate the Container Cloud managed clusters.

Release compatibility matrix

Deployment compatibility of the Container Cloud components versions for each product release.

Release Notes

Learn about new features and bug fixes in the current Container Cloud version as well as in the Container Cloud minor releases.

QuickStart Guides

Easy and lightweight instructions to get started with Container Cloud.

Intended audience

This documentation assumes that the reader is familiar with network and cloud concepts and is intended for the following users:

  • Infrastructure Operator

    • Is member of the IT operations team

    • Has working knowledge of Linux, virtualization, Kubernetes API and CLI, and OpenStack to support the application development team

    • Accesses Mirantis Container Cloud and Kubernetes through a local machine or web UI

    • Provides verified artifacts through a central repository to the Tenant DevOps engineers

  • Tenant DevOps engineer

    • Is member of the application development team and reports to line-of-business (LOB)

    • Has working knowledge of Linux, virtualization, Kubernetes API and CLI to support application owners

    • Accesses Container Cloud and Kubernetes through a local machine or web UI

    • Consumes artifacts from a central repository approved by the Infrastructure Operator

Conventions

This documentation set uses the following conventions in the HTML format:

Documentation conventions

Convention

Description

boldface font

Inline CLI tools and commands, titles of the procedures and system response examples, table titles.

monospaced font

Files names and paths, Helm charts parameters and their values, names of packages, nodes names and labels, and so on.

italic font

Information that distinguishes some concept or term.

Links

External links and cross-references, footnotes.

Main menu > menu item

GUI elements that include any part of interactive user interface and menu navigation.

Superscript

Some extra, brief information. For example, if a feature is available from a specific release or if a feature is in the Technology Preview development stage.

Note

The Note block

Messages of a generic meaning that may be useful to the user.

Caution

The Caution block

Information that prevents a user from mistakes and undesirable consequences when following the procedures.

Warning

The Warning block

Messages that include details that can be easily missed, but should not be ignored by the user and are valuable before proceeding.

See also

The See also block

List of references that may be helpful for understanding of some related tools, concepts, and so on.

Learn more

The Learn more block

Used in the Release Notes to wrap a list of internal references to the reference architecture, deployment and operation procedures specific to a newly implemented product feature.

Technology Preview support scope

This documentation set includes description of the Technology Preview features. A Technology Preview feature provides early access to upcoming product innovations, allowing customers to experience the functionality and provide feedback during the development process. Technology Preview features may be privately or publicly available and neither are intended for production use. While Mirantis will provide support for such features through official channels, normal Service Level Agreements do not apply. Customers may be supported by Mirantis Customer Support or Mirantis Field Support.

As Mirantis considers making future iterations of Technology Preview features generally available, we will attempt to resolve any issues that customers experience when using these features.

During the development of a Technology Preview feature, additional components may become available to the public for testing. Because Technology Preview features are being under development, Mirantis cannot guarantee the stability of such features. As a result, if you are using Technology Preview features, you may not be able to seamlessly upgrade to subsequent releases of that feature. Mirantis makes no guarantees that Technology Preview features will be graduated to a generally available product release.

The Mirantis Customer Success Organization may create bug reports on behalf of support cases filed by customers. These bug reports will then be forwarded to the Mirantis Product team for possible inclusion in a future release.

Documentation history

The documentation set refers to Mirantis Container Cloud GA as to the latest released GA version of the product. For details about the Container Cloud GA minor releases dates, refer to Container Cloud releases.

Product Overview

Mirantis Container Cloud enables you to ship code faster by enabling speed with choice, simplicity, and security. Through a single pane of glass you can deploy, manage, and observe Kubernetes clusters on public clouds, private clouds, or bare metal infrastructure. Mirantis Container Cloud provides the ability to leverage multiple on premises (VMware, OpenStack, and bare metal) and public cloud (AWS, Azure, Equinix Metal) infrastructure.

The list of the most common use cases includes:

Multi-cloud

Organizations are increasingly moving toward a multi-cloud strategy, with the goal of enabling the effective placement of workloads over multiple platform providers. Multi-cloud strategies can introduce a lot of complexity and management overhead. Mirantis Container Cloud enables you to effectively deploy and manage container clusters (Kubernetes and Swarm) across multiple cloud provider platforms, both on premises and in the cloud.

Hybrid cloud

The challenges of consistently deploying, tracking, and managing hybrid workloads across multiple cloud platforms is compounded by not having a single point that provides information on all available resources. Mirantis Container Cloud enables hybrid cloud workload by providing a central point of management and visibility of all your cloud resources.

Kubernetes cluster lifecycle management

The consistent lifecycle management of a single Kubernetes cluster is a complex task on its own that is made infinitely more difficult when you have to manage multiple clusters across different platforms spread across the globe. Mirantis Container Cloud provides a single, centralized point from which you can perform full lifecycle management of your container clusters, including automated updates and upgrades. We also support attaching existing Mirantis Kubernetes Engine clusters.

Highly regulated industries

Regulated industries need a fine level of access control granularity, high security standards and extensive reporting capabilities to ensure that they can meet and exceed the security standards and requirements. Mirantis Container Cloud provides for a fine-grained Role Based Access Control (RBAC) mechanism and easy integration and federation to existing identity management systems (IDM).

Logging, monitoring, alerting

A complete operational visibility is required to identify and address issues in the shortest amount of time – before the problem becomes serious. Mirantis StackLight is the proactive monitoring, logging, and alerting solution designed for large-scale container and cloud observability with extensive collectors, dashboards, trend reporting and alerts.

Storage

Cloud environments require a unified pool of storage that can be scaled up by simply adding storage server nodes. Ceph is a unified, distributed storage system designed for excellent performance, reliability, and scalability. Deploy Ceph utilizing Rook to provide and manage a robust persistent storage that can be used by Kubernetes workloads on the bare metal and Equinix Metal based clusters.

Security

Security is a core concern for all enterprises, especially with more of our systems being exposed to the Internet as a norm. Mirantis Container Cloud provides for a multi-layered security approach that includes effective identity management and role based authentication, secure out of the box defaults and extensive security scanning and monitoring during the development process.

5G and Edge

The introduction of 5G technologies and the support of Edge workloads requires an effective multi-tenant solution to manage the underlying container infrastructure. Mirantis Container Cloud provides for a full stack, secure, multi-cloud cluster management and Day-2 operations solution that supports both on premises bare metal and cloud.

Reference Architecture

Overview

Mirantis Container Cloud is a set of microservices that are deployed using Helm charts and run in a Kubernetes cluster. Container Cloud is based on the Kubernetes Cluster API community initiative.

The following diagram illustrates an overview of Container Cloud and the clusters it manages:

_images/cluster-overview.png

All artifacts used by Kubernetes and workloads are stored on the Container Cloud content delivery network (CDN):

  • mirror.mirantis.com (Debian packages including the Ubuntu mirrors)

  • binary.mirantis.com (Helm charts and binary artifacts)

  • mirantis.azurecr.io (Docker image registry)

All Container Cloud components are deployed in the Kubernetes clusters. All Container Cloud APIs are implemented using the Kubernetes Custom Resource Definition (CRD) that represents custom objects stored in Kubernetes and allows you to expand Kubernetes API.

The Container Cloud logic is implemented using controllers. A controller handles the changes in custom resources defined in the controller CRD. A custom resource consists of a spec that describes the desired state of a resource provided by a user. During every change, a controller reconciles the external state of a custom resource with the user parameters and stores this external state in the status subresource of its custom resource.

Container Cloud regions

Container Cloud can have several regions. A region is a physical location, for example, a data center, that has access to one or several cloud provider back ends. A separate regional cluster manages a region that can include multiple providers. A region must have a two-way (full) network connectivity between a regional cluster and a cloud provider back end. For example, an OpenStack VM must have access to the related regional cluster. And this regional cluster must have access to the OpenStack floating IPs and load balancers.

The following diagram illustrates the structure of the Container Cloud regions:

_images/regions.png
Container Cloud cluster types

The types of the Container Cloud clusters include:

Bootstrap cluster
  • Runs the bootstrap process on a seed node. For the OpenStack, AWS, Equinix Metal, Microsoft Azure, or VMware vSphere-based Container Cloud, it can be an operator desktop computer. For the baremetal-based Container Cloud, this is the first temporary data center node.

  • Requires access to a provider back end: OpenStack, AWS, Azure, vSphere, Equinix Metal, or bare metal.

  • Contains minimum set of services to deploy the management and regional clusters.

  • Is destroyed completely after a successful bootstrap.

Management and regional clusters
  • Management cluster:

    • Runs all public APIs and services including the web UIs of Container Cloud.

    • Does not require access to any provider back end.

  • Regional cluster:

    • Is combined with management cluster by default.

    • Runs the provider-specific services and internal API including LCMMachine and LCMCluster. Also, it runs an LCM controller for orchestrating managed clusters and other controllers for handling different resources.

    • Requires two-way access to a provider back end. The provider connects to a back end to spawn managed cluster nodes, and the agent running on the nodes accesses the regional cluster to obtain the deployment information.

    • Requires access to a management cluster to obtain user parameters.

    • Supports multi-regional deployments. For example, you can deploy an AWS-based management cluster with AWS-based and OpenStack-based regional clusters.

Management and regional clusters comprise Container Cloud as product. For deployment details, see Deployment Guide and Deploy an additional regional cluster (optional) sections for the required cloud provider.

Managed cluster
  • A Mirantis Kubernetes Engine (MKE) cluster that an end user creates using the Container Cloud web UI.

  • Requires access to a regional cluster. Each node of a managed cluster runs an LCM agent that connects to the LCM machine of the regional cluster to obtain the deployment details.

  • An attached MKE cluster that is not created using Container Cloud. In such case, nodes of the attached cluster do not contain LCM agent. For supported MKE versions that can be attached to Container Cloud, see Release compatibility matrix.

  • Baremetal-based managed clusters support the Mirantis OpenStack for Kubernetes (MOS) product. For details, see MOS documentation.

All types of the Container Cloud clusters except the bootstrap cluster are based on the MKE and Mirantis Container Runtime (MCR) architecture. For details, see MKE and MCR documentation.

The following diagram illustrates the distribution of services between each type of the Container Cloud clusters:

_images/cluster-types.png

Cloud provider

The Mirantis Container Cloud provider is the central component of Container Cloud that provisions a node of a management, regional, or managed cluster and runs the LCM agent on this node. It runs in a management and regional clusters and requires connection to a provider back end.

The Container Cloud provider interacts with the following types of public API objects:

Public API object name

Description

Container Cloud release object

Contains the following information about clusters:

  • Version of the supported Cluster release for a management and regional clusters

  • List of supported Cluster releases for the managed clusters and supported upgrade path

  • Description of Helm charts that are installed on the management and regional clusters depending on the selected provider

Cluster release object

  • Provides a specific version of a management, regional, or managed cluster. Any Cluster release object, as well as a Container Cloud release object never changes, only new releases can be added. Any change leads to a new release of a cluster.

  • Contains references to all components and their versions that are used to deploy all cluster types:

    • LCM components:

      • LCM agent

      • Ansible playbooks

      • Scripts

      • Description of steps to execute during a cluster deployment and upgrade

      • Helm controller image references

    • Supported Helm charts description:

      • Helm chart name and version

      • Helm release name

      • Helm values

Cluster object

  • References the Credentials, KaaSRelease and ClusterRelease objects.

  • Is tied to a specific Container Cloud region and provider.

  • Represents all cluster-level resources. For example, for the OpenStack-based clusters, it represents networks, load balancer for the Kubernetes API, and so on. It uses data from the Credentials object to create these resources and data from the KaaSRelease and ClusterRelease objects to ensure that all lower-level cluster objects are created.

Machine object

  • References the Cluster object.

  • Represents one node of a managed cluster, for example, an OpenStack VM, and contains all data to provision it.

Credentials object

  • Contains all information necessary to connect to a provider back end.

  • Is tied to a specific Container Cloud region and provider.

PublicKey object

Is provided to every machine to obtain an SSH access.

The following diagram illustrates the Container Cloud provider data flow:

_images/provider-dataflow.png

The Container Cloud provider performs the following operations in Container Cloud:

  • Consumes the below types of data from a management and regional cluster:

    • Credentials to connect to a provider back end

    • Deployment instructions from the KaaSRelease and ClusterRelease objects

    • The cluster-level parameters from the Cluster objects

    • The machine-level parameters from the Machine objects

  • Prepares data for all Container Cloud components:

    • Creates the LCMCluster and LCMMachine custom resources for LCM controller and LCM agent. The LCMMachine custom resources are created empty to be later handled by the LCM controller.

    • Creates the the HelmBundle custom resources for the Helm controller using data from the KaaSRelease and ClusterRelease objects.

    • Creates service accounts for these custom resources.

    • Creates a scope in Identity and access management (IAM) for a user access to a managed cluster.

  • Provisions nodes for a managed cluster using the cloud-init script that downloads and runs the LCM agent.

Release controller

The Mirantis Container Cloud release controller is responsible for the following functionality:

  • Monitor and control the KaaSRelease and ClusterRelease objects present in a management cluster. If any release object is used in a cluster, the release controller prevents the deletion of such an object.

  • Sync the KaaSRelease and ClusterRelease objects published at https://binary.mirantis.com/releases/ with an existing management cluster.

  • Trigger the Container Cloud auto-upgrade procedure if a new KaaSRelease object is found:

    1. Search for the managed clusters with old Cluster releases that are not supported by a new Container Cloud release. If any are detected, abort the auto-upgrade and display a corresponding note about an old Cluster release in the Container Cloud web UI for the managed clusters. In this case, a user must update all managed clusters using the Container Cloud web UI. Once all managed clusters are upgraded to the Cluster releases supported by a new Container Cloud release, the Container Cloud auto-upgrade is retriggered by the release controller.

    2. Trigger the Container Cloud release upgrade of all Container Cloud components in a management cluster. The upgrade itself is processed by the Container Cloud provider.

    3. Trigger the Cluster release upgrade of a management cluster to the Cluster release version that is indicated in the upgraded Container Cloud release version. The LCMCluster components, such as MKE, are upgraded before the HelmBundle components, such as StackLight or Ceph.

    4. Verify the regional cluster(s) status. If the regional cluster is ready, trigger the Cluster release upgrade of the regional cluster.

      Once a management cluster is upgraded, an option to update a managed cluster becomes available in the Container Cloud web UI. During a managed cluster update, all cluster components including Kubernetes are automatically upgraded to newer versions if available. The LCMCluster components, such as MKE, are upgraded before the HelmBundle components, such as StackLight or Ceph.

Container Cloud remains operational during the management and regional clusters upgrade. Managed clusters are not affected during this upgrade. For the list of components that are updated during the Container Cloud upgrade, see the Components versions section of the corresponding Container Cloud release in Release Notes.

When Mirantis announces support of the newest versions of Mirantis Container Runtime (MCR) and Mirantis Kubernetes Engine (MKE), Container Cloud automatically upgrades these components as well. For the maintenance window best practices before upgrade of these components, see MKE Documentation.

Web UI

The Mirantis Container Cloud web UI is mainly designed to create and update the managed clusters as well as add or remove machines to or from an existing managed cluster. It also allows attaching existing Mirantis Kubernetes Engine (MKE) clusters.

You can use the Container Cloud web UI to obtain the management cluster details including endpoints, release version, and so on. The management cluster update occurs automatically with a new release change log available through the Container Cloud web UI.

The Container Cloud web UI is a JavaScript application that is based on the React framework. The Container Cloud web UI is designed to work on a client side only. Therefore, it does not require a special back end. It interacts with the Kubernetes and Keycloak APIs directly. The Container Cloud web UI uses a Keycloak token to interact with Container Cloud API and download kubeconfig for the management and managed clusters.

The Container Cloud web UI uses NGINX that runs on a management cluster and handles the Container Cloud web UI static files. NGINX proxies the Kubernetes and Keycloak APIs for the Container Cloud web UI.

Bare metal

The bare metal service provides for the discovery, deployment, and management of bare metal hosts.

The bare metal management in Mirantis Container Cloud is implemented as a set of modular microservices. Each microservice implements a certain requirement or function within the bare metal management system.

Bare metal components

The bare metal management solution for Mirantis Container Cloud includes the following components:

Bare metal components

Component

Description

OpenStack Ironic

The back-end bare metal manager in a standalone mode with its auxiliary services that include httpd, dnsmasq, and mariadb.

OpenStack Ironic Inspector

Introspects and discovers the bare metal hosts inventory. Includes OpenStack Ironic Python Agent (IPA) that is used as a provision-time agent for managing bare metal hosts.

Ironic Operator

Monitors changes in the external IP addresses of httpd, ironic, and ironic-inspector and automatically reconciles the configuration for dnsmasq, ironic, baremetal-provider, and baremetal-operator.

Bare Metal Operator

Manages bare metal hosts through the Ironic API. The Container Cloud bare-metal operator implementation is based on the Metal³ project.

cluster-api-provider-baremetal

The plugin for the Kubernetes Cluster API integrated with Container Cloud. Container Cloud uses the Metal³ implementation of cluster-api-provider-baremetal for the Cluster API.

LCM agent

Used for physical and logical storage, physical and logical network, and control over the life cycle of a bare metal machine resources.

Ceph

Distributed shared storage is required by the Container Cloud services to create persistent volumes to store their data.

MetalLB

Load balancer for Kubernetes services on bare metal. 1

NGINX

Load balancer for external access to the Kubernetes API endpoint.

Keepalived

Monitoring service that ensures availability of the virtual IP for the external load balancer endpoint (NGINX). 1

IPAM

IP address management services provide consistent IP address space to the machines in bare metal clusters. See details in IP Address Management.

1(1,2)

For details, see Built-in load balancing.

The diagram below summarizes the following components and resource kinds:

  • Metal³-based bare metal management in Container Cloud (white)

  • Internal APIs (yellow)

  • External dependency components (blue)

_images/bm-component-stack.png
Bare metal networking

This section provides an overview of the networking configuration and the IP address management in the Mirantis Container Cloud on bare metal.

IP Address Management

Mirantis Container Cloud on bare metal uses IP Address Management (IPAM) to keep track of the network addresses allocated to bare metal hosts. This is necessary to avoid IP address conflicts and expiration of address leases to machines through DHCP.

IPAM is provided by the kaas-ipam controller. Its functions include:

  • Allocation of IP address ranges or subnets to newly created clusters using SubnetPool and Subnet resources.

  • Allocation IP addresses to machines and cluster services at the request of baremetal-provider using the IpamHost and IPaddr resources.

  • Creation and maintenance of host networking configuration on the bare metal hosts using the IpamHost resources.

The IPAM service can support different networking topologies and network hardware configurations on the bare metal hosts.

In the most basic network configuration, IPAM uses a single L3 network to assign addresses to all bare metal hosts, as defined in Managed cluster networking.

You can apply complex networking configurations to a bare metal host using the L2 templates. The L2 templates imply multihomed host networking and enable you to create a managed cluster where nodes use separate host networks for different types of traffic. Multihoming is required to ensure the security and performance of a managed cluster.

Management cluster networking

The main purpose of networking in a Container Cloud management or regional cluster is to provide access to the Container Cloud Management API that consists of the Kubernetes API of the Container Cloud management and regional clusters and the Container Cloud LCM API. This API allows end users to provision and configure managed clusters and machines. Also, this API is used by LCM agents in managed clusters to obtain configuration and report status.

The following types of networks are supported for the management and regional clusters in Container Cloud:

  • PXE/Management network

    Enables PXE boot of all bare metal machines in the Container Cloud region. Connects LCM agents running on the hosts to the Container Cloud LCM API. Serves the external connections to the Container Cloud Management API. In management and regional clusters, this network also serves storage traffic for the built-in Ceph cluster.

    • PXE subnet

      Provides IP addresses for DHCP and network boot of the bare metal hosts for initial inspection and operating system provisioning. This network may not have the default gateway or a router connected to it. The PXE subnet is defined by the Container Cloud Operator during bootstrap.

    • LCM subnet

      Provides IP addresses for the Kubernetes nodes in the management cluster. This network also provides a Virtual IP (VIP) address for the load balancer that enables external access to the Kubernetes API of a management cluster. This VIP is also the endpoint to access the Container Cloud Management API in the management cluster.

      Provides IP addresses for the services of Container Cloud, such as bare metal provisioning service (Ironic). These addresses are allocated and served by MetalLB.

  • Kubernetes workloads network

    Technology Preview

    Serves the internal traffic between workloads on the management cluster.

    • Kubernetes workloads subnet

      Provides IP addresses that are assigned to nodes and used by Calico.

  • Out-of-Band (OOB) network

    Connects to Baseboard Management Controllers of the servers that host the management cluster. The OOB subnet must be accessible from the management network through IP routing. The OOB network is not managed by Container Cloud and is not represented in the IPAM API.

Managed cluster networking

A Kubernetes cluster networking is typically focused on connecting pods on different nodes. On bare metal, however, the cluster networking is more complex as it needs to facilitate many different types of traffic.

Kubernetes clusters managed by Mirantis Container Cloud have the following types of traffic:

  • PXE/Lifecycle management (LCM) network

    Enables the PXE boot of all bare metal machines in Container Cloud. Connects LCM agents running on the hosts to the Container Cloud LCM API. The LCM API is provided by the regional or management cluster.

    • LCM subnet

      Provides IP addresses that are statically allocated by the IPAM service to bare metal hosts. This network must be connected to the Kubernetes API endpoint of the regional cluster through an IP router. LCM agents running on managed clusters will connect to the regional cluster API through this router. LCM subnets may be different per managed cluster as long as this connection requirement is satisfied. The Virtual IP (VIP) address for load balancer that enables access to the Kubernetes API of the managed cluster must be allocated from the LCM subnet.

  • Kubernetes workloads network

    Technology Preview

    Serves as an underlay network for traffic between pods in the managed cluster. This network should not be shared between clusters.

    • Kubernetes workloads subnet

      Provides IP addresses that are assigned to nodes and used by Calico.

  • Kubernetes external network

    Serves ingress traffic to the managed cluster from the outside world. This network can be shared between clusters, but must have a dedicated subnet per cluster.

    • Services subnet

      Technology Preview

      Provides IP addresses for externally available load-balanced services. The address ranges for MetalLB are assigned from this subnet. This subnet must be unique per managed cluster.

  • Storage network

    Serves storage access and replication traffic from and to Ceph OSD services. The storage network does not need to be connected to any IP routers and does not require external access, unless you want to use Ceph from outside of a Kubernetes cluster. To use a dedicated storage network, define and configure both subnets listed below.

    • Storage access subnet

      Provides IP addresses that are assigned to Ceph nodes. The Ceph OSD services bind to these addresses on their respective nodes. Serves Ceph access traffic from and to storage clients. This is a public network in Ceph terms. 1 This subnet is unique per managed cluster.

    • Storage replication subnet

      Provides IP addresses that are assigned to Ceph nodes. The Ceph OSD services bind to these addresses on their respective nodes. Serves Ceph internal replication traffic. This is a cluster network in Ceph terms. 1 This subnet is unique per managed cluster.

  • Out-of-Band (OOB) network

    Connects baseboard management controllers (BMCs) of the bare metal hosts. This network must not be accessible from the managed clusters.

The following diagram illustrates the networking schema of the Container Cloud deployment on bare metal with a managed cluster:

_images/bm-cluster-l3-networking-multihomed.png
1(1,2)

For more details about Ceph networks, see Ceph Network Configuration Reference.

Host networking

The following network roles are defined for all Mirantis Container Cloud clusters nodes on bare metal including the bootstrap, management, regional, and managed cluster nodes:

  • Out-of-band (OOB) network

    Connects the Baseboard Management Controllers (BMCs) of the hosts in the network to Ironic. This network is out of band for the host operating system.

  • PXE/LCM network

    Enables remote booting of servers through the PXE protocol. In management or regional clusters, DHCP server listens on this network for hosts discovery and inspection. In managed clusters, hosts use this network for the initial PXE boot and provisioning.

    Connects LCM agents running on the node to the LCM API of the management or regional cluster. In management or regional clusters, it is replaced by the management network.

  • Kubernetes workloads (pods) network

    Technology Preview

    Serves connections between Kubernetes pods. Each host has an address on this network, and this address is used by Calico as an endpoint to the underlay network.

  • Kubernetes external network

    Technology Preview

    Serves external connection to the Kubernetes API and the user services exposed by the cluster. In management or regional clusters, it is replaced by the management network.

  • Storage access (access) network

    Connects Ceph nodes to the storage clients. The Ceph OSD service is bound to the address on this network. In management or regional clusters, it is replaced by the management network.

  • Storage replication (cluster) network

    Connects Ceph nodes to each other. Serves internal replication traffic. In management or regional clusters, it is replaced by the management network.

Each network is represented on the host by a virtual Linux bridge. Physical interfaces may be connected to one of the bridges directly, or through a logical VLAN subinterface, or combined into a bond interface that is in turn connected to a bridge.

The following table summarizes the default names used for the bridges connected to the networks listed above:

Management or regional cluster

Network type

Bridge name

Assignment method TechPreview

OOB network

N/A

N/A

PXE/LCM network

k8s-lcm 0

By a static interface name

Kubernetes workloads network

k8s-pods 0

By a static interface name

Managed cluster

Network type

Bridge name

Assignment method

PXE/LCM network

k8s-lcm 0

By a static interface name

Kubernetes workloads network

k8s-pods 0

By a static interface name

Kubernetes external network

k8s-ext

By the subnet label ipam/SVC-MetalLB

Storage access (public) network

ceph-public

By the subnet label ipam/SVC-ceph-public

Storage replication (cluster) network

ceph-cluster

By the subnet label ipam/SVC-ceph-cluster

0(1,2,3,4)

Interface name for this network role is static and cannot be changed.

Extended hardware configuration

Mirantis Container Cloud provides APIs that enable you to define hardware configurations that extend the reference architecture:

  • Bare Metal Host Profile API

    Enables for quick configuration of host boot and storage devices and assigning of custom configuration profiles to individual machines. See Create a custom bare metal host profile.

  • IP Address Management API

    Enables for quick configuration of host network interfaces and IP addresses and setting up of IP addresses ranges for automatic allocation. See Advanced networking configuration.

Typically, operations with the extended hardware configurations are available through the API and CLI, but not the web UI.

Built-in load balancing

The Mirantis Container Cloud managed clusters that are based on vSphere or bare metal use MetalLB for load balancing of services and NGINX with VIP managed by Virtual Router Redundancy Protocol (VRRP) with Keepalived for the Kubernetes API load balancer.

Kubernetes API load balancing

Every control plane node of each Kubernetes cluster runs the kube-api service in a container. This service provides a Kubernetes API endpoint. Every control plane node also runs the nginx server that provides load balancing with back-end health checking for all kube-api endpoints as back ends.

The default load balancing method is least_conn. With this method, a request is sent to the server with the least number of active connections. The default load balancing method cannot be changed using the Container Cloud API.

Only one of the control plane nodes at any given time serves as a front end for Kubernetes API. To ensure this, the Kubernetes clients use a virtual IP (VIP) address for accessing Kubernetes API. This VIP is assigned to one node at a time using VRRP. The keepalived daemon running on each control plane node provides health checking and failover of the VIP.

The keepalived daemon is configured in multicast mode.

Note

The use of VIP address for load balancing of Kubernetes API requires that all control plane nodes of a Kubernetes cluster are connected to a shared L2 segment. This limitation prevents from installing full L3 topologies where control plane nodes are split between different L2 segments and L3 networks.

Caution

External load balancers for services are not supported by the current version of the Container Cloud vSphere provider. The built-in load balancing described below is the only supported option and cannot be disabled.

Services load balancing

The services provided by the Kubernetes clusters, including Container Cloud and user services, are balanced by MetalLB. The metallb-speaker service runs on every worker node in the cluster and handles connections to the service IP addresses.

MetalLB runs in the MAC-based (L2) mode. It means that all control plane nodes must be connected to a shared L2 segment. This is a limitation that does not allow installing full L3 cluster topologies.

Caution

External load balancers for services are not supported by the current version of the Container Cloud vSphere provider. The built-in load balancing described below is the only supported option and cannot be disabled.

VMware vSphere network objects and IPAM recommendations

The VMware vSphere provider of Mirantis Container Cloud supports the following types of vSphere network objects:

  • Virtual network

    A network of virtual machines running on a hypervisor(s) that are logically connected to each other so that they can exchange data. Virtual machines can be connected to virtual networks that you create when you add a network.

  • Distributed port group

    A port group associated with a vSphere distributed switch that specifies port configuration options for each member port. Distributed port groups define how connection is established through the vSphere distributed switch to the network.

A Container Cloud cluster can be deployed using one of these network objects with or without a DHCP server in the network:

  • Non-DHCP

    Container Cloud uses IPAM service to manage IP addresses assignment to machines. You must provide additional network parameters, such as CIDR, gateway, IP ranges, and nameservers. Container Cloud processes this data to the cloud-init metadata and passes the data to machines during their bootstrap.

  • DHCP

    Container Cloud relies on a DHCP server to assign IP addresses to virtual machines.

Mirantis recommends using IP address management (IPAM) for cluster machines provided by Container Cloud. IPAM must be enabled for deployment in the non-DHCP vSphere networks. But Mirantis recommends enabling IPAM in the DHCP-based networks as well. In this case, the dedicated IPAM range should not intersect with the IP range used in the DHCP server configuration for the provided vSphere network. Such configuration prevents issues with accidental IP address change for machines. For the issue details, see vSphere known issue 14080 <known-2-9-0>`.

The following parameters are required to enable IPAM:

  • Network CIDR.

  • Network gateway address.

  • Minimum 1 DNS server.

  • IP address include range to be allocated for cluster machines. Make sure that this range is not part of the DHCP range if the network has a DHCP server.

    Minimal number of addresses in the range:

    • 3 IPs for management or regional cluster

    • 3+N IPs for a managed cluster, where N is the number of worker nodes

  • Optional. IP address exclude range that is the list of IPs not to be assigned to machines from the include ranges.

A dedicated Container Cloud network must not contain any virtual machines with the keepalived instance running inside them as this may lead to the vrouter_id conflict. By default, the Container Cloud management or regional cluster is deployed with vrouter_id set to 1. Managed clusters are deployed with the vrouter_id value starting from 2 and upper.

Kubernetes lifecycle management

The Kubernetes lifecycle management (LCM) engine in Mirantis Container Cloud consists of the following components:

LCM controller

Responsible for all LCM operations. Consumes the LCMCluster object and orchestrates actions through LCM agent.

LCM agent

Relates only to Mirantis Kubernetes Engine (MKE) clusters deployed using Container Cloud, and is not used for attached MKE clusters. Runs on the target host. Executes Ansible playbooks in headless mode.

Helm controller

Responsible for the lifecycle of the Helm charts. It is installed by LCM controller and interacts with Tiller.

The Kubernetes LCM components handle the following custom resources:

  • LCMCluster

  • LCMMachine

  • HelmBundle

The following diagram illustrates handling of the LCM custom resources by the Kubernetes LCM components. On a managed cluster, apiserver handles multiple Kubernetes objects, for example, deployments, nodes, RBAC, and so on.

_images/lcm-components.png
LCM custom resources

The Kubernetes LCM components handle the following custom resources (CRs):

  • LCMMachine

  • LCMCluster

  • HelmBundle

LCMMachine

Describes a machine that is located on a cluster. It contains the machine type, control or worker, StateItems that correspond to Ansible playbooks and miscellaneous actions, for example, downloading a file or executing a shell command. LCMMachine reflects the current state of the machine, for example, a node IP address, and each StateItem through its status. Multiple LCMMachine CRs can correspond to a single cluster.

LCMCluster

Describes a managed cluster. In its spec, LCMCluster contains a set of StateItems for each type of LCMMachine, which describe the actions that must be performed to deploy the cluster. LCMCluster is created by the provider, using machineTypes of the Release object. The status field of LCMCluster reflects the status of the cluster, for example, the number of ready or requested nodes.

HelmBundle

Wrapper for Helm charts that is handled by Helm controller. HelmBundle tracks what Helm charts must be installed on a managed cluster.

LCM controller

LCM controller runs on the management and regional cluster and orchestrates the LCMMachine objects according to their type and their LCMCluster object.

Once the LCMCluster and LCMMachine objects are created, LCM controller starts monitoring them to modify the spec fields and update the status fields of the LCMMachine objects when required. The status field of LCMMachine is updated by LCM agent running on a node of a management, regional, or managed cluster.

Each LCMMachine has the following lifecycle states:

  1. Uninitialized - the machine is not yet assigned to an LCMCluster.

  2. Pending - the agent reports a node IP address and hostname.

  3. Prepare - the machine executes StateItems that correspond to the prepare phase. This phase usually involves downloading the necessary archives and packages.

  4. Deploy - the machine executes StateItems that correspond to the deploy phase that is becoming a Mirantis Kubernetes Engine (MKE) node.

  5. Ready - the machine is being deployed.

  6. Upgrade - the machine is being upgraded to the new MKE version.

  7. Reconfigure - the machine is being updated with a new set of manager nodes. Once done, the machine moves to the ready state again.

The templates for StateItems are stored in the machineTypes field of an LCMCluster object, with separate lists for the MKE manager and worker nodes. Each StateItem has the execution phase field for a management, regional, and managed cluster:

  1. The prepare phase is executed for all machines for which it was not executed yet. This phase comprises downloading the files necessary for the cluster deployment, installing the required packages, and so on.

  2. During the deploy phase, a node is added to the cluster. LCM controller applies the deploy phase to the nodes in the following order:

    1. First manager node is deployed.

    2. The remaining manager nodes are deployed one by one and the worker nodes are deployed in batches (by default, up to 50 worker nodes at the same time). After at least one manager and one worker node are in the ready state, helm-controller is installed on the cluster.

LCM controller deploys and upgrades a Mirantis Container Cloud cluster by setting StateItems of LCMMachine objects following the corresponding StateItems phases described above. The Container Cloud cluster upgrade process follows the same logic that is used for a new deployment, that is applying a new set of StateItems to the LCMMachines after updating the LCMCluster object. But during the upgrade, the following additional actions are performed:

  • If the existing worker node is being upgraded, LCM controller performs draining and cordoning on this node honoring the Pod Disruption Budgets. This operation prevents unexpected disruptions of the workloads.

  • LCM controller verifies that the required version of helm-controller is installed.

LCM agent

LCM agent handles a single machine that belongs to a management, regional, or managed cluster. It runs on the machine operating system but communicates with apiserver of the regional cluster. LCM agent is deployed as a systemd unit using cloud-init. LCM agent has a built-in self-upgrade mechanism.

LCM agent monitors the spec of a particular LCMMachine object to reconcile the machine state with the object StateItems and update the LCMMachine status accordingly. The actions that LCM agent performs while handling the StateItems are as follows:

  • Download configuration files

  • Run shell commands

  • Run Ansible playbooks in headless mode

LCM agent provides the IP address and hostname of the machine for the LCMMachine status parameter.

Helm controller

Helm controller is used by Mirantis Container Cloud to handle management, regional, and managed clusters core addons such as StackLight and the application addons such as the OpenStack components.

Helm controller runs in the same pod as the Tiller process. The Tiller gRPC endpoint is not accessible outside the pod. The pod is created using StatefulSet inside a cluster by LCM controller once the cluster contains at least one manager and worker node.

The Helm release information is stored in the KaaSRelease object for the management and regional clusters and in the ClusterRelease object for all types of the Container Cloud clusters. These objects are used by the Container Cloud provider. The Container Cloud provider uses the information from the ClusterRelease object together with the Container Cloud API Cluster spec. In Cluster spec, the operator can specify the Helm release name and charts to use. By combining the information from the Cluster providerSpec parameter and its ClusterRelease object, the cluster actuator generates the LCMCluster objects. These objects are further handled by LCM controller and the HelmBundle object handled by Helm controller. HelmBundle must have the same name as the LCMCluster object for the cluster that HelmBundle applies to.

Although a cluster actuator can only create a single HelmBundle per cluster, Helm controller can handle multiple HelmBundle objects per cluster.

Helm controller handles the HelmBundle objects and reconciles them with the Tiller state in its cluster.

Helm controller can also be used by the management cluster with corresponding HelmBundle objects created as part of the initial management cluster setup.

Identity and access management

Identity and access management (IAM) provides a central point of users and permissions management of the Mirantis Container Cloud cluster resources in a granular and unified manner. Also, IAM provides infrastructure for single sign-on user experience across all Container Cloud web portals.

IAM for Container Cloud consists of the following components:

Keycloak
  • Provides the OpenID Connect endpoint

  • Integrates with an external identity provider (IdP), for example, existing LDAP or Google Open Authorization (OAuth)

  • Stores roles mapping for users

IAM controller
  • Provides IAM API with data about Container Cloud projects

  • Handles all role-based access control (RBAC) components in Kubernetes API

IAM API

Provides an abstraction API for creating user scopes and roles

IAM API and CLI

Mirantis IAM exposes the versioned and backward compatible Google remote procedure call (gRPC) protocol API to interact with IAM CLI.

IAM API is designed as a user-facing functionality. For this reason, it operates in the context of user authentication and authorization.

In IAM API, an operator can use the following entities:

  • Grants - to grant or revoke user access

  • Scopes - to describe user roles

  • Users - to provide user account information

Mirantis Container Cloud UI interacts with IAM API on behalf of the user. However, the user can directly work with IAM API using IAM CLI. IAM CLI uses the OpenID Connect (OIDC) endpoint to obtain the OIDC token for authentication in IAM API and enable you to perform different API operations.

The following diagram illustrates the interaction between IAM API and CLI:

_images/iam-api-cli.png

See also

IAM CLI

External identity provider integration

To be consistent and keep the integrity of a user database and user permissions, in Mirantis Container Cloud, IAM stores the user identity information internally. However in real deployments, the identity provider usually already exists.

Out of the box, in Container Cloud, IAM supports integration with LDAP and Google Open Authorization (OAuth). If LDAP is configured as an external identity provider, IAM performs one-way synchronization by mapping attributes according to configuration.

In the case of the Google Open Authorization (OAuth) integration, the user is automatically registered and their credentials are stored in the internal database according to the user template configuration. The Google OAuth registration workflow is as follows:

  1. The user requests a Container Cloud web UI resource.

  2. The user is redirected to the IAM login page and logs in using the Log in with Google account option.

  3. IAM creates a new user with the default access rights that are defined in the user template configuration.

  4. The user can access the Container Cloud web UI resource.

The following diagram illustrates the external IdP integration to IAM:

_images/iam-ext-idp.png

You can configure simultaneous integration with both external IdPs with the user identity matching feature enabled.

Authentication and authorization

Mirantis IAM uses the OpenID Connect (OIDC) protocol for handling authentication.

Implementation flow

Mirantis IAM performs as an OpenID Connect (OIDC) provider, it issues a token and exposes discovery endpoints.

The credentials can be handled by IAM itself or delegated to an external identity provider (IdP).

The issued JSON Web Token (JWT) is sufficient to perform operations across Mirantis Container Cloud according to the scope and role defined in it. Mirantis recommends using asymmetric cryptography for token signing (RS256) to minimize the dependency between IAM and managed components.

When Container Cloud calls Mirantis Kubernetes Engine (MKE), the user in Keycloak is created automatically with a JWT issued by Keycloak on behalf of the end user. MKE, in its turn, verifies whether the JWT is issued by Keycloak. If the user retrieved from the token does not exist in the MKE database, the user is automatically created in the MKE database based on the information from the token.

The authorization implementation is out of the scope of IAM in Container Cloud. This functionality is delegated to the component level. IAM interacts with a Container Cloud component using the OIDC token content that is processed by a component itself and required authorization is enforced. Such an approach enables you to have any underlying authorization that is not dependent on IAM and still to provide a unified user experience across all Container Cloud components.

Kubernetes CLI authentication flow

The following diagram illustrates the Kubernetes CLI authentication flow. The authentication flow for Helm and other Kubernetes-oriented CLI utilities is identical to the Kubernetes CLI flow, but JSON Web Tokens (JWT) must be pre-provisioned.

_images/iam-authn-k8s.png

Storage

The baremetal-based or Equinix Metal based Mirantis Container Cloud uses Ceph as a distributed storage system for file, block, and object storage. This section provides an overview of a Ceph cluster deployed by Container Cloud.

Overview

Mirantis Container Cloud deploys Ceph on the baremetal-based management and managed clusters and on the Equinix Metal based managed clusters using Helm charts with the following components:

  • Ceph controller - a Kubernetes controller that obtains the parameters from Container Cloud through a custom resource (CR), creates CRs for Rook, and updates its CR status based on the Ceph cluster deployment progress. It creates users, pools, and keys for OpenStack and Kubernetes and provides Ceph configurations and keys to access them. Also, Ceph controller eventually obtains the data from the OpenStack Controller for the Keystone integration and updates the RADOS Gateway services configurations to use Kubernetes for user authentication.

  • Ceph operator

    • Transforms user parameters from the Container Cloud Ceph CR into Rook objects and deploys a Ceph cluster using Rook.

    • Provides integration of the Ceph cluster with Kubernetes

    • Provides data for OpenStack to integrate with the deployed Ceph cluster

  • Custom resource (CR) - represents the customization of a Kubernetes installation and allows you to define the required Ceph configuration through the Container Cloud web UI before deployment. For example, you can define the failure domain, pools, Ceph node roles, number of Ceph components such as Ceph OSDs, and so on.

  • Rook - a storage orchestrator that deploys Ceph on top of a Kubernetes cluster.

A typical Ceph cluster consists of the following components:

Ceph Monitors

Three or, in rare cases, five Ceph Monitors.

Ceph Managers

Mirantis recommends having three Ceph Managers in every cluster

RADOS Gateway services

Mirantis recommends having three or more RADOS Gateway services for HA.

Ceph OSDs

The number of Ceph OSDs may vary according to the deployment needs.

Warning

  • A Ceph cluster with 3 Ceph nodes does not provide hardware fault tolerance and is not eligible for recovery operations, such as a disk or an entire Ceph node replacement.

  • A Ceph cluster uses the replication factor that equals 3. If the number of Ceph OSDs is less than 3, a Ceph cluster moves to the degraded state with the write operations restriction until the number of alive Ceph OSDs equals the replication factor again.

The placement of Ceph Monitors and Ceph Managers is defined in the custom resource.

The following diagram illustrates the way a Ceph cluster is deployed in Container Cloud:

_images/ceph-deployment.png

The following diagram illustrates the processes within a deployed Ceph cluster:

_images/ceph-data-flow.png
Limitations

A Ceph cluster configuration in Mirantis Container Cloud includes but is not limited to the following limitations:

  • Only one Ceph controller per a management, regional, or managed cluster and only one Ceph cluster per Ceph controller are supported.

  • The replication size for any Ceph pool must be set to more than 1.

  • Only one CRUSH tree per cluster. The separation of devices per Ceph pool is supported through device classes with only one pool of each type for a device class.

  • All CRUSH rules must have the same failure_domain.

  • Only the following types of CRUSH buckets are supported:

    • topology.kubernetes.io/region

    • topology.kubernetes.io/zone

    • topology.rook.io/datacenter

    • topology.rook.io/room

    • topology.rook.io/pod

    • topology.rook.io/pdu

    • topology.rook.io/row

    • topology.rook.io/rack

    • topology.rook.io/chassis

  • Consuming an existing Ceph cluster is not supported.

  • CephFS is not supported.

  • Only IPv4 is supported.

  • If two or more Ceph OSDs are located on the same device, there must be no dedicated WAL or DB for this class.

  • Only a full collocation or dedicated WAL and DB configurations are supported.

  • The minimum size of any defined Ceph OSD device is 5 GB.

  • Reducing the number of Ceph Monitors is not supported and causes the Ceph Monitor daemons removal from random nodes.

  • Removal of the mgr role in the nodes section of the KaaSCephCluster CR does not remove Ceph Managers. To remove a Ceph Manager from a node, remove it from the nodes spec and manually delete the mgr pod in the Rook namespace.

  • When adding a Ceph node with the Ceph Monitor role, if any issues occur with the Ceph Monitor, rook-ceph removes it and adds a new Ceph Monitor instead, named using the next alphabetic character in order. Therefore, the Ceph Monitor names may not follow the alphabetical order. For example, a, b, d, instead of a, b, c.

Monitoring

Mirantis Container Cloud uses StackLight, the logging, monitoring, and alerting solution that provides a single pane of glass for cloud maintenance and day-to-day operations as well as offers critical insights into cloud health including operational information about the components deployed in management, regional, and managed clusters. StackLight is based on Prometheus, an open-source monitoring solution and a time series database.

Deployment architecture

Mirantis Container Cloud deploys the StackLight stack as a release of a Helm chart that contains the helm-controller and helmbundles.lcm.mirantis.com (HelmBundle) custom resources. The StackLight HelmBundle consists of a set of Helm charts with the StackLight components that include:

StackLight components overview

StackLight component

Description

Alerta

Receives, consolidates, and deduplicates the alerts sent by Alertmanager and visually represents them through a simple web UI. Using the Alerta web UI, you can view the most recent or watched alerts, group, and filter alerts.

Alertmanager

Handles the alerts sent by client applications such as Prometheus, deduplicates, groups, and routes alerts to receiver integrations. Using the Alertmanager web UI, you can view the most recent fired alerts, silence them, or view the Alertmanager configuration.

Elasticsearch curator

Maintains the data (indexes) in Elasticsearch by performing such operations as creating, closing, or opening an index as well as deleting a snapshot. Also, manages the data retention policy in Elasticsearch.

Elasticsearch exporter

The Prometheus exporter that gathers internal Elasticsearch metrics.

Grafana

Builds and visually represents metric graphs based on time series databases. Grafana supports querying of Prometheus using the PromQL language.

Database back ends

StackLight uses PostgreSQL for Alerta and Grafana. PostgreSQL reduces the data storage fragmentation while enabling high availability. High availability is achieved using Patroni, the PostgreSQL cluster manager that monitors for node failures and manages failover of the primary node. StackLight also uses Patroni to manage major version upgrades of PostgreSQL clusters, which allows leveraging the database engine functionality and improvements as they are introduced upstream in new releases, maintaining functional continuity without version lock-in.

Logging stack

Responsible for collecting, processing, and persisting logs and Kubernetes events. By default, when deploying through the Container Cloud web UI, only the metrics stack is enabled on managed clusters. To enable StackLight to gather managed cluster logs, enable the logging stack during deployment. On management clusters, the logging stack is enabled by default. The logging stack components include:

  • Elasticsearch, which stores logs and notifications.

  • Fluentd-elasticsearch, which collects logs, sends them to Elasticsearch, generates metrics based on analysis of incoming log entries, and exposes these metrics to Prometheus.

  • Kibana, which provides real-time visualization of the data stored in Elasticsearch and enables you to detect issues.

  • Metricbeat, which collects Kubernetes events and sends them to Elasticsearch for storage.

  • Prometheus-es-exporter, which presents the Elasticsearch data as Prometheus metrics by periodically sending configured queries to the Elasticsearch cluster and exposing the results to a scrapable HTTP endpoint like other Prometheus targets.

  • Optional. Cerebro, a web UI for managing the Elasticsearch cluster. Using the Cerebro web UI, you can get a detailed view on your Elasticsearch cluster and debug issues. Cerebro is disabled by default.

Note

The logging mechanism performance depends on the cluster log load. In case of a high load, you may need to increase the default resource requests and limits for fluentdElasticsearch. For details, see StackLight configuration parameters: Resource limits.

Metric collector

Collects telemetry data (CPU or memory usage, number of active alerts, and so on) from Prometheus and sends the data to centralized cloud storage for further processing and analysis. Metric collector runs on the management cluster.

Prometheus

Gathers metrics. Automatically discovers and monitors the endpoints. Using the Prometheus web UI, you can view simple visualizations and debug. By default, the Prometheus database stores metrics of the past 15 days or up to 15 GB of data depending on the limit that is reached first.

Prometheus-es-exporter

Presents the Elasticsearch data as Prometheus metrics by periodically sending configured queries to the Elasticsearch cluster and exposing the results to a scrapable HTTP endpoint like other Prometheus targets.

Prometheus node exporter

Gathers hardware and operating system metrics exposed by kernel.

Prometheus Relay

Adds a proxy layer to Prometheus to merge the results from underlay Prometheus servers to prevent gaps in case some data is missing on some servers. Is available only in the HA StackLight mode.

Pushgateway

Enables ephemeral and batch jobs to expose their metrics to Prometheus. Since these jobs may not exist long enough to be scraped, they can instead push their metrics to Pushgateway, which then exposes these metrics to Prometheus. Pushgateway is not an aggregator or a distributed counter but rather a metrics cache. The pushed metrics are exactly the same as scraped from a permanently running program.

Salesforce notifier

Enables sending Alertmanager notifications to Salesforce to allow creating Salesforce cases and closing them once the alerts are resolved. Disabled by default.

Salesforce reporter

Queries Prometheus for the data about the amount of vCPU, vRAM, and vStorage used and available, combines the data, and sends it to Salesforce daily. Mirantis uses the collected data for further analysis and reports to improve the quality of customer support. Disabled by default.

Telegraf

Collects metrics from the system. Telegraf is plugin-driven and has the concept of two distinct set of plugins: input plugins collect metrics from the system, services, or third-party APIs; output plugins write and expose metrics to various destinations.

The Telegraf agents used in Container Cloud include:

  • telegraf-ds-smart monitors SMART disks, and runs on both management and managed clusters.

  • telegraf-ironic monitors Ironic on the baremetal-based management clusters. The ironic input plugin collects and processes data from Ironic HTTP API, while the http_response input plugin checks Ironic HTTP API availability. As an output plugin, to expose collected data as Prometheus target, Telegraf uses prometheus.

  • telegraf-docker-swarm gathers metrics from the Mirantis Container Runtime API about the Docker nodes, networks, and Swarm services. This is a Docker Telegraf input plugin with downstream additions.

Telemeter

Enables a multi-cluster view through a Grafana dashboard of the management cluster. Telemeter includes a Prometheus federation push server and clients to enable isolated Prometheus instances, which cannot be scraped from a central Prometheus instance, to push metrics to the central location.

The Telemeter services are distributed as follows:

  • Management cluster hosts the Telemeter server

  • Regional clusters host the Telemeter server and Telemeter client

  • Managed clusters host the Telemeter client

The metrics from managed clusters are aggregated on regional clusters. Then both regional and managed clusters metrics are sent from regional clusters to the management cluster.

Every Helm chart contains a default values.yml file. These default values are partially overridden by custom values defined in the StackLight Helm chart.

Before deploying a management or managed cluster, you can select the HA or non-HA StackLight architecture type. The non-HA mode is set by default. The following table lists the differences between the HA and non-HA modes:

StackLight database modes

Non-HA StackLight mode default

HA StackLight mode

  • One Prometheus instance

  • One Elasticsearch instance

  • One PostgreSQL instance

One persistent volume is provided for storing data. In case of a service or node failure, a new pod is redeployed and the volume is reattached to provide the existing data. Such setup has a reduced hardware footprint but provides less performance.

  • Two Prometheus instances

  • Three Elasticsearch instances

  • Three PostgreSQL instances

Local Volume Provisioner is used to provide local host storage. In case of a service or node failure, the traffic is automatically redirected to any other running Prometheus or Elasticsearch server. For better performance, Mirantis recommends that you deploy StackLight in the HA mode.

Authentication flow

StackLight provides five web UIs including Prometheus, Alertmanager, Alerta, Kibana, and Grafana. Access to StackLight web UIs is protected by Keycloak-based Identity and access management (IAM). All web UIs except Alerta are exposed to IAM through the IAM proxy middleware. The Alerta configuration provides direct integration with IAM.

The following diagram illustrates accessing the IAM-proxied StackLight web UIs, for example, Prometheus web UI:

_images/sl-auth-iam-proxied.png

Authentication flow for the IAM-proxied StackLight web UIs:

  1. A user enters the public IP of a StackLight web UI, for example, Prometheus web UI.

  2. The public IP leads to IAM proxy, deployed as a Kubernetes LoadBalancer, which protects the Prometheus web UI.

  3. LoadBalancer routes the HTTP request to Kubernetes internal IAM proxy service endpoints, specified in the X-Forwarded-Proto or X-Forwarded-Host headers.

  4. The Keycloak login form opens (the login_url field in the IAM proxy configuration, which points to Keycloak realm) and the user enters the user name and password.

  5. Keycloak validates the user name and password.

  6. The user obtains access to the Prometheus web UI (the upstreams field in the IAM proxy configuration).

Note

  • The discovery URL is the URL of the IAM service.

  • The upstream URL is the hidden endpoint of a web UI (Prometheus web UI in the example above).

The following diagram illustrates accessing the Alerta web UI:

_images/sl-authentication-direct.png

Authentication flow for the Alerta web UI:

  1. A user enters the public IP of the Alerta web UI.

  2. The public IP leads to Alerta deployed as a Kubernetes LoadBalancer type.

  3. LoadBalancer routes the HTTP request to the Kubernetes internal Alerta service endpoint.

  4. The Keycloak login form opens (Alerta refers to the IAM realm) and the user enters the user name and password.

  5. Keycloak validates the user name and password.

  6. The user obtains access to the Alerta web UI.

Supported features

Using the Mirantis Container Cloud web UI, on the pre-deployment stage of a managed cluster, you can view, enable or disable, or tune the following StackLight features available:

  • StackLight HA mode.

  • Database retention size and time for Prometheus.

  • Tunable index retention period for Elasticsearch.

  • Tunable PersistentVolumeClaim (PVC) size for Prometheus and Elasticsearch set to 16 GB for Prometheus and 30 GB for Elasticsearch by default. The PVC size must be logically aligned with the retention periods or sizes for these components.

  • Email and Slack receivers for the Alertmanager notifications.

  • Predefined set of dashboards.

  • Predefined set of alerts and capability to add new custom alerts for Prometheus in the following exemplary format:

    - alert: HighErrorRate
      expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
      for: 10m
      labels:
        severity: page
      annotations:
        summary: High request latency
    
Monitored components

StackLight measures, analyzes, and reports in a timely manner about failures that may occur in the following Mirantis Container Cloud components and their sub-components, if any:

  • Ceph

  • Ironic (Container Cloud bare-metal provider)

  • Kubernetes services:

    • Calico

    • etcd

    • Kubernetes cluster

    • Kubernetes containers

    • Kubernetes deployments

    • Kubernetes nodes

  • NGINX

  • Node hardware and operating system

  • PostgreSQL

  • SMART disks

  • StackLight:

    • Alertmanager

    • Elasticsearch

    • Grafana

    • Prometheus

    • Prometheus Relay

    • Pushgateway

    • Salesforce notifier

    • Telemeter

  • SSL certificates

  • Mirantis Kubernetes Engine (MKE)

    • Docker/Swarm metrics (through Telegraf)

    • Built-in MKE metrics

Outbound cluster metrics

The data collected and transmitted through an encrypted channel back to Mirantis provides our Customer Success Organization information to better understand the operational usage patterns our customers are experiencing as well as to provide feedback on product usage statistics to enable our product teams to enhance our products and services for our customers.

The node-level resource data are broken down into three broad categories: Cluster, Node, and Namespace. The telemetry data tracks Allocatable, Capacity, Limits, Requests, and actual Usage of node-level resources.

Terms explanation

Term

Definition

Allocatable

On a Kubernetes Node, the amount of compute resources that are available for pods

Capacity

The total number of available resources regardless of current consumption

Limits

Constraints imposed by Administrators

Requests

The resources that a given container application is requesting

Usage

The actual usage or consumption of a given resource

The full list of the outbound data includes:

  • From all Container Cloud managed clusters:

    • If Ceph is enabled:

      • ceph_pool_available

      • ceph_pool_size

      • ceph_pool_used

    • cluster_filesystem_size_bytes

    • cluster_filesystem_usage_bytes

    • cluster_filesystem_usage_ratio

    • cluster_master_nodes_total

    • cluster_nodes_total

    • cluster_persistentvolumeclaim_requests_storage_bytes

    • cluster_total_alerts_triggered

    • cluster_capacity_cpu_cores

    • cluster_capacity_memory_bytes

    • cluster_usage_cpu_cores

    • cluster_usage_memory_bytes

    • cluster_usage_per_capacity_cpu_ratio

    • cluster_usage_per_capacity_memory_ratio

    • cluster_worker_nodes_total

    • kaas_info

    • kaas_clusters

    • kaas_machines_ready

    • kaas_machines_requested

    • kubernetes_api_availability

    • mke_api_availability

    • mke_cluster_nodes_total

    • mke_cluster_containers_total

    • mke_cluster_vcpu_free

    • mke_cluster_vcpu_used

    • mke_cluster_vram_free

    • mke_cluster_vram_used

    • mke_cluster_vstorage_free

    • mke_cluster_vstorage_used

  • From Mirantis OpenStack for Kubernetes (MOS) managed clusters only:

    • openstack_cinder_api_status

    • openstack_cinder_volumes_total

    • openstack_glance_api_status

    • openstack_glance_images_total

    • openstack_glance_snapshots_total

    • openstack_heat_stacks_total

    • openstack_host_aggregate_instances

    • openstack_host_aggregate_memory_used_ratio

    • openstack_host_aggregate_memory_utilisation_ratio

    • openstack_host_aggregate_cpu_utilisation_ratio

    • openstack_host_aggregate_vcpu_used_ratio

    • openstack_instance_create_end

    • openstack_instance_create_error

    • openstack_instance_create_start

    • openstack_keystone_api_status

    • openstack_keystone_tenants_total

    • openstack_keystone_users_total

    • openstack_kpi_provisioning

    • openstack_neutron_api_status

    • openstack_neutron_lbaas_loadbalancers_total

    • openstack_neutron_networks_total

    • openstack_neutron_ports_total

    • openstack_neutron_routers_total

    • openstack_neutron_subnets_total

    • openstack_nova_api_status

    • openstack_nova_computes_total

    • openstack_nova_disk_total_gb

    • openstack_nova_instances_active_total

    • openstack_nova_ram_total_gb

    • openstack_nova_used_disk_total_gb

    • openstack_nova_used_ram_total_gb

    • openstack_nova_used_vcpus_total

    • openstack_nova_vcpus_total

    • openstack_quota_instances

    • openstack_quota_ram_gb

    • openstack_quota_vcpus

    • openstack_quota_volume_storage_gb

    • openstack_usage_instances

    • openstack_usage_ram_gb

    • openstack_usage_vcpus

    • openstack_usage_volume_storage_gb

StackLight proxy

StackLight components, which require external access, automatically use the same proxy that is configured for Mirantis Container Cloud clusters. Therefore, you only need to configure proxy during deployment of your management, regional, or managed clusters. No additional actions are required to set up proxy for StackLight. For more details about implementation of proxy support in Container Cloud, see Proxy and cache support.

Note

Proxy handles only the HTTP and HTTPS traffic. Therefore, for clusters with limited or no Internet access, it is not possible to set up Alertmanager email notifications, which use SMTP, when proxy is used.

Proxy is used for the following StackLight components:

Component

Cluster type

Usage

Alertmanager

Any

As a default http_config for all HTTP-based receivers except the predefined HTTP-alerta and HTTP-salesforce. For these receivers, http_config is overridden on the receiver level.

Metric collector

Management

To send outbound cluster metrics to Mirantis.

Salesforce notifier

Any

To send notifications to the Salesforce instance.

Salesforce reporter

Any

To send metric reports to the Salesforce instance.

Telemeter client

Regional

To send all metrics from the clusters of a region, including the managed and regional clusters, to the management cluster. Proxy is not used for the Telemeter client on managed clusters because managed clusters must have a direct access to their regional cluster.

Hardware and system requirements

Using Mirantis Container Cloud, you can deploy a Mirantis Kubernetes Engine (MKE) cluster on bare metal, OpenStack, Microsoft Azure, VMware vSphere, Equinix Metal, or Amazon Web Services (AWS). Each cloud provider requires corresponding resources.

Note

Using the free Mirantis license, you can create up to three Container Cloud managed clusters with three worker nodes on each cluster. Within the same quota, you can also attach existing MKE clusters that are not deployed by Container Cloud. If you need to increase this quota, contact Mirantis support for further details.

Requirements for a bootstrap node

A bootstrap node is necessary only to deploy the management cluster. When the bootstrap is complete, the bootstrap node can be redeployed and its resources can be reused for the managed cluster workloads.

The minimum reference system requirements of a baremetal-based bootstrap seed node are described in System requirements for the seed node. The minimum reference system requirements a bootstrap node for other supported Container Cloud providers are as follows:

  • Any local machine on Ubuntu 16.04 or 18.04 that requires access to the provider API with the following configuration:

    • 2 vCPUs

    • 4 GB of RAM

    • 5 GB of available storage

    • Docker version currently available for Ubuntu 18.04

  • Internet access for downloading of all required artifacts

Note

For the vSphere cloud provider, you can use RHEL 7.9 as the operating system for the bootstrap node. The system requirements are the same as for Ubuntu.

Requirements for a baremetal-based cluster

If you use a firewall or proxy, make sure that the bootstrap, management, and regional clusters have access to the following IP ranges and domain names:

  • IP ranges:

  • Domain names:

    • mirror.mirantis.com and repos.mirantis.com for packages

    • binary.mirantis.com for binaries and Helm charts

    • mirantis.azurecr.io for Docker images

    • mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry (port 443 if proxy is enabled)

    • mirantis.my.salesforce.com for Salesforce alerts

Note

  • Access to Salesforce is required from any Container Cloud cluster type.

  • If any additional Alertmanager notification receiver is enabled, for example, Slack, its endpoint must also be accessible from the cluster.

Reference hardware configuration

The following hardware configuration is used as a reference to deploy Mirantis Container Cloud with bare metal Container Cloud clusters with Mirantis Kubernetes Engine.

Reference hardware configuration for Container Cloud management and managed clusters on bare metal

Server role

Management cluster

Managed cluster

# of servers 0

3 1

6 2

CPU sockets

1

1

RAM, GB

128

128

SSD system, GB 3

1x 960

1x 960

SSD/HDD storage, GB 4

2x 1900

2x 1900

Onboard LAN ports

2

2

Discrete NICs

2

2

Total LAN ports 5

6

6

0

The Container Cloud reference architecture uses the following hardware models:

  • Server - Supermicro 1U SYS-6018R-TDW

  • CPU - Intel Xeon E5-2620v4

  • Discrete NICs - Intel X520-DA2

1

Adding more than 3 nodes to a management or regional cluster is not supported.

2

Three manager nodes for HA and three worker storage nodes for a minimal Ceph cluster. For more details about Ceph requirements, see Management cluster storage.

3

A management cluster requires 2 volumes for Container Cloud (total 50 GB) and 5 volumes for StackLight (total 60 GB). A managed cluster requires 5 volumes for StackLight.

4

In total, at least 3 disks are required:

  • sda - minimum 120 GB for system

  • sdb - minimum 120 GB for LocalVolumeProvisioner

  • sdc - for Ceph OSD

For the default storage schema, see Default configuration of the host system storage

5

Only one PXE port per node is allowed. OOB management (IPMI) port is not included.

System requirements for the seed node

The seed node is necessary only to deploy the management cluster. When the bootstrap is complete, the bootstrap node can be redeployed and its resources can be reused for the managed cluster workloads.

The minimum reference system requirements for a baremetal-based bootstrap seed node are as follows:

  • Basic server on Ubuntu 18.04 with the following configuration:

    • Kernel version 4.15.0-76.86 or later

    • 8 GB of RAM

    • 4 CPU

    • 10 GB of free disk space for the bootstrap cluster cache

  • No DHCP or TFTP servers on any NIC networks

  • Routable access IPMI network for the hardware servers. For more details, see Host networking.

  • Internet access for downloading of all required artifacts

Network fabric

The following diagram illustrates the physical and virtual L2 underlay networking schema for the final state of the Mirantis Container Cloud bare metal deployment.

_images/bm-cluster-physical-and-l2-networking.png

The network fabric reference configuration is a spine/leaf with 2 leaf ToR switches and one out-of-band (OOB) switch per rack.

Reference configuration uses the following switches for ToR and OOB:

  • Cisco WS-C3560E-24TD has 24 of 1 GbE ports. Used in OOB network segment.

  • Dell Force 10 S4810P has 48 of 1/10GbE ports. Used as ToR in Common/PXE network segment.

In the reference configuration, all odd interfaces from NIC0 are connected to TOR Switch 1, and all even interfaces from NIC0 are connected to TOR Switch 2. The Baseboard Management Controller (BMC) interfaces of the servers are connected to OOB Switch 1.

Management cluster storage

The management cluster requires minimum three storage devices per node. Each device is used for different type of storage.

  • The first device is always used for boot partitions and the root file system. SSD is recommended. RAID device is not supported.

  • One storage device per server is reserved for local persistent volumes. These volumes are served by the Local Storage Static Provisioner (local-volume-provisioner) and used by many services of Container Cloud.

  • At least one disk per server must be configured as a device managed by a Ceph OSD.

  • The recommended number of Ceph OSDs per a management cluster node is 2 OSDs per node, to the total of 6 OSDs. The recommended replication factor 3 ensures that no data is lost if any single node of the management cluster fails.

You can configure host storage devices using the BareMetalHostProfile resources. For details, see Customize the default bare metal host profile.

Requirements for an OpenStack-based cluster

While planning the deployment of an OpenStack-based Mirantis Container Cloud cluster with Mirantis Kubernetes Engine (MKE), consider the following general requirements:

  • Kubernetes on OpenStack requires the Cinder and Octavia APIs availability.

  • The only supported OpenStack networking is Open vSwitch. Other networking technologies, such as Tungsten Fabric, are not supported.

For system requirements for a bootstrap node, see Requirements for a bootstrap node.

Note

Container Cloud is developed and tested on OpenStack Queens.

If you use a firewall or proxy, make sure that the bootstrap, management, and regional clusters have access to the following IP ranges and domain names:

  • IP ranges:

  • Domain names:

    • mirror.mirantis.com and repos.mirantis.com for packages

    • binary.mirantis.com for binaries and Helm charts

    • mirantis.azurecr.io for Docker images

    • mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry (port 443 if proxy is enabled)

    • mirantis.my.salesforce.com for Salesforce alerts

Note

  • Access to Salesforce is required from any Container Cloud cluster type.

  • If any additional Alertmanager notification receiver is enabled, for example, Slack, its endpoint must also be accessible from the cluster.

Requirements for an OpenStack-based Container Cloud cluster

Resource

Management or regional cluster

Managed cluster

Comments

# of nodes

3 (HA) + 1 (Bastion)

5 (6 with StackLight HA)

  • A bootstrap cluster requires access to the OpenStack API.

  • Each management or regional cluster requires 3 nodes for the manager nodes HA. Adding more than 3 nodes to a management or regional cluster is not supported.

  • A managed cluster requires 3 nodes for the manager nodes HA and 2 nodes for the Container Cloud workloads. If the multiserver mode is enabled for StackLight, 3 nodes are required for the Container Cloud workloads.

  • Each management or regional cluster requires 1 node for the Bastion instance that is created with a public IP address to allow SSH access to instances.

# of vCPUs per node

8

8

  • The Bastion node requires 1 vCPU.

  • Refer to the RAM recommendations described below to plan resources for different types of nodes.

RAM in GB per node

24

16

To prevent issues with low RAM, Mirantis recommends the following types of instances for a managed cluster with 50-200 nodes:

  • 16 vCPUs and 32 GB of RAM - manager node

  • 16 vCPUs and 128 GB of RAM - nodes where the StackLight server components run

The Bastion node requires 1 GB of RAM.

Storage in GB per node

120

120

For the Bastion node, the default amount of storage is enough.

Operating system

Ubuntu 18.04

Ubuntu 18.04

For management, regional, and managed clusters, a base Ubuntu 18.04 image must be present in Glance.

Docker version

-

-

For management, regional, and managed clusters, Mirantis Container Runtime 20.10.6 is deployed by Container Cloud as a CRI.

OpenStack version

Queens

Queens

Obligatory OpenStack components

Octavia, Cinder, OVS

Octavia, Cinder, OVS

# of Cinder volumes

7 (total 110 GB)

5 (total 60 GB)

  • Each management or regional cluster requires 2 volumes for Container Cloud (total 50 GB) and 5 volumes for StackLight (total 60 GB)

  • A managed cluster requires 5 volumes for StackLight

# of load balancers

10 (management) + 7 (regional)

6

  • LBs for a management cluster:

    • 1 for MKE

    • 1 for Container Cloud UI

    • 1 for Keycloak service

    • 1 for IAM service

    • 6 for StackLight

  • LBs for a regional cluster:

    • 1 for MKE

    • 6 for StackLight

  • LBs for a managed cluster:

    • 1 for MKE

    • 5 for StackLight with enabled logging (or 4 without logging)

# of floating IPs

11 (management) + 8 (regional)

11

  • FIPs for a management cluster:

    • 1 for MKE

    • 1 for Container Cloud UI

    • 1 for Keycloak service

    • 1 for IAM service

    • 1 for the Bastion node (or 3 without Bastion: one FIP per manager node)

    • 6 for StackLight

  • FIPs for a regional cluster:

    • 1 for MKE

    • 1 for the Bastion node (or 3 without Bastion)

    • 6 for StackLight

  • FIPs for a managed cluster:

    • 1 for MKE

    • 3 for the manager nodes

    • 2 for the worker nodes

    • 5 for StackLight with enabled logging (4 without logging)

Requirements for an AWS-based cluster

While planning the deployment of an AWS-based Mirantis Container Cloud cluster with Mirantis Kubernetes Engine, consider the requirements described below.

For system requirements for a bootstrap node, see Requirements for a bootstrap node.

Warning

Some of the AWS features required for Container Cloud may not be included into your AWS account quota. Therefore, carefully consider the AWS fees applied to your account that may increase for the Container Cloud infrastructure.

If you use a firewall or proxy, make sure that the bootstrap, management, and regional clusters have access to the following IP ranges and domain names:

  • IP ranges:

  • Domain names:

    • mirror.mirantis.com and repos.mirantis.com for packages

    • binary.mirantis.com for binaries and Helm charts

    • mirantis.azurecr.io for Docker images

    • mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry (port 443 if proxy is enabled)

    • mirantis.my.salesforce.com for Salesforce alerts

Note

  • Access to Salesforce is required from any Container Cloud cluster type.

  • If any additional Alertmanager notification receiver is enabled, for example, Slack, its endpoint must also be accessible from the cluster.

Note

If you want to deploy a managed cluster that is based on Equinix Metal on top of an AWS-based management cluster, see Requirements for an Equinix Metal based cluster.

Requirements for an AWS-based Container Cloud cluster

Resource

Management or regional cluster

Managed cluster

Comment

# of nodes

3 (HA)

5 (6 with StackLight HA)

  • A management cluster requires 3 nodes for the manager nodes HA. Adding more than 3 nodes to a management or regional cluster is not supported.

  • A managed cluster requires 3 nodes for the manager nodes HA and 2 nodes for the Container Cloud workloads. If the multiserver mode is enabled for StackLight, 3 nodes are required for the Container Cloud workloads.

# of vCPUs per node

8

8

RAM in GB per node

24

16

Storage in GB per node

120

120

Operating system

Ubuntu 18.04

Ubuntu 18.04

For a management and managed cluster, a base Ubuntu 18.04 image is required.

Docker version

-

-

For a management and managed cluster, Mirantis Container Runtime 20.10.6 is deployed by Container Cloud as a CRI.

Instance type

c5d.4xlarge

c5d.2xlarge

To prevent issues with low RAM, Mirantis recommends the following types of instances for a managed cluster with 50-200 nodes:

  • c5d.4xlarge - manager node

  • r5.4xlarge - nodes where the StackLight server components run

The /var/lib/docker Docker data is located on local NVMe SSDs by default. EBS is used for the operating system.

Bastion host instance type

t2.micro

t2.micro

The Bastion instance is created with a public Elastic IP address to allow SSH access to instances.

# of volumes

7 (total 110 GB)

5 (total 60 GB)

  • A management cluster requires 2 volumes for Container Cloud (total 50 GB) and 5 volumes for StackLight (total 60 GB)

  • A managed cluster requires 5 volumes for StackLight

# of Elastic load balancers to be used

10

6

  • Elastic LBs for a management cluster: 1 for Kubernetes, 4 for Container Cloud, 5 for StackLight

  • Elastic LBs for a managed cluster: 1 for Kubernetes and 5 for StackLight

# of Elastic IP addresses to be used

1

1

Requirements for an Azure-based cluster

While planning the deployment of an Azure-based Mirantis Container Cloud cluster with Mirantis Kubernetes Engine, consider the requirements described below.

For system requirements for a bootstrap node, see Requirements for a bootstrap node.

Warning

Some of the Azure features required for Container Cloud may not be included into your Azure account quota. Therefore, carefully consider the Azure fees applied to your account that may increase for the Container Cloud infrastructure.

If you use a firewall or proxy, make sure that the bootstrap, management, and regional clusters have access to the following IP ranges and domain names:

  • IP ranges:

  • Domain names:

    • mirror.mirantis.com and repos.mirantis.com for packages

    • binary.mirantis.com for binaries and Helm charts

    • mirantis.azurecr.io for Docker images

    • mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry (port 443 if proxy is enabled)

    • mirantis.my.salesforce.com for Salesforce alerts

Note

  • Access to Salesforce is required from any Container Cloud cluster type.

  • If any additional Alertmanager notification receiver is enabled, for example, Slack, its endpoint must also be accessible from the cluster.

Requirements for an Azure-based Container Cloud cluster

Resource

Management or regional cluster

Managed cluster

Comment

# of nodes

3 (HA)

5 (6 with StackLight HA)

  • A management cluster requires 3 nodes for the manager nodes HA. Adding more than 3 nodes to a management or regional cluster is not supported.

  • A managed cluster requires 3 nodes for the manager nodes HA and 2 nodes for the Container Cloud workloads. If the multiserver mode is enabled for StackLight, 3 nodes are required for the Container Cloud workloads.

# of vCPUs per node

8

8

RAM in GB per node

24

16

Storage in GB per node

128

128

Operating system

Ubuntu 18.04

Ubuntu 18.04

For a management, regional and managed cluster, a base Ubuntu 18.04 image is required.

Docker version

-

-

For a management, regional and managed cluster, Mirantis Container Runtime 20.10.6 is deployed by Container Cloud as a CRI.

Virtual Machine size

Standard_F16s_v2

Standard_F8s_v2

To prevent issues with low RAM, Mirantis recommends selecting Azure virtual machine sizes that meet the following minimum requirements for managed clusters:

  • 16 GB RAM (24 GB RAM for a cluster with 50-200 nodes)

  • 8 CPUs

  • Ephemeral OS drive supported

  • OS drive size is more than 128 GB

# of Azure resource groups

1

1

# of Azure networks

1

1

# of Azure subnets

1

1

# of Azure security groups

1

1

# of Azure network interfaces

3

One network interface per each machine

# of Azure route tables

1

1

# of Azure load balancers to be used

2

2

1 load balancer for an API server and 1 for Kubernetes services

# of public IP addresses to be used

12/9

8

  • Management cluster: 10 public IPs for Kubernetes services and 2 public IPs as front-end IPs for load balancers

  • Regional cluster: 7 public IPs for Kubernetes services and 2 public IPs as front-end IPs for load balancers

  • Managed cluster: 6 public IPs for Kubernetes services and 2 public IPs as front-end IPs for load balancers

# of OS disks

3

1 OS disk per each machine

# of data disks

0

5 (total 60 GB)

A managed cluster requires 5 volumes for StackLight

Requirements for an Equinix Metal based cluster

While planning the deployment of Mirantis Container Cloud cluster with MKE that is based on the Equinix Metal cloud provider, consider the requirements described below.

For system requirements for a bootstrap node, see Requirements for a bootstrap node.

If you want to deploy an Equinix Metal based managed cluster on top of an AWS management cluster, also refer to requirements for an Requirements for an AWS-based cluster.

If you use a firewall or proxy, make sure that the bootstrap, management, and regional clusters have access to the following IP ranges and domain names:

  • IP ranges:

  • Domain names:

    • mirror.mirantis.com and repos.mirantis.com for packages

    • binary.mirantis.com for binaries and Helm charts

    • mirantis.azurecr.io for Docker images

    • mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry (port 443 if proxy is enabled)

    • mirantis.my.salesforce.com for Salesforce alerts

Note

  • Access to Salesforce is required from any Container Cloud cluster type.

  • If any additional Alertmanager notification receiver is enabled, for example, Slack, its endpoint must also be accessible from the cluster.

Requirements for an Equinix Metal based Container Cloud cluster

Resource

Management or regional cluster

Managed cluster

Comment

# of nodes

3 (HA)

5 (6 with StackLight HA)

  • A management cluster requires 3 nodes for the manager nodes HA. Adding more than 3 nodes to a management or regional cluster is not supported.

  • A managed cluster requires 3 nodes for the manager nodes HA and 2 nodes for the Container Cloud workloads. If the multiserver mode is enabled for StackLight, 3 nodes are required for the Container Cloud workloads.

# of vCPUs per node

8

8

RAM in GB per node

24

16

Operating system

Ubuntu 18.04

Ubuntu 18.04

Docker version

-

-

For a management and managed cluster, Mirantis Container Runtime 20.10.6 is deployed by Container Cloud as a CRI.

Server type

c3.small.x86

c3.small.x86

Most available Equinix Metal servers are configured with minimal requirements to deploy Container Cloud clusters. However, ensure that the selected Equinix Metal server type meets the following minimal requirements for a managed cluster:

  • 16 GB RAM

  • 8 CPUs

  • 2 storage devices with more than 120 GB each

Warning

If the Equinix Metal data center has not enough capacity, the server provisioning request will fail. Servers of particular types can be unavailable at a given time. Therefore, before you deploy a cluster, verify that the selected server type is available as described in Verify the capacity of the Equinix Metal facility.

For more details about the Equinix Metal capacity, see official Equinix Metal Documentation.

# of Elastic IP addresses to be used

12

6

  • Elastic IPs for a management cluster: 1 for Kubernetes, 5 for Container Cloud, 6 for StackLight

  • Elastic IPs for a managed cluster: 1 for Kubernetes and 5 for StackLight

Ceph nodes

-

See comments

Recommended minimal number of Ceph node roles:

Storage

Manager and Monitor

1-2

1

3-500

3 (for HA)

> 500

5

If you select Manual Ceph Configuration during the cluster creation, you can manually configure Ceph roles for each machine in the cluster following the recommended minimal number of Ceph node roles. Otherwise, Equinix Metal cloud provider will automatically configure Ceph roles: all control plane machines will be configured with Storage and Manager and Monitor roles. All worker machines will be configured with Storage role.

Requirements for a VMware vSphere-based cluster

Note

Container Cloud is developed and tested on VMware vSphere 7.0 and 6.7.

For system requirements for a bootstrap node, see Requirements for a bootstrap node.

If you use a firewall or proxy, make sure that the bootstrap, management, and regional clusters have access to the following IP ranges and domain names:

  • IP ranges:

  • Domain names:

    • mirror.mirantis.com and repos.mirantis.com for packages

    • binary.mirantis.com for binaries and Helm charts

    • mirantis.azurecr.io for Docker images

    • mcc-metrics-prod-ns.servicebus.windows.net:9093 for Telemetry (port 443 if proxy is enabled)

    • mirantis.my.salesforce.com for Salesforce alerts

Note

  • Access to Salesforce is required from any Container Cloud cluster type.

  • If any additional Alertmanager notification receiver is enabled, for example, Slack, its endpoint must also be accessible from the cluster.

Requirements for a vSphere-based Container Cloud cluster

Resource

Management cluster

Managed cluster

Comments

# of nodes

3 (HA)

5 (6 with StackLight HA)

  • A bootstrap cluster requires access to the vSphere API.

  • A management cluster requires 3 nodes for the manager nodes HA. Adding more than 3 nodes to a management or regional cluster is not supported.

  • A managed cluster requires 3 nodes for the manager nodes HA and 2 nodes for the Container Cloud workloads. If the multiserver mode is enabled for StackLight, 3 nodes are required for the Container Cloud workloads.

# of vCPUs per node

8

8

Refer to the RAM recommendations described below to plan resources for different types of nodes.

RAM in GB per node

24

16

To prevent issues with low RAM, Mirantis recommends the following VM templates for a managed cluster with 50-200 nodes:

  • 16 vCPUs and 32 GB of RAM - manager node

  • 16 vCPUs and 128 GB of RAM - nodes where the StackLight server components run

Storage in GB per node

120

120

The listed amount of disk space must be available as a shared datastore of any type, for example, NFS or vSAN, mounted on all hosts of the vCenter cluster.

Operating system

RHEL 7.9 or 7.8 1
CentOS 7.9 2
RHEL 7.9 or 7.8 1
CentOS 7.9 2

For a management and managed cluster, a base OS VM template must be present in the VMware VM templates folder available to Container Cloud. For details about the template, see Prepare the OVF template.

RHEL license
(for RHEL deployments only)

RHEL licenses for Virtual Datacenters

RHEL licenses for Virtual Datacenters

This license type allows running unlimited guests inside one hypervisor. The amount of licenses is equal to the amount of hypervisors in vCenter Server, which will be used to host RHEL-based machines. Container Cloud will schedule machines according to scheduling rules applied to vCenter Server. Therefore, make sure that your RedHat Customer portal account has enough licenses for allowed hypervisors.

Docker version

-

-

For a management and managed cluster, Mirantis Container Runtime 20.10.6 is deployed by Container Cloud as a CRI.

VMware vSphere version

7.0, 6.7

7.0, 6.7

cloud-init version

19.4

19.4

The minimal cloud-init package version built for the Prepare the OVF template.

VMware Tools version

11.0.5

11.0.5

The minimal open-vm-tools package version built for the Prepare the OVF template.

Obligatory vSphere capabilities

DRS,
Shared datastore
DRS,
Shared datastore

A shared datastore must be mounted on all hosts of the vCenter cluster. Combined with Distributed Resources Scheduler (DRS), it ensures that the VMs are dynamically scheduled to the cluster hosts.

IP subnet size

/24

/24

Consider the supported VMware vSphere network objects and IPAM recommendations.

Minimal IP addresses distribution:

  • Management cluster:

    • 1 for the load balancer of Kubernetes API

    • 3 for manager nodes (one per node)

    • 6 for the Container Cloud services

    • 6 for StackLight

  • Managed cluster:

    • 1 for the load balancer of Kubernetes API

    • 3 for manager nodes

    • 2 for worker nodes

    • 6 for StackLight

1(1,2)

RHEL 7.8 deployment is possible with allowed access to the rhel-7-server-rpms repository provided by the Red Hat Enterprise Linux Server 7 x86_64. Verify that your RHEL license or activation key meets this requirement.

2(1,2)

CentOS deployments are available as Technology Preview. Use this configuration for testing and evaluation purposes only. A Container Cloud cluster based on both RHEL and CentOS operating systems is not supported.

Proxy and cache support

Proxy support

If you require all Internet access to go through a proxy server for security and audit purposes, you can bootstrap management and regional clusters using proxy. The proxy server settings consist of three standard environment variables that are set prior to the bootstrap process:

  • HTTP_PROXY

  • HTTPS_PROXY

  • NO_PROXY

These settings are not propagated to managed clusters. However, you can enable a separate proxy access on a managed cluster using the Container Cloud web UI. This proxy is intended for the end user needs and is not used for a managed cluster deployment or for access to the Mirantis resources.

Caution

Since Container Cloud uses the OpenID Connect (OIDC) protocol for IAM authentication, management clusters require a direct non-proxy access from regional and managed clusters.

StackLight components, which require external access, automatically use the same proxy that is configured for Container Cloud clusters.

On the managed clusters with limited Internet access, a proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled, for example, for the Salesforce integration and Alertmanager notifications external rules. For more details about proxy implementation in StackLight, see StackLight proxy.

For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Hardware and system requirements.

Artifacts caching

The Container Cloud managed clusters are deployed without direct Internet access in order to consume less Internet traffic in your cloud. The Mirantis artifacts used during managed clusters deployment are downloaded through a cache running on a regional cluster. The feature is enabled by default on new managed clusters and will be automatically enabled on existing clusters during upgrade to the latest version.

Caution

IAM operations require a direct non-proxy access of a managed cluster to a management cluster.

Mirantis Kubernetes Engine API limitations

To ensure the Mirantis Container Cloud stability in managing the Container Cloud-based Mirantis Kubernetes Engine (MKE) clusters, the following MKE API functionality is not available for the Container Cloud-based MKE clusters as compared to the attached MKE clusters that are not deployed by Container Cloud. Use the Container Cloud web UI or CLI for this functionality instead.

Public APIs limitations in a Container Cloud-based MKE cluster

API endpoint

Limitation

GET /swarm

Swarm Join Tokens are filtered out for all users, including admins.

PUT /api/ucp/config-toml

All requests are forbidden.

POST /nodes/{id}/update

Requests for the following changes are forbidden:

  • Change Role

  • Add or remove the com.docker.ucp.orchestrator.swarm and com.docker.ucp.orchestrator.kubernetes labels.

DELETE /nodes/{id}

All requests are forbidden.

Deployment Guide

Deploy a baremetal-based management cluster

This section describes how to bootstrap a baremetal-based Mirantis Container Cloud management cluster.

Workflow overview

The bare metal management system enables the Infrastructure Operator to deploy Mirantis Container Cloud on a set of bare metal servers. It also enables Container Cloud to deploy managed clusters on bare metal servers without a pre-provisioned operating system.

The Infrastructure Operator performs the following steps to install Container Cloud in a bare metal environment:

  1. Install and connect hardware servers as described in Requirements for a baremetal-based cluster.

    Caution

    The baremetal-based Container Cloud does not manage the underlay networking fabric but requires specific network configuration to operate.

  2. Install Ubuntu 18.04 on one of the bare metal machines to create a seed node and copy the bootstrap tarball to this node.

  3. Obtain the Mirantis license file that will be required during the bootstrap.

  4. Create the deployment configuration files that include the bare metal hosts metadata.

  5. Validate the deployment templates using fast preflight.

  6. Run the bootstrap script for the fully automated installation of the management cluster onto the selected bare metal hosts.

Using the bootstrap script, the Container Cloud bare metal management system prepares the seed node for the management cluster and starts the deployment of Container Cloud itself. The bootstrap script performs all necessary operations to perform the automated management cluster setup. The deployment diagram below illustrates the bootstrap workflow of a baremetal-based management cluster.

_images/bm-bootstrap-workflow.png
Bootstrap a management cluster

This section describes how to prepare and bootstrap a baremetal-based management cluster. The procedure includes:

  • A runbook that describes how to create a seed node that is a temporary server used to run the management cluster bootstrap scripts.

  • A step-by-step instruction how to prepare metadata for the bootstrap scripts and how to run them.

Prepare the seed node

Before installing Mirantis Container Cloud on a bare metal environment, complete the following preparation steps:

  1. Verify that the hardware allocated for the installation meets the minimal requirements described in Requirements for a baremetal-based cluster.

  2. Install basic Ubuntu 18.04 server using standard installation images of the operating system on the bare metal seed node.

  3. Log in to the seed node that is running Ubuntu 18.04.

  4. Create a virtual bridge to connect to your PXE network on the seed node. Use the following netplan-based configuration file as an example:

    # cat /etc/netplan/config.yaml
    network:
      version: 2
      renderer: networkd
      ethernets:
        ens3:
            dhcp4: false
            dhcp6: false
      bridges:
          br0:
              addresses:
              # Please, adjust for your environment
              - 10.0.0.15/24
              dhcp4: false
              dhcp6: false
              # Please, adjust for your environment
              gateway4: 10.0.0.1
              interfaces:
              # Interface name may be different in your environment
              - ens3
              nameservers:
                  addresses:
                  # Please, adjust for your environment
                  - 8.8.8.8
              parameters:
                  forward-delay: 4
                  stp: false
    
  5. Apply the new network configuration using netplan:

    sudo netplan apply
    
  6. Verify the new network configuration:

    sudo brctl show
    

    Example of system response:

    bridge name     bridge id               STP enabled     interfaces
    br0             8000.fa163e72f146       no              ens3
    

    Verify that the interface connected to the PXE network belongs to the previously configured bridge.

  7. Install the current Docker version available for Ubuntu 18.04:

    sudo apt install docker.io
    
  8. Verify that your logged USER has access to the Docker daemon:

    sudo usermod -aG docker $USER
    
  9. Log out and log in again to the seed node to apply the changes.

  10. Verify that Docker is configured correctly and has access to Container Cloud CDN. For example:

    docker run --rm alpine sh -c "apk add --no-cache curl; \
    curl https://binary.mirantis.com"
    

    The system output must contain a json file with no error messages. In case of errors, follow the steps provided in Troubleshooting.

    Note

    If you require all Internet access to go through a proxy server for security and audit purposes, configure Docker proxy settings as described in the official Docker documentation.

Proceed with Verify the seed node.

Verify the seed node

Before you proceed to bootstrapping the management cluster on bare metal, perform the following steps:

  1. Verify that the seed node has direct access to the Baseboard Management Controller (BMC) of each baremetal host. All target hardware nodes must be in the power off state.

    For example, using the IPMI tool:

    ipmitool -I lanplus -H 'IPMI IP' -U 'IPMI Login' -P 'IPMI password' \
    chassis power status
    

    Example of system response:

    Chassis Power is off
    
  2. Verify that you configured each bare metal host as follows:

    • Enable the boot NIC support for UEFI load. Usually, at least the built-in network interfaces support it.

    • Enable the UEFI-LAN-OPROM support in BIOS -> Advanced -> PCIPCIe.

    • Enable the IPv4-PXE stack.

    • Set the following boot order:

      1. UEFI-DISK

      2. UEFI-PXE

    • If your PXE network is not configured to use the first network interface, fix the UEFI-PXE boot order to speed up node discovering by selecting only one required network interface.

    • Power off all bare metal hosts.

    Warning

    Only one Ethernet port on a host must be connected to the Common/PXE network at any given time. The physical address (MAC) of this interface must be noted and used to configure the BareMetalHost object describing the host.

Proceed with Prepare metadata and deploy the management cluster.

Prepare metadata and deploy the management cluster

Using the example procedure below, replace the addresses and credentials in the configuration YAML files with the data from your environment. Keep everything else as is, including the file names and YAML structure.

The overall network mapping scheme with all L2 parameters, for example, for a single 10.0.0.0/24 network, is described in the following table. The configuration of each parameter indicated in this table is described in the steps below.

Network mapping overview

Deployment file name

Parameters and values

cluster.yaml

  • SET_LB_HOST=10.0.0.90

  • SET_METALLB_ADDR_POOL=10.0.0.61-10.0.0.80

ipam-objects.yaml

  • SET_IPAM_CIDR=10.0.0.0/24

  • SET_PXE_NW_GW=10.0.0.1

  • SET_PXE_NW_DNS=8.8.8.8

  • SET_IPAM_POOL_RANGE=10.0.0.100-10.0.0.252

  • SET_LB_HOST=10.0.0.90

  • SET_METALLB_ADDR_POOL=10.0.0.61-10.0.0.80

bootstrap.sh

  • KAAS_BM_PXE_IP=10.0.0.20

  • KAAS_BM_PXE_MASK=24

  • KAAS_BM_PXE_BRIDGE=br0

  • KAAS_BM_BM_DHCP_RANGE=10.0.0.30,10.0.0.49


  1. Log in to the seed node that you configured as described in Prepare the seed node.

  2. Change to your preferred work directory, for example, your home directory:

    cd $HOME
    
  3. Prepare the bootstrap script:

    1. Download and run the Container Cloud bootstrap script:

      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      chmod 0755 get_container_cloud.sh
      ./get_container_cloud.sh
      
    2. Change the directory to the kaas-bootstrap folder created by the script.

  4. Obtain your license file that will be required during the bootstrap:

    1. Create a user account at www.mirantis.com.

    2. Log in to your account and download the mirantis.lic license file.

    3. Save the license file as mirantis.lic under the kaas-bootstrap directory on the bootstrap node.

  5. Create a copy of the current templates directory for future reference.

    mkdir templates.backup
    cp -r templates/*  templates.backup/
    
  6. Update the cluster definition template in templates/bm/cluster.yaml.template according to the environment configuration. Use the table below. Manually set all parameters that start with SET_. For example, SET_METALLB_ADDR_POOL.

    Cluster template mandatory parameters

    Parameter

    Description

    Example value

    SET_LB_HOST

    The IP address of the externally accessible API endpoint of the management cluster. This address must NOT be within the SET_METALLB_ADDR_POOL range but must be from the PXE network. External load balancers are not supported.

    10.0.0.90

    SET_METALLB_ADDR_POOL

    The IP range to be used as external load balancers for the Kubernetes services with the LoadBalancer type. This range must be within the PXE network. The minimum required range is 19 IP addresses.

    10.0.0.61-10.0.0.80

  7. Optional. Configure the regional NTP server parameters to be applied to all machines of regional and managed clusters in the specified region.

    In templates/bm/cluster.yaml.template, add the ntp:servers section with the list of required servers names:

    spec:
      ...
      providerSpec:
        value:
          kaas:
          ...
            regional:
              - helmReleases:
                - name: baremetal-provider
                  values:
                    config:
                      lcm:
                        ...
                        ntp:
                          servers:
                          - 0.pool.ntp.org
                          ...
                provider: baremetal
                ...
    
  8. Inspect the default bare metal host profile definition in templates/bm/baremetalhostprofiles.yaml.template. If your hardware configuration differs from the reference, adjust the default profile to match. For details, see Customize the default bare metal host profile.

  9. Update the bare metal hosts definition template in templates/bm/baremetalhosts.yaml.template according to the environment configuration. Use the table below. Manually set all parameters that start with SET_.

    Bare metal hosts template mandatory parameters

    Parameter

    Description

    Example value

    SET_MACHINE_0_IPMI_USERNAME

    The IPMI user name in the base64 encoding to access the BMC. 0

    dXNlcg== (base64 encoded user)

    SET_MACHINE_0_IPMI_PASSWORD

    The IPMI password in the base64 encoding to access the BMC. 0

    cGFzc3dvcmQ= (base64 encoded password)

    SET_MACHINE_0_MAC

    The MAC address of the first management master node in the PXE network.

    ac:1f:6b:02:84:71

    SET_MACHINE_0_BMC_ADDRESS

    The IP address of the BMC endpoint for the first master node in the management cluster. Must be an address from the OOB network that is accessible through the PXE network default gateway.

    192.168.100.11

    SET_MACHINE_1_IPMI_USERNAME

    The IPMI user name in the base64 encoding to access the BMC. 0

    dXNlcg== (base64 encoded user)

    SET_MACHINE_1_IPMI_PASSWORD

    The IPMI password in the base64 encoding to access the BMC. 0

    cGFzc3dvcmQ= (base64 encoded password)

    SET_MACHINE_1_MAC

    The MAC address of the second management master node in the PXE network.

    ac:1f:6b:02:84:72

    SET_MACHINE_1_BMC_ADDRESS

    The IP address of the BMC endpoint for the second master node in the management cluster. Must be an address from the OOB network that is accessible through the PXE network default gateway.

    192.168.100.12

    SET_MACHINE_2_IPMI_USERNAME

    The IPMI user name in the base64 encoding to access the BMC. 0

    dXNlcg== (base64 encoded user)

    SET_MACHINE_2_IPMI_PASSWORD

    The IPMI password in the base64 encoding to access the BMC. 0

    cGFzc3dvcmQ= (base64 encoded password)

    SET_MACHINE_2_MAC

    The MAC address of the third management master node in the PXE network.

    ac:1f:6b:02:84:73

    SET_MACHINE_2_BMC_ADDRESS

    The IP address of the BMC endpoint for the third master node in the management cluster. Must be an address from the OOB network that is accessible through the PXE network default gateway.

    192.168.100.13

    0(1,2,3,4,5,6)

    You can obtain the base64-encoded user name and password using the following command in your Linux console:

    $ echo -n <username|password> | base64
    
  10. Update the IP address pools definition template in templates/bm/ipam-objects.yaml.template according to the environment configuration. Use the table below. Manually set all parameters that start with SET_. For example, SET_IPAM_POOL_RANGE.

    IP address pools template mandatory parameters

    Parameter

    Description

    Example value

    SET_IPAM_CIDR

    The address of PXE network in CIDR notation. Must be minimum in the /24 network.

    10.0.0.0/24

    SET_PXE_NW_GW

    The default gateway in the PXE network. Since this is the only network that Container Cloud will use, this gateway must provide access to:

    • The Internet to download the Mirantis artifacts

    • The OOB network of the Container Cloud cluster

    10.0.0.1

    SET_PXE_NW_DNS

    An external (non-Kubernetes) DNS server accessible from the PXE network. This server will be used by the bare metal hosts in all Container Cloud clusters.

    8.8.8.8

    SET_IPAM_POOL_RANGE

    This pool range includes addresses that will be allocated to bare metal hosts in all Container Cloud clusters. The size of this range limits the number of hosts that can be deployed by the instance of Container Cloud.

    10.0.0.100-10.0.0.252

    SET_LB_HOST 1

    The IP address of the externally accessible API endpoint of the management cluster. This address must NOT be within the SET_METALLB_ADDR_POOL range but must be from the PXE network. External load balancers are not supported.

    10.0.0.90

    SET_METALLB_ADDR_POOL 1

    The IP range to be used as external load balancers for the Kubernetes services with the LoadBalancer type. This range must be within the PXE network. The minimum required range is 19 IP addresses.

    10.0.0.61-10.0.0.80

    1(1,2)

    Use the same value that you used for this parameter in the cluster.yaml.template file (see above).

  11. Optional. To connect the management cluster hosts to the PXE/management network using bond interfaces, proceed to Configure NIC bonding.

  12. If you require all Internet access to go through a proxy server, in bootstrap.env, add the following environment variables to bootstrap the management and regional cluster using proxy:

    • HTTP_PROXY

    • HTTPS_PROXY

    • NO_PROXY

    Example snippet:

    export HTTP_PROXY=http://proxy.example.com:3128
    export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
    export NO_PROXY=172.18.10.0,registry.internal.lan
    

    The following variables formats are accepted:

    Proxy configuration data

    Variable

    Format

    • HTTP_PROXY

    • HTTPS_PROXY

    • http://proxy.example.com:port - for anonymous access

    • http://user:password@proxy.example.com:port - for restricted access

    • NO_PROXY

    Comma-separated list of IP addresses or domain names

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for a baremetal-based cluster.

  13. Optional. Configure external identity provider for IAM.

  14. Optional. If you are going to use your own TLS certificates for Keycloak, set DISABLE_OIDC=true in bootstrap.env.

  15. Configure the Ceph cluster:

    1. Optional. Technology Preview. Configure Ceph controller to manage Ceph nodes resources. In templates/bm/cluster.yaml.template, in the ceph-controller section of spec.providerSpec.value.helmReleases, specify the hyperconverge parameter with required resource requests, limits, or tolerations:

      spec:
         providerSpec:
           value:
             helmReleases:
             - name: ceph-controller
               values:
                 hyperconverge:
                   tolerations: <ceph tolerations map>
                   resources: <ceph resource management map>
      

      For the parameters description, see Enable Ceph tolerations and resources management.

    2. In templates/bm/kaascephcluster.yaml.template:

      • Configure dedicated networks clusterNet and publicNet for Ceph components.

      • Set up the disk configuration according to your hardware node specification. Verify that the storageDevices section has a valid list of HDD, SSD, or NVME device names and each device is empty, that is, no file system is present on it. To enable all LCM features of Ceph controller, set manageOsds to true.

        Caution

        The manageOsds parameter enables irreversible operations such as Ceph OSD removal. Therefore, use this feature with caution.

      • If required, configure other parameters as described in Ceph advanced configuration.

      Configuration example:

      manageOsds: true
      ...
      # This part of KaaSCephCluster should contain valid networks definition
      network:
        clusterNet: 10.10.10.0/24
        publicNet: 10.10.11.0/24
      ...
      nodes:
        master-0:
        ...
        <node_name>:
          ...
          # This part of KaaSCephCluster should contain valid device names
          storageDevices:
          - name: sdb
            config:
              deviceClass: hdd
          # Each storageDevices dicts can have several devices
          - name: sdc
            config:
              deviceClass: hdd
          # All devices for Ceph also should be described to ``wipe`` in
          # ``baremetalhosts.yaml.template``
          - name: sdd
            config:
             deviceClass: hdd
          # Do not to include first devices here (like vda or sda)
          # because they will be allocated for operating system
      
    3. In machines.yaml.template, verify that the metadata:name structure matches the machine names in the spec:nodes structure of kaascephcluster.yaml.template.

  16. Verify that the kaas-bootstrap directory contains the following files:

    # tree  ~/kaas-bootstrap
      ~/kaas-bootstrap/
      ....
      ├── bootstrap.sh
      ├── kaas
      ├── mirantis.lic
      ├── releases
      ...
      ├── templates
      ....
      │   ├── bm
      │   │   ├── baremetalhostprofiles.yaml.template
      │   │   ├── baremetalhosts.yaml.template
      │   │   ├── cluster.yaml.template
      │   │   ├── ipam-objects.yaml.template
      │   │   ├── kaascephcluster.yaml.template
      │   │   └── machines.yaml.template
      ....
      ├── templates.backup
          ....
    
  17. Export all required parameters using the table below.

    export KAAS_BM_ENABLED="true"
    #
    export KAAS_BM_PXE_IP="10.0.0.20"
    export KAAS_BM_PXE_MASK="24"
    export KAAS_BM_PXE_BRIDGE="br0"
    #
    export KAAS_BM_BM_DHCP_RANGE="10.0.0.30,10.0.0.49"
    export BOOTSTRAP_METALLB_ADDRESS_POOL="10.0.0.61-10.0.0.80"
    #
    unset KAAS_BM_FULL_PREFLIGHT
    
    Bare metal prerequisites data

    Parameter

    Description

    Example value

    KAAS_BM_PXE_IP

    The provisioning IP address. This address will be assigned to the interface of the seed node defined by the KAAS_BM_PXE_BRIDGE parameter (see below). The PXE service of the bootstrap cluster will use this address to network boot the bare metal hosts for the management cluster.

    10.0.0.20

    KAAS_BM_PXE_MASK

    The CIDR prefix for the PXE network. It will be used with all of the addresses below when assigning them to interfaces.

    24

    KAAS_BM_PXE_BRIDGE

    The PXE network bridge name. The name must match the name of the bridge created on the seed node during the Prepare the seed node stage.

    br0

    KAAS_BM_BM_DHCP_RANGE

    The start_ip and end_ip addresses must be within the PXE network. This range will be used by dnsmasq to provide IP addresses for nodes during provisioning.

    10.0.0.30,10.0.0.49

    BOOTSTRAP_METALLB_ADDRESS_POOL

    The pool of IP addresses that will be used by services in the bootstrap cluster. Can be the same as the SET_METALLB_ADDR_POOL range for the management cluster, or a different range.

    10.0.0.61-10.0.0.80

    KEYCLOAK_FLOATING_IP Optional

    The spec.loadBalancerIP address for the Keycloak service. Use the address from the top of the SET_METALLB_ADDR_POOL range.

    10.0.0.70 2

    IAM_FLOATING_IP Optional

    The spec.loadBalancerIP address for the IAM service. Use the address from the top of the SET_METALLB_ADDR_POOL range.

    10.0.0.71 2

    PROXY_FLOATING_IP Optional

    The spec.loadBalancerIP address for the Squid service. Use the address from the top of the SET_METALLB_ADDR_POOL range.

    10.0.0.72 2

    2(1,2,3)

    Must not conflict with other *_FLOATING_IP parameters. Use the address from the top of the SET_METALLB_ADDR_POOL range.

  18. Run the verification preflight script to validate the deployment templates configuration:

    ./bootstrap.sh preflight
    

    The command outputs a human-readable report with the verification details. The report includes the list of verified bare metal nodes and their Chassis Power status. This status is based on the deployment templates configuration used during the verification.

    Caution

    If the report contains information about missing dependencies or incorrect configuration, fix the issues before proceeding to the next step.

  19. Run the bootstrap script:

    ./bootstrap.sh all
    

    In case of deployment issues, refer to Troubleshooting. If the script fails for an unknown reason:

    1. Run the cleanup script:

      ./bootstrap.sh cleanup
      
    2. Rerun the bootstrap script.

    Note

    If the bootstrap fails on the Connecting to bootstrap cluster step with the unable to initialize Tiller in bootstrap cluster: failed to establish connection with tiller error, refer to the known issue 16873 to identify possible root cause of the issue and apply the workaround, if applicable.

    Warning

    During the bootstrap process, do not manually restart or power off any of the bare metal hosts.

  20. When the bootstrap is complete, collect and save the following management cluster details in a secure location:

    • The kubeconfig file located in the same directory as the bootstrap script. This file contains the admin credentials for the management cluster.

    • The private ssh_key for access to the management cluster nodes that is located in the same directory as the bootstrap script.

      Note

      If the initial version of your Container Cloud management cluster was earlier than 2.6.0, ssh_key is named openstack_tmp and is located at ~/.ssh/.

    • The URL for the Container Cloud web UI.

      To create users with permissions required for accessing the Container Cloud web UI, see Create initial users after a management cluster bootstrap.

    • The StackLight endpoints. For details, see Access StackLight web UIs.

    • The Keycloak URL that the system outputs when the bootstrap completes. The admin password for Keycloak is located in kaas-bootstrap/passwords.yml along with other IAM passwords.

    Note

    The Container Cloud web UI and StackLight endpoints are available through Transport Layer Security (TLS) and communicate with Keycloak to authenticate users. Keycloak is exposed using HTTPS and self-signed TLS certificates that are not trusted by web browsers.

    To use your own TLS certificates for Keycloak, refer to Configure TLS certificates for management cluster applications.

    Note

    When the bootstrap is complete, the bootstrap cluster resources are freed up.

Configure NIC bonding

You can configure L2 templates for the management cluster to set up a bond network interface for the PXE/Management network.

This configuration must be applied to the bootstrap templates, before you run the bootstrap script to deploy the management cluster.

Caution

  • This configuration requires each host in your management cluster to have at least two physical interfaces.

  • Connect at least two interfaces per host to an Ethernet switch that supports Link Aggregation Control Protocol (LACP) port groups and LACP fallback.

  • Configure an LACP group on the ports connected to the NICs of a host.

  • Configure the LACP fallback on the port group to ensure that the host can boot over the PXE network before the bond interface is set up on the host operating system.

  • Configure server BIOS for both NICs of a bond to be PXE-enabled.

  • If the server does not support booting from multiple NICs, configure the port of the LACP group that is connected to the PXE-enabled NIC of a server to be primary port. With this setting, the port becomes active in the fallback mode.

To configure a bond interface that aggregates two interfaces for the PXE/Management network:

  1. In kaas-bootstrap/templates/bm/ipam-objects.yaml.template:

    1. Configure only the following parameters for the declaration of {{nic0}}, as shown in the example below:

      • dhcp4

      • dhcp6

      • match

      • set-name

      Remove other parameters.

    2. Add the declaration of the second NIC {{nic1}} to be added to the bond interface:

      • Specify match:mac-address: {{mac1}} to match the MAC of the desired NIC.

      • Specify set-name: {{nic1}} to ensure the correct name of the NIC.

    3. Add the declaration of the bond interface bond0. It must have the interfaces parameter listing both Ethernet interfaces. Set addresses, gateway4, and nameservers fields to fetch data from the kaas-mgmt subnet.

    4. Configure bonding options using the parameters field. The only mandatory option is mode. See the example below for details.

      Note

      You can set any mode supported by netplan and your hardware.

  2. Verify your configuration using the following example:

    kind: L2Template
    metadata:
      name: kaas-mgmt
      ...
    spec:
      ...
      npTemplate: |
        version: 2
        ethernets:
          {{nic0}}:
            dhcp4: false
            dhcp6: false
            match:
              mac-address: {{mac0}}
            set-name: {{nic0}}
          {{nic1}}:
            dhcp4: false
            dhcp6: false
            match:
              mac-address: {{mac1}}
            set-name: {{nic1}}
        bonds:
          bond0:
            interfaces:
              - {{nic0}}
              - {{nic1}}
            parameters:
              mode: 802.3ad
            dhcp4: false
            dhcp6: false
            addresses:
              - {{ip "bond0:kaas-mgmt"}}
            gateway4: {{gateway_from_subnet "kaas-mgmt"}}
            nameservers:
              addresses: {{nameservers_from_subnet "kaas-mgmt"}}
        ...
    
  3. Proceed to bootstrap your management cluster as described in Bootstrap a management cluster.

Customize the default bare metal host profile

This section describes the bare metal host profile settings and instructs how to configure this profile before deploying Mirantis Container Cloud on physical servers.

The bare metal host profile is a Kubernetes custom resource. It allows the Infrastructure Operator to define how the storage devices and the operating system are provisioned and configured.

The bootstrap templates for a bare metal deployment include the template for the default BareMetalHostProfile object in the following file that defines the default bare metal host profile:

templates/bm/baremetalhostprofiles.yaml.template

Note

Using BareMetalHostProfile, you can configure LVM or mdadm-based software RAID support during a management or managed cluster creation. For details, see Configure RAID support.

This feature is available as Technology Preview. Use such configuration for testing and evaluation purposes only. For the Technology Preview feature definition, refer to Technology Preview support scope.

The customization procedure of BareMetalHostProfile is almost the same for the management and managed clusters, with the following differences:

  • For a management cluster, the customization automatically applies to machines during bootstrap. And for a managed cluster, you apply the changes using kubectl before creating a managed cluster.

  • For a management cluster, you edit the default baremetalhostprofiles.yaml.template. And for a managed cluster, you create a new BareMetalHostProfile with the necessary configuration.

For the procedure details, see Create a custom bare metal host profile. Use this procedure for both types of clusters considering the differences described above.

Deploy an OpenStack-based management cluster

This section describes how to bootstrap an OpenStack-based Mirantis Container Cloud management cluster.

Workflow overview

The Infrastructure Operator performs the following steps to install Mirantis Container Cloud on an OpenStack-based environment:

  1. Prepare an OpenStack environment that meets the Requirements for an OpenStack-based cluster.

  2. Prepare the bootstrap node using Prerequisites.

  3. Obtain the Mirantis license file that will be required during the bootstrap.

  4. Prepare the OpenStack clouds.yaml file.

  5. Create and configure the deployment configuration files that include the cluster and machines metadata.

  6. Run the bootstrap script for the fully automated installation of the management cluster.

For more details, see Bootstrap a management cluster.

Prerequisites

Before you start with bootstrapping the OpenStack-based management cluster, complete the following prerequisite steps:

  1. Verify that your planned cloud meets the reference hardware bill of material and software requirements as described in Requirements for an OpenStack-based cluster.

  2. Configure the bootstrap node:

    1. Log in to any personal computer or VM running Ubuntu 18.04 that you will be using as the bootstrap node.

    2. If you use a newly created VM, run:

      sudo apt-get update
      
    3. Install the current Docker version available for Ubuntu 18.04:

      sudo apt install docker.io
      
    4. Grant your USER access to the Docker daemon:

      sudo usermod -aG docker $USER
      
    5. Log off and log in again to the bootstrap node to apply the changes.

    6. Verify that Docker is configured correctly and has access to Container Cloud CDN. For example:

      docker run --rm alpine sh -c "apk add --no-cache curl; \
      curl https://binary.mirantis.com"
      

      The system output must contain no error records. In case of issues, follow the steps provided in Troubleshooting.

      Note

      If you require all Internet access to go through a proxy server for security and audit purposes, configure Docker proxy settings as described in the official Docker documentation.

  3. Proceed to Bootstrap a management cluster.

Bootstrap a management cluster

After you complete the prerequisite steps described in Prerequisites, proceed with bootstrapping your OpenStack-based Mirantis Container Cloud management cluster.

To bootstrap an OpenStack-based management cluster:

  1. Log in to the bootstrap node running Ubuntu 18.04 that is configured as described in Prerequisites.

  2. Prepare the bootstrap script:

    1. Download and run the Container Cloud bootstrap script:

      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      chmod 0755 get_container_cloud.sh
      ./get_container_cloud.sh
      
    2. Change the directory to the kaas-bootstrap folder created by the script.

  3. Obtain your license file that will be required during the bootstrap:

    1. Create a user account at www.mirantis.com.

    2. Log in to your account and download the mirantis.lic license file.

    3. Save the license file as mirantis.lic under the kaas-bootstrap directory on the bootstrap node.

  4. Prepare the OpenStack configuration for a new cluster:

    1. Log in to the OpenStack Horizon.

    2. In the Project section, select API Access.

    3. In the right-side drop-down menu Download OpenStack RC File, select OpenStack clouds.yaml File.

    4. Save the downloaded clouds.yaml file in the kaas-bootstrap folder created by the get_container_cloud.sh script.

    5. In clouds.yaml, add the password field with your OpenStack password under the clouds/openstack/auth section.

      Example:

      clouds:
        openstack:
          auth:
            auth_url: https://auth.openstack.example.com:5000/v3
            username: your_username
            password: your_secret_password
            project_id: your_project_id
            user_domain_name: your_user_domain_name
          region_name: RegionOne
          interface: public
          identity_api_version: 3
      
    6. Verify access to the target cloud endpoint from Docker. For example:

      docker run --rm alpine sh -c "apk add --no-cache curl; \
      curl https://auth.openstack.example.com:5000/v3"
      

      The system output must contain no error records.

    In case of issues, follow the steps provided in Troubleshooting.

  5. Configure the cluster and machines metadata:

    1. In templates/machines.yaml.template, modify the spec:providerSpec:value section for 3 control plane nodes marked with the cluster.sigs.k8s.io/control-plane label by substituting the flavor and image parameters with the corresponding values of the control plane nodes in the related OpenStack cluster. For example:

      spec: &cp_spec
        providerSpec:
          value:
            apiVersion: "openstackproviderconfig.k8s.io/v1alpha1"
            kind: "OpenstackMachineProviderSpec"
            flavor: kaas.minimal
            image: bionic-server-cloudimg-amd64-20190612
      

      Note

      The flavor parameter value provided in the example above is cloud-specific and must meet the Container Cloud requirements.

      Also, modify other parameters as required.

    2. Modify the templates/cluster.yaml.template parameters to fit your deployment. For example, add the corresponding values for cidrBlocks in the spec::clusterNetwork::services section.

  6. Optional. Configure backups for the MariaDB database as described in Configure periodic backups of MariaDB for AWS and OpenStack providers.

  7. Optional. Configure the regional NTP server parameters to be applied to all machines of regional and managed clusters in the specified region.

    In templates/cluster.yaml.template, add the ntp:servers section with the list of required servers names:

    spec:
      ...
      providerSpec:
        value:
          kaas:
          ...
            regional:
              - helmReleases:
                - name: openstack-provider
                  values:
                    config:
                      lcm:
                        ...
                        ntp:
                          servers:
                          - 0.pool.ntp.org
                          ...
                provider: openstack
                ...
    
  8. Optional. If you require all Internet access to go through a proxy server, in bootstrap.env, add the following environment variables to bootstrap the management and regional cluster using proxy:

    • HTTP_PROXY

    • HTTPS_PROXY

    • NO_PROXY

    Example snippet:

    export HTTP_PROXY=http://proxy.example.com:3128
    export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
    export NO_PROXY=172.18.10.0,registry.internal.lan
    

    The following variables formats are accepted:

    Proxy configuration data

    Variable

    Format

    • HTTP_PROXY

    • HTTPS_PROXY

    • http://proxy.example.com:port - for anonymous access

    • http://user:password@proxy.example.com:port - for restricted access

    • NO_PROXY

    Comma-separated list of IP addresses or domain names

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for an OpenStack-based cluster.

  9. Optional. Configure external identity provider for IAM.

  10. Optional. If you are going to use your own TLS certificates for Keycloak, set DISABLE_OIDC=true in bootstrap.env.

  11. Run the bootstrap script:

    ./bootstrap.sh all
    

    In case of deployment issues, refer to Troubleshooting. If the script fails for an unknown reason:

    1. Run the cleanup script:

      ./bootstrap.sh cleanup
      
    2. Rerun the bootstrap script.

    Note

    If the bootstrap fails on the Connecting to bootstrap cluster step with the unable to initialize Tiller in bootstrap cluster: failed to establish connection with tiller error, refer to the known issue 16873 to identify possible root cause of the issue and apply the workaround, if applicable.

  12. When the bootstrap is complete, collect and save the following management cluster details in a secure location:

    • The kubeconfig file located in the same directory as the bootstrap script. This file contains the admin credentials for the management cluster.

    • The private ssh_key for access to the management cluster nodes that is located in the same directory as the bootstrap script.

      Note

      If the initial version of your Container Cloud management cluster was earlier than 2.6.0, ssh_key is named openstack_tmp and is located at ~/.ssh/.

    • The URL for the Container Cloud web UI.

      To create users with permissions required for accessing the Container Cloud web UI, see Create initial users after a management cluster bootstrap.

    • The StackLight endpoints. For details, see Access StackLight web UIs.

    • The Keycloak URL that the system outputs when the bootstrap completes. The admin password for Keycloak is located in kaas-bootstrap/passwords.yml along with other IAM passwords.

    Note

    The Container Cloud web UI and StackLight endpoints are available through Transport Layer Security (TLS) and communicate with Keycloak to authenticate users. Keycloak is exposed using HTTPS and self-signed TLS certificates that are not trusted by web browsers.

    To use your own TLS certificates for Keycloak, refer to Configure TLS certificates for management cluster applications.

    Note

    When the bootstrap is complete, the bootstrap cluster resources are freed up.

  13. Optional. Deploy an additional regional cluster as described in Deploy an additional regional cluster (optional).

Now, you can proceed with operating your management cluster using the Container Cloud web UI and deploying managed clusters as described in Create and operate an OpenStack-based managed cluster.

Deploy an AWS-based management cluster

This section describes how to bootstrap a Mirantis Container Cloud management cluster that is based on the Amazon Web Services (AWS) cloud provider.

Workflow overview

The Infrastructure Operator performs the following steps to install Mirantis Container Cloud on an AWS-based environment:

  1. Prepare an AWS environment that meets the Requirements for an AWS-based cluster.

  2. Prepare the bootstrap node as per Prerequisites.

  3. Obtain the Mirantis license file that will be required during the bootstrap.

  4. Prepare the AWS environment credentials.

  5. Create and configure the deployment configuration files that include the cluster and machines metadata.

  6. Run the bootstrap script for the fully automated installation of the management cluster.

For more details, see Bootstrap a management cluster.

Prerequisites

Before you start with bootstrapping the AWS-based management cluster, complete the following prerequisite steps:

  1. Inspect the Requirements for an AWS-based cluster to understand the potential impact of the Container Cloud deployment on your AWS cloud usage.

  2. Configure the bootstrap node:

    1. Log in to any personal computer or VM running Ubuntu 18.04 that you will be using as the bootstrap node.

    2. If you use a newly created VM, run:

      sudo apt-get update
      
    3. Install the current Docker version available for Ubuntu 18.04:

      sudo apt install docker.io
      
    4. Grant your USER access to the Docker daemon:

      sudo usermod -aG docker $USER
      
    5. Log off and log in again to the bootstrap node to apply the changes.

    6. Verify that Docker is configured correctly and has access to Container Cloud CDN. For example:

      docker run --rm alpine sh -c "apk add --no-cache curl; \
      curl https://binary.mirantis.com"
      

      The system output must contain no error records. In case of issues, follow the steps provided in Troubleshooting.

      Note

      If you require all Internet access to go through a proxy server for security and audit purposes, configure Docker proxy settings as described in the official Docker documentation.

  3. Proceed to Bootstrap a management cluster.

Bootstrap a management cluster

After you complete the prerequisite steps described in Prerequisites, proceed with bootstrapping your AWS-based Mirantis Container Cloud management cluster.

To bootstrap an AWS-based management cluster:

  1. Log in to the bootstrap node running Ubuntu 18.04 that is configured as described in Prerequisites.

  2. Prepare the bootstrap script:

    1. Download and run the Container Cloud bootstrap script:

      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      chmod 0755 get_container_cloud.sh
      ./get_container_cloud.sh
      
    2. Change the directory to the kaas-bootstrap folder created by the script.

  3. Obtain your license file that will be required during the bootstrap:

    1. Create a user account at www.mirantis.com.

    2. Log in to your account and download the mirantis.lic license file.

    3. Save the license file as mirantis.lic under the kaas-bootstrap directory on the bootstrap node.

  4. Prepare the AWS deployment templates:

    1. Verify access to the target cloud endpoint from Docker. For example:

      docker run --rm alpine sh -c "apk add --no-cache curl; \
      curl https://ec2.amazonaws.com"
      

      The system output must contain no error records. In case of issues, follow the steps provided in Troubleshooting.

    2. Change the directory to the kaas-bootstrap folder.

    3. In templates/aws/machines.yaml.template, modify the spec:providerSpec:value section by substituting the ami:id parameter with the corresponding value for Ubuntu 18.04 from the required AWS region. For example:

      spec:
        providerSpec:
          value:
            apiVersion: aws.kaas.mirantis.com/v1alpha1
            kind: AWSMachineProviderSpec
            instanceType: c5d.4xlarge
            ami:
              id: ami-033a0960d9d83ead0
      

      Also, modify other parameters as required.

    4. Optional. In templates/aws/cluster.yaml.template, modify the default configuration of the AWS instance types and AMI IDs for further creation of managed clusters:

      providerSpec:
          value:
            ...
            kaas:
              ...
              regional:
              - provider: aws
                helmReleases:
                  - name: aws-credentials-controller
                    values:
                      config:
                        allowedInstanceTypes:
                          minVCPUs: 8
                          # in MiB
                          minMemory: 16384
                          # in GB
                          minStorage: 120
                          supportedArchitectures:
                          - "x86_64"
                          filters:
                          - name: instance-storage-info.disk.type
                            values:
                              - "ssd"
                        allowedAMIs:
                        -
                          - name: name
                            values:
                            - "ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20200729"
                          - name: owner-id
                            values:
                            - "099720109477"
      

      Also, modify other parameters as required.

  5. Optional. Configure backups for the MariaDB database as described in Configure periodic backups of MariaDB for AWS and OpenStack providers.

  6. Optional. Configure the regional NTP server parameters to be applied to all machines of regional and managed clusters in the specified region.

    In templates/aws/cluster.yaml.template, add the ntp:servers section with the list of required servers names:

    spec:
      ...
      providerSpec:
        value:
          kaas:
          ...
            regional:
              - helmReleases:
                - name: aws-provider
                  values:
                    config:
                      lcm:
                        ...
                        ntp:
                          servers:
                          - 0.pool.ntp.org
                          ...
                provider: aws
                ...
    
  7. Generate the AWS Access Key ID with Secret Access Key for the user with the IAMFullAccess permissions and select the AWS default region name. For details, see AWS General Reference: Programmatic access.

  8. Export the following parameters by adding the corresponding values for the AWS IAMFullAccess user credentials created in the previous step:

    export KAAS_AWS_ENABLED=true
    export AWS_SECRET_ACCESS_KEY=XXXXXXX
    export AWS_ACCESS_KEY_ID=XXXXXXX
    export AWS_DEFAULT_REGION=us-east-2
    
  9. For Container Cloud to communicate with the AWS APIs, create the AWS CloudFormation stack that contains properly configured IAM users and policies.

    Note

    If the AWS CloudFormation stack already exists in your AWS account, skip this step.

    ./container-cloud bootstrap aws policy
    

    If you do not have access to create the CloudFormation stack, users, or policies:

    1. Log in to your AWS Management Console.

    2. On the home page, expand the upper right menu with your user name and capture your Account ID.

    3. Create the CloudFormation template:

      ./container-cloud bootstrap aws policy --account-id <accountId> --dump > cf.yaml
      

      Substitute the parameter enclosed in angle brackets with the corresponding value.

    4. Send the cf.yaml template to your AWS account admin to create the CloudFormation stack from this template.

      The generated template includes the following lists of IAM permissions:

      • ec2:DescribeInstances

      • ec2:DescribeRegions

      • ecr:GetAuthorizationToken

      • ecr:BatchCheckLayerAvailability

      • ecr:GetDownloadUrlForLayer

      • ecr:GetRepositoryPolicy

      • ecr:DescribeRepositories

      • ecr:ListImages

      • ecr:BatchGetImage

      • ec2:AllocateAddress

      • ec2:AssociateRouteTable

      • ec2:AttachInternetGateway

      • ec2:AuthorizeSecurityGroupIngress

      • ec2:CreateInternetGateway

      • ec2:CreateNatGateway

      • ec2:CreateRoute

      • ec2:CreateRouteTable

      • ec2:CreateSecurityGroup

      • ec2:CreateSubnet

      • ec2:CreateTags

      • ec2:CreateVpc

      • ec2:ModifyVpcAttribute

      • ec2:DeleteInternetGateway

      • ec2:DeleteNatGateway

      • ec2:DeleteRouteTable

      • ec2:DeleteSecurityGroup

      • ec2:DeleteSubnet

      • ec2:DeleteTags

      • ec2:DeleteVpc

      • ec2:DescribeAccountAttributes

      • ec2:DescribeAddresses

      • ec2:DescribeAvailabilityZones

      • ec2:DescribeInstances

      • ec2:DescribeInstanceTypes

      • ec2:DescribeInternetGateways

      • ec2:DescribeImages

      • ec2:DescribeNatGateways

      • ec2:DescribeNetworkInterfaces

      • ec2:DescribeNetworkInterfaceAttribute

      • ec2:DescribeRegions

      • ec2:DescribeRouteTables

      • ec2:DescribeSecurityGroups

      • ec2:DescribeSubnets

      • ec2:DescribeVpcs

      • ec2:DescribeVpcAttribute

      • ec2:DescribeVolumes

      • ec2:DetachInternetGateway

      • ec2:DisassociateRouteTable

      • ec2:DisassociateAddress

      • ec2:ModifyInstanceAttribute

      • ec2:ModifyNetworkInterfaceAttribute

      • ec2:ModifySubnetAttribute

      • ec2:ReleaseAddress

      • ec2:RevokeSecurityGroupIngress

      • ec2:RunInstances

      • ec2:TerminateInstances

      • tag:GetResources

      • elasticloadbalancing:CreateLoadBalancer

      • elasticloadbalancing:ConfigureHealthCheck

      • elasticloadbalancing:DeleteLoadBalancer

      • elasticloadbalancing:DescribeLoadBalancers

      • elasticloadbalancing:DescribeLoadBalancerAttributes

      • elasticloadbalancing:ModifyLoadBalancerAttributes

      • elasticloadbalancing:RegisterInstancesWithLoadBalancer

      • elasticloadbalancing:DescribeTargetGroups

      • elasticloadbalancing:CreateTargetGroup

      • elasticloadbalancing:DeleteTargetGroup

      • elasticloadbalancing:DescribeListeners

      • elasticloadbalancing:CreateListener

      • elasticloadbalancing:DeleteListener

      • elasticloadbalancing:RegisterTargets

      • elasticloadbalancing:DeregisterTargets

      • autoscaling:DescribeAutoScalingGroups

      • autoscaling:DescribeLaunchConfigurations

      • autoscaling:DescribeTags

      • ec2:DescribeInstances

      • ec2:DescribeImages

      • ec2:DescribeRegions

      • ec2:DescribeRouteTables

      • ec2:DescribeSecurityGroups

      • ec2:DescribeSubnets

      • ec2:DescribeVolumes

      • ec2:CreateSecurityGroup

      • ec2:CreateTags

      • ec2:CreateVolume

      • ec2:ModifyInstanceAttribute

      • ec2:ModifyVolume

      • ec2:AttachVolume

      • ec2:AuthorizeSecurityGroupIngress

      • ec2:CreateRoute

      • ec2:DeleteRoute

      • ec2:DeleteSecurityGroup

      • ec2:DeleteVolume

      • ec2:DetachVolume

      • ec2:RevokeSecurityGroupIngress

      • ec2:DescribeVpcs

      • elasticloadbalancing:AddTags

      • elasticloadbalancing:AttachLoadBalancerToSubnets

      • elasticloadbalancing:ApplySecurityGroupsToLoadBalancer

      • elasticloadbalancing:CreateLoadBalancer

      • elasticloadbalancing:CreateLoadBalancerPolicy

      • elasticloadbalancing:CreateLoadBalancerListeners

      • elasticloadbalancing:ConfigureHealthCheck

      • elasticloadbalancing:DeleteLoadBalancer

      • elasticloadbalancing:DeleteLoadBalancerListeners

      • elasticloadbalancing:DescribeLoadBalancers

      • elasticloadbalancing:DescribeLoadBalancerAttributes

      • elasticloadbalancing:DetachLoadBalancerFromSubnets

      • elasticloadbalancing:DeregisterInstancesFromLoadBalancer

      • elasticloadbalancing:ModifyLoadBalancerAttributes

      • elasticloadbalancing:RegisterInstancesWithLoadBalancer

      • elasticloadbalancing:SetLoadBalancerPoliciesForBackendServer

      • elasticloadbalancing:AddTags

      • elasticloadbalancing:CreateListener

      • elasticloadbalancing:CreateTargetGroup

      • elasticloadbalancing:DeleteListener

      • elasticloadbalancing:DeleteTargetGroup

      • elasticloadbalancing:DescribeListeners

      • elasticloadbalancing:DescribeLoadBalancerPolicies

      • elasticloadbalancing:DescribeTargetGroups

      • elasticloadbalancing:DescribeTargetHealth

      • elasticloadbalancing:ModifyListener

      • elasticloadbalancing:ModifyTargetGroup

      • elasticloadbalancing:RegisterTargets

      • elasticloadbalancing:SetLoadBalancerPoliciesOfListener

      • iam:CreateServiceLinkedRole

      • kms:DescribeKey

  10. Configure the bootstrapper.cluster-api-provider-aws.kaas.mirantis.com user created in the previous steps:

    1. Using your AWS Management Console, generate the AWS Access Key ID with Secret Access Key for bootstrapper.cluster-api-provider-aws.kaas.mirantis.com and select the AWS default region name.

      Note

      Other authorization methods, such as usage of AWS_SESSION_TOKEN, are not supported.

    2. Export the AWS bootstrapper.cluster-api-provider-aws.kaas.mirantis.com user credentials that were created in the previous step:

      export KAAS_AWS_ENABLED=true
      export AWS_SECRET_ACCESS_KEY=XXXXXXX
      export AWS_ACCESS_KEY_ID=XXXXXXX
      export AWS_DEFAULT_REGION=us-east-2
      
  11. If you require all Internet access to go through a proxy server, in bootstrap.env, add the following environment variables to bootstrap the management and regional cluster using proxy:

    • HTTP_PROXY

    • HTTPS_PROXY

    • NO_PROXY

    Example snippet:

    export HTTP_PROXY=http://proxy.example.com:3128
    export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
    export NO_PROXY=172.18.10.0,registry.internal.lan
    

    The following variables formats are accepted:

    Proxy configuration data

    Variable

    Format

    • HTTP_PROXY

    • HTTPS_PROXY

    • http://proxy.example.com:port - for anonymous access

    • http://user:password@proxy.example.com:port - for restricted access

    • NO_PROXY

    Comma-separated list of IP addresses or domain names

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for an AWS-based cluster.

  12. Optional. Configure external identity provider for IAM.

  13. Optional. If you are going to use your own TLS certificates for Keycloak, set DISABLE_OIDC=true in bootstrap.env.

  14. Run the bootstrap script:

    ./bootstrap.sh all
    

    In case of deployment issues, refer to Troubleshooting. If the script fails for an unknown reason:

    1. Run the cleanup script:

      ./bootstrap.sh cleanup
      
    2. Rerun the bootstrap script.

    Note

    If the bootstrap fails on the Connecting to bootstrap cluster step with the unable to initialize Tiller in bootstrap cluster: failed to establish connection with tiller error, refer to the known issue 16873 to identify possible root cause of the issue and apply the workaround, if applicable.

  15. When the bootstrap is complete, collect and save the following management cluster details in a secure location:

    • The kubeconfig file located in the same directory as the bootstrap script. This file contains the admin credentials for the management cluster.

    • The private ssh_key for access to the management cluster nodes that is located in the same directory as the bootstrap script.

      Note

      If the initial version of your Container Cloud management cluster was earlier than 2.6.0, ssh_key is named openstack_tmp and is located at ~/.ssh/.

    • The URL for the Container Cloud web UI.

      To create users with permissions required for accessing the Container Cloud web UI, see Create initial users after a management cluster bootstrap.

    • The StackLight endpoints. For details, see Access StackLight web UIs.

    • The Keycloak URL that the system outputs when the bootstrap completes. The admin password for Keycloak is located in kaas-bootstrap/passwords.yml along with other IAM passwords.

    Note

    The Container Cloud web UI and StackLight endpoints are available through Transport Layer Security (TLS) and communicate with Keycloak to authenticate users. Keycloak is exposed using HTTPS and self-signed TLS certificates that are not trusted by web browsers.

    To use your own TLS certificates for Keycloak, refer to Configure TLS certificates for management cluster applications.

    Note

    When the bootstrap is complete, the bootstrap cluster resources are freed up.

  16. Optional. Deploy an additional regional cluster of a different provider type as described in Deploy an additional regional cluster (optional).

Now, you can proceed with operating your management cluster using the Container Cloud web UI and deploying managed clusters as described in Create and operate an AWS-based managed cluster.

Using the same management cluster, you can also deploy managed clusters that are based on the Equinix Metal cloud provider. For details, see Create and operate an Equinix Metal based managed cluster.

Deploy an Azure-based management cluster

This section describes how to bootstrap a Mirantis Container Cloud management cluster that is based on the Microsoft Azure cloud provider.

Workflow overview

The Infrastructure Operator performs the following steps to deploy Mirantis Container Cloud on Azure:

  1. Prepare an Azure environment that meets Requirements for an Azure-based cluster.

  2. Prepare the bootstrap node as described in Prerequisites.

  3. Obtain the Mirantis license file that will be required during the bootstrap.

  4. Prepare the Azure subscription credentials.

  5. Create and configure the deployment configuration files that include the cluster and machines metadata.

  6. Run the bootstrap script for a fully automated installation of the management cluster.

For more details, see Bootstrap a management cluster.

Prerequisites

Before you start with bootstrapping an Azure-based management cluster, complete the following prerequisite steps:

  1. Inspect the Requirements for an Azure-based cluster to understand the potential impact of the Container Cloud deployment on your Azure project.

  2. Configure the bootstrap node:

    1. Log in to any personal computer or VM running Ubuntu 18.04 that you will be using as the bootstrap node.

    2. If you use a newly created VM, run:

      sudo apt-get update
      
    3. Install the current Docker version available for Ubuntu 18.04:

      sudo apt install docker.io
      
    4. Grant your USER access to the Docker daemon:

      sudo usermod -aG docker $USER
      
    5. Log off and log in again to the bootstrap node to apply the changes.

    6. Verify that Docker is configured correctly and has access to Container Cloud CDN. For example:

      docker run --rm alpine sh -c "apk add --no-cache curl; \
      curl https://binary.mirantis.com"
      

      The system output must contain no error records. In case of issues, follow the steps provided in Troubleshooting.

      Note

      If you require all Internet access to go through a proxy server for security and audit purposes, configure Docker proxy settings as described in the official Docker documentation.

  3. Proceed to Bootstrap a management cluster.

Bootstrap a management cluster

After you complete the prerequisite steps described in Prerequisites, proceed with bootstrapping your Mirantis Container Cloud management cluster based on the Azure provider.

To bootstrap an Azure-based management cluster:

  1. Log in to the bootstrap node running Ubuntu 18.04 that is configured as described in Prerequisites.

  2. Prepare the bootstrap script:

    1. Download and run the Container Cloud bootstrap script:

      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      chmod 0755 get_container_cloud.sh
      ./get_container_cloud.sh
      
    2. Change the directory to the kaas-bootstrap folder created by the script.

  3. Obtain your license file that will be required during the bootstrap:

    1. Create a user account at www.mirantis.com.

    2. Log in to your account and download the mirantis.lic license file.

    3. Save the license file as mirantis.lic under the kaas-bootstrap directory on the bootstrap node.

  4. Set up your Azure environment:

    1. Create an Azure service principal. Skip this step to use an existing Azure service principal.

      1. Create a Microsoft Azure account.

      2. Install Azure CLI.

      3. Log in to the Azure CLI:

        az login
        
      4. List your Azure accounts:

        az account list -o table
        
      5. If more than one account exists, select the account dedicated for Container Cloud:

        az account set -s <subscriptionID>
        
      6. Create an Azure service principal:

        Caution

        The owner role is required for creation of role assignments.

        az ad sp create-for-rbac --role contributor
        

        Example of system response:

        {
           "appId": "0c87aM5a-e172-182b-a91a-a9b8d39ddbcd",
           "displayName": "azure-cli-2021-08-04-15-25-16",
           "name": "1359ac72-5794-494d-b787-1d7309b7f8bc",
           "password": "Q1jB2-7Uz6Cka7xos6vL-Ddb4BQx2vgMl",
           "tenant": "6d498697-7anvd-4172-a7v0-4e5b2e25f280"
        }
        
    2. Change the directory to kaas-bootstrap.

    3. Export the following parameter:

      export KAAS_AZURE_ENABLED=true
      
    4. In templates/azure/azure-config.yaml.template, modify the following parameters using credentials obtained in the previous steps or using credentials of an existing Azure service principal obtained from the subscription owner:

      • spec:subscriptionID is the subscription ID of your Azure account

      • spec:tenantID is the value of "tenant"

      • spec:clientID is the value of "appId"

      • spec:clientSecret:value is the value of "password"

      For example:

      spec:
        subscriptionID: b8bea78f-zf7s-s7vk-s8f0-642a6v7a39c1
        tenantID: 6d498697-7anvd-4172-a7v0-4e5b2e25f280
        clientID: 0c87aM5a-e172-182b-a91a-a9b8d39ddbcd
        clientSecret:
          value: Q1jB2-7Uz6Cka7xos6vL-Ddb4BQx2vgMl
      
    5. In templates/azure/cluster.yaml.template, modify the default configuration of the Azure cluster location. This is an Azure region that your subscription has quota for.

      To obtain the list of available locations, run:

      az account list-locations -o=table
      

      For example:

      providerSpec:
        value:
        ...
          location: southcentralus
      

      Also, modify other parameters as required.

  5. Optional. In templates/azure/machines.yaml.template, modify the default configuration of the Azure virtual machine size and OS disk size.

    Mirantis Container Cloud only supports Azure virtual machine sizes that meet the following minimum requirements:

    1. More than 8 CPU

    2. More than 24 GB RAM

    3. Ephemeral OS drive supported

    4. Temporary storage size is more than 128 GB

    Set the OS disk size parameter to at least 128 GB (default value) and verify that it does not exceed the temporary storage size.

    To obtain the list of all Azure virtual machine sizes available in the selected Azure region:

    az vm list-skus -l southcentralus -o=json
    

    To filter virtual machine sizes by the Container Cloud minimum requirements:

    1. Install jq.

    2. Run the following command:

      az vm list-skus -l eastus -o=json | jq '.[] | {name: .name}+{vCPUs: .capabilities[]? | select(.name == "vCPUs" and (.value | tonumber >= 8))}+{RAM: .capabilities[]? | select(.name == "MemoryGB" and (.value | tonumber >= 16))}+{EphemeralOSDiskSupported: .capabilities[]? | select(.name == "EphemeralOSDiskSupported" and .value == "True")}+{TempStorageSize: .capabilities[]? | select(.name == "CachedDiskBytes" and (.value | tonumber >= 137438953472))}'
      

      The default VM size is Standard_F16s_v2:

      providerSpec:
        value:
        ...
          vmSize: Standard_F16s_v2
          osDisk:
            osType: Linux
            diskSizeGB: 128
      

    Also, modify other parameters as required.

  6. Optional. Configure the regional NTP server parameters to be applied to all machines of regional and managed clusters in the specified region.

    In templates/azure/cluster.yaml.template, add the ntp:servers section with the list of required servers names:

    spec:
      ...
      providerSpec:
        value:
          kaas:
          ...
            regional:
              - helmReleases:
                - name: azure-provider
                  values:
                    config:
                      lcm:
                        ...
                        ntp:
                          servers:
                          - 0.pool.ntp.org
                          ...
                provider: azure
                ...
    
  7. Export the following parameter:

    export KAAS_AZURE_ENABLED=true
    
  8. If you require Internet access to go through a proxy server, in bootstrap.env, add the following environment variables to bootstrap the management and regional cluster using proxy:

    • HTTP_PROXY

    • HTTPS_PROXY

    • NO_PROXY

    Example snippet:

    export HTTP_PROXY=http://proxy.example.com:3128
    export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
    export NO_PROXY=172.18.10.0,registry.internal.lan
    

    The following variables formats are accepted:

    Proxy configuration data

    Variable

    Format

    • HTTP_PROXY

    • HTTPS_PROXY

    • http://proxy.example.com:port - for anonymous access

    • http://user:password@proxy.example.com:port - for restricted access

    • NO_PROXY

    Comma-separated list of IP addresses or domain names

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for an Azure-based cluster.

  9. Optional. Configure external identity provider for IAM.

  10. Optional. If you are going to use your own TLS certificates for Keycloak, set DISABLE_OIDC=true in bootstrap.env.

  11. Run the bootstrap script:

    ./bootstrap.sh all
    

    In case of deployment issues, refer to Troubleshooting. If the script fails for an unknown reason:

    1. Run the cleanup script:

      ./bootstrap.sh cleanup
      
    2. Rerun the bootstrap script.

    Note

    If the bootstrap fails on the Connecting to bootstrap cluster step with the unable to initialize Tiller in bootstrap cluster: failed to establish connection with tiller error, refer to the known issue 16873 to identify possible root cause of the issue and apply the workaround, if applicable.

  12. When the bootstrap is complete, collect and save the following management cluster details in a secure location:

    • The kubeconfig file located in the same directory as the bootstrap script. This file contains the admin credentials for the management cluster.

    • The private ssh_key for access to the management cluster nodes that is located in the same directory as the bootstrap script.

      Note

      If the initial version of your Container Cloud management cluster was earlier than 2.6.0, ssh_key is named openstack_tmp and is located at ~/.ssh/.

    • The URL for the Container Cloud web UI.

      To create users with permissions required for accessing the Container Cloud web UI, see Create initial users after a management cluster bootstrap.

    • The StackLight endpoints. For details, see Access StackLight web UIs.

    • The Keycloak URL that the system outputs when the bootstrap completes. The admin password for Keycloak is located in kaas-bootstrap/passwords.yml along with other IAM passwords.

    Note

    The Container Cloud web UI and StackLight endpoints are available through Transport Layer Security (TLS) and communicate with Keycloak to authenticate users. Keycloak is exposed using HTTPS and self-signed TLS certificates that are not trusted by web browsers.

    To use your own TLS certificates for Keycloak, refer to Configure TLS certificates for management cluster applications.

    Note

    When the bootstrap is complete, the bootstrap cluster resources are freed up.

  13. Optional. Deploy an additional regional cluster of a different provider type or configuration as described in Deploy an additional regional cluster (optional).

Now, you can proceed with operating your management cluster through the Container Cloud web UI and deploying managed clusters as described in Create and operate an Azure-based managed cluster.

Deploy an Equinix Metal based management cluster

This section describes how to bootstrap a Mirantis Container Cloud management cluster that is based on the Equinix Metal cloud provider.

Workflow overview

The Infrastructure Operator performs the following steps to deploy Mirantis Container Cloud on Equinix Metal:

  1. Prepare an Equinix Metal environment that meets the Requirements for an Equinix Metal based cluster.

  2. Prepare the bootstrap node as described in Prerequisites.

  3. Obtain the Mirantis license file that will be required during the bootstrap.

  4. Configure BGP in the Equinix Metal project.

  5. Verify the capacity of the Equinix Metal facility.

  6. Prepare the Equinix Metal environment credentials.

  7. Create and configure the deployment configuration files that include the cluster and machines metadata.

  8. Run the bootstrap script for a fully automated installation of the management cluster.

For more details, see Bootstrap a management cluster.

Prerequisites

Before you start with bootstrapping the Equinix Metal based management cluster, complete the following prerequisite steps:

  1. Inspect the Requirements for an Equinix Metal based cluster to understand the potential impact of the Container Cloud deployment on your Equinix Metal project.

  2. Configure the bootstrap node:

    1. Log in to any personal computer or VM running Ubuntu 18.04 that you will be using as the bootstrap node.

    2. If you use a newly created VM, run:

      sudo apt-get update
      
    3. Install the current Docker version available for Ubuntu 18.04:

      sudo apt install docker.io
      
    4. Grant your USER access to the Docker daemon:

      sudo usermod -aG docker $USER
      
    5. Log off and log in again to the bootstrap node to apply the changes.

    6. Verify that Docker is configured correctly and has access to Container Cloud CDN. For example:

      docker run --rm alpine sh -c "apk add --no-cache curl; \
      curl https://binary.mirantis.com"
      

      The system output must contain no error records. In case of issues, follow the steps provided in Troubleshooting.

      Note

      If you require all Internet access to go through a proxy server for security and audit purposes, configure Docker proxy settings as described in the official Docker documentation.

  3. Proceed to Equinix Metal project setup.

Equinix Metal project setup

Before deploying an Equinix Metal based Container Cloud cluster, ensure that local Border Gateway Protocol (BGP) is enabled and properly configured for your Equinix Metal project.

To configure BGP in the Equinix Metal project:

  1. Log in to the Equinix Metal console.

  2. In IPs & Networks, select BGP.

  3. In the window that opens:

    1. Click Activate BGP on This Project.

    2. Select local type.

    3. Click Add and wait for the request to finalize.

  4. Verify the value of the max_prefix BGP parameter:

    1. Set the token variable to your project token.

      To obtain the token in the Equinix Metal console, navigate to Project Settings > Project API Keys > Add New Key.

    2. Set the project variable to your project ID.

      To obtain the project ID in the Equinix Metal console, navigate to Project Settings > General > PROJECT ID.

    3. Run the following command:

      curl -sS -H "X-Auth-Token: ${token}" "https://api.equinix.com/metal/v1/projects/${project}/bgp-config" | jq .max_prefix
      

    In the system output, if the value is 10 (default), contact the Equinix Metal support to increase this parameter to at least 150.

    The default value allows creating only two Container Cloud clusters per one Equinix Metal project. Hence, Mirantis recommends increasing the max_prefix value.

Now, you can proceed to Verify the capacity of the Equinix Metal facility.

Verify the capacity of the Equinix Metal facility

Before deploying an Equinix Metal based Container Cloud cluster, ensure that the Equinix Metal project has enough capacity to deploy the required number of machines. Otherwise, the machines will be stuck in the Provisioned state with an error message about no available servers of particular type in your facility.

To verify the capacity of the Equinix Metal facility:

  1. Install the Equinix Metal CLI.

  2. Obtain USER_API_TOKEN for the Equinix Metal authentication:

    1. Log in to the Equinix Metal console.

    2. Capture the existing API key with the Read/Write permissions or create a new one:

      1. In Personal Settings > Personal API Keys, click Add New Key.

      2. Fill in the Description and select the Read/Write permissions.

      3. Click Add Key.

  3. Configure the Equinix Metal authentication by exporting the PACKET_TOKEN environment variable. Replace USER_API_TOKEN with your USER_API_TOKEN:

    export PACKET_TOKEN=USER_API_TOKEN
    

    For more details, see Equinix Metal authentication.

  4. Verify the capacity of the Equinix Metal facility selected for the management cluster bootstrap:

    packet-cli capacity check --facility $EQUINIX_FACILITY --plan $EQUINIX_MACHINE_TYPE --quantity $MACHINES_AMOUNT
    

    In the command above, replace $EQUINIX_FACILITY, $EQUINIX_MACHINE_TYPE and $MACHINES_AMOUNT with corresponding values of your cluster.

    For example, to verify that Equinix Metal can create 6 machines of the c3.small.x86 type in the am6 facility, run:

    packet-cli capacity check --facility am6 --plan c3.small.x86 --quantity 6
    

    In the system response, verify that the AVAILABILITY section has the true value.

    Example of a positive system response:

    +----------+--------------+----------+--------------+
    | FACILITY |     PLAN     | QUANTITY | AVAILABILITY |
    +----------+--------------+----------+--------------+
    | am6      | c3.small.x86 | 6        | true         |
    +----------+--------------+----------+--------------+
    

Proceed to Bootstrap a management cluster.

Bootstrap a management cluster

After you complete the prerequisite steps described in Prerequisites, proceed with bootstrapping your Mirantis Container Cloud management cluster based on the Equinix Metal provider.

To bootstrap an Equinix Metal based management cluster:

  1. Log in to the bootstrap node running Ubuntu 18.04 that is configured as described in Prerequisites.

  2. Prepare the bootstrap script:

    1. Download and run the Container Cloud bootstrap script:

      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      chmod 0755 get_container_cloud.sh
      ./get_container_cloud.sh
      
    2. Change the directory to the kaas-bootstrap folder created by the script.

  3. Obtain your license file that will be required during the bootstrap:

    1. Create a user account at www.mirantis.com.

    2. Log in to your account and download the mirantis.lic license file.

    3. Save the license file as mirantis.lic under the kaas-bootstrap directory on the bootstrap node.

  4. Prepare the Equinix Metal configuration:

    1. Log in to the Equinix Metal console.

    2. Select the project that you want to use for the Container Cloud deployment.

    3. In Project Settings > General, capture your Project ID.

    4. In Profile Settings > Personal API Keys, capture the existing user-level API Key or create a new one:

      1. In Profile Settings > Personal API Keys, click Add New Key.

      2. Fill in the Description and select the Read/Write permissions.

      3. Click Add Key.

    5. Change the directory to kaas-bootstrap.

    6. In templates/equinix/equinix-config.yaml.template, modify spec:projectID and spec:apiToken:value using the values obtained in the previous steps. For example:

      spec:
        projectID: g98sd6f8-dc7s-8273-v8s7-d9v7395nd91
        apiToken:
          value: Bi3m9c7qjYBD3UgsnSCSsqs2bYkbK
      
    7. In templates/equinix/cluster.yaml.template, modify the default configuration of the Equinix Metal facility depending on the previously prepared capacity settings:

      providerSpec:
        value:
        ...
          facility: am6
      

      Also, modify other parameters as required.

    8. Optional. In templates/equinix/machines.yaml.template, modify the default configuration of the Equinix Metal machine type. The minimal required type is c3.small.x86.

      providerSpec:
        value:
        ...
          machineType: c3.small.x86
      

      Also, modify other parameters as required.

  5. Optional. Configure the regional NTP server parameters to be applied to all machines of regional and managed clusters in the specified region.

    In templates/equinix/cluster.yaml.template, add the ntp:servers section with the list of required servers names:

    spec:
      ...
      providerSpec:
        value:
          kaas:
          ...
            regional:
              - helmReleases:
                - name: equinix-provider
                  values:
                    config:
                      lcm:
                        ...
                        ntp:
                          servers:
                          - 0.pool.ntp.org
                          ...
                provider: equinixmetal
                ...
    
  6. Export the following parameter:

    export KAAS_EQUINIX_ENABLED=true
    
  7. If you require all Internet access to go through a proxy server, in bootstrap.env, add the following environment variables to bootstrap the management and regional cluster using proxy:

    • HTTP_PROXY

    • HTTPS_PROXY

    • NO_PROXY

    Example snippet:

    export HTTP_PROXY=http://proxy.example.com:3128
    export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
    export NO_PROXY=172.18.10.0,registry.internal.lan
    

    The following variables formats are accepted:

    Proxy configuration data

    Variable

    Format

    • HTTP_PROXY

    • HTTPS_PROXY

    • http://proxy.example.com:port - for anonymous access

    • http://user:password@proxy.example.com:port - for restricted access

    • NO_PROXY

    Comma-separated list of IP addresses or domain names

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for an Equinix Metal based cluster.

  8. Optional. Configure external identity provider for IAM.

  9. Optional. If you are going to use your own TLS certificates for Keycloak, set DISABLE_OIDC=true in bootstrap.env.

  10. Re-verify that the selected Equinix Metal facility for the management cluster bootstrap is still available and has enough capacity:

    packet-cli capacity check --facility $EQUINIX_FACILITY --plan $EQUINIX_MACHINE_TYPE --quantity $MACHINES_AMOUNT
    

    In the system response, if the value in the AVAILABILITY section has changed from true to false, find an available facility and update the previously configured facility field in cluster.yaml.template.

    For details about the verification procedure, see Verify the capacity of the Equinix Metal facility.

  11. Run the bootstrap script:

    ./bootstrap.sh all
    

    In case of deployment issues, refer to Troubleshooting. If the script fails for an unknown reason:

    1. Run the cleanup script:

      ./bootstrap.sh cleanup
      
    2. Rerun the bootstrap script.

    Note

    If the bootstrap fails on the Connecting to bootstrap cluster step with the unable to initialize Tiller in bootstrap cluster: failed to establish connection with tiller error, refer to the known issue 16873 to identify possible root cause of the issue and apply the workaround, if applicable.

  12. When the bootstrap is complete, collect and save the following management cluster details in a secure location:

    • The kubeconfig file located in the same directory as the bootstrap script. This file contains the admin credentials for the management cluster.

    • The private ssh_key for access to the management cluster nodes that is located in the same directory as the bootstrap script.

      Note

      If the initial version of your Container Cloud management cluster was earlier than 2.6.0, ssh_key is named openstack_tmp and is located at ~/.ssh/.

    • The URL for the Container Cloud web UI.

      To create users with permissions required for accessing the Container Cloud web UI, see Create initial users after a management cluster bootstrap.

    • The StackLight endpoints. For details, see Access StackLight web UIs.

    • The Keycloak URL that the system outputs when the bootstrap completes. The admin password for Keycloak is located in kaas-bootstrap/passwords.yml along with other IAM passwords.

    Note

    The Container Cloud web UI and StackLight endpoints are available through Transport Layer Security (TLS) and communicate with Keycloak to authenticate users. Keycloak is exposed using HTTPS and self-signed TLS certificates that are not trusted by web browsers.

    To use your own TLS certificates for Keycloak, refer to Configure TLS certificates for management cluster applications.

    Note

    When the bootstrap is complete, the bootstrap cluster resources are freed up.

  13. Optional. Deploy an additional regional cluster of a different provider type or configuration as described in Deploy an additional regional cluster (optional).

Now, you can proceed with operating your management cluster using the Container Cloud web UI and deploying managed clusters as described in Create and operate an Equinix Metal based managed cluster.

Deploy a VMware vSphere-based management cluster

This section describes how to bootstrap a VMware vSphere-based Mirantis Container Cloud management cluster.

Note

You can deploy vSphere-based clusters on CentOS. Support of this operating system is available as Technology Preview. Use it for testing and evaluation purposes only.

Deployment of a Container Cloud cluster that is based on both RHEL and CentOS operating systems is not supported.

Workflow overview

Perform the following steps to install Mirantis Container Cloud on a VMware vSphere-based environment:

  1. Prepare a vSphere environment that meets the Requirements for a VMware vSphere-based cluster.

  2. Determine vSphere resources required for the deployment as described in Deployment resources requirements.

  3. Prepare the bootstrap node as described in Prerequisites.

  4. Obtain the Mirantis license file to use during the bootstrap.

  5. Set up the VMware accounts for deployment as described in VMware deployment users.

  6. Create and configure the deployment configuration files that include the cluster and machines metadata as described in Bootstrap a management cluster.

  7. Prepare the OVF template for the management cluster nodes using OVF template requirements.

  8. Run the bootstrap script for the fully automated installation of the management cluster.

For more details, see Bootstrap a management cluster.

Deployment resources requirements

The VMware vSphere provider of Mirantis Container Cloud requires the following resources to successfully create virtual machines for Container Cloud clusters:

  • Data center

    All resources below must be related to one data center.

  • Cluster

    All virtual machines must run on the hosts of one cluster.

  • Virtual Network or Distributed Port Group

    Network for virtual machines. For details, see VMware vSphere network objects and IPAM recommendations.

  • Datastore

    Storage for virtual machines disks and Kubernetes volumes.

  • Folder

    Placement of virtual machines.

  • Resource pool

    Pool of CPU and memory resources for virtual machines.

You must provide the data center and cluster resources by name. You can provide other resources by:

  • Name

    Resource name must be unique in the data center and cluster. Otherwise, the vSphere provider detects multiple resources with same name and cannot determine which one to use.

  • Full path (recommended)

    Full path to a resource depends on its type. For example:

    • Network

      /<data_center>/network/<network_name>

    • Resource pool

      /<data_center>/host/<cluster>/Resources/<resource pool_name>

    • Folder

      /<data_center>/vm/<folder1>/<folder2>/.../<folder_name> or /<data_center>/vm/<folder_name>

    • Datastore

      /<data_center>/datastore/<datastore_name>

You can determine the proper resource name using the vSphere UI.

To obtain the full path to vSphere resources:

  1. Download the latest version of GOVC utility depending on your operating system and unpack the govc binary into PATH on your machine.

  2. Set the environment variables to access your vSphere cluster. For example:

    export GOVC_USERNAME=user
    export GOVC_PASSWORD=password
    export GOVC_URL=https://vcenter.example.com
    
  3. List the data center root using the govc ls command. Example output:

    /<data_center>/vm
    /<data_center>/network
    /<data_center>/host
    /<data_center>/datastore
    
  4. Obtain the full path to resources by name for:

    1. Network or Distributed Port Group (Distributed Virtual Port Group):

      govc find /<data_center> -type n -name <network_name>
      
    2. Datastore:

      govc find /<data_center> -type s -name <datastore_name>
      
    3. Folder:

      govc find /<data_center> -type f -name <folder_name>
      
    4. Resource pool:

      govc find /<data_center> -type p -name <resource_pool_name>
      
  5. Verify the resource type by full path:

    govc object.collect -json -o "<full_path_to_resource>" | jq .Self.Type
    
Prerequisites

Before bootstrapping a VMware vSphere-based management cluster, complete the following prerequisite steps:

  1. Verify that your planned cloud configuration meets the reference hardware bill of material and software requirements as described in Requirements for a VMware vSphere-based cluster.

  2. Verify that your planned cloud configuration meets the deployment resources requirements.

  3. Configure Ubuntu or RHEL on the bootstrap node:

    • For Ubuntu:

      1. Log in to any personal computer or VM running Ubuntu 18.04 that you will be using as the bootstrap node.

      2. If you use a newly created VM, run:

        sudo apt-get update
        
      3. Install the current Docker version available for Ubuntu 18.04:

        sudo apt install docker.io
        
      4. Grant your USER access to the Docker daemon:

        sudo usermod -aG docker $USER
        
      5. Log off and log in again to the bootstrap node to apply the changes.

      6. Verify that Docker is configured correctly and has access to Container Cloud CDN. For example:

        docker run --rm alpine sh -c "apk add --no-cache curl; \
        curl https://binary.mirantis.com"
        

        The system output must contain no error records. In case of issues, follow the steps provided in Troubleshooting.

        Note

        If you require all Internet access to go through a proxy server for security and audit purposes, configure Docker proxy settings as described in the official Docker documentation.

    • For RHEL:

      1. Log in to any VM running RHEL 7.9 that you will be using as a bootstrap node.

      2. If you do not use RedHat Satellite server locally in your infrastructure and require all Internet access to go through a proxy server, including access to RedHat customer portal, configure proxy parameters for subscription-manager using the example below:

        subscription-manager config \
            --server.proxy_scheme=$SCHEME \
            --server.proxy_hostname=$HOST \
            --server.proxy_port=$PORT \
            --server.proxy_user=$USER \
            --server.proxy_password=$PASS \
            --server.no_proxy=$NO_PROXY
        
      3. Attach the RHEL subscription using subscription-manager.

      4. Install the following packages:

        sudo yum install yum-utils wget vim -y
        
      5. Verify that the extras repository is enabled:

        sudo yum-config-manager --enable rhel-7-server-extras-rpms
        
      6. Install and configure Docker, disable SELinux:

        sudo yum install docker -y
        sudo systemctl start docker
        sudo chmod 666 /var/run/docker.sock
        sudo setenforce 0
        
      7. Verify that Docker is configured correctly and has access to Container Cloud CDN. For example:

        docker run --rm alpine sh -c "apk add --no-cache curl; \
        curl https://binary.mirantis.com"
        

        The system output must contain no error records. In case of issues, follow the steps provided in Troubleshooting.

        Note

        If you require all Internet access to go through a proxy server for security and audit purposes, configure Docker proxy settings as described in the official Docker documentation.

  4. Prepare the VMware deployment user setup and permissions.

Prepare the VMware deployment user setup and permissions

To deploy Mirantis Container Cloud on a VMware vSphere-based environment, prepare the following VMware accounts:

  1. Log in to the vCenter Server Web Console.

  2. Create the cluster-api user with the following privileges:

    Note

    Container Cloud uses two separate vSphere accounts for:

    • Cluster API related operations, such as create or delete VMs, and for preparation of the OVF template using Packer

    • Storage operations, such as dynamic PVC provisioning

    You can also create one user that has all privileges sets mentioned above.

    Privilege

    Permission

    Content library

    • Download files

    • Read storage

    • Sync library item

    Datastore

    • Allocate space

    • Browse datastore

    • Low-level file operations

    • Update virtual machine metadata

    Distributed switch

    • Host operation

    • IPFIX operation

    • Modify

    • Network I/O control operation

    • Policy operation

    • Port configuration operation

    • Port setting operation

    • VSPAN operation

    Folder

    • Create folder

    • Rename folder

    Global

    Cancel task

    Host local operations

    • Create virtual machine

    • Delete virtual machine

    • Reconfigure virtual machine

    Network

    Assign network

    Resource

    Assign virtual machine to resource pool

    Scheduled task

    • Create tasks

    • Modify task

    • Remove task

    • Run task

    Sessions

    • Validate session

    • View and stop sessions

    Storage views

    View

    Tasks

    • Create task

    • Update task

    Virtual machine permissions

    Privilege

    Permission

    Change configuration

    • Acquire disk lease

    • Add existing disk

    • Add new disk

    • Add or remove device

    • Advanced configuration

    • Change CPU count

    • Change Memory

    • Change Settings

    • Change Swapfile placement

    • Change resource

    • Configure Host USB device

    • Configure Raw device

    • Configure managedBy

    • Display connection settings

    • Extend virtual disk

    • Modify device settings

    • Query Fault Tolerance compatibility

    • Query unowned files

    • Reload from path

    • Remove disk

    • Rename

    • Reset guest information

    • Set annotation

    • Toggle disk change tracking

    • Toggle fork parent

    • Upgrade virtual machine compatibility

    Interaction

    • Configure CD media

    • Configure floppy media

    • Console interaction

    • Device connection

    • Inject USB HID scan codes

    • Power off

    • Power on

    • Reset

    • Suspend

    Inventory

    • Create from existing

    • Create new

    • Move

    • Register

    • Remove

    • Unregister

    Provisioning

    • Allow disk access

    • Allow file access

    • Allow read-only disk access

    • Allow virtual machine download

    • Allow virtual machine files upload

    • Clone template

    • Clone virtual machine

    • Create template from virtual machine

    • Customize guest

    • Deploy template

    • Mark as template

    • Mark as virtual machine

    • Modify customization specification

    • Promote disks

    • Read customization specifications

    Snapshot management

    • Create snapshot

    • Remove snapshot

    • Rename snapshot

    • Revert to snapshot

    vSphere replication

    Monitor replication

  3. Create the storage user with the following privileges:

    Note

    For more details about all required privileges for the storage user, see vSphere Cloud Provider documentation.

    Privilege

    Permission

    Cloud Native Storage

    Searchable

    Content library

    View configuration settings

    Datastore

    • Allocate space

    • Browse datastore

    • Low level file operations

    • Remove file

    Folder

    • Create folder

    Host configuration

    • Storage partition configuration

    Host local operations

    • Create virtual machine

    • Delete virtual machine

    • Reconfigure virtual machine

    Host profile

    View

    Profile-driven storage

    Profile-driven storage view

    Resource

    Assign virtual machine to resource pool

    Scheduled task

    • Create tasks

    • Modify task

    • Run task

    Sessions

    • Validate session

    • View and stop sessions

    Storage views

    View

    Virtual machine permissions

    Privilege

    Permission

    Change configuration

    • Add existing disk

    • Add new disk

    • Add or remove device

    • Advanced configuration

    • Change CPU count

    • Change Memory

    • Change Settings

    • Configure managedBy

    • Extend virtual disk

    • Remove disk

    • Rename

    Inventory

    • Create from existing

    • Create new

    • Remove

  4. For RHEL deployments, if you do not have a RHEL machine with the virt-who service configured to report the vSphere environment configuration and hypervisors information to RedHat Customer Portal or RedHat Satellite server, set up the virt-who service inside the Container Cloud machines for a proper RHEL license activation.

    Create a virt-who user with at least read-only access to all objects in the vCenter Data Center.

    The virt-who service on RHEL machines will be provided with the virt-who user credentials to properly manage RHEL subscriptions.

    For details on how to create the virt-who user, refer to the official RedHat Customer Portal documentation.

Now, proceed to Bootstrap a management cluster.

Bootstrap a management cluster

After you complete the prerequisite steps described in Prerequisites, proceed with bootstrapping your VMware vSphere-based Mirantis Container Cloud management cluster.

To bootstrap a vSphere-based management cluster:

  1. Log in to the bootstrap node running Ubuntu 18.04 that is configured as described in Prerequisites.

  2. Prepare the bootstrap script:

    1. Download and run the Container Cloud bootstrap script:

      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      chmod 0755 get_container_cloud.sh
      ./get_container_cloud.sh
      
    2. Change the directory to the kaas-bootstrap folder created by the script.

  3. Obtain your license file that will be required during the bootstrap:

    1. Create a user account at www.mirantis.com.

    2. Log in to your account and download the mirantis.lic license file.

    3. Save the license file as mirantis.lic under the kaas-bootstrap directory on the bootstrap node.

  4. Prepare deployment templates:

    1. Modify templates/vsphere/vsphere-config.yaml.template:

      vSphere configuration data

      Parameter

      Description

      SET_VSPHERE_SERVER

      IP address or FQDN of the vCenter Server.

      SET_VSPHERE_SERVER_PORT

      Port of the vCenter Server. For example, port: "8443". Leave empty to use 443 by default.

      SET_VSPHERE_DATACENTER

      vSphere data center name.

      SET_VSPHERE_SERVER_INSECURE

      Flag that controls validation of the vSphere Server certificate. Must be true or false.

      SET_VSPHERE_CAPI_PROVIDER_USERNAME

      vSphere Cluster API provider user name that you added when preparing the deployment user setup and permissions.

      SET_VSPHERE_CAPI_PROVIDER_PASSWORD

      vSphere Cluster API provider user password.

      SET_VSPHERE_CLOUD_PROVIDER_USERNAME

      vSphere Cloud Provider deployment user name that you added when preparing the deployment user setup and permissions.

      SET_VSPHERE_CLOUD_PROVIDER_PASSWORD

      vSphere Cloud Provider deployment user password.

    2. Modify the templates/vsphere/cluster.yaml.template parameters to fit your deployment. For example, add the corresponding values for cidrBlocks in the spec::clusterNetwork::services section.

      Required parameters

      Parameter

      Description

      SET_LB_HOST

      IP address from the provided vSphere network for load balancer (Keepalived).

      SET_VSPHERE_METALLB_RANGE

      MetalLB range of IP addresses that can be assigned to load balancers for Kubernetes Services.

      SET_VSPHERE_DATASTORE

      Name of the vSphere datastore. You can use different datastores for vSphere Cluster API and vSphere Cloud Provider.

      SET_VSPHERE_MACHINES_FOLDER

      Path to a folder where the cluster machines metadata will be stored.

      SET_VSPHERE_NETWORK_PATH

      Path to a network for cluster machines.

      SET_VSPHERE_RESOURCE_POOL_PATH

      Path to a resource pool in which VMs will be created.

    3. For either DHCP or non-DHCP vSphere network:

      1. Determine the vSphere network parameters as described in VMware vSphere network objects and IPAM recommendations.

      2. Provide the following additional parameters for a proper network setup on machines using embedded IP address management (IPAM) in templates/vsphere/cluster.yaml.template

      vSphere configuration data

      Parameter

      Description

      ipamEnabled

      Enables IPAM. Set to true for networks without DHCP.

      SET_VSPHERE_NETWORK_CIDR

      CIDR of the provided vSphere network. For example, 10.20.0.0/16.

      SET_VSPHERE_NETWORK_GATEWAY

      Gateway of the provided vSphere network.

      SET_VSPHERE_CIDR_INCLUDE_RANGES

      Optional. IP range for the cluster machines. Specify the range of the provided CIDR. For example, 10.20.0.100-10.20.0.200.

      SET_VSPHERE_CIDR_EXCLUDE_RANGES

      Optional. IP ranges to be excluded from being assigned to the cluster machines. The MetalLB range and SET_LB_HOST should not intersect with the addresses for IPAM. For example, 10.20.0.150-10.20.0.170.

      SET_VSPHERE_NETWORK_NAMESERVERS

      List of nameservers for the provided vSphere network.

    4. For RHEL deployments, fill out templates/vsphere/rhellicenses.yaml.template using one of the following set of parameters for RHEL machines subscription:

      • The user name and password of your RedHat Customer Portal account associated with your RHEL license for Virtual Datacenters.

        Optionally, provide the subscription allocation pools to use for the RHEL subscriptions activation. If not needed, remove the poolIDs field for subscription-manager to automatically select the licenses for machines.

        For example:

        spec:
          username: <username>
          password:
            value: <password>
          poolIDs:
          - <pool1>
          - <pool2>
        
      • The activation key and organization ID associated with your RedHat account with RHEL license for Virtual Datacenters. The activation key can be created by the organization administrator on RedHat Customer Portal.

        If you use the RedHat Satellite server for management of your RHEL infrastructure, you can provide a pre-generated activation key from that server. In this case:

        • Provide the URL to the RedHat Satellite RPM for installation of the CA certificate that belongs to that server.

        • Configure squid-proxy on the management or regional cluster to allow access to your Satellite server. For details, see Configure squid-proxy.

        For example:

        spec:
          activationKey:
            value: <activation key>
          orgID: "<organization ID>"
          rpmUrl: <rpm url>
        

      Caution

      Provide only one set of parameters. Mixing of parameters from different activation methods will cause deployment failure.

    5. For CentOS deployments, in templates/vsphere/rhellicenses.yaml.template, remove all lines under items:.

  5. In bootstrap.env, add the KAAS_VSPHERE_ENABLED=true environment variable that enables the vSphere provider deployment in Container Cloud.

  6. Configure the regional NTP server parameters to be applied to all machines of regional and managed clusters in the specified region.

    In templates/vsphere/cluster.yaml.template, add the ntp:servers section with the list of required servers names:

    spec:
      ...
      providerSpec:
        value:
          kaas:
          ...
            regional:
              - helmReleases:
                - name: vsphere-provider
                  values:
                    config:
                      lcm:
                        ...
                        ntp:
                          servers:
                          - 0.pool.ntp.org
                          ...
                provider: vsphere
                ...
    
  7. Prepare the OVF template as described in Prepare the OVF template.

  8. In templates/vsphere/machines.yaml.template:

    • Define SET_VSPHERE_TEMPLATE_PATH prepared in the previous step

    • Modify other parameters as required

    spec:
      providerSpec:
        value:
          apiVersion: vsphere.cluster.k8s.io/v1alpha1
          kind: VsphereMachineProviderSpec
          rhelLicense: <rhel-license-name>
          network:
            devices:
            - dhcp4: true
              dhcp6: false
          template: <SET_VSPHERE_TEMPLATE_PATH>
    

    Note

    The <rhel-license-name> value is the RHEL license name defined in rhellicenses.yaml.tempalte, defaults to kaas-mgmt-rhel-license. Remove or comment out this parameter for CentOS deployments.

  9. If you require all Internet access to go through a proxy server, in bootstrap.env, add the following environment variables to bootstrap the management and regional cluster using proxy:

    • HTTP_PROXY

    • HTTPS_PROXY

    • NO_PROXY

    Example snippet:

    export HTTP_PROXY=http://proxy.example.com:3128
    export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
    export NO_PROXY=172.18.10.0,registry.internal.lan
    

    The following variables formats are accepted:

    Proxy configuration data

    Variable

    Format

    • HTTP_PROXY

    • HTTPS_PROXY

    • http://proxy.example.com:port - for anonymous access

    • http://user:password@proxy.example.com:port - for restricted access

    • NO_PROXY

    Comma-separated list of IP addresses or domain names

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for a VMware vSphere-based cluster.

  10. Optional. Configure external identity provider for IAM.

  11. Optional. If you are going to use your own TLS certificates for Keycloak, set DISABLE_OIDC=true in bootstrap.env.

  12. Run the bootstrap script:

    ./bootstrap.sh all
    

    In case of deployment issues, refer to Troubleshooting. If the script fails for an unknown reason:

    1. Run the cleanup script:

      ./bootstrap.sh cleanup
      
    2. Rerun the bootstrap script.

    Note

    If the bootstrap fails on the Connecting to bootstrap cluster step with the unable to initialize Tiller in bootstrap cluster: failed to establish connection with tiller error, refer to the known issue 16873 to identify possible root cause of the issue and apply the workaround, if applicable.

  13. When the bootstrap is complete, collect and save the following management cluster details in a secure location:

    • The kubeconfig file located in the same directory as the bootstrap script. This file contains the admin credentials for the management cluster.

    • The private ssh_key for access to the management cluster nodes that is located in the same directory as the bootstrap script.

      Note

      If the initial version of your Container Cloud management cluster was earlier than 2.6.0, ssh_key is named openstack_tmp and is located at ~/.ssh/.

    • The URL for the Container Cloud web UI.

      To create users with permissions required for accessing the Container Cloud web UI, see Create initial users after a management cluster bootstrap.

    • The StackLight endpoints. For details, see Access StackLight web UIs.

    • The Keycloak URL that the system outputs when the bootstrap completes. The admin password for Keycloak is located in kaas-bootstrap/passwords.yml along with other IAM passwords.

    Note

    The Container Cloud web UI and StackLight endpoints are available through Transport Layer Security (TLS) and communicate with Keycloak to authenticate users. Keycloak is exposed using HTTPS and self-signed TLS certificates that are not trusted by web browsers.

    To use your own TLS certificates for Keycloak, refer to Configure TLS certificates for management cluster applications.

    Note

    When the bootstrap is complete, the bootstrap cluster resources are freed up.

Now, you can proceed with operating your management cluster using the Container Cloud web UI and deploying managed clusters as described in Create and operate a VMware vSphere-based managed cluster.

Prepare the OVF template

To deploy Mirantis Container Cloud on a vSphere-based environment, the OVF template for cluster machines must be prepared according to the following requirements:

  1. The VMware Tools package is installed.

  2. The cloud-init utility is installed and configured with the specific VMwareGuestInfo data source.

  3. For RHEL deployments, the virt-who service is enabled and configured to connect to the VMware vCenter Server to properly apply the RHEL subscriptions on the nodes. The virt-who service can run on a standalone machine or can be integrated into a VM template.

The following procedures describe how to meet the requirements above either using the Container Cloud script or manually.

To prepare the OVF template using the Container Cloud script:

  1. Prepare the Container Cloud bootstrap and modify templates/vsphere/vsphere-config.yaml.template and templates/vsphere/cluster.yaml.template as described in Bootstrap a management cluster, steps 1-9.

  2. Download the ISO image depending on the target OS:

  3. Export the following variables:

    • The path to the downloaded ISO file.

    • The vSphere cluster name.

    • The OS name: rhel or centos.

    • The OS version: 7.8 or 7.9 for RHEL, 7.9 for CentOS.

    • Optional. The virt-who user name and password for RHEL deployments.

    For example, for RHEL:

    export KAAS_VSPHERE_ENABLED=true
    export VSPHERE_RO_USER=virt-who-user
    export VSPHERE_RO_PASSWORD=virt-who-user-password
    export VSPHERE_PACKER_ISO_FILE=$(pwd)/iso-file.dvd.iso
    export VSPHERE_CLUSTER_NAME=vsphere-cluster-name
    export VSPHERE_PACKER_IMAGE_OS_NAME=rhel
    export VSPHERE_PACKER_IMAGE_OS_VERSION=7.9
    
    Optional variables

    Variable

    Description

    VSPHERE_VM_NETWORK_DEVICE

    Network interface name in a virtual machine. Defaults to eth0.

    VSPHERE_VM_TIMEZONE

    Time zone for virtual machines. Defaults to America/New_York.

  4. Optional. If you require all Internet access to go through a proxy server, in bootstrap.env, add the following environment variables:

    • HTTP_PROXY

    • HTTPS_PROXY

    • NO_PROXY

    Example snippet:

    export HTTP_PROXY=http://proxy.example.com:3128
    export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
    export NO_PROXY=172.18.10.0,registry.internal.lan
    

    The following variables formats are accepted:

    Proxy configuration data

    Variable

    Format

    • HTTP_PROXY

    • HTTPS_PROXY

    • http://proxy.example.com:port - for anonymous access

    • http://user:password@proxy.example.com:port - for restricted access

    • NO_PROXY

    Comma-separated list of IP addresses or domain names

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for a VMware vSphere-based cluster.

  5. Prepare the OVF template:

    ./bootstrap.sh vsphere_template
    
  6. After the template is prepared, set the SET_VSPHERE_TEMPLATE_PATH parameter in templates/vsphere/machines.yaml.template as described in Bootstrap a management cluster.

To prepare the OVF template manually:

  1. Run a virtual machine on the vSphere data center with the DVD ISO mounted to it. Specify the amount of resources that will be used in the Container Cloud setup. A minimal resources configuration must match the Requirements for a VMware vSphere-based cluster for a vSphere-based Container Cloud cluster.

  2. Bootstrap the OS using vSphere Web Console. Select a minimal setup in the VM installation configuration. Create a user with root or sudo permissions to access the machine.

  3. Log in to the VM when it starts.

  4. Optional. If you require all Internet access to go through a proxy server, in bootstrap.env, add the following environment variables:

    • HTTP_PROXY

    • HTTPS_PROXY

    • NO_PROXY

    Example snippet:

    export HTTP_PROXY=http://proxy.example.com:3128
    export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
    export NO_PROXY=172.18.10.0,registry.internal.lan
    

    The following variables formats are accepted:

    Proxy configuration data

    Variable

    Format

    • HTTP_PROXY

    • HTTPS_PROXY

    • http://proxy.example.com:port - for anonymous access

    • http://user:password@proxy.example.com:port - for restricted access

    • NO_PROXY

    Comma-separated list of IP addresses or domain names

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for a VMware vSphere-based cluster.

  5. For RHEL, attach your RHEL license for Virtual Datacenters to the VM:

    subscription-manager register
    # automatic subscription selection:
    subscription-manager attach --auto
    # or specify pool id:
    subscription-manager attach --pool=<POOL_ID>
    # verify subscription status
    subscription-manager status
    
  6. Select from the following options:

    • Prepare the operating system automatically:

      1. Download the automation script:

        curl https://gerrit.mcp.mirantis.com/plugins/gitiles/kubernetes/vmware-guestinfo/+/refs/tags/v1.1.3/install.sh?format=TEXT | \
        base64 -d > install.sh
        chmod +x install.sh
        
      2. Optional. For RHEL, export the vCenter Server credentials of the read-only user. For example:

        export VC_SERVER='vcenter1.example.com'
        export VC_USER='domain\vmware_read_only_username'
        export VC_PASSWORD='password!23'
        # optional parameters:
        export VC_HYPERVISOR_ID=hostname
        export VC_FILTER_HOSTS="esx1.example.com, esx2.example.com"
        export VCENTER_CONFIG_PATH="/etc/virt-who.d/vcenter.conf"
        
      3. Run the installation script:

        ./install.sh
        
    • Prepare the operating system manually:

      1. Install the open-vm-tools package version 11.0.5 or later with dependencies and verify its version:

        yum install open-vm-tools net-tools perl -y
        vmtoolsd --version
        vmware-toolbox-cmd --version
        
      2. Install and configure cloud-init:

        1. Install the cloud-init package version 19.4 or later and verify its version:

          yum install cloud-init -y
          cloud-init --version
          
        2. Download the VMwareGuestInfo data source files:

          curl https://gerrit.mcp.mirantis.com/plugins/gitiles/kubernetes/vmware-guestinfo/+/refs/tags/v1.1.3/DataSourceVMwareGuestInfo.py?format=TEXT | \
          base64 -d > DataSourceVMwareGuestInfo.py
          curl https://gerrit.mcp.mirantis.com/plugins/gitiles/kubernetes/vmware-guestinfo/+/refs/tags/v1.1.3/99-DataSourceVMwareGuestInfo.cfg?format=TEXT | \
          base64 -d > 99-DataSourceVMwareGuestInfo.cfg
          
        3. Add 99-DataSourceVMwareGuestInfo.cfg to /etc/cloud/cloud.cfg.d/.

        4. Depending on the Python version on the VM operating system, add DataSourceVMwareGuestInfo.py to the cloud-init sources folder. Obtain the cloud-init folder on the OS:

          python -c 'import os; from cloudinit import sources; print(os.path.dirname(sources.__file__));'
          
      3. Prepare the virt-who user configuration:

        Note

        For details about the virt-who user creation, see Prepare the VMware deployment user setup and permissions.

        1. Install virt-who:

          yum install virt-who -y
          cp /etc/virt-who.d/template.conf /etc/virt-who.d/vcenter.conf
          
        2. Set up the file content using the following example:

          [vcenter]
          type=esx
          server=vcenter1.example.com
          username=domain\vmware_read_only_username
          encrypted_password=bd257f93d@482B76e6390cc54aec1a4d
          owner=1234567
          hypervisor_id=hostname
          filter_hosts=esx1.example.com, esx2.example.com
          
          virt-who configuration parameters

          Parameter

          Description

          [vcenter]

          Name of the vCenter data center.

          type=esx

          Specifies the connection of the defined virt-who user to the vCenter Server.

          server

          The FQDN of the vCenter Server.

          username

          The virt-who user name on the vCenter Server with the read-only access.

          encrypted_password

          The virt-who password encrypted by the virt-who-password utility using the virt-who-password -p <password> command.

          owner

          The organization that the hypervisors belong to.

          hypervisor_id

          Specifies how to identify the hypervisors. Use a host name to provide meaningful host names to the Subscription Management. Alternatively, use uuid or hwuuid to avoid duplication in case of hypervisor renaming.

          filter_hosts

          List of hypervisors that never run RHEL VMs. Such hypervisors do not have to be reported by virt-who.

  7. For CentOS, verify that .yum mirrors are set to use only the *.centos.org URLs. Otherwise, access to other mirrors may be blocked by squid-proxy on managed clusters. For details, see Configure squid-proxy.

  8. For RHEL, remove the RHEL subscription from the node.

    subscription-manager remove --all
    subscription-manager unregister
    subscription-manager clean
    
  9. Shut down the VM.

  10. Create an OVF template from the VM.

Now, proceed to Bootstrap a management cluster.

Configure squid-proxy

By default squid-proxy allows an access only to the official RedHat subscription.rhsm.redhat.com and .cdn.redhat.com URLs or to the CentOS *.centos.org mirrors.

If you use RedHat Satellite server or if you want to access some specific yum repositories of RedHat or CentOS, allow those domains (or IPs addresses) in the squid-proxy configuration on the management or regional cluster.

Note

You can apply the procedure below before or after the management or regional cluster deployment.

To configure squid-proxy for an access to specific domains:

  1. Modify the allowed domains for squid-proxy in the regional Helm releases configuration for the vsphere provider using the example below.

    • For new deployments, modify templates/vsphere/cluster.yaml.template

    • For existing deployments, modify the management or regional cluster configuration:

      kubectl edit cluster <mgmtOrRegionalClusterName> -n <projectName>
      

    Example configuration:

    spec:
      ...
      providerSpec:
        value:
          ...
          kaas:
            ...
            regional:
              - helmReleases:
                ...
                - name: squid-proxy
                  values:
                    config:
                      domains:
                        rhel:
                        - .subscription.rhsm.redhat.com
                        - .cdn.redhat.com
                        - .centos.org
                        - .satellite.server.org
                        - .custom.centos.mirror.org
                        - 172.16.10.10
                provider: vsphere
    
  2. On a deployed cluster, verify that the configuration is applied properly by verifying configmap for squid-proxy:

    kubectl describe configmap squid-proxy -n kaas
    

    The squid.conf data should include the provided domains. For example:

    acl rhel dstdomain .subscription.rhsm.redhat.com .cdn.redhat.com .centos.org .satellite.server.org .custom.centos.mirror.org 172.16.10.10
    

Deploy an additional regional cluster (optional)

After you bootstrap a management cluster of the required cloud provider type, you can optionally deploy an additional regional cluster of the same or different provider type. For details about regions, see Container Cloud regions.

Perform this procedure if you wish to operate managed clusters across clouds from a single Mirantis Container Cloud management plane.

Caution

  • A regional cluster requires access to the management cluster.

  • If you deploy a management cluster on a public cloud, such as AWS, Equinix Metal, or Microsoft Azure, you can add any type of regional cluster.

  • If you deploy a management cluster on a private cloud, such as OpenStack or vSphere, you can add only private-based regional clusters.

Multi-regional deployment enables you to create managed clusters of several provider types using one management cluster. For example, you can bootstrap an AWS-based management cluster and deploy an OpenStack-based regional cluster on this management cluster. Such cluster enables creation of OpenStack-based and AWS-based managed clusters with Kubernetes deployments.

Note

The integration of baremetal-based support for deploying additional regional clusters is in the development stage and will be announced separately in one of the upcoming Container Cloud releases.

Note

If the bootstrap node for deployment of an additional regional cluster is not the same where you bootstrapped the management cluster, first prepare the bootstrap as described in Configure the bootstrap node.

This section describes how to deploy an additional OpenStack, AWS, Equinix Metal, VMware vSphere or Azure-based regional cluster on an existing management cluster.

Configure the bootstrap node

This section describes how to prepare a new bootstrap node for an additional regional cluster deployment on top of the management cluster. To use the same node where you bootstrapped the management cluster, skip this instruction and proceed to deploying a regional cluster of the required provider type.

To configure a new bootstrap node for a regional cluster:

  1. Install and configure Docker:

    1. Log in to any personal computer or VM running Ubuntu 18.04 that you will be using as the bootstrap node.

    2. If you use a newly created VM, run:

      sudo apt-get update
      
    3. Install the current Docker version available for Ubuntu 18.04:

      sudo apt install docker.io
      
    4. Grant your USER access to the Docker daemon:

      sudo usermod -aG docker $USER
      
    5. Log off and log in again to the bootstrap node to apply the changes.

    6. Verify that Docker is configured correctly and has access to Container Cloud CDN. For example:

      docker run --rm alpine sh -c "apk add --no-cache curl; \
      curl https://binary.mirantis.com"
      

      The system output must contain no error records. In case of issues, follow the steps provided in Troubleshooting.

      Note

      If you require all Internet access to go through a proxy server for security and audit purposes, configure Docker proxy settings as described in the official Docker documentation.

  2. Prepare the bootstrap script:

    1. Download and run the Container Cloud bootstrap script:

      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      chmod 0755 get_container_cloud.sh
      ./get_container_cloud.sh
      
    2. Change the directory to the kaas-bootstrap folder created by the script.

  3. If you deleted the mirantis.lic file used during the management cluster bootstrap:

    1. Create a user account at www.mirantis.com.

    2. Log in to your account and download the mirantis.lic license file.

    3. Save the license file as mirantis.lic under the kaas-bootstrap directory on the bootstrap node.

  4. On the new bootstrap node, save the management cluster kubeconfig that was created after the management cluster bootstrap.

Now, proceed to deploying an additional regional cluster of the required provider type as described in Deploy an additional regional cluster.

Deploy an AWS-based regional cluster

You can deploy an additional regional AWS-based cluster to create managed clusters of several provider types or with different configurations.

To deploy an AWS-based regional cluster:

  1. Log in to the node where you bootstrapped a management cluster.

  2. Verify that the bootstrap directory is updated.

    Select from the following options:

    • For clusters deployed using Container Cloud 2.11.0 or later:

      ./container-cloud bootstrap download --management-kubeconfig <pathToMgmtKubeconfig> \
      --target-dir <pathToBootstrapDirectory>
      
    • For clusters deployed using the Container Cloud release earlier than 2.11.0 or if you deleted the kaas-bootstrap folder, download and run the Container Cloud bootstrap script:

      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      
      chmod 0755 get_container_cloud.sh
      
      ./get_container_cloud.sh
      
  3. Prepare the AWS configuration for the new regional cluster:

    1. Verify access to the target cloud endpoint from Docker. For example:

      docker run --rm alpine sh -c "apk add --no-cache curl; \
      curl https://ec2.amazonaws.com"
      

      The system output must contain no error records. In case of issues, follow the steps provided in Troubleshooting.

    2. Change the directory to the kaas-bootstrap folder.

    3. In templates/aws/machines.yaml.template, modify the spec:providerSpec:value section by substituting the ami:id parameter with the corresponding value for Ubuntu 18.04 from the required AWS region. For example:

      spec:
        providerSpec:
          value:
            apiVersion: aws.kaas.mirantis.com/v1alpha1
            kind: AWSMachineProviderSpec
            instanceType: c5d.4xlarge
            ami:
              id: ami-033a0960d9d83ead0
      

      Also, modify other parameters as required.

    4. Optional. In templates/aws/cluster.yaml.template, modify the default configuration of the AWS instance types and AMI IDs for further creation of managed clusters:

      providerSpec:
          value:
            ...
            kaas:
              ...
              regional:
              - provider: aws
                helmReleases:
                  - name: aws-credentials-controller
                    values:
                      config:
                        allowedInstanceTypes:
                          minVCPUs: 8
                          # in MiB
                          minMemory: 16384
                          # in GB
                          minStorage: 120
                          supportedArchitectures:
                          - "x86_64"
                          filters:
                          - name: instance-storage-info.disk.type
                            values:
                              - "ssd"
                        allowedAMIs:
                        -
                          - name: name
                            values:
                            - "ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20200729"
                          - name: owner-id
                            values:
                            - "099720109477"
      

      Also, modify other parameters as required.

  4. If you require all Internet access to go through a proxy server, in bootstrap.env, add the following environment variables to bootstrap the regional cluster using proxy:

    • HTTP_PROXY

    • HTTPS_PROXY

    • NO_PROXY

    Example snippet:

    export HTTP_PROXY=http://proxy.example.com:3128
    export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
    export NO_PROXY=172.18.10.0,registry.internal.lan
    

    The following variables formats are accepted:

    Proxy configuration data

    Variable

    Format

    • HTTP_PROXY

    • HTTPS_PROXY

    • http://proxy.example.com:port - for anonymous access

    • http://user:password@proxy.example.com:port - for restricted access

    • NO_PROXY

    Comma-separated list of IP addresses or domain names

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for an AWS-based cluster.

  5. Optional. Configure the regional NTP server parameters to be applied to all machines of regional and managed clusters in the specified region.

    In templates/aws/cluster.yaml.template, add the ntp:servers section with the list of required servers names:

    spec:
      ...
      providerSpec:
        value:
          kaas:
          ...
            regional:
              - helmReleases:
                - name: aws-provider
                  values:
                    config:
                      lcm:
                        ...
                        ntp:
                          servers:
                          - 0.pool.ntp.org
                          ...
                provider: aws
                ...
    
  6. Configure the bootstrapper.cluster-api-provider-aws.kaas.mirantis.com user created in the previous steps:

    1. Using your AWS Management Console, generate the AWS Access Key ID with Secret Access Key for bootstrapper.cluster-api-provider-aws.kaas.mirantis.com and select the AWS default region name.

      Note

      Other authorization methods, such as usage of AWS_SESSION_TOKEN, are not supported.

    2. Export the AWS bootstrapper.cluster-api-provider-aws.kaas.mirantis.com user credentials that were created in the previous step:

      export KAAS_AWS_ENABLED=true
      export AWS_SECRET_ACCESS_KEY=XXXXXXX
      export AWS_ACCESS_KEY_ID=XXXXXXX
      export AWS_DEFAULT_REGION=us-east-2
      
  7. Export the following parameters:

    export KUBECONFIG=<pathToMgmtClusterKubeconfig>
    export REGIONAL_CLUSTER_NAME=<newRegionalClusterName>
    export REGION=<NewRegionName>
    

    Substitute the parameters enclosed in angle brackets with the corresponding values of your cluster.

    Caution

    The REGION and REGIONAL_CLUSTER_NAME parameters values must contain only lowercase alphanumeric characters, hyphens, or periods.

    Note

    If the bootstrap node for the regional cluster deployment is not the same where you bootstrapped the management cluster, also export SSH_KEY_NAME. It is required for the management cluster to create a publicKey Kubernetes CRD with the public part of your newly generated ssh_key for the regional cluster.

    export SSH_KEY_NAME=<newRegionalClusterSshKeyName>
    
  8. Run the regional cluster bootstrap script:

    ./bootstrap.sh deploy_regional
    

    Note

    When the bootstrap is complete, obtain and save in a secure location the kubeconfig-<regionalClusterName> file located in the same directory as the bootstrap script. This file contains the admin credentials for the regional cluster.

    If the bootstrap node for the regional cluster deployment is not the same where you bootstrapped the management cluster, a new regional ssh_key will be generated. Make sure to save this key in a secure location as well.

    The workflow of the regional cluster bootstrap script

    #

    Description

    1

    Prepare the bootstrap cluster for the new regional cluster.

    2

    Load the updated Container Cloud CRDs for Credentials, Cluster, and Machines with information about the new regional cluster to the management cluster.

    3

    Connect to each machine of the management cluster through SSH.

    4

    Wait for the Machines and Cluster objects of the new regional cluster to be ready on the management cluster.

    5

    Load the following objects to the new regional cluster: Secret with the management cluster kubeconfig and ClusterRole for the Container Cloud provider.

    6

    Forward the bootstrap cluster endpoint to helm-controller.

    7

    Wait for all CRDs to be available and verify the objects created using these CRDs.

    8

    Pivot the cluster API stack to the regional cluster.

    9

    Switch the LCM agent from the bootstrap cluster to the regional one.

    10

    Wait for the Container Cloud components to start on the regional cluster.

Now, you can proceed with deploying the managed clusters of supported provider types as described in Create and operate managed clusters.

Deploy an Azure-based regional cluster

You can deploy an additional regional Azure-based cluster to create managed clusters of several provider types or with different configurations.

To deploy an Azure-based regional cluster:

  1. Log in to the node where you bootstrapped a management cluster.

  2. Prepare the Azure configuration for the new regional cluster:

    1. Create an Azure service principal. Skip this step to use an existing Azure service principal.

      1. Create a Microsoft Azure account.

      2. Install Azure CLI.

      3. Log in to the Azure CLI:

        az login
        
      4. List your Azure accounts:

        az account list -o table
        
      5. If more than one account exists, select the account dedicated for Container Cloud:

        az account set -s <subscriptionID>
        
      6. Create an Azure service principal:

        Caution

        The owner role is required for creation of role assignments.

        az ad sp create-for-rbac --role contributor
        

        Example of system response:

        {
           "appId": "0c87aM5a-e172-182b-a91a-a9b8d39ddbcd",
           "displayName": "azure-cli-2021-08-04-15-25-16",
           "name": "1359ac72-5794-494d-b787-1d7309b7f8bc",
           "password": "Q1jB2-7Uz6Cka7xos6vL-Ddb4BQx2vgMl",
           "tenant": "6d498697-7anvd-4172-a7v0-4e5b2e25f280"
        }
        
    2. Change the directory to kaas-bootstrap.

    3. Export the following parameter:

      export KAAS_AZURE_ENABLED=true
      
    4. In templates/azure/azure-config.yaml.template, modify the following parameters using credentials obtained in the previous steps or using credentials of an existing Azure service principal obtained from the subscription owner:

      • spec:subscriptionID is the subscription ID of your Azure account

      • spec:tenantID is the value of "tenant"

      • spec:clientID is the value of "appId"

      • spec:clientSecret:value is the value of "password"

      For example:

      spec:
        subscriptionID: b8bea78f-zf7s-s7vk-s8f0-642a6v7a39c1
        tenantID: 6d498697-7anvd-4172-a7v0-4e5b2e25f280
        clientID: 0c87aM5a-e172-182b-a91a-a9b8d39ddbcd
        clientSecret:
          value: Q1jB2-7Uz6Cka7xos6vL-Ddb4BQx2vgMl
      
    5. In templates/azure/cluster.yaml.template, modify the default configuration of the Azure cluster location. This is an Azure region that your subscription has quota for.

      To obtain the list of available locations, run:

      az account list-locations -o=table
      

      For example:

      providerSpec:
        value:
        ...
          location: southcentralus
      

      Also, modify other parameters as required.

  3. If you require Internet access to go through a proxy server, in bootstrap.env, add the following environment variables to bootstrap the regional cluster using proxy:

    • HTTP_PROXY

    • HTTPS_PROXY

    • NO_PROXY

    Example snippet:

    export HTTP_PROXY=http://proxy.example.com:3128
    export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
    export NO_PROXY=172.18.10.0,registry.internal.lan
    

    The following variables formats are accepted:

    Proxy configuration data

    Variable

    Format

    • HTTP_PROXY

    • HTTPS_PROXY

    • http://proxy.example.com:port - for anonymous access

    • http://user:password@proxy.example.com:port - for restricted access

    • NO_PROXY

    Comma-separated list of IP addresses or domain names

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for an Azure-based cluster.

  4. Optional. Configure the regional NTP server parameters to be applied to all machines of regional and managed clusters in the specified region.

    In templates/azure/cluster.yaml.template, add the ntp:servers section with the list of required servers names:

    spec:
      ...
      providerSpec:
        value:
          kaas:
          ...
            regional:
              - helmReleases:
                - name: azure-provider
                  values:
                    config:
                      lcm:
                        ...
                        ntp:
                          servers:
                          - 0.pool.ntp.org
                          ...
                provider: azure
                ...
    
  5. Export the following parameters:

    export KUBECONFIG=<pathToMgmtClusterKubeconfig>
    export REGIONAL_CLUSTER_NAME=<newRegionalClusterName>
    export REGION=<NewRegionName>
    

    Substitute the parameters enclosed in angle brackets with the corresponding values of your cluster.

    Caution

    The REGION and REGIONAL_CLUSTER_NAME parameters values must contain only lowercase alphanumeric characters, hyphens, or periods.

    Note

    If the bootstrap node for the regional cluster deployment is not the same where you bootstrapped the management cluster, also export SSH_KEY_NAME. It is required for the management cluster to create a publicKey Kubernetes CRD with the public part of your newly generated ssh_key for the regional cluster.

    export SSH_KEY_NAME=<newRegionalClusterSshKeyName>
    
  6. Run the regional cluster bootstrap script:

    ./bootstrap.sh deploy_regional
    

    Note

    When the bootstrap is complete, obtain and save in a secure location the kubeconfig-<regionalClusterName> file located in the same directory as the bootstrap script. This file contains the admin credentials for the regional cluster.

    If the bootstrap node for the regional cluster deployment is not the same where you bootstrapped the management cluster, a new regional ssh_key will be generated. Make sure to save this key in a secure location as well.

    The workflow of the regional cluster bootstrap script

    #

    Description

    1

    Prepare the bootstrap cluster for the new regional cluster.

    2

    Load the updated Container Cloud CRDs for Credentials, Cluster, and Machines with information about the new regional cluster to the management cluster.

    3

    Connect to each machine of the management cluster through SSH.

    4

    Wait for the Machines and Cluster objects of the new regional cluster to be ready on the management cluster.

    5

    Load the following objects to the new regional cluster: Secret with the management cluster kubeconfig and ClusterRole for the Container Cloud provider.

    6

    Forward the bootstrap cluster endpoint to helm-controller.

    7

    Wait for all CRDs to be available and verify the objects created using these CRDs.

    8

    Pivot the cluster API stack to the regional cluster.

    9

    Switch the LCM agent from the bootstrap cluster to the regional one.

    10

    Wait for the Container Cloud components to start on the regional cluster.

Now, you can proceed with deploying the managed clusters of supported provider types as described in Create and operate managed clusters.

Deploy an Equinix Metal based regional cluster

You can deploy an additional regional Equinix Metal based cluster to create managed clusters of several provider types or with different configurations.

To deploy an Equinix Metal based regional cluster:

  1. Configure BGP for your Equinix Metal project as described in Equinix Metal project setup.

  2. Log in to the node where you bootstrapped the Container Cloud management cluster.

  3. Verify that the bootstrap directory is updated.

    Select from the following options:

    • For clusters deployed using Container Cloud 2.11.0 or later:

      ./container-cloud bootstrap download --management-kubeconfig <pathToMgmtKubeconfig> \
      --target-dir <pathToBootstrapDirectory>
      
    • For clusters deployed using the Container Cloud release earlier than 2.11.0 or if you deleted the kaas-bootstrap folder, download and run the Container Cloud bootstrap script:

      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      
      chmod 0755 get_container_cloud.sh
      
      ./get_container_cloud.sh
      
  4. Prepare the Equinix Metal configuration for the new regional cluster:

    1. Log in to the Equinix Metal console.

    2. Select the project that you want to use for the Container Cloud deployment.

    3. In Project Settings > General, capture your Project ID.

    4. In Profile Settings > Personal API Keys, capture the existing user-level API Key or create a new one:

      1. In Profile Settings > Personal API Keys, click Add New Key.

      2. Fill in the Description and select the Read/Write permissions.

      3. Click Add Key.

    5. Change the directory to kaas-bootstrap.

    6. In templates/equinix/equinix-config.yaml.template, modify spec:projectID and spec:apiToken:value using the values obtained in the previous steps. For example:

      spec:
        projectID: g98sd6f8-dc7s-8273-v8s7-d9v7395nd91
        apiToken:
          value: Bi3m9c7qjYBD3UgsnSCSsqs2bYkbK
      
    7. In templates/equinix/cluster.yaml.template, modify the default configuration of the Equinix Metal facility depending on the previously prepared capacity settings:

      providerSpec:
        value:
        ...
          facility: am6
      

      Also, modify other parameters as required.

    8. Optional. In templates/equinix/machines.yaml.template, modify the default configuration of the Equinix Metal machine type. The minimal required type is c3.small.x86.

      providerSpec:
        value:
        ...
          machineType: c3.small.x86
      

      Also, modify other parameters as required.

  5. If you require all Internet access to go through a proxy server, in bootstrap.env, add the following environment variables to bootstrap the regional cluster using proxy:

    • HTTP_PROXY

    • HTTPS_PROXY

    • NO_PROXY

    Example snippet:

    export HTTP_PROXY=http://proxy.example.com:3128
    export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
    export NO_PROXY=172.18.10.0,registry.internal.lan
    

    The following variables formats are accepted:

    Proxy configuration data

    Variable

    Format

    • HTTP_PROXY

    • HTTPS_PROXY

    • http://proxy.example.com:port - for anonymous access

    • http://user:password@proxy.example.com:port - for restricted access

    • NO_PROXY

    Comma-separated list of IP addresses or domain names

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for an Equinix Metal based cluster.

  6. Optional. Configure the regional NTP server parameters to be applied to all machines of regional and managed clusters in the specified region.

    In templates/equinix/cluster.yaml.template, add the ntp:servers section with the list of required servers names:

    spec:
      ...
      providerSpec:
        value:
          kaas:
          ...
            regional:
              - helmReleases:
                - name: equinix-provider
                  values:
                    config:
                      lcm:
                        ...
                        ntp:
                          servers:
                          - 0.pool.ntp.org
                          ...
                provider: equinixmetal
                ...
    
  7. Export the following parameters:

    export KAAS_EQUINIX_ENABLED=true
    export KUBECONFIG=<pathToMgmtClusterKubeconfig>
    export REGIONAL_CLUSTER_NAME=<newRegionalClusterName>
    export REGION=<NewRegionName>
    

    Substitute the parameters enclosed in angle brackets with the corresponding values of your cluster.

    Caution

    The REGION and REGIONAL_CLUSTER_NAME parameters values must contain only lowercase alphanumeric characters, hyphens, or periods.

    Note

    If the bootstrap node for the regional cluster deployment is not the same where you bootstrapped the management cluster, also export SSH_KEY_NAME. It is required for the management cluster to create a publicKey Kubernetes CRD with the public part of your newly generated ssh_key for the regional cluster.

    export SSH_KEY_NAME=<newRegionalClusterSshKeyName>
    
  8. Run the regional cluster bootstrap script:

    ./bootstrap.sh deploy_regional
    

    Note

    When the bootstrap is complete, obtain and save in a secure location the kubeconfig-<regionalClusterName> file located in the same directory as the bootstrap script. This file contains the admin credentials for the regional cluster.

    If the bootstrap node for the regional cluster deployment is not the same where you bootstrapped the management cluster, a new regional ssh_key will be generated. Make sure to save this key in a secure location as well.

    The workflow of the regional cluster bootstrap script

    #

    Description

    1

    Prepare the bootstrap cluster for the new regional cluster.

    2

    Load the updated Container Cloud CRDs for Credentials, Cluster, and Machines with information about the new regional cluster to the management cluster.

    3

    Connect to each machine of the management cluster through SSH.

    4

    Wait for the Machines and Cluster objects of the new regional cluster to be ready on the management cluster.

    5

    Load the following objects to the new regional cluster: Secret with the management cluster kubeconfig and ClusterRole for the Container Cloud provider.

    6

    Forward the bootstrap cluster endpoint to helm-controller.

    7

    Wait for all CRDs to be available and verify the objects created using these CRDs.

    8

    Pivot the cluster API stack to the regional cluster.

    9

    Switch the LCM agent from the bootstrap cluster to the regional one.

    10

    Wait for the Container Cloud components to start on the regional cluster.

Now, you can proceed with deploying the managed clusters of supported provider types as described in Create and operate managed clusters.

Deploy an OpenStack-based regional cluster

You can deploy an additional regional OpenStack-based cluster to create managed clusters of several provider types or with different configurations.

To deploy an OpenStack-based regional cluster:

  1. Log in to the node where you bootstrapped a management cluster.

  2. Verify that the bootstrap directory is updated.

    Select from the following options:

    • For clusters deployed using Container Cloud 2.11.0 or later:

      ./container-cloud bootstrap download --management-kubeconfig <pathToMgmtKubeconfig> \
      --target-dir <pathToBootstrapDirectory>
      
    • For clusters deployed using the Container Cloud release earlier than 2.11.0 or if you deleted the kaas-bootstrap folder, download and run the Container Cloud bootstrap script:

      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      
      chmod 0755 get_container_cloud.sh
      
      ./get_container_cloud.sh
      
  3. Prepare the OpenStack configuration for a new regional cluster:

    1. Log in to the OpenStack Horizon.

    2. In the Project section, select API Access.

    3. In the right-side drop-down menu Download OpenStack RC File, select OpenStack clouds.yaml File.

    4. Save the downloaded clouds.yaml file in the kaas-bootstrap folder created by the get_container_cloud.sh script.

    5. In clouds.yaml, add the password field with your OpenStack password under the clouds/openstack/auth section.

      Example:

      clouds:
        openstack:
          auth:
            auth_url: https://auth.openstack.example.com:5000/v3
            username: your_username
            password: your_secret_password
            project_id: your_project_id
            user_domain_name: your_user_domain_name
          region_name: RegionOne
          interface: public
          identity_api_version: 3
      
    6. Verify access to the target cloud endpoint from Docker. For example:

      docker run --rm alpine sh -c "apk add --no-cache curl; \
      curl https://auth.openstack.example.com:5000/v3"
      

      The system output must contain no error records.

    In case of issues, follow the steps provided in Troubleshooting.

  4. Configure the cluster and machines metadata:

    1. In templates/machines.yaml.template, modify the spec:providerSpec:value section for 3 control plane nodes marked with the cluster.sigs.k8s.io/control-plane label by substituting the flavor and image parameters with the corresponding values of the control plane nodes in the related OpenStack cluster. For example:

      spec: &cp_spec
        providerSpec:
          value:
            apiVersion: "openstackproviderconfig.k8s.io/v1alpha1"
            kind: "OpenstackMachineProviderSpec"
            flavor: kaas.minimal
            image: bionic-server-cloudimg-amd64-20190612
      

      Note

      The flavor parameter value provided in the example above is cloud-specific and must meet the Container Cloud requirements.

      Also, modify other parameters as required.

    2. Modify the templates/cluster.yaml.template parameters to fit your deployment. For example, add the corresponding values for cidrBlocks in the spec::clusterNetwork::services section.

  5. Optional. Configure the regional NTP server parameters to be applied to all machines of regional and managed clusters in the specified region.

    In templates/cluster.yaml.template, add the ntp:servers section with the list of required servers names:

    spec:
      ...
      providerSpec:
        value:
          kaas:
          ...
            regional:
              - helmReleases:
                - name: openstack-provider
                  values:
                    config:
                      lcm:
                        ...
                        ntp:
                          servers:
                          - 0.pool.ntp.org
                          ...
                provider: openstack
                ...
    
  6. Optional. If you require all Internet access to go through a proxy server, in bootstrap.env, add the following environment variables to bootstrap the regional cluster using proxy:

    • HTTP_PROXY

    • HTTPS_PROXY

    • NO_PROXY

    Example snippet:

    export HTTP_PROXY=http://proxy.example.com:3128
    export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
    export NO_PROXY=172.18.10.0,registry.internal.lan
    

    The following variables formats are accepted:

    Proxy configuration data

    Variable

    Format

    • HTTP_PROXY

    • HTTPS_PROXY

    • http://proxy.example.com:port - for anonymous access

    • http://user:password@proxy.example.com:port - for restricted access

    • NO_PROXY

    Comma-separated list of IP addresses or domain names

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for an OpenStack-based cluster.

  7. Clean up the environment configuration:

    1. If you are deploying the regional cluster on top of a baremetal-based management cluster, unset the following parameters:

      unset KAAS_BM_ENABLED KAAS_BM_FULL_PREFLIGHT KAAS_BM_PXE_IP \
            KAAS_BM_PXE_MASK KAAS_BM_PXE_BRIDGE KAAS_BM_BM_DHCP_RANGE \
            TEMPLATES_DIR
      
    2. If you are deploying the regional cluster on top of an AWS-based management cluster, unset the KAAS_AWS_ENABLED parameter:

      unset KAAS_AWS_ENABLED
      
  8. Export the following parameters:

    export KUBECONFIG=<pathToMgmtClusterKubeconfig>
    export REGIONAL_CLUSTER_NAME=<newRegionalClusterName>
    export REGION=<NewRegionName>
    

    Substitute the parameters enclosed in angle brackets with the corresponding values of your cluster.

    Caution

    The REGION and REGIONAL_CLUSTER_NAME parameters values must contain only lowercase alphanumeric characters, hyphens, or periods.

    Note

    If the bootstrap node for the regional cluster deployment is not the same where you bootstrapped the management cluster, also export SSH_KEY_NAME. It is required for the management cluster to create a publicKey Kubernetes CRD with the public part of your newly generated ssh_key for the regional cluster.

    export SSH_KEY_NAME=<newRegionalClusterSshKeyName>
    
  9. Run the regional cluster bootstrap script:

    ./bootstrap.sh deploy_regional
    

    Note

    When the bootstrap is complete, obtain and save in a secure location the kubeconfig-<regionalClusterName> file located in the same directory as the bootstrap script. This file contains the admin credentials for the regional cluster.

    If the bootstrap node for the regional cluster deployment is not the same where you bootstrapped the management cluster, a new regional ssh_key will be generated. Make sure to save this key in a secure location as well.

    The workflow of the regional cluster bootstrap script

    #

    Description

    1

    Prepare the bootstrap cluster for the new regional cluster.

    2

    Load the updated Container Cloud CRDs for Credentials, Cluster, and Machines with information about the new regional cluster to the management cluster.

    3

    Connect to each machine of the management cluster through SSH.

    4

    Wait for the Machines and Cluster objects of the new regional cluster to be ready on the management cluster.

    5

    Load the following objects to the new regional cluster: Secret with the management cluster kubeconfig and ClusterRole for the Container Cloud provider.

    6

    Forward the bootstrap cluster endpoint to helm-controller.

    7

    Wait for all CRDs to be available and verify the objects created using these CRDs.

    8

    Pivot the cluster API stack to the regional cluster.

    9

    Switch the LCM agent from the bootstrap cluster to the regional one.

    10

    Wait for the Container Cloud components to start on the regional cluster.

Now, you can proceed with deploying the managed clusters of supported provider types as described in Create and operate managed clusters.

Deploy a VMware vSphere-based regional cluster

You can deploy an additional regional VMware vSphere-based cluster to create managed clusters of several provider types or with different configurations.

To deploy a vSphere-based regional cluster:

  1. Log in to the node where you bootstrapped a management cluster.

  2. Verify that the bootstrap directory is updated.

    Select from the following options:

    • For clusters deployed using Container Cloud 2.11.0 or later:

      ./container-cloud bootstrap download --management-kubeconfig <pathToMgmtKubeconfig> \
      --target-dir <pathToBootstrapDirectory>
      
    • For clusters deployed using the Container Cloud release earlier than 2.11.0 or if you deleted the kaas-bootstrap folder, download and run the Container Cloud bootstrap script:

      wget https://binary.mirantis.com/releases/get_container_cloud.sh
      
      chmod 0755 get_container_cloud.sh
      
      ./get_container_cloud.sh
      
  3. Verify access to the target vSphere cluster from Docker. For example:

    docker run --rm alpine sh -c "apk add --no-cache curl; \
    curl https://vsphere.server.com"
    

    The system output must contain no error records. In case of issues, follow the steps provided in Troubleshooting.

  4. Prepare deployment templates:

    1. Modify templates/vsphere/vsphere-config.yaml.template:

      vSphere configuration data

      Parameter

      Description

      SET_VSPHERE_SERVER

      IP address or FQDN of the vCenter Server.

      SET_VSPHERE_SERVER_PORT

      Port of the vCenter Server. For example, port: "8443". Leave empty to use 443 by default.

      SET_VSPHERE_DATACENTER

      vSphere data center name.

      SET_VSPHERE_SERVER_INSECURE

      Flag that controls validation of the vSphere Server certificate. Must be true or false.

      SET_VSPHERE_CAPI_PROVIDER_USERNAME

      vSphere Cluster API provider user name that you added when preparing the deployment user setup and permissions.

      SET_VSPHERE_CAPI_PROVIDER_PASSWORD

      vSphere Cluster API provider user password.

      SET_VSPHERE_CLOUD_PROVIDER_USERNAME

      vSphere Cloud Provider deployment user name that you added when preparing the deployment user setup and permissions.

      SET_VSPHERE_CLOUD_PROVIDER_PASSWORD

      vSphere Cloud Provider deployment user password.

    2. Modify the templates/vsphere/cluster.yaml.template parameters to fit your deployment. For example, add the corresponding values for cidrBlocks in the spec::clusterNetwork::services section.

      Required parameters

      Parameter

      Description

      SET_LB_HOST

      IP address from the provided vSphere network for load balancer (Keepalived).

      SET_VSPHERE_METALLB_RANGE

      MetalLB range of IP addresses that can be assigned to load balancers for Kubernetes Services.

      SET_VSPHERE_DATASTORE

      Name of the vSphere datastore. You can use different datastores for vSphere Cluster API and vSphere Cloud Provider.

      SET_VSPHERE_MACHINES_FOLDER

      Path to a folder where the cluster machines metadata will be stored.

      SET_VSPHERE_NETWORK_PATH

      Path to a network for cluster machines.

      SET_VSPHERE_RESOURCE_POOL_PATH

      Path to a resource pool in which VMs will be created.

    3. For either DHCP or non-DHCP vSphere network:

      1. Determine the vSphere network parameters as described in VMware vSphere network objects and IPAM recommendations.

      2. Provide the following additional parameters for a proper network setup on machines using embedded IP address management (IPAM) in templates/vsphere/cluster.yaml.template

      vSphere configuration data

      Parameter

      Description

      ipamEnabled

      Enables IPAM. Set to true for networks without DHCP.

      SET_VSPHERE_NETWORK_CIDR

      CIDR of the provided vSphere network. For example, 10.20.0.0/16.

      SET_VSPHERE_NETWORK_GATEWAY

      Gateway of the provided vSphere network.

      SET_VSPHERE_CIDR_INCLUDE_RANGES

      Optional. IP range for the cluster machines. Specify the range of the provided CIDR. For example, 10.20.0.100-10.20.0.200.

      SET_VSPHERE_CIDR_EXCLUDE_RANGES

      Optional. IP ranges to be excluded from being assigned to the cluster machines. The MetalLB range and SET_LB_HOST should not intersect with the addresses for IPAM. For example, 10.20.0.150-10.20.0.170.

      SET_VSPHERE_NETWORK_NAMESERVERS

      List of nameservers for the provided vSphere network.

    4. For RHEL deployments, fill out templates/vsphere/rhellicenses.yaml.template using one of the following set of parameters for RHEL machines subscription:

      • The user name and password of your RedHat Customer Portal account associated with your RHEL license for Virtual Datacenters.

        Optionally, provide the subscription allocation pools to use for the RHEL subscriptions activation. If not needed, remove the poolIDs field for subscription-manager to automatically select the licenses for machines.

        For example:

        spec:
          username: <username>
          password:
            value: <password>
          poolIDs:
          - <pool1>
          - <pool2>
        
      • The activation key and organization ID associated with your RedHat account with RHEL license for Virtual Datacenters. The activation key can be created by the organization administrator on RedHat Customer Portal.

        If you use the RedHat Satellite server for management of your RHEL infrastructure, you can provide a pre-generated activation key from that server. In this case:

        • Provide the URL to the RedHat Satellite RPM for installation of the CA certificate that belongs to that server.

        • Configure squid-proxy on the management or regional cluster to allow access to your Satellite server. For details, see Configure squid-proxy.

        For example:

        spec:
          activationKey:
            value: <activation key>
          orgID: "<organization ID>"
          rpmUrl: <rpm url>
        

      Caution

      Provide only one set of parameters. Mixing of parameters from different activation methods will cause deployment failure.

    5. For CentOS deployments, in templates/vsphere/rhellicenses.yaml.template, remove all lines under items:.

  5. Optional. Configure the regional NTP server parameters to be applied to all machines of regional and managed clusters in the specified region.

    In templates/vsphere/cluster.yaml.template, add the ntp:servers section with the list of required servers names:

    spec:
      ...
      providerSpec:
        value:
          kaas:
          ...
            regional:
              - helmReleases:
                - name: vsphere-provider
                  values:
                    config:
                      lcm:
                        ...
                        ntp:
                          servers:
                          - 0.pool.ntp.org
                          ...
                provider: vsphere
                ...
    
  6. Prepare the OVF template as described in Prepare the OVF template.

  7. In templates/vsphere/machines.yaml.template:

    • Define SET_VSPHERE_TEMPLATE_PATH prepared in the previous step

    • Modify other parameters as required

    spec:
      providerSpec:
        value:
          apiVersion: vsphere.cluster.k8s.io/v1alpha1
          kind: VsphereMachineProviderSpec
          rhelLicense: <rhel-license-name>
          network:
            devices:
            - dhcp4: true
              dhcp6: false
          template: <SET_VSPHERE_TEMPLATE_PATH>
    

    Note

    The <rhel-license-name> value is the RHEL license name defined in rhellicenses.yaml.tempalte, defaults to kaas-mgmt-rhel-license. Remove or comment out this parameter for CentOS deployments.

  8. Optional. If you require all Internet access to go through a proxy server, in bootstrap.env, add the following environment variables to bootstrap the regional cluster using proxy:

    • HTTP_PROXY

    • HTTPS_PROXY

    • NO_PROXY

    Example snippet:

    export HTTP_PROXY=http://proxy.example.com:3128
    export HTTPS_PROXY=http://user:pass@proxy.example.com:3128
    export NO_PROXY=172.18.10.0,registry.internal.lan
    

    The following variables formats are accepted:

    Proxy configuration data

    Variable

    Format

    • HTTP_PROXY

    • HTTPS_PROXY

    • http://proxy.example.com:port - for anonymous access

    • http://user:password@proxy.example.com:port - for restricted access

    • NO_PROXY

    Comma-separated list of IP addresses or domain names

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for a VMware vSphere-based cluster.

  9. Export the following parameters:

    export KAAS_VSPHERE_ENABLED=true
    export KUBECONFIG=<pathToMgmtClusterKubeconfig>
    export REGIONAL_CLUSTER_NAME=<newRegionalClusterName>
    export REGION=<NewRegionName>
    

    Substitute the parameters enclosed in angle brackets with the corresponding values of your cluster.

    Caution

    The REGION and REGIONAL_CLUSTER_NAME parameters values must contain only lowercase alphanumeric characters, hyphens, or periods.

    Note

    If the bootstrap node for the regional cluster deployment is not the same where you bootstrapped the management cluster, also export SSH_KEY_NAME. It is required for the management cluster to create a publicKey Kubernetes CRD with the public part of your newly generated ssh_key for the regional cluster.

    export SSH_KEY_NAME=<newRegionalClusterSshKeyName>
    
  10. Run the regional cluster bootstrap script:

    ./bootstrap.sh deploy_regional
    

    Note

    When the bootstrap is complete, obtain and save in a secure location the kubeconfig-<regionalClusterName> file located in the same directory as the bootstrap script. This file contains the admin credentials for the regional cluster.

    If the bootstrap node for the regional cluster deployment is not the same where you bootstrapped the management cluster, a new regional ssh_key will be generated. Make sure to save this key in a secure location as well.

    The workflow of the regional cluster bootstrap script

    #

    Description

    1

    Prepare the bootstrap cluster for the new regional cluster.

    2

    Load the updated Container Cloud CRDs for Credentials, Cluster, and Machines with information about the new regional cluster to the management cluster.

    3

    Connect to each machine of the management cluster through SSH.

    4

    Wait for the Machines and Cluster objects of the new regional cluster to be ready on the management cluster.

    5

    Load the following objects to the new regional cluster: Secret with the management cluster kubeconfig and ClusterRole for the Container Cloud provider.

    6

    Forward the bootstrap cluster endpoint to helm-controller.

    7

    Wait for all CRDs to be available and verify the objects created using these CRDs.

    8

    Pivot the cluster API stack to the regional cluster.

    9

    Switch the LCM agent from the bootstrap cluster to the regional one.

    10

    Wait for the Container Cloud components to start on the regional cluster.

Now, you can proceed with deploying the managed clusters of supported provider types as described in Create and operate managed clusters.

Create initial users after a management cluster bootstrap

Once you bootstrap your management or regional cluster, create Keycloak users for access to the Container Cloud web UI. Use the created credentials to log in to the Container Cloud web UI. Mirantis recommends creating at least two users, reader and writer, that are required for a typical Container Cloud deployment.

To create the user for access to the Container Cloud web UI, use the following command:

./container-cloud bootstrap user add --username <userName> --roles <roleName>
--kubeconfig <pathToMgmtKubeconfig>

Note

You will be asked for the user password interactively.

Set the following command flags as required:

Flag

Description

--username

Required. Name of the user to create.

--roles

Required. Role to assign to the user:

  • writer - read and write access

  • reader - view access

  • operator - required for bare metal deployments only to create and manage the BaremetalHost objects

--kubeconfig

Required. Path to the management cluster kubeconfig generated during the management cluster bootstrap.

--namespace

Optional. Name of the Container Cloud project where the user will be created. If not set, a global user will be created for all Container Cloud projects with the corresponding role access to view or manage all Container Cloud public objects.

--password-stdin

Optional. Flag to provide the user password from a file or stdin:

echo '$PASSWORD' | ./container-cloud bootstrap user add --username <userName> --roles <roleName> --kubeconfig <pathToMgmtKubeconfig> --password-stdin

To delete the user, run the following command:

./container-cloud bootstrap user delete --username <userName> --kubeconfig <pathToMgmtKubeconfig>

Troubleshooting

This section provides solutions to the issues that may occur while deploying a management cluster.

Collect the bootstrap logs

If the bootstrap script fails during the deployment process, collect and inspect the bootstrap and management cluster logs.

To collect the bootstrap logs:

  1. Log in to your local machine where the bootstrap script was executed.

  2. Run the following command:

    ./bootstrap.sh collect_logs
    

    The logs are collected in the directory where the bootstrap script is located.

  3. Technology Preview. For bare metal clusters, assess the Ironic pod logs:

    • Extract the content of the 'message' fields from every log message:

      kubectl -n kaas logs <ironicPodName> -c syslog | jq -rM '.message'
      
    • Extract the content of the 'message' fields from the ironic_conductor source log messages:

      kubectl -n kaas logs <ironicPodName> -c syslog | jq -rM 'select(.source == "ironic_conductor") | .message'
      

    The syslog container collects logs generated by Ansible during the node deployment and cleanup and outputs them in the JSON format.


The Container Cloud logs structure in <output_dir>/<cluster_name>/ is as follows:

  • /events.log - human-readable table that contains information about the cluster events

  • /system - system logs

  • /system/<machine_name>/ucp - Mirantis Kuberntes Engine (MKE) logs

  • /objects/cluster - logs of the non-namespaced Kubernetes objects

  • /objects/namespaced - logs of the namespaced Kubernetes objects

  • /objects/namespaced/<namespaceName>/core/pods - pods logs from a specified Kubernetes namespace

  • /objects/namespaced/<namespaceName>/core/pods/<containerName>.prev.log - logs of the pods from a specified Kubernetes namespace that were previously removed or failed

  • /objects/namespaced/<namespaceName>/core/pods/<ironicPodName>/syslog.log Technology Preview - Ironic pod logs of the bare metal clusters

    Note

    Logs collected by the syslog container during the bootstrap phase are not transferred to the management cluster during pivoting. These logs are located in /volume/log/ironic/ansible_conductor.log inside the Ironic pod.

Depending on the type of issue found in logs, apply the corresponding fixes. For example, if you detect the LoadBalancer ERROR state errors during the bootstrap of an OpenStack-based management cluster, contact your system administrator to fix the issue. To troubleshoot other issues, refer to the corresponding section in Troubleshooting.

Troubleshoot the bootstrap node configuration

This section provides solutions to the issues that may occur while configuring the bootstrap node.

DNS settings

If you have issues related to the DNS settings, the following error message may occur:

curl: (6) Could not resolve host

The issue may occur if a VPN is used to connect to the cloud or a local DNS forwarder is set up.

The workaround is to change the default DNS settings for Docker:

  1. Log in to your local machine.

  2. Identify your internal or corporate DNS server address:

    systemd-resolve --status
    
  3. Create or edit /etc/docker/daemon.json by specifying your DNS address:

    {
      "dns": ["<YOUR_DNS_ADDRESS>"]
    }
    
  4. Restart the Docker daemon:

    sudo systemctl restart docker
    
Default network address

If you have issues related to the default network address configuration, cURL either hangs or the following error occurs:

curl: (7) Failed to connect to xxx.xxx.xxx.xxx port xxxx: Host is unreachable

The issue may occur because the default Docker network address 172.17.0.0/16 and/or the kind Docker network, which is used by kind, overlap with your cloud address or other addresses of the network configuration.

Workaround:

  1. Log in to your local machine.

  2. Verify routing to the IP addresses of the target cloud endpoints:

    1. Obtain the IP address of your target cloud. For example:

      nslookup auth.openstack.example.com
      

      Example of system response:

      Name:   auth.openstack.example.com
      Address: 172.17.246.119
      
    2. Verify that this IP address is not routed through docker0 but through any other interface, for example, ens3:

      ip r get 172.17.246.119
      

      Example of the system response if the routing is configured correctly:

      172.17.246.119 via 172.18.194.1 dev ens3 src 172.18.1.1 uid 1000
        cache
      

      Example of the system response if the routing is configured incorrectly:

      172.17.246.119 via 172.18.194.1 dev docker0 src 172.18.1.1 uid 1000
        cache
      
  3. If the routing is incorrect, change the IP address of the default Docker bridge:

    1. Create or edit /etc/docker/daemon.json by adding the "bip" option:

      {
        "bip": "192.168.91.1/24"
      }
      
    2. Restart the Docker daemon:

      sudo systemctl restart docker
      
  4. If required, customize addresses for your kind Docker network or any other additional Docker networks:

    1. Remove the kind network:

      docker network rm 'kind'
      
    2. Choose from the following options:

      • Configure /etc/docker/daemon.json:

        Note

        The following steps are applied to to customize addresses for the kind Docker network. Use these steps as an example for any other additional Docker networks.

        1. Add the following section to /etc/docker/daemon.json:

          {
           "default-address-pools":
           [
             {"base":"192.169.0.0/16","size":24}
           ]
          }
          
        2. Restart the Docker daemon:

          sudo systemctl restart docker
          

          After Docker restart, the newly created local or global scope networks, including 'kind', will be dynamically assigned a subnet from the defined pool.

      • Recreate the 'kind' Docker network manually with a subnet that is not in use in your network. For example:

        docker network create -o com.docker.network.bridge.enable_ip_masquerade=true -d bridge --subnet 192.168.0.0/24 'kind'
        

        Caution

        Docker pruning removes the user defined networks, including 'kind'. Therefore, every time after running the Docker pruning commands, re-create the 'kind' network again using the command above.

Troubleshoot OpenStack-based deployments

This section provides solutions to the issues that may occur while deploying an OpenStack-based management cluster. To troubleshoot a managed cluster, see Operations Guide: Troubleshooting.

TLS handshake timeout

If you execute the bootstrap.sh script from an OpenStack VM that is running on the OpenStack environment used for bootstrapping the management cluster, the following error messages may occur that can be related to the MTU settings discrepancy:

curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to server:port

Failed to check if machine "<machine_name>" exists:
failed to create provider client ... TLS handshake timeout

To identify whether the issue is MTU-related:

  1. Log in to the OpenStack VM in question.

  2. Compare the MTU outputs for the docker0 and ens3 interfaces:

    ip addr
    

    Example of system response:

    3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500...
    ...
    2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450...
    

    If the MTU output values differ for docker0 and ens3, proceed with the workaround below. Otherwise, inspect the logs further to identify the root cause of the error messages.

Workaround:

  1. In your OpenStack environment used for Mirantis Container Cloud, log in to any machine with CLI access to OpenStack. For example, you can create a new Ubuntu VM (separate from the bootstrap VM) and install the python-openstackclient package on it.

  2. Change the vXLAN MTU size for the VM to the required value depending on your network infrastructure and considering your physical network configuration, such as Jumbo frames, and so on.

    openstack network set --mtu <YOUR_MTU_SIZE> <network-name>
    
  3. Stop and start the VM in Nova.

  4. Log in to the bootstrap VM dedicated for the management cluster.

  5. Re-execute the bootstrap.sh script.

Troubleshoot vSphere-based deployments

This section provides solutions to the issues that may occur while deploying a vSphere-based management cluster. To troubleshoot a managed cluster, see Operations Guide: Troubleshooting.

Virtual machine issues with obtaining an IP

Issues with virtual machines obtaining an IP may occur during the machines deployment of the vSphere-based Container Cloud management or managed cluster with IPAM enabled.

The issue symptoms are as follows:

  • On a cluster network with a DHCP server, the machine obtains a wrong IP address that is most likely provided by the DHCP server. The cluster deployment proceeds with unexpected IP addresses that are not in the IPAM range.

  • On a cluster network without a DHCP server, the machine does not obtain an IP address. The deployment freezes and fails by timeout.

To apply the issue resolution:

  1. Verify that the cloud-init package version in the VM template is 19.4 or later. Older versions are affected by the cloud-init bug.

    cloud-init --version
    
  2. Verify that the open-vm-tools package version is 11.0.5 or later.

    vmtoolsd --version
    vmware-toolbox-cmd --version
    
  3. Verify that the /etc/cloud/cloud.cfg.d/99-DataSourceVMwareGuestInfo.cfg file is present on the cluster and it is not empty.

  4. Verify that the DataSourceVMwareGuestInfo.py file is present in the cloud-init sources folder and is not empty. To obtain the cloud-init folder:

    python -c 'import os; from cloudinit import sources; print(os.path.dirname(sources.__file__));'
    

If your deployment meets the requirements described in the verification steps above but the issue still persists, rebuild the VM template as described in Prepare the OVF template or contact Mirantis support.

Configure external identity provider for IAM

This section describes how to configure authentication for Mirantis Container Cloud depending on the external identity provider type integrated to your deployment.

Configure LDAP for IAM

If you integrate LDAP for IAM to Mirantis Container Cloud, add the required LDAP configuration to cluster.yaml.template during the bootstrap of the management cluster.

Note

The example below defines the recommended non-anonymous authentication type. If you require anonymous authentication, replace the following parameters with authType: "none":

authType: "simple"
bindCredential: ""
bindDn: ""

To configure LDAP for IAM:

  1. Select from the following options:

    • For a baremetal-based management cluster, open the templates/bm/cluster.yaml.template file for editing.

    • For an OpenStack management cluster, open the templates/cluster.yaml.template file for editing.

    • For an AWS-based management cluster, open the templates/aws/cluster.yaml.template file for editing.

  2. Configure the keycloak:userFederation:providers: and keycloak:userFederation:mappers: sections as required:

    Note

    • Verify that the userFederation section is located on the same level as the initUsers section.

    • Verify that all attributes set in the mappers section are defined for users in the specified LDAP system. Missing attributes may cause authorization issues.

    spec:
      providerSpec:
        value:
          kaas:
            management:
              helmReleases:
              - name: iam
                values:
                  keycloak:
                    userFederation:
                      providers:
                        - displayName: "<LDAP_NAME>"
                          providerName: "ldap"
                          priority: 1
                          fullSyncPeriod: -1
                          changedSyncPeriod: -1
                          config:
                            pagination: "true"
                            debug: "false"
                            searchScope: "1"
                            connectionPooling: "true"
                            usersDn: "<DN>" # "ou=People, o=<ORGANIZATION>, dc=<DOMAIN_COMPONENT>"
                            userObjectClasses: "inetOrgPerson,organizationalPerson"
                            usernameLDAPAttribute: "uid"
                            rdnLDAPAttribute: "uid"
                            vendor: "ad"
                            editMode: "READ_ONLY"
                            uuidLDAPAttribute: "uid"
                            connectionUrl: "ldap://<LDAP_DNS>"
                            syncRegistrations: "false"
                            authType: "simple"
                            bindCredential: ""
                            bindDn: ""
                      mappers:
                        - name: "username"
                          federationMapperType: "user-attribute-ldap-mapper"
                          federationProviderDisplayName: "<LDAP_NAME>"
                          config:
                            ldap.attribute: "uid"
                            user.model.attribute: "username"
                            is.mandatory.in.ldap: "true"
                            read.only: "true"
                            always.read.value.from.ldap: "false"
                        - name: "full name"
                          federationMapperType: "full-name-ldap-mapper"
                          federationProviderDisplayName: "<LDAP_NAME>"
                          config:
                            ldap.full.name.attribute: "cn"
                            read.only: "true"
                            write.only: "false"
                        - name: "last name"
                          federationMapperType: "user-attribute-ldap-mapper"
                          federationProviderDisplayName: "<LDAP_NAME>"
                          config:
                            ldap.attribute: "sn"
                            user.model.attribute: "lastName"
                            is.mandatory.in.ldap: "true"
                            read.only: "true"
                            always.read.value.from.ldap: "true"
                        - name: "email"
                          federationMapperType: "user-attribute-ldap-mapper"
                          federationProviderDisplayName: "<LDAP_NAME>"
                          config:
                            ldap.attribute: "mail"
                            user.model.attribute: "email"
                            is.mandatory.in.ldap: "false"
                            read.only: "true"
                            always.read.value.from.ldap: "true"
    

Now, return to the bootstrap instruction depending on the provider type of your management cluster.

Configure Google OAuth IdP for IAM

Caution

The instruction below applies to the DNS-based management clusters. If you bootstrap a non-DNS-based management cluster, configure Google OAuth IdP for Keycloak after bootstrap using the official Keycloak documentation.

If you integrate Google OAuth external identity provider for IAM to Mirantis Container Cloud, create the authorization credentials for IAM in your Google OAuth account and configure cluster.yaml.template during the bootstrap of the management cluster.

To configure Google OAuth IdP for IAM:

  1. Create Google OAuth credentials for IAM:

    1. Log in to your https://console.developers.google.com.

    2. Navigate to Credentials.

    3. In the APIs Credentials menu, select OAuth client ID.

    4. In the window that opens:

      1. In the Application type menu, select Web application.

      2. In the Authorized redirect URIs field, type in <keycloak-url>/auth/realms/iam/broker/google/endpoint, where <keycloak-url> is the corresponding DNS address.

      3. Press Enter to add the URI.

      4. Click Create.

      A page with your client ID and client secret opens. Save these credentials for further usage.

  2. Log in to the bootstrap node.

  3. Select from the following options:

    • For a baremetal-based management cluster, open the templates/bm/cluster.yaml.template file for editing.

    • For an OpenStack management cluster, open the templates/cluster.yaml.template file for editing.

    • For an AWS-based management cluster, open the templates/aws/cluster.yaml.template file for editing.

  4. In the keycloak:externalIdP: section, add the following snippet with your credentials created in previous steps:

    keycloak:
      externalIdP:
        google:
          enabled: true
          config:
            clientId: <Google_OAuth_client_ID>
            clientSecret: <Google_OAuth_client_secret>
    

Now, return to the bootstrap instruction depending on the provider type of your management cluster.

Operations Guide

Mirantis Container Cloud CLI

The Mirantis Container Cloud APIs are implemented using the Kubernetes CustomResourceDefinitions (CRDs) that enable you to expand the Kubernetes API. For details, see API Reference.

You can operate Container Cloud using the kubectl command-line tool that is based on the Kubernetes API. For the kubectl reference, see the official Kubernetes documentation.

The Container Cloud Operations Guide mostly contains manuals that describe the Container Cloud web UI that is intuitive and easy to get started with. Some sections are divided into a web UI instruction and an analogous but more advanced CLI one. Certain Container Cloud operations can be performed only using CLI with the corresponding steps described in dedicated sections. For details, refer to the required component section of this guide.

Create and operate managed clusters

Note

This tutorial applies only to the Container Cloud web UI users with the writer access role assigned by the Infrastructure Operator. To add a bare metal host, the operator access role is also required.

After you deploy the Mirantis Container Cloud management cluster, you can start creating managed clusters that will be based on the same cloud provider type that you have for the management or regional cluster: OpenStack, AWS, bare metal, VMware vSphere, Microsoft Azure, or Equinix Metal.

The deployment procedure is performed using the Container Cloud web UI and comprises the following steps:

  1. For a baremetal-based managed cluster, create and configure bare metal hosts with corresponding labels for machines such as worker, manager, or storage.

  2. Create an initial cluster configuration depending on the provider type.

  3. Add the required amount of machines with the corresponding configuration to the managed cluster.

  4. For a baremetal-based or Equinix Metal based managed cluster, add a Ceph cluster.

Note

The Container Cloud web UI communicates with Keycloak to authenticate users. Keycloak is exposed using HTTPS with self-signed TLS certificates that are not trusted by web browsers.

To use your own TLS certificates for Keycloak, refer to Configure TLS certificates for management cluster applications.

Create and operate a baremetal-based managed cluster

After bootstrapping your baremetal-based Mirantis Container Cloud management cluster as described in Deploy a baremetal-based management cluster, you can start creating the baremetal-based managed clusters using the Container Cloud web UI.

Add a bare metal host

Before creating a bare metal managed cluster, add the required number of bare metal hosts either using the Container Cloud web UI for a default configuration or using CLI for an advanced configuration.

Add a bare metal host using web UI

This section describes how to add bare metal hosts using the Container Cloud web UI during a managed cluster creation.

Before you proceed with adding a bare metal host:

  • Verify that the physical network on the server has been configured correctly. See Network fabric for details.

  • Enable the boot NIC support for UEFI load. Usually, at least the built-in network interfaces support it.

  • Enable the UEFI-LAN-OPROM support in BIOS -> Advanced -> PCIPCIe.

  • Enable the IPv4-PXE stack.

  • Set the following boot order:

    1. UEFI-DISK

    2. UEFI-PXE

  • If your PXE network is not configured to use the first network interface, fix the UEFI-PXE boot order to speed up node discovering by selecting only one required network interface.

  • Power off all bare metal hosts.

Warning

Only one Ethernet port on a host must be connected to the Common/PXE network at any given time. The physical address (MAC) of this interface must be noted and used to configure the BareMetalHost object describing the host.

To add a bare metal host to a baremetal-based managed cluster:

  1. Optional. Create a custom bare metal host profile depending on your needs as described in Create a custom bare metal host profile.

    Note

    You can view the created profiles in the BM Host Profiles tab of the Container Cloud web UI.

  2. Log in to the Container Cloud web UI with the operator permissions.

  3. Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.

  4. In the Baremetal tab, click Add BM host.

  5. Fill out the Add new BM host form as required:

    • Baremetal host name

      Specify the name of the new bare metal host.

    • Username

      Specify the name of the user for accessing the BMC (IPMI user).

    • Password

      Specify the password of the user for accessing the BMC (IPMI password).

    • Boot MAC address

      Specify the MAC address of the PXE network interface.

    • IP Address

      Specify the IP address to access the BMC.

    • Label

      Assign the machine label to the new host that defines which type of machine may be deployed on this bare metal host. Only one label can be assigned to a host. The supported labels include:

      • Manager

        This label is selected and set by default. Assign this label to the bare metal hosts that can be used to deploy machines with the manager type. These hosts must match the CPU and RAM requirements described in Reference hardware configuration.

      • Worker

        The host with this label may be used to deploy the worker machine type. Assign this label to the bare metal hosts that have sufficient CPU and RAM resources, as described in Reference hardware configuration.

      • Storage

        Assign this label to the bare metal hosts that have sufficient storage devices to match Reference hardware configuration. Hosts with this label will be used to deploy machines with the storage type that run Ceph OSDs.

  6. Click Create.

    While adding the bare metal host, Container Cloud discovers and inspects the hardware of the bare metal host and adds it to BareMetalHost.status for future references.

    During provisioning, baremetal-operator inspects the bare metal host and moves it to the Preparing state. The host becomes ready to be linked to a bare metal machine.

  7. Verify the results of the hardware inspection to avoid unexpected errors during the host usage:

    1. In the BM Hosts tab, verify that the bare metal host is registered and switched to one of the following statuses:

      • Preparing for a newly added host

      • Ready for a previously used host or for a host that is already linked to a machine

    2. Click the name of the newly added bare metal host.

    3. In the window with the host details, scroll down to the Hardware section.

    4. Review the section and make sure that the number and models of disks, network interface cards, and CPUs match the hardware specification of the server.

      • If the hardware details are consistent with the physical server specifications for all your hosts, proceed to Create a managed cluster.

      • If you find any discrepancies in the hardware inspection results, it might indicate that the server has hardware issues or is not compatible with Container Cloud.

Add a bare metal host using CLI

This section describes how to add bare metal hosts using the Container Cloud CLI during a managed cluster creation.

To add a bare metal host using API:

  1. Verify that you configured each bare metal host as follows:

    • Enable the boot NIC support for UEFI load. Usually, at least the built-in network interfaces support it.

    • Enable the UEFI-LAN-OPROM support in BIOS -> Advanced -> PCIPCIe.

    • Enable the IPv4-PXE stack.

    • Set the following boot order:

      1. UEFI-DISK

      2. UEFI-PXE

    • If your PXE network is not configured to use the first network interface, fix the UEFI-PXE boot order to speed up node discovering by selecting only one required network interface.

    • Power off all bare metal hosts.

    Warning

    Only one Ethernet port on a host must be connected to the Common/PXE network at any given time. The physical address (MAC) of this interface must be noted and used to configure the BareMetalHost object describing the host.

  2. Optional. Create a custom bare metal host profile depending on your needs as described in Create a custom bare metal host profile.

  3. Log in to the host where your management cluster kubeconfig is located and where kubectl is installed.

  4. Create a secret YAML file that describes the unique credentials of the new bare metal host.

    Example of the bare metal host secret:

    apiVersion: v1
    data:
      password: <credentials-password>
      username: <credentials-user-name>
    kind: Secret
    metadata:
      labels:
        kaas.mirantis.com/credentials: "true"
        kaas.mirantis.com/provider: baremetal
        kaas.mirantis.com/region: region-one
      name: <credentials-name>
      namespace: <managed-cluster-project-name>
    type: Opaque
    

    In the data section, add the IPMI user name and password in the base64 encoding to access the BMC. To obtain the base64-encoded credentials, you can use the following command in your Linux console:

    echo -n <username|password> | base64
    

    Caution

    Each bare metal host must have a unique Secret.

  5. Apply this secret YAML file to your deployment:

    kubectl create -f ${<bmh-cred-file-name>}.yaml
    
  6. Create a YAML file that contains a description of the new bare metal host.

    Example of the bare metal host configuration file with the worker role:

    apiVersion: metal3.io/v1alpha1
    kind: BareMetalHost
    metadata:
      labels:
        kaas.mirantis.com/baremetalhost-id: <unique-bare-metal-host-hardware-node-id>
        hostlabel.bm.kaas.mirantis.com/worker: "true"
        kaas.mirantis.com/provider: baremetal
        kaas.mirantis.com/region: region-one
      name: <bare-metal-host-unique-name>
      namespace: <managed-cluster-project-name>
    spec:
      bmc:
        address: <ip_address_for-bmc-access>
        credentialsName: <credentials-name>
      bootMACAddress: <bare-metal-host-boot-mac-address>
      online: true
    

    For a detailed fields description, see BareMetalHost.

  7. Apply this configuration YAML file to your deployment:

    kubectl create -f ${<bare-metal-host-config-file-name>}.yaml
    

    During provisioning, baremetal-operator inspects the bare metal host and moves it to the Preparing state. The host becomes ready to be linked to a bare metal machine.

  8. Log in to the Container Cloud web UI with the with the operator permissions.

  9. Verify the results of the hardware inspection to avoid unexpected errors during the host usage:

    1. In the BM Hosts tab, verify that the bare metal host is registered and switched to one of the following statuses:

      • Preparing for a newly added host

      • Ready for a previously used host or for a host that is already linked to a machine

    2. Click the name of the newly added bare metal host.

    3. In the window with the host details, scroll down to the Hardware section.

    4. Review the section and make sure that the number and models of disks, network interface cards, and CPUs match the hardware specification of the server.

      • If the hardware details are consistent with the physical server specifications for all your hosts, proceed to Create a managed cluster.

      • If you find any discrepancies in the hardware inspection results, it might indicate that the server has hardware issues or is not compatible with Container Cloud.

Create a custom bare metal host profile

The bare metal host profile is a Kubernetes custom resource. It allows the operator to define how the storage devices and the operating system are provisioned and configured.

This section describes the bare metal host profile default settings and configuration of custom profiles for managed clusters using Mirantis Container Cloud API. This procedure also applies to a management cluster with a few differences described in Customize the default bare metal host profile.

Note

You can view the created profiles in the BM Host Profiles tab of the Container Cloud web UI.

Note

Using BareMetalHostProfile, you can configure LVM or mdadm-based software RAID support during a management or managed cluster creation. For details, see Configure RAID support.

This feature is available as Technology Preview. Use such configuration for testing and evaluation purposes only. For the Technology Preview feature definition, refer to Technology Preview support scope.

Default configuration of the host system storage

The default host profile requires three storage devices in the following strict order:

  1. Boot device and operating system storage

    This device contains boot data and operating system data. It is partitioned using the GUID Partition Table (GPT) labels. The root file system is an ext4 file system created on top of an LVM logical volume. For a detailed layout, refer to the table below.

  2. Local volumes device

    This device contains an ext4 file system with directories mounted as persistent volumes to Kubernetes. These volumes are used by the Mirantis Container Cloud services to store its data, including monitoring and identity databases.

  3. Ceph storage device

    This device is used as a Ceph datastore or Ceph OSD.

The following table summarizes the default configuration of the host system storage set up by the Container Cloud bare metal management.

Default configuration of the bare metal host storage

Device/partition

Name/Mount point

Recommended size, GB

Description

/dev/sda1

bios_grub

4 MiB

The mandatory GRUB boot partition required for non-UEFI systems.

/dev/sda2

UEFI -> /boot/efi

0.2 GiB

The boot partition required for the UEFI boot mode.

/dev/sda3

config-2

64 MiB

The mandatory partition for the cloud-init configuration. Used during the first host boot for initial configuration.

/dev/sda4

lvm_root_part

100% of the remaining free space in the LVM volume group

The main LVM physical volume that is used to create the root file system.

/dev/sdb

lvm_lvp_part -> /mnt/local-volumes

100% of the remaining free space in the LVM volume group

The LVM physical volume that is used to create the file system for LocalVolumeProvisioner.

/dev/sdc

-

100% of the remaining free space in the LVM volume group

Clean raw disk that will be used for the Ceph storage back end.

If required, you can customize the default host storage configuration. For details, see Create a custom host profile.

Create a custom host profile

In addition to the default BareMetalHostProfile object installed with Mirantis Container Cloud, you can create custom profiles for managed clusters using Container Cloud API.

Note

The procedure below also applies to the Container Cloud management clusters.

To create a custom bare metal host profile:

  1. Select from the following options:

    • For a management cluster, log in to the bare metal seed node that will be used to bootstrap the management cluster.

    • For a managed cluster, log in to the local machine where you management cluster kubeconfig is located and where kubectl is installed.

      Note

      The management cluster kubeconfig is created automatically during the last stage of the management cluster bootstrap.

  2. Select from the following options:

    • For a management cluster, open templates/bm/baremetalhostprofiles.yaml.template for editing.

    • For a managed cluster, create a new bare metal host profile under the templates/bm/ directory.

  3. Edit the host profile using the example template below to meet your hardware configuration requirements:

    apiVersion: metal3.io/v1alpha1
    kind: BareMetalHostProfile
    metadata:
      name: <PROFILE_NAME>
      namespace: <PROJECT_NAME>
    spec:
      devices:
      # From the HW node, obtain the first device, which size is at least 120Gib
      - device:
          minSizeGiB: 120
          wipe: true
        partitions:
        - name: bios_grub
          partflags:
          - bios_grub
          sizeGiB: 0.00390625
          wipe: true
        - name: uefi
          partflags:
          - esp
          sizeGiB: 0.2
          wipe: true
        - name: config-2
          sizeGiB: 0.0625
          wipe: true
        - name: lvm_root_part
          sizeGiB: 0
          wipe: true
      # From the HW node, obtain the second device, which size is at least 120Gib
      # If a device exists but does not fit the size,
      # the BareMetalHostProfile will not be applied to the node
      - device:
          minSizeGiB: 120
          wipe: true
      # From the HW node, obtain the disk device with the exact name
      - device:
          byName: /dev/nvme0n1
          minSizeGiB: 120
          wipe: true
        partitions:
        - name: lvm_lvp_part
          sizeGiB: 0
          wipe: true
      # Example of wiping a device w\o partitioning it.
      # Mandatory for the case when a disk is supposed to be used for Ceph back end
      # later
      - device:
          byName: /dev/sde
          wipe: true
      fileSystems:
      - fileSystem: vfat
        partition: config-2
      - fileSystem: vfat
        mountPoint: /boot/efi
        partition: uefi
      - fileSystem: ext4
        logicalVolume: root
        mountPoint: /
      - fileSystem: ext4
        logicalVolume: lvp
        mountPoint: /mnt/local-volumes/
      logicalVolumes:
      - name: root
        sizeGiB: 0
        vg: lvm_root
      - name: lvp
        sizeGiB: 0
        vg: lvm_lvp
      postDeployScript: |
        #!/bin/bash -ex
        echo $(date) 'post_deploy_script done' >> /root/post_deploy_done
      preDeployScript: |
        #!/bin/bash -ex
        echo $(date) 'pre_deploy_script done' >> /root/pre_deploy_done
      volumeGroups:
      - devices:
        - partition: lvm_root_part
        name: lvm_root
      - devices:
        - partition: lvm_lvp_part
        name: lvm_lvp
      grubConfig:
        defaultGrubOptions:
        - GRUB_DISABLE_RECOVERY="true"
        - GRUB_PRELOAD_MODULES=lvm
        - GRUB_TIMEOUT=20
      kernelParameters:
        sysctl:
          kernel.panic: "900"
          kernel.dmesg_restrict: "1"
          kernel.core_uses_pid: "1"
          fs.file-max: "9223372036854775807"
          fs.aio-max-nr: "1048576"
          fs.inotify.max_user_instances: "4096"
          vm.max_map_count: "262144"
    
  4. To use multiple devices for LVM volume, use the example template extract below for reference.

    Caution

    The following template extract contains only sections relevant to LVM configuration with multiple PVs. Expand the main template described in the the previous step with the configuration below if required.

    spec:
      devices:
        ...
        - device:
          ...
          partitions:
            - name: lvm_lvp_part1
              sizeGiB: 0
              wipe: true
        - device:
          ...
          partitions:
            - name: lvm_lvp_part2
              sizeGiB: 0
              wipe: true
    volumeGroups:
      ...
      - devices:
        - partition: lvm_lvp_part1
        - partition: lvm_lvp_part2
        name: lvm_lvp
    logicalVolumes:
      ...
      - name: root
        sizeGiB: 0
        vg: lvm_lvp
    fileSystems:
      ...
      - fileSystem: ext4
        logicalVolume: root
        mountPoint: /
    
  5. Optional. Technology Preview. Configure support of the Redundant Array of Independent Disks (RAID) that allows, for example, installing a cluster operating system on a RAID device, refer to Configure RAID support.

  6. Add or edit the mandatory parameters in the new BareMetalHostProfile object. For the parameters description, see API: BareMetalHostProfile spec.

  7. Select from the following options:

    • For a management cluster, proceed with the cluster bootstrap procedure as described in Bootstrap a management cluster.

    • For a managed cluster:

      1. Add the bare metal host profile to your management cluster:

        kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> apply -f <pathToBareMetalHostProfileFile>
        
      2. If required, further modify the host profile:

        kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> edit baremetalhostprofile <hostProfileName>
        
      3. Proceed with Add a bare metal host either using web UI or CLI.

Enable huge pages in a host profile

The BareMetalHostProfile API allows configuring a host to use the huge pages feature of the Linux kernel on managed clusters.

Note

Huge pages is a mode of operation of the Linux kernel. With huge pages enabled, the kernel allocates the RAM in bigger chunks, or pages. This allows a KVM (kernel-based virtual machine) and VMs running on it to use the host RAM more efficiently and improves the performance of VMs.

To enable huge pages in a custom bare metal host profile for a managed cluster:

  1. Log in to the local machine where you management cluster kubeconfig is located and where kubectl is installed.

    Note

    The management cluster kubeconfig is created automatically during the last stage of the management cluster bootstrap.

  2. Open for editing or create a new bare metal host profile under the templates/bm/ directory.

  3. Edit the grubConfig section of the host profile spec using the example below to configure the kernel boot parameters and enable huge pages:

    spec:
      grubConfig:
        defaultGrubOptions:
        - GRUB_DISABLE_RECOVERY="true"
        - GRUB_PRELOAD_MODULES=lvm
        - GRUB_TIMEOUT=20
        - GRUB_CMDLINE_LINUX_DEFAULT="hugepagesz=1G hugepages=N"
    

    The example configuration above will allocate N huge pages of 1 GB each on the server boot. The last hugepagesz parameter value is default unless default_hugepagesz is defined. For details about possible values, see official Linux kernel documentation.

  4. Add the bare metal host profile to your management cluster:

    kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> apply -f <pathToBareMetalHostProfileFile>
    
  5. If required, further modify the host profile:

    kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> edit baremetalhostprofile <hostProfileName>
    
  6. Proceed with Add a bare metal host.

Configure RAID support

Caution

This feature is available as Technology Preview. Use such configuration for testing and evaluation purposes only. For the Technology Preview feature definition, refer to Technology Preview support scope.

You can configure support of the software-based Redundant Array of Independent Disks (RAID) using BareMetalHosProfile to set up an LVM or mdadm-based RAID level 1 (raid1). If required, you can further configure RAID in the same profile, for example, to install a cluster operating system onto a RAID device.

Caution

  • RAID configuration on already provisioned bare metal machines or on an existing cluster is not supported.

    To start using any kind of RAIDs, reprovisioning of machines with a new BaremetalHostProfile is required.

  • Mirantis supports only the raid1 type of RAID devices both for LVM and mdadm. The raid0 type for the mdadm RAID is also available to be on par with the LVM linear type.

  • Mirantis recommends having at least two physical disks for RAID to prevent unnecessary complexity.

Create an LVM software RAID level 1 (raid1)

Caution

This feature is available as Technology Preview. Use such configuration for testing and evaluation purposes only. For the Technology Preview feature definition, refer to Technology Preview support scope.

During configuration of your custom bare metal host profile, you can create an LVM-based software RAID device raid1 by adding type: raid1 to the logicalVolume spec in BaremetalHostProfile.

Caution

The logicalVolume spec of the raid1 type requires at least two devices (partitions) in volumeGroup where you build a logical volume. For an LVM of the linear type, one device is enough.

Note

The LVM raid1 requires additional space to store the raid1 metadata on a volume group, roughly 4 MB for each partition. Therefore, you cannot create a logical volume of exactly the same size as the partitions it works on.

For example, if you have two partitions of 10 GiB, the corresponding raid1 logical volume size will be less than 10 GiB. For that reason, you can either set sizeGiB: 0 to use all available space on the volume group, or set a smaller size than the partition size. For example, use sizeGiB: 9.9 instead of sizeGiB: 10 for the logical volume.

The following example illustrates an extract of BaremetalHostProfile with / on the LVM raid1.

...
devices:
  - device:
      byName: /dev/sda
      minSizeGiB: 200
      type: hdd
      wipe: true
    partitions:
      - name: root_part1
        sizeGiB: 120
    partitions:
      - name: rest_sda
        sizeGiB: 0
  - device:
      byName: /dev/sdb
      minSizeGiB: 200
      type: hdd
      wipe: true
    partitions:
      - name: root_part2
        sizeGiB: 120
    partitions:
      - name: rest_sdb
        sizeGiB: 0
volumeGroups:
  - name: vg-root
    devices:
      - partition: root_part1
      - partition: root_part2
  - name: vg-data
    devices:
      - partition: rest_sda
      - partition: rest_sdb
logicalVolumes:
  - name: root
    type: raid1  ## <-- LVM raid1
    vg: vg-root
    sizeGiB: 119.9
  - name: data
    type: linear
    vg: vg-data
    sizeGiB: 0
fileSystems:
  - fileSystem: ext4
    logicalVolume: root
    mountPoint: /
    mountOpts: "noatime,nodiratime"
  - fileSystem: ext4
    logicalVolume: data
    mountPoint: /mnt/data
Create an mdadm software RAID level 1 (raid1)

Caution

This feature is available as Technology Preview. Use such configuration for testing and evaluation purposes only. For the Technology Preview feature definition, refer to Technology Preview support scope.

During configuration of your custom bare metal host profile as described in Create a custom bare metal host profile, you can create an mdadm-based software RAID device raid1 by describing the mdadm devices under the softRaidDevices field in BaremetalHostProfile. For example:

...
softRaidDevices:
- name: /dev/md0
   devices:
   - partition: sda1
   - partition: sdb1
- name: raid-name
   devices:
   - partition: sda2
   - partition: sdb2
...

The only two required fields to describe RAID devices are name and devices. The devices field must describe at least two partitions to build an mdadm RAID on it. For the mdadm RAID parameters, see API: BareMetalHostProfile spec.

Caution

The mdadm RAID devices cannot be created on top of LVM devices, as well as LVM devices cannot be created on top of mdadm devices.

The following example illustrates an extract of BaremetalHostProfile with / on the mdadm raid1 and some data storage on raid0:

...
devices:
  - device:
      byName: /dev/sda
      wipe: true
    partitions:
      - name: root_part1
        sizeGiB: 120
    partitions:
      - name: rest_sda
        sizeGiB: 0
  - device:
      byName: /dev/sdb
      wipe: true
    partitions:
      - name: root_part2
        sizeGiB: 120
    partitions:
      - name: rest_sdb
        sizeGiB: 0
softRaidDevices:
  - name: root
    level: raid1  ## <-- mdadm raid1
    devices:
      - partition: root_part1
      - partition: root_part2
  - name: raid-name
    level: raid0  ## <-- mdadm raid0
    devices:
      - partition: rest_sda
      - partition: rest_sdb
fileSystems:
  - fileSystem: ext4
    softRaidDevice: root
    mountPoint: /
    mountOpts: "noatime,nodiratime"
  - fileSystem: ext4
    softRaidDevice: data
    mountPoint: /mnt/data
...
Create a managed cluster

This section instructs you on how to configure and deploy a managed cluster that is based on the baremetal-based management cluster through the Mirantis Container Cloud web UI.

To create a managed cluster on bare metal:

  1. Log in to the Container Cloud web UI with the writer permissions.

  2. Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.

  3. In the SSH keys tab, click Add SSH Key to upload the public SSH key(s) that will be used for the SSH access to VMs.

  4. Optional. In the Proxies tab, enable proxy access to the managed cluster:

    1. Click Add Proxy.

    2. In the Add New Proxy wizard, fill out the form with the following parameters:

      Proxy configuration

      Parameter

      Description

      Proxy Name

      Name of the proxy server to use during a managed cluster creation.

      Region

      From the drop-down list, select the required region.

      HTTP Proxy

      Add the HTTP proxy server domain name in the following format:

      • http://proxy.example.com:port - for anonymous access

      • http://user:password@proxy.example.com:port - for restricted access

      HTTPS Proxy

      Add the HTTPS proxy server domain name in the same format as for HTTP Proxy.

      No Proxy

      Comma-separated list of IP addresses or domain names.

    For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Requirements for a baremetal-based cluster.

  5. In the Clusters tab, click Create Cluster.

  6. Configure the new cluster in the Create New Cluster wizard that opens:

    1. Define general and Kubernetes parameters:

      Create new cluster: General, Provider, and Kubernetes

      Section

      Parameter name

      Description

      General settings

      Cluster name

      The cluster name.

      Provider

      Select Baremetal.

      Region

      From the drop-down list, select Baremetal.

      Release version

      The Container Cloud version.

      Proxy

      Optional. From the drop-down list, select the proxy server name that you have previously created.

      SSH keys

      From the drop-down list, select the SSH key name(s) that you have previously added for SSH access to the bare metal hosts.

      Provider

      LB host IP

      The IP address of the load balancer endpoint that will be used to access the Kubernetes API of the new cluster. This IP address must be on the Combined/PXE network.

      LB address range

      The range of IP addresses that can be assigned to load balancers for Kubernetes Services by MetalLB.

      Kubernetes

      Services CIDR blocks

      The Kubernetes Services CIDR blocks. For example, 10.233.0.0/18.

      Pods CIDR blocks

      The Kubernetes pods CIDR blocks. For example, 10.233.64.0/18.

    2. Configure StackLight:

      StackLight configuration

      Section

      Parameter name

      Description

      StackLight

      Enable Monitoring

      Selected by default. Deselect to skip StackLight deployment.

      Note

      You can also enable, disable, or configure StackLight parameters after deploying a managed cluster. For details, see Change a cluster configuration or Configure StackLight.

      Enable Logging

      Select to deploy the StackLight logging stack. For details about the logging components, see Deployment architecture.

      Note

      The logging mechanism performance depends on the cluster log load. In case of a high load, you may need to increase the default resource requests and limits for fluentdElasticsearch. For details, see StackLight configuration parameters: Resource limits.

      HA Mode

      Select to enable StackLight monitoring in the HA mode. For the differences between HA and non-HA modes, see Deployment architecture.

      StackLight Default Logs Severity Level

      Log severity (verbosity) level for all StackLight components. The default value for this parameter is Default component log level that respects original defaults of each StackLight component. For details about severity levels, see Log verbosity.

      StackLight Component Logs Severity Level

      The severity level of logs for a specific StackLight component that overrides the value of the StackLight Default Logs Severity Level parameter. For details about severity levels, see Log verbosity.

      Expand the drop-down menu for a specific component to display its list of available log levels.

      Elasticsearch

      Retention Time

      Available if you select Enable Logging. The Elasticsearch logs retention period.

      Persistent Volume Claim Size

      Available if you select Enable Logging. The Elasticsearch persistent volume claim size.

      Collected Logs Severity Level

      Available if you select Enable Logging. The minimum severity of all Container Cloud components logs collected in Elasticsearch. For details about severity levels, see Logging.

      Prometheus

      Retention Time

      The Prometheus database retention period.

      Retention Size

      The Prometheus database retention size.

      Persistent Volume Claim Size

      The Prometheus persistent volume claim size.

      Enable Watchdog Alert

      Select to enable the Watchdog alert that fires as long as the entire alerting pipeline is functional.

      Custom Alerts

      Specify alerting rules for new custom alerts or upload a YAML file in the following exemplary format:

      - alert: HighErrorRate
        expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
        for: 10m
        labels:
          severity: page
        annotations:
          summary: High request latency
      

      For details, see Official Prometheus documentation: Alerting rules. For the list of the predefined StackLight alerts, see Operations Guide: Available StackLight alerts.

      StackLight Email Alerts

      Enable Email Alerts

      Select to enable the StackLight email alerts.

      Send Resolved

      Select to enable notifications about resolved StackLight alerts.

      Require TLS

      Select to enable transmitting emails through TLS.

      Email alerts configuration for StackLight

      Fill out the following email alerts parameters as required:

      • To - the email address to send notifications to.

      • From - the sender address.

      • SmartHost - the SMTP host through which the emails are sent.

      • Authentication username - the SMTP user name.

      • Authentication password - the SMTP password.

      • Authentication identity - the SMTP identity.

      • Authentication secret - the SMTP secret.

      StackLight Slack Alerts

      Enable Slack alerts

      Select to enable the StackLight Slack alerts.

      Send Resolved

      Select to enable notifications about resolved StackLight alerts.

      Slack alerts configuration for StackLight

      Fill out the following Slack alerts parameters as required:

      • API URL - The Slack webhook URL.

      • Channel - The channel to send notifications to, for example, #channel-for-alerts.

  7. Click Create.

    To monitor the cluster readiness, hover over the status icon of a specific cluster in the Status column of the Clusters page.

    Once the orange blinking status icon is green and Ready, the cluster deployment or update is complete.

    You can monitor live deployment status of the following cluster components:

    Component

    Description

    Bastion

    For the OpenStack and AWS-based clusters, the Bastion node IP address status that confirms the Bastion node creation

    Helm

    Installation or upgrade status of all Helm releases

    Kubelet

    Readiness of the node in a Kubernetes cluster, as reported by kubelet

    Kubernetes

    Readiness of all requested Kubernetes objects

    Nodes

    Equality of the requested nodes number in the cluster to the number of nodes having the Ready LCM status

    OIDC

    Readiness of the cluster OIDC configuration

    StackLight

    Health of all StackLight-related objects in a Kubernetes cluster

    Swarm

    Readiness of all nodes in a Docker Swarm cluster

    LoadBalancer

    Readiness of the Kubernetes API load balancer

    ProviderInstance

    Readiness of all machines in the underlying infrastructure (virtual or bare metal, depending on the provider type)

  8. Recommended. Configure an L2 template for a new cluster as described in Advanced networking configuration. You may skip this step if you do not require L2 separation for network traffic.

    Note

    This step is mandatory for Mirantis OpenStack for Kubernetes (MOS) clusters.

Now, proceed to Add a machine.

Advanced networking configuration

By default, Mirantis Container Cloud configures a single interface on the cluster nodes, leaving all other physical interfaces intact.

With L2 networking templates, you can create advanced host networking configurations for your clusters. For example, you can create bond interfaces on top of physical interfaces on the host or use multiple subnets to separate different types of network traffic.

You can use several host-specific L2 templates per one cluster to support different hardware configurations. For example, you can create L2 templates with different number and layout of NICs to be applied to the specific machines of one cluster.

When you create a baremetal-based project, the exemplary templates with the ipam/PreInstalledL2Template label are copied to this project. These templates are preinstalled during the management cluster bootstrap.

Using the L2 Templates section of the Clusters tab in the Container Cloud web UI, you can view a list of preinstalled templates and the ones that you manually create before a cluster deployment.

To facilitate multi-rack and other types of distributed bare metal datacenter topologies, the dnsmasq DHCP server used for host provisioning in Container Cloud supports working with multiple L2 segments through DHCP relay capable network routers.

Follow the procedures below to create and configure network objects for your managed clusters.

Workflow of network interface naming

To simplify operations with L2 templates, before you start creating them, inspect the general workflow of a network interface name gathering and processing.

Network interface naming workflow:

  1. The Operator creates a baremetalHost object.

  2. The baremetalHost object executes the introspection stage and becomes ready.

  3. The Operator collects information about NIC count, naming, and so on for further changes in the mapping logic.

    At this stage, the NICs order in the object may randomly change during each introspection, but the NICs names are always the same. For more details, see Predictable Network Interface Names.

    For example:

    # Example commands:
    # kubectl -n managed-ns get bmh baremetalhost1 -o custom-columns='NAME:.metadata.name,STATUS:.status.provisioning.state'
    # NAME            STATE
    # baremetalhost1  ready
    
    # kubectl -n managed-ns get bmh baremetalhost1 -o yaml
    # Example output:
    
    apiVersion: metal3.io/v1alpha1
    kind: BareMetalHost
    ...
    status:
    ...
        nics:
        - ip: fe80::ec4:7aff:fe6a:fb1f%eno2
          mac: 0c:c4:7a:6a:fb:1f
          model: 0x8086 0x1521
          name: eno2
          pxe: false
        - ip: fe80::ec4:7aff:fe1e:a2fc%ens1f0
          mac: 0c:c4:7a:1e:a2:fc
          model: 0x8086 0x10fb
          name: ens1f0
          pxe: false
        - ip: fe80::ec4:7aff:fe1e:a2fd%ens1f1
          mac: 0c:c4:7a:1e:a2:fd
          model: 0x8086 0x10fb
          name: ens1f1
          pxe: false
        - ip: 192.168.1.151 # Temp. PXE network adress
          mac: 0c:c4:7a:6a:fb:1e
          model: 0x8086 0x1521
          name: eno1
          pxe: true
     ...
    
  4. The Operator selects from the following options:

  5. The Operator creates a Machine or Subnet object.

  6. The baremetal-provider service links the Machine object to the baremetalHost object.

  7. The kaas-ipam and baremetal-provider services collect hardware information from the baremetalHost object and use it to configure host networking and services.

  8. The kaas-ipam service:

    1. Spawns the IpamHost object.

    2. Renders the l2template object.

    3. Spawns the ipaddr object.

    4. Updates the IpamHost object status with all rendered and linked information.

  9. The baremetal-provider service collects the rendered networking information from the IpamHost object

  10. The baremetal-provider service proceeds with the IpamHost object provisioning.

Create subnets

Before creating an L2 template, ensure that you have the required subnets that can be used in the L2 template to allocate IP addresses for the managed cluster nodes. Where required, create a number of subnets for a particular project using the Subnet CR. A subnet has three logical scopes:

  • global - CR uses the default namespace. A subnet can be used for any cluster located in any project.

  • namespaced - CR uses the namespace that corresponds to a particular project where managed clusters are located. A subnet can be used for any cluster located in the same project.

  • cluster - CR uses the namespace where the referenced cluster is located. A subnet is only accessible to the cluster that L2Template.spec.clusterRef refers to. The Subnet objects with the cluster scope will be created for every new cluster.

You can have subnets with the same name in different projects. In this case, the subnet that has the same project as the cluster will be used. One L2 template may often reference several subnets, those subnets may have different scopes in this case.

The IP address objects (IPaddr CR) that are allocated from subnets always have the same project as their corresponding IpamHost objects, regardless of the subnet scope.

You can create subnets using either the Container Cloud web UI or CLI.

Сreate subnets for a managed cluster using web UI
  1. Log in to the Container Cloud web UI with the operator permissions.

  2. Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.

  3. In the Clusters tab, click the required cluster and scroll down to the Subnets section.

  4. Click Add Subnet.

  5. Fill out the Add new subnet form as required:

    • Subnet Name

      Subnet name.

    • CIDR

      A valid IPv4 CIDR, for example, 10.11.0.0/24.

    • Include Ranges Optional

      A list of IP address ranges within the given CIDR that should be used in the allocation of IPs for nodes. The gateway, network, broadcast, and DNS addresses will be excluded (protected) automatically if they intersect with one of the range. The IPs outside the given ranges will not be used in the allocation. Each element of the list can be either an interval 10.11.0.5-10.11.0.70 or a single address 10.11.0.77. The includeRanges parameter is mutually exclusive with excludeRanges.

    • Exclude Ranges Optional

      A list of IP address ranges within the given CIDR that should not be used in the allocation of IPs for nodes. The IPs within the given CIDR but outside the given ranges will be used in the allocation. The gateway, network, broadcast, and DNS addresses will be excluded (protected) automatically if they are included in the CIDR. Each element of the list can be either an interval 10.11.0.5-10.11.0.70 or a single address 10.11.0.77. The excludeRanges parameter is mutually exclusive with includeRanges.

    • Gateway Optional

      A valid gateway address, for example, 10.11.0.9.

  6. Click Create.

Сreate subnets for a managed cluster using CLI
  1. Log in to a local machine where your management cluster kubeconfig is located and where kubectl is installed.

    Note

    The management cluster kubeconfig is created during the last stage of the management cluster bootstrap.

  2. Create the subnet.yaml file with a number of global or namespaced subnets depending on the configuration of your cluster:

    kubectl --kubeconfig <pathToManagementClusterKubeconfig> apply -f <SubnetFileName.yaml>
    

    Note

    In the command above and in the steps below, substitute the parameters enclosed in angle brackets with the corresponding values.

    Example of a subnet.yaml file:

    apiVersion: ipam.mirantis.com/v1alpha1
    kind: Subnet
    metadata:
      name: demo
      namespace: demo-namespace
      labels:
        kaas.mirantis.com/provider: baremetal
        kaas.mirantis.com/region: region-one
    spec:
      cidr: 10.11.0.0/24
      gateway: 10.11.0.9
      includeRanges:
      - 10.11.0.5-10.11.0.70
      nameservers:
      - 172.18.176.6
    
    Specification fields of the Subnet object

    Parameter

    Description

    cidr (singular)

    A valid IPv4 CIDR, for example, 10.11.0.0/24.

    includeRanges (list)

    A list of IP address ranges within the given CIDR that should be used in the allocation of IPs for nodes. The gateway, network, broadcast, and DNS addresses will be excluded (protected) automatically if they intersect with one of the range. The IPs outside the given ranges will not be used in the allocation. Each element of the list can be either an interval 10.11.0.5-10.11.0.70 or a single address 10.11.0.77. The includeRanges parameter is mutually exclusive with excludeRanges.

    excludeRanges (list)

    A list of IP address ranges within the given CIDR that should not be used in the allocation of IPs for nodes. The IPs within the given CIDR but outside the given ranges will be used in the allocation. The gateway, network, broadcast, and DNS addresses will be excluded (protected) automatically if they are included in the CIDR. Each element of the list can be either an interval 10.11.0.5-10.11.0.70 or a single address 10.11.0.77. The excludeRanges parameter is mutually exclusive with includeRanges.

    useWholeCidr (boolean)

    If set to true, the subnet address (10.11.0.0 in the example above) and the broadcast address (10.11.0.255 in the example above) are included into the address allocation for nodes. Otherwise, (false by default), the subnet address and broadcast address will be excluded from the address allocation.

    gateway (singular)

    A valid gateway address, for example, 10.11.0.9.

    nameservers (list)

    A list of the IP addresses of name servers. Each element of the list is a single address, for example, 172.18.176.6.

    Caution

    The subnet for the PXE network is automatically created during deployment and must contain the ipam/DefaultSubnet: "1" label. Each bare metal region must have only one subnet with this label.

    Caution

    You may use different subnets to allocate IP addresses to different Container Cloud components in your cluster. For details, see the optional steps below.

    Add a label with the ipam/SVC- prefix to each subnet that is used to configure a Container Cloud service. Make sure that each subnet has only one such label.

  3. Optional. Add a subnet for the MetalLB service.

    • To designate a subnet as MetalLB address pool, use the ipam/SVC-MetalLB label key. Set value of the label to "1".

    • Set the cluster.sigs.k8s.io/cluster-name label to the name of the cluster where this subnet is used.

    • You may create multiple subnets with the ipam/SVC-MetalLB label to define multiple IP address ranges for MetalLB in the cluster.

    Caution

    The IP addresses of the MetalLB address pool are not assigned to the interfaces on hosts. This subnet is virtual. Make sure that it is not included in the L2 template definitions for your cluster.

    Note

    • When MetalLB address ranges are defined in both cluster specification and specific Subnet objects, the resulting MetalLB address pools configuration will contain address ranges from both cluster specification and Subnet objects.

    • All address ranges for L2 address pools that are defined in both cluster specification and Subnet objects are aggregated into a single L2 address pool and sorted as strings.

  4. Optional. Add a Ceph public subnet.

    • Set the ipam/SVC-ceph-public label with the value "1" to create a subnet that will be used to configure the Ceph public network.

    • Use this subnet in the L2 template for storage nodes.

    • Assign this subnet to the interface connected to your storage access network.

    • Set the cluster.sigs.k8s.io/cluster-name label to the name of the target cluster during this subnet creation.

    • Ceph will automatically use this subnet for its external connections.

    • A Ceph OSD will look for and bind to an address from this subnet when it is started on a machine.

  5. Optional. Add a Ceph replication subnet.

    • Set the ipam/SVC-ceph-cluster label with the value "1" to create a subnet that will be used to configure the Ceph replication network.

    • Set the cluster.sigs.k8s.io/cluster-name label to the name of the target cluster during this subnet creation.

    • Use this subnet in the L2 template for storage nodes.

    • Ceph will automatically use this subnet for its internal replication traffic.

  6. Optional. Add a subnet for Kubernetes pods traffic.

    Caution

    Use of a dedicated network for Kubernetes pods traffic, for external connection to the Kubernetes services exposed by the cluster, and for the Ceph cluster access and replication traffic is available as Technology Preview. Use such configurations for testing and evaluation purposes only. For the Technology Preview feature definition, refer to Technology Preview support scope.

    The following feature is still under development and will be announced in one of the following Container Cloud releases:

    • Switching Kubernetes API to listen to the specified IP address on the node

  7. Optional. Add subnets for configuring multiple DHCP ranges. For details, see Configure multiple DHCP ranges using Subnet resources.

  8. Verify that the subnet is successfully created:

    kubectl get subnet kaas-mgmt -oyaml
    

    In the system output, verify the status fields of the subnet.yaml file using the table below.

    Status fields of the Subnet object

    Parameter

    Description

    statusMessage

    Contains a short state description and a more detailed one if applicable. The short status values are as follows:

    • OK - operational.

    • ERR - non-operational. This status has a detailed description, for example, ERR: Wrong includeRange for CIDR….

    cidr

    Reflects the actual CIDR, has the same meaning as spec.cidr.

    gateway

    Reflects the actual gateway, has the same meaning as spec.gateway.

    nameservers

    Reflects the actual name servers, has same meaning as spec.nameservers.

    ranges

    Specifies the address ranges that are calculated using the fields from spec: cidr, includeRanges, excludeRanges, gateway, useWholeCidr. These ranges are directly used for nodes IP allocation.

    lastUpdate

    Includes the date and time of the latest update of the Subnet RC.

    allocatable

    Includes the number of currently available IP addresses that can be allocated for nodes from the subnet.

    allocatedIPs

    Specifies the list of IPv4 addresses with the corresponding IPaddr object IDs that were already allocated from the subnet.

    capacity

    Contains the total number of IP addresses being held by ranges that equals to a sum of the allocatable and allocatedIPs parameters values.

    versionIpam

    Contains thevVersion of the kaas-ipam component that made the latest changes to the Subnet RC.

    Example of a successfully created subnet:

    apiVersion: ipam.mirantis.com/v1alpha1
    kind: Subnet
    metadata:
      labels:
        ipam/UID: 6039758f-23ee-40ba-8c0f-61c01b0ac863
        kaas.mirantis.com/provider: baremetal
        kaas.mirantis.com/region: region-one
      name: kaas-mgmt
      namespace: default
    spec:
      cidr: 10.0.0.0/24
      excludeRanges:
      - 10.0.0.100
      - 10.0.0.101-10.0.0.120
      gateway: 10.0.0.1
      includeRanges:
      - 10.0.0.50-10.0.0.90
      nameservers:
      - 172.18.176.6
    status:
      allocatable: 38
      allocatedIPs:
      - 10.0.0.50:0b50774f-ffed-11ea-84c7-0242c0a85b02
      - 10.0.0.51:1422e651-ffed-11ea-84c7-0242c0a85b02
      - 10.0.0.52:1d19912c-ffed-11ea-84c7-0242c0a85b02
      capacity: 41
      cidr: 10.0.0.0/24
      gateway: 10.0.0.1
      lastUpdate: "2020-09-26T11:40:44Z"
      nameservers:
      - 172.18.176.6
      ranges:
      - 10.0.0.50-10.0.0.90
      statusMessage: OK
      versionIpam: v3.0.999-20200807-130909-44151f8
    
  9. Proceed to creating an L2 template for one or multiple managed clusters as described in Create L2 templates.

Automate multiple subnet creation using SubnetPool

Before creating an L2 template, ensure that you have the required subnets that can be used in the L2 template to allocate IP addresses for the managed cluster nodes. You can also create multiple subnets using the SubnetPool object to separate different types of network traffic. SubnetPool allows for automatic creation of Subnet objects that will consume blocks from the parent SubnetPool CIDR IP address range. The SubnetPool blockSize setting defines the IP address block size to allocate to each child Subnet. SubnetPool has a global scope, so any SubnetPool can be used to create the Subnet objects for any namespace and for any cluster.

To automate multiple subnet creation using SubnetPool:

  1. Log in to a local machine where your management cluster kubeconfig is located and where kubectl is installed.

    Note

    The management cluster kubeconfig is created during the last stage of the management cluster bootstrap.

  2. Create the subnetpool.yaml file with a number of subnet pools:

    Note

    You can define either or both subnets and subnet pools, depending on the use case. A single L2 template can use either or both subnets and subnet pools.

    kubectl --kubeconfig <pathToManagementClusterKubeconfig> apply -f <SubnetFileName.yaml>
    

    Note

    In the command above and in the steps below, substitute the parameters enclosed in angle brackets with the corresponding values.

    Example of a subnetpool.yaml file:

    apiVersion: ipam.mirantis.com/v1alpha1
    kind: SubnetPool
    metadata:
      name: kaas-mgmt
      namespace: default
      labels:
        kaas.mirantis.com/provider: baremetal
        kaas.mirantis.com/region: region-one
    spec:
      cidr: 10.10.0.0/16
      blockSize: /25
      nameservers:
      - 172.18.176.6
      gatewayPolicy: first
    

    For the specification fields description of the SubnetPool object, see SubnetPool spec.

  3. Verify that the subnet pool is successfully created:

    kubectl get subnetpool kaas-mgmt -oyaml
    

    In the system output, verify the status fields of the subnetpool.yaml file. For the status fields description of the SunbetPool object, see SubnetPool status.

  4. Proceed to creating an L2 template for one or multiple managed clusters as described in Create L2 templates. In this procedure, select the exemplary L2 template for multiple subnets that contains the l3Layout section.

    Caution

    Using the l3Layout section, define all subnets of a cluster. Otherwise, do not use the l3Layout section. Defining only part of subnets is not allowed.

Create L2 templates

Caution

Since Container Cloud 2.9.0, L2 templates have a new format. In the new L2 templates format, l2template:status:npTemplate is used directly during provisioning. Therefore, a hardware node obtains and applies a complete network configuration during the first system boot.

Update any L2 template created before Container Cloud 2.9.0 as described in Release Notes: Switch L2 templates to the new format.

After you create subnets for one or more managed clusters or projects as described in Create subnets or Automate multiple subnet creation using SubnetPool, follow the procedure below to create L2 templates for a managed cluster. This procedure contains exemplary L2 templates for the following use cases:

L2 template example with bonds and bridges

This section contains an exemplary L2 template that demonstrates how to set up bonds and bridges on hosts for your managed clusters as described in Create L2 templates.

Caution

Use of a dedicated network for Kubernetes pods traffic, for external connection to the Kubernetes services exposed by the cluster, and for the Ceph cluster access and replication traffic is available as Technology Preview. Use such configurations for testing and evaluation purposes only. For the Technology Preview feature definition, refer to Technology Preview support scope.

The following feature is still under development and will be announced in one of the following Container Cloud releases:

  • Switching Kubernetes API to listen to the specified IP address on the node

Dedicated network for the Kubernetes pods traffic

If you want to use a dedicated network for Kubernetes pods traffic, configure each node with an IPv4 and/or IPv6 address that will be used to route the pods traffic between nodes. To accomplish that, use the npTemplate.bridges.k8s-pods bridge in the L2 template, as demonstrated in the example below. As defined in Host networking, this bridge name is reserved for the Kubernetes pods network. When the k8s-pods bridge is defined in an L2 template, Calico CNI uses that network for routing the pods traffic between nodes.

Dedicated network for the Kubernetes services traffic (MetalLB)

You can use a dedicated network for external connection to the Kubernetes services exposed by the cluster. If enabled, MetalLB will listen and respond on the dedicated virtual bridge. To accomplish that, configure each node where metallb-speaker is deployed with an IPv4 or IPv6 address. Both the MetalLB IP address ranges and the IP addresses configured on those nodes must fit in the same CIDR.

Use the npTemplate.bridges.k8s-ext bridge in the L2 template, as demonstrated in the example below. This bridge name is reserved for the Kubernetes external network. The Subnet object that corresponds to the k8s-ext bridge must have explicitly excluded the IP address ranges that are in use by MetalLB.

Dedicated network for the Ceph distributed storage traffic

You can configure dedicated networks for the Ceph cluster access and replication traffic. Set labels on the Subnet CRs for the corresponding networks, as described in Create subnets. Container Cloud automatically configures Ceph to use the addresses from these subnets. Ensure that the addresses are assigned to the storage nodes.

Use the npTemplate.bridges.ceph-cluster and npTemplate.bridges.ceph-public bridges in the L2 template, as demonstrated in the example below. These names are reserved for the Ceph cluster access (public) and replication (cluster) networks.

The Subnet objects used to assign IP addresses to these bridges must have corresponding labels ipam/SVC-ceph-public for the ceph-public bridge and ipam/SVC-ceph-cluster for the ceph-cluster bridge.

Example of an L2 template with interfaces bonding
apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
  name: test-managed
  namespace: managed-ns
spec:
  clusterRef: managed-cluster
  autoIfMappingPrio:
    - provision
    - eno
    - ens
    - enp
  l3Layout:
    - subnetName: pxe-subnet
      scope:      global
    - subnetName: demo-pods
      scope:      namespace
    - subnetName: demo-ext
      scope:      namespace
    - subnetName: demo-ceph-cluster
      scope:      namespace
    - subnetName: demo-ceph-public
      scope:      namespace
  npTemplate: |
    version: 2
    ethernets:
      ten10gbe0s0:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 2}}
        set-name: {{nic 2}}
      ten10gbe0s1:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 3}}
        set-name: {{nic 3}}
    bonds:
      bond0:
        interfaces:
          - ten10gbe0s0
          - ten10gbe0s1
    vlans:
      k8s-ext-vlan:
        id: 1001
        link: bond0
      k8s-pods-vlan:
        id: 1002
        link: bond0
      stor-frontend:
        id: 1003
        link: bond0
      stor-backend:
        id: 1004
        link: bond0
    bridges:
      k8s-ext:
        interfaces: [k8s-ext-vlan]
        addresses:
          - {{ip "k8s-ext:demo-ext"}}
      k8s-pods:
        interfaces: [k8s-pods-vlan]
        addresses:
          - {{ip "k8s-pods:demo-pods"}}
      ceph-cluster:
        interfaces: [stor-backend]
        addresses:
          - {{ip "ceph-cluster:demo-ceph-cluster"}}
      ceph-public:
        interfaces: [stor-frontend]
        addresses:
          - {{ip "ceph-public:demo-ceph-public"}}
L2 template example for automatic multiple subnet creation

This section contains an exemplary L2 template for automatic multiple subnet creation as described in Automate multiple subnet creation using SubnetPool. This template also contains the L3Layout section that allows defining the Subnet scopes and enables auto-creation of the Subnet objects from the SubnetPool objects. For details about auto-creation of the Subnet objects see Automate multiple subnet creation using SubnetPool.

For details on how to create L2 templates, see Create L2 templates.

Caution

Do not assign an IP address to the PXE nic 0 NIC explicitly to prevent the IP duplication during updates. The IP address is automatically assigned by the bootstrapping engine.

Example of an L2 template for multiple subnets:

apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
  name: test-managed
  namespace: managed-ns
spec:
  clusterRef: managed-cluster
  autoIfMappingPrio:
    - provision
    - eno
    - ens
    - enp
  l3Layout:
    - subnetName: pxe-subnet
      scope:      global
    - subnetName: subnet-1
      subnetPool: kaas-mgmt
      scope:      namespace
    - subnetName: subnet-2
      subnetPool: kaas-mgmt
      scope:      cluster
  npTemplate: |
    version: 2
    ethernets:
      onboard1gbe0:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 0}}
        set-name: {{nic 0}}
        # IMPORTANT: do not assign an IP address here explicitly
        # to prevent IP duplication issues. The IP will be assigned
        # automatically by the bootstrapping engine.
        # addresses: []
      onboard1gbe1:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 1}}
        set-name: {{nic 1}}
      ten10gbe0s0:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 2}}
        set-name: {{nic 2}}
        addresses:
          - {{ip "2:subnet-1"}}
      ten10gbe0s1:
        dhcp4: false
        dhcp6: false
        match:
          macaddress: {{mac 3}}
        set-name: {{nic 3}}
        addresses:
          - {{ip "3:subnet-2"}}

In the template above, the following networks are defined in the l3Layout section:

  • pxe-subnet - global PXE network that already exists. A subnet name must refer to the PXE subnet created for the region.

  • subnet-1 - unless already created, this subnet will be created from the kaas-mgmt subnet pool. The subnet name must be unique within the project. This subnet is shared between the project clusters.

  • subnet-2 - will be created from the kaas-mgmt subnet pool. This subnet has the cluster scope. Therefore, the real name of the Subnet CR object consists of the subnet name defined in l3Layout and the cluster UID. But the npTemplate section of the L2 template must contain only the subnet name defined in l3Layout. The subnets of the cluster scope are not shared between clusters.

Caution

Using the l3Layout section, define all subnets of a cluster. Otherwise, do not use the l3Layout section. Defining only part of subnets is not allowed.


To create an L2 template for a new managed cluster:

  1. Log in to a local machine where your management cluster kubeconfig is located and where kubectl is installed.

    Note

    The management cluster kubeconfig is created during the last stage of the management cluster bootstrap.

  2. Inspect the existing L2 templates to select the one that fits your deployment:

    kubectl --kubeconfig <pathToManagementClusterKubeconfig> \
    get l2template -n <ProjectNameForNewManagedCluster>
    
  3. Create an L2 YAML template specific to your deployment using one of the exemplary templates:

    Note

    You can create several L2 templates with different configurations to be applied to different nodes of the same cluster. See Assign L2 templates to machines for details.

  4. Add or edit the mandatory parameters in the new L2 template. The following tables provide the description of the mandatory and the l3Layout section parameters in the example templates mentioned in the previous step.

    L2 template mandatory parameters

    Parameter

    Description

    clusterRef

    References the Cluster object that this template is applied to. The default value is used to apply the given template to all clusters within a particular project, unless an L2 template that references a specific cluster name exists.

    Caution

    • An L2 template must have the same namespace as the referenced cluster.

    • A cluster can be associated with many L2 templates. Only one of them can have the ipam/DefaultForCluster label. Every L2 template that does not have the ipam/DefaultForCluster label can be later assigned to a particular machine using l2TemplateSelector.

    • A project (Kubernetes namespace) can have only one default L2 template (L2Template with Spec.clusterRef: default).

    ifMapping or autoIfMappingPrio

    • ifMapping is a list of interface names for the template. The interface mapping is defined globally for all bare metal hosts in the cluster but can be overridden at the host level, if required, by editing the IpamHost object for a particular host. The ifMapping parameter is mutually exclusive with autoIfMappingPrio.

    • autoIfMappingPrio is a list of prefixes, such as eno, ens, and so on, to match the interfaces to automatically create a list for the template. If you are not aware of any specific ordering of interfaces on the nodes, use the default ordering from Predictable Network Interfaces Names specification for systemd. You can also override the default NIC list per host using the IfMappingOverride parameter of the corresponding IpamHost. The provision value corresponds to the network interface that was used to provision a node. Usually, it is the first NIC found on a particular node. It is defined explicitly to ensure that this interface will not be reconfigured accidentally.

      The autoIfMappingPrio parameter is mutually exclusive with ifMapping.

    l3Layout

    Subnets to be used in the npTemplate section. The l3Layout section is mandatory for each L2Template custom resource (CR). For more details about L2Template, see L2Template API.

    npTemplate

    A netplan-compatible configuration with special lookup functions that defines the networking settings for the cluster hosts, where physical NIC names and details are parameterized. This configuration will be processed using Go templates. Instead of specifying IP and MAC addresses, interface names, and other network details specific to a particular host, the template supports use of special lookup functions. These lookup functions, such as nic, mac, ip, and so on, return host-specific network information when the template is rendered for a particular host. For details about netplan, see the official netplan documentation.

    Caution

    All rules and restrictions of the netplan configuration also apply to L2 templates. For details, see the official netplan documentation.

    For more details about the L2Template custom resource (CR), see the L2Template API section.

    l3Layout section parameters

    Parameter

    Description

    subnetName

    Name of the Subnet object that will be used in the npTemplate section to allocate IP addresses from. All Subnet names must be unique across a single L2 template.

    subnetPool

    Optional. Default: none. Name of the parent SubnetPool object that will be used to create a Subnet object with a given subnetName and scope. If a corresponding Subnet object already exists, nothing will be created and the existing object will be used. If no SubnetPool is provided, no new Subnet object will be created.

    scope

    Logical scope of the Subnet object with a corresponding subnetName. Possible values:

    • global - the Subnet object is accessible globally, for any Container Cloud project and cluster in the region, for example, the PXE subnet.

    • namespace - the Subnet object is accessible within the same project and region where the L2 template is defined.

    • cluster - the Subnet object is only accessible to the cluster that L2Template.spec.clusterRef refers to. The Subnet objects with the cluster scope will be created for every new cluster.

    The following table describes the main lookup functions for an L2 template.

    Lookup function

    Description

    {{nic N}}

    Name of a NIC number N. NIC numbers correspond to the interface mapping list.

    {{mac N}}

    MAC address of a NIC number N registered during a host hardware inspection.

    {{ip “N:subnet-a”}}

    IP address and mask for a NIC number N. The address will be auto-allocated from the given subnet if the address does not exist yet.

    {{ip “br0:subnet-x”}}

    IP address and mask for a virtual interface, “br0” in this example. The address will be auto-allocated from the given subnet if the address does not exist yet.

    {{gateway_from_subnet “subnet-a”}}

    IPv4 default gateway address from the given subnet.

    {{nameservers_from_subnet “subnet-a”}}

    List of the IP addresses of name servers from the given subnet.

    Note

    Every subnet referenced in an L2 template can have either a global or namespaced scope. In the latter case, the subnet must exist in the same project where the corresponding cluster and L2 template are located.

  5. Add the L2 template to your management cluster:

    kubectl --kubeconfig <pathToManagementClusterKubeconfig> apply -f <pathToL2TemplateYamlFile>
    
  6. Optional. Further modify the template:

    kubectl --kubeconfig <pathToManagementClusterKubeconfig> \
    -n <ProjectNameForNewManagedCluster> edit l2template <L2templateName>
    
  7. Proceed with Add a machine. The resulting L2 template will be used to render the netplan configuration for the managed cluster machines.


The workflow of the netplan configuration using an L2 template is as follows:

  1. The kaas-ipam service uses the data from BareMetalHost, the L2 template, and subnets to generate the netplan configuration for every cluster machine.

  2. The generated netplan configuration is saved in the status.netconfigV2 section of the IpamHost resource. If the status.l2RenderResult field of the IpamHost resource is OK, the configuration was rendered in the IpamHost resource successfully. Otherwise, the status contains an error message.

  3. The baremetal-provider service copies data from the status.netconfigV2 of IpamHost to the Spec.StateItemsOverwrites[‘deploy’][‘bm_ipam_netconfigv2’] parameter of LCMMachine.

  4. The lcm-agent service on every host synchronizes the LCMMachine data to its host. The lcm-agent service runs a playbook to update the netplan configuration on the host during the pre-download and deploy phases.

Assign L2 templates to machines

You can create multiple L2 templates with different configurations and apply them to different machines in the same cluster.

To assign a specific L2 template to machines in a cluster:

  1. Create the default L2 template for the cluster. It will be used for machines that do not have an L2 template explicitly assigned.

    To designate an L2 template as default, assign the ipam/DefaultForCluster label to it. Only one L2 template in a cluster can have this label.

  2. Create other required L2 templates for the cluster. Use the clusterRef parameter in the L2 template spec to assign the templates to the cluster.

  3. Add the l2template-<NAME> label to every L2 template. Replace the <NAME> parameter with the unique name of the L2 template.

  4. Assign an L2 template to a machine. Set the l2TemplateSelector field in the machine spec to the name of the label added in the previous step. IPAM controller uses this field to use a specific L2 template for the corresponding machine.

    Alternatively, you may set the l2TemplateSelector field to the name of the L2 template. This makes the template exclusively used by the corresponding machine.

Consider the following examples of an L2 template assignment to a machine.

Example of an L2Template resource:

apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
  name: ExampleNetConfig
  namespace: MyProject
  labels:
    kaas.mirantis.com/provider: baremetal
    kaas.mirantis.com/region: region-one
...
spec:
  clusterRef: MyCluster
...

Example of a Machine resource with the label-based L2 template selector:

apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
metadata:
  name: Machine1
  namespace: MyProject
...
spec:
  providerSpec:
    value:
      l2TemplateSelector:
        label: l2template-ExampleNetConfig
...

Example of a Machine resource with the name-based L2 template selector:

apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
metadata:
  name: Machine1
  namespace: MyProject
...
spec:
  providerSpec:
    value:
      l2TemplateSelector:
        name: ExampleNetConfig
...
Example of a complete L2 templates configuration for cluster creation

The following example contains all required objects of an advanced network and host configuration for a baremetal-based managed cluster.

The procedure below contains:

  • Various .yaml objects to be applied with a managed cluster kubeconfig

  • Useful comments inside the .yaml example files

  • Example hardware and configuration data, such as network, disk, auth, that must be updated accordingly to fit your cluster configuration

  • Example templates, such as l2template and baremetalhostprofline, that illustrate how to implement a specific configuration

Caution

The exemplary configuration described below is not production ready and is provided for illustration purposes only.

For illustration purposes, all files provided in this exemplary procedure are named by the Kubernetes object types:

managed-ns_BareMetalHost_cz7700-managed-cluster-control-noefi.yaml
managed-ns_BareMetalHost_cz7741-managed-cluster-control-noefi.yaml
managed-ns_BareMetalHost_cz7743-managed-cluster-control-noefi.yaml
managed-ns_BareMetalHost_cz812-managed-cluster-storage-worker-noefi.yaml
managed-ns_BareMetalHost_cz813-managed-cluster-storage-worker-noefi.yaml
managed-ns_BareMetalHost_cz814-managed-cluster-storage-worker-noefi.yaml
managed-ns_BareMetalHost_cz815-managed-cluster-worker-noefi.yaml
managed-ns_BareMetalHostProfile_bmhp-cluster-default.yaml
managed-ns_BareMetalHostProfile_worker-storage1.yaml
managed-ns_Cluster_managed-cluster.yaml
managed-ns_KaaSCephCluster_ceph-cluster-managed-cluster.yaml
managed-ns_L2Template_bm-1490-template-controls-netplan-cz7700-pxebond.yaml
managed-ns_L2Template_bm-1490-template-controls-netplan.yaml
managed-ns_L2Template_bm-1490-template-workers-netplan.yaml
managed-ns_Machine_cz7700-managed-cluster-control-noefi-.yaml
managed-ns_Machine_cz7741-managed-cluster-control-noefi-.yaml
managed-ns_Machine_cz7743-managed-cluster-control-noefi-.yaml
managed-ns_Machine_cz812-managed-cluster-storage-worker-noefi-.yaml
managed-ns_Machine_cz813-managed-cluster-storage-worker-noefi-.yaml
managed-ns_Machine_cz814-managed-cluster-storage-worker-noefi-.yaml
managed-ns_Machine_cz815-managed-cluster-worker-noefi-.yaml
managed-ns_PublicKey_managed-cluster-key.yaml
managed-ns_Secret_cz7700-cred.yaml
managed-ns_Secret_cz7741-cred.yaml
managed-ns_Secret_cz7743-cred.yaml
managed-ns_Secret_cz812-cred.yaml
managed-ns_Secret_cz813-cred.yaml
managed-ns_Secret_cz814-cred.yaml
managed-ns_Secret_cz815-cred.yaml
managed-ns_Subnet_lcm-nw.yaml
managed-ns_Subnet_metallb-public-for-managed.yaml
managed-ns_Subnet_metallb-public-for-extiface.yaml
managed-ns_Subnet_storage-backend.yaml
managed-ns_Subnet_storage-frontend.yaml
default_Namespace_managed-ns.yaml

Caution

The procedure below assumes that you apply each new .yaml file using kubectl create -f <file_name.yaml>.

To create an example configuration for a managed cluster creation:

  1. Verify that you have configured the following items:

    1. All bmh nodes for PXE boot as described in Add a bare metal host using CLI

    2. All physical NICs of the bmh nodes

    3. All required physical subnets and routing

  2. Create an empty .yaml file with the namespace object:

    apiVersion: v1
    kind: Namespace
    metadata:
      name: managed-ns
    
  3. Create the required number of .yaml files with the Secret objects for each bmh node with unique name and authentication data. The following example contains one secret:

    apiVersion: v1
    data:
      password: YWRtaW4=
      username: ZW5naW5lZXI=
    kind: Secret
    metadata:
      labels:
        kaas.mirantis.com/credentials: 'true'
        kaas.mirantis.com/provider: baremetal
        kaas.mirantis.com/region: region-one
      name: cz815-cred
      namespace: managed-ns
    
  4. Create a set of files with the bmh nodes configuration:

    • managed-ns_BareMetalHost_cz7700-managed-cluster-control-noefi.yaml

      apiVersion: metal3.io/v1alpha1
      kind: BareMetalHost
      metadata:
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          hostlabel.bm.kaas.mirantis.com/controlplane: controlplane
          # we will use those label, to link machine to exact bmh node
          kaas.mirantis.com/baremetalhost-id: cz7700
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        name: cz7700-managed-cluster-control-noefi
        namespace: managed-ns
      spec:
        bmc:
          address: 192.168.1.12
          credentialsName: cz7700-cred
        bootMACAddress: 0c:c4:7a:34:52:04
        bootMode: legacy
        online: true
      
    • managed-ns_BareMetalHost_cz7741-managed-cluster-control-noefi.yaml

      apiVersion: metal3.io/v1alpha1
      kind: BareMetalHost
      metadata:
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          hostlabel.bm.kaas.mirantis.com/controlplane: controlplane
          kaas.mirantis.com/baremetalhost-id: cz7741
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        name: cz7741-managed-cluster-control-noefi
        namespace: managed-ns
      spec:
        bmc:
          address: 192.168.1.76
          credentialsName: cz7741-cred
        bootMACAddress: 0c:c4:7a:34:92:f4
        bootMode: legacy
        online: true
      
    • managed-ns_BareMetalHost_cz7743-managed-cluster-control-noefi.yaml

      apiVersion: metal3.io/v1alpha1
      kind: BareMetalHost
      metadata:
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          hostlabel.bm.kaas.mirantis.com/controlplane: controlplane
          kaas.mirantis.com/baremetalhost-id: cz7743
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        name: cz7743-managed-cluster-control-noefi
        namespace: managed-ns
      spec:
        bmc:
          address: 192.168.1.78
          credentialsName: cz7743-cred
        bootMACAddress: 0c:c4:7a:34:66:fc
        bootMode: legacy
        online: true
      
    • managed-ns_BareMetalHost_cz812-managed-cluster-storage-worker-noefi.yaml

      apiVersion: metal3.io/v1alpha1
      kind: BareMetalHost
      metadata:
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          hostlabel.bm.kaas.mirantis.com/storage: storage
          hostlabel.bm.kaas.mirantis.com/worker: worker
          kaas.mirantis.com/baremetalhost-id: cz812
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        name: cz812-managed-cluster-storage-worker-noefi
        namespace: managed-ns
      spec:
        bmc:
          address: 192.168.1.182
          credentialsName: cz812-cred
        bootMACAddress: 0c:c4:7a:bc:ff:2e
        bootMode: legacy
        online: true
      
    • managed-ns_BareMetalHost_cz813-managed-cluster-storage-worker-noefi.yaml

      apiVersion: metal3.io/v1alpha1
      kind: BareMetalHost
      metadata:
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          hostlabel.bm.kaas.mirantis.com/storage: storage
          hostlabel.bm.kaas.mirantis.com/worker: worker
          kaas.mirantis.com/baremetalhost-id: cz813
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        name: cz813-managed-cluster-storage-worker-noefi
        namespace: managed-ns
      spec:
        bmc:
          address: 192.168.1.183
          credentialsName: cz813-cred
        bootMACAddress: 0c:c4:7a:bc:fe:36
        bootMode: legacy
        online: true
      
    • managed-ns_BareMetalHost_cz814-managed-cluster-storage-worker-noefi.yaml

      apiVersion: metal3.io/v1alpha1
      kind: BareMetalHost
      metadata:
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          hostlabel.bm.kaas.mirantis.com/storage: storage
          hostlabel.bm.kaas.mirantis.com/worker: worker
          kaas.mirantis.com/baremetalhost-id: cz814
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        name: cz814-managed-cluster-storage-worker-noefi
        namespace: managed-ns
      spec:
        bmc:
          address: 192.168.1.184
          credentialsName: cz814-cre
        bootMACAddress: 0c:c4:7a:bc:fb:20
        bootMode: legacy
        online: true
      
    • managed-ns_BareMetalHost_cz815-managed-cluster-worker-noefi.yaml

      apiVersion: metal3.io/v1alpha1
      kind: BareMetalHost
      metadata:
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          hostlabel.bm.kaas.mirantis.com/worker: worker
          kaas.mirantis.com/baremetalhost-id: cz815
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        name: cz815-managed-cluster-worker-noefi
        namespace: managed-ns
      spec:
        bmc:
          address: 192.168.1.185
          credentialsName: cz815-cred
        bootMACAddress: 0c:c4:7a:bc:fc:3e
        bootMode: legacy
        online: true
      
  5. Verify that the inspecting phase has started:

    KUBECONFIG=kubeconfig kubectl -n managed-ns get bmh -o wide
    

    Example of system response:

    NAME                                       STATUS STATE CONSUMER BMC           BOOTMODE ONLINE ERROR REGION
    cz7700-managed-cluster-control-noefi       OK     inspecting     192.168.1.12  legacy   true         region-one
    cz7741-managed-cluster-control-noefi       OK     inspecting     192.168.1.76  legacy   true         region-one
    cz7743-managed-cluster-control-noefi       OK     inspecting     192.168.1.78  legacy   true         region-one
    cz812-managed-cluster-storage-worker-noefi OK     inspecting     192.168.1.182 legacy   true         region-one
    

    Wait for inspection to complete. Usually, it takes up to 15 minutes.

  6. Collect the bmh hardware information to create the l2template and bmh objects:

    KUBECONFIG=kubeconfig kubectl -n managed-ns get bmh -o wide
    

    Example of system response:

    NAME                                       STATUS STATE CONSUMER BMC           BOOTMODE ONLINE ERROR REGION
    cz7700-managed-cluster-control-noefi       OK     ready          192.168.1.12  legacy   true         region-one
    cz7741-managed-cluster-control-noefi       OK     ready          192.168.1.76  legacy   true         region-one
    cz7743-managed-cluster-control-noefi       OK     ready          192.168.1.78  legacy   true         region-one
    cz812-managed-cluster-storage-worker-noefi OK     ready          192.168.1.182 legacy   true         region-one
    
    KUBECONFIG=kubeconfig kubectl -n managed-ns get bmh cz7700-managed-cluster-control-noefi -o yaml | less
    

    Example of system response:

     ..
     nics:
     - ip: ""
       mac: 0c:c4:7a:1d:f4:a6
       model: 0x8086 0x10fb
       # discovered interfaces
       name: ens4f0
       pxe: false
       # temporary PXE address discovered from baremetal-mgmt
     - ip: 172.16.170.30
       mac: 0c:c4:7a:34:52:04
       model: 0x8086 0x1521
       name: enp9s0f0
       pxe: true
       # duplicates temporary PXE address discovered from baremetal-mgmt
       # since we have fallback-bond configured on host
     - ip: 172.16.170.33
       mac: 0c:c4:7a:34:52:05
       model: 0x8086 0x1521
       # discovered interfaces
       name: enp9s0f1
       pxe: false
    ....
     storage:
     - by_path: /dev/disk/by-path/pci-0000:00:1f.2-ata-1
       model: Samsung SSD 850
       name: /dev/sda
       rotational: false
       sizeBytes: 500107862016
     - by_path: /dev/disk/by-path/pci-0000:00:1f.2-ata-2
       model: Samsung SSD 850
       name: /dev/sdb
       rotational: false
       sizeBytes: 500107862016
    ....
    
  7. Create bare metal host profiles:

    • managed-ns_BareMetalHostProfile_bmhp-cluster-default.yaml

      apiVersion: metal3.io/v1alpha1
      kind: BareMetalHostProfile
      metadata:
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          # This label indicates that this profile will be default in
          # namespaces, so machines w\o exact profile selecting will use
          # this template
          kaas.mirantis.com/defaultBMHProfile: 'true'
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        name: bmhp-cluster-default
        namespace: managed-ns
      spec:
        devices:
        - device:
            byName: /dev/sda
            minSizeGiB: 120
            wipe: true
          partitions:
          - name: bios_grub
            partflags:
            - bios_grub
            sizeGiB: 0.00390625
            wipe: true
          - name: uefi
            partflags:
            - esp
            sizeGiB: 0.2
            wipe: true
          - name: config-2
            sizeGiB: 0.0625
            wipe: true
          - name: lvm_dummy_part
            sizeGiB: 1
            wipe: true
          - name: lvm_root_part
            sizeGiB: 0
            wipe: true
        - device:
            byName: /dev/sdb
            minSizeGiB: 30
            wipe: true
        - device:
            byName: /dev/sdc
            minSizeGiB: 30
            wipe: true
          partitions:
          - name: lvm_lvp_part
            sizeGiB: 0
            wipe: true
        - device:
            byName: /dev/sdd
            wipe: true
        fileSystems:
        - fileSystem: vfat
          partition: config-2
        - fileSystem: vfat
          mountPoint: /boot/efi
          partition: uefi
        - fileSystem: ext4
          logicalVolume: root
          mountPoint: /
        - fileSystem: ext4
          logicalVolume: lvp
          mountPoint: /mnt/local-volumes/
        grubConfig:
          defaultGrubOptions:
          - GRUB_DISABLE_RECOVERY="true"
          - GRUB_PRELOAD_MODULES=lvm
          - GRUB_TIMEOUT=30
        kernelParameters:
          modules:
          - content: 'options kvm_intel nested=1'
            filename: kvm_intel.conf
          sysctl:
            fs.aio-max-nr: '1048576'
            fs.file-max: '9223372036854775807'
            fs.inotify.max_user_instances: '4096'
            kernel.core_uses_pid: '1'
            kernel.dmesg_restrict: '1'
            kernel.panic: '900'
            net.ipv4.conf.all.rp_filter: '0'
            net.ipv4.conf.default.rp_filter: '0'
            net.ipv4.conf.k8s-ext.rp_filter: '0'
            net.ipv4.conf.kalive-ext.rp_filter: '0'
            net.ipv4.conf.m-pub.rp_filter: '0'
            vm.max_map_count: '262144'
        logicalVolumes:
        - name: root
          sizeGiB: 0
          vg: lvm_root
        - name: lvp
          sizeGiB: 0
          vg: lvm_lvp
        postDeployScript: |
          #!/bin/bash -ex
          # used for test-debug only!
          echo "root:r00tme" | sudo chpasswd
          echo 'ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"' > /etc/udev/rules.d/60-ssd-scheduler.rules
          echo $(date) 'post_deploy_script done' >> /root/post_deploy_done
      
        preDeployScript: |
          #!/bin/bash -ex
          echo "$(date) pre_deploy_script done" >> /root/pre_deploy_done
        volumeGroups:
        - devices:
          - partition: lvm_root_part
          name: lvm_root
        - devices:
          - partition: lvm_lvp_part
          name: lvm_lvp
        - devices:
          - partition: lvm_dummy_part
          # here we can create lvm, but do not mount or format it somewhere
          name: lvm_forawesomeapp
      
    • managed-ns_BareMetalHostProfile_worker-storage1.yaml

      apiVersion: metal3.io/v1alpha1
      kind: BareMetalHostProfile
      metadata:
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        name: worker-storage1
        namespace: managed-ns
      spec:
        devices:
        - device:
            minSizeGiB: 120
            wipe: true
          partitions:
          - name: bios_grub
            partflags:
            - bios_grub
            sizeGiB: 0.00390625
            wipe: true
          - name: uefi
            partflags:
            - esp
            sizeGiB: 0.2
            wipe: true
          - name: config-2
            sizeGiB: 0.0625
            wipe: true
          # Create dummy partition w\o mounting
          - name: lvm_dummy_part
            sizeGiB: 1
            wipe: true
          - name: lvm_root_part
            sizeGiB: 0
            wipe: true
        - device:
            # Will be used for Ceph, so required to be wiped
            byName: /dev/sdb
            minSizeGiB: 30
            wipe: true
        - device:
            byName: /dev/nvme0n1
            minSizeGiB: 30
            wipe: true
          partitions:
          - name: lvm_lvp_part
            sizeGiB: 0
            wipe: true
        - device:
            byName: /dev/sde
            wipe: true
        - device:
            byName: /dev/sdf
            minSizeGiB: 30
            wipe: true
          partitions:
            - name: lvm_lvp_part_sdf
              wipe: true
              sizeGiB: 0
        fileSystems:
        - fileSystem: vfat
          partition: config-2
        - fileSystem: vfat
          mountPoint: /boot/efi
          partition: uefi
        - fileSystem: ext4
          logicalVolume: root
          mountPoint: /
        - fileSystem: ext4
          logicalVolume: lvp
          mountPoint: /mnt/local-volumes/
        grubConfig:
          defaultGrubOptions:
          - GRUB_DISABLE_RECOVERY="true"
          - GRUB_PRELOAD_MODULES=lvm
          - GRUB_TIMEOUT=30
        kernelParameters:
          modules:
          - content: 'options kvm_intel nested=1'
            filename: kvm_intel.conf
          sysctl:
            fs.aio-max-nr: '1048576'
            fs.file-max: '9223372036854775807'
            fs.inotify.max_user_instances: '4096'
            kernel.core_uses_pid: '1'
            kernel.dmesg_restrict: '1'
            kernel.panic: '900'
            net.ipv4.conf.all.rp_filter: '0'
            net.ipv4.conf.default.rp_filter: '0'
            net.ipv4.conf.k8s-ext.rp_filter: '0'
            net.ipv4.conf.kalive-ext.rp_filter: '0'
            net.ipv4.conf.m-pub.rp_filter: '0'
            vm.max_map_count: '262144'
        logicalVolumes:
        - name: root
          sizeGiB: 0
          vg: lvm_root
        - name: lvp
          sizeGiB: 0
          vg: lvm_lvp
        postDeployScript: |
      
          #!/bin/bash -ex
      
          # used for test-debug only! That would allow operator to logic via TTY.
          echo "root:r00tme" | sudo chpasswd
          # Just an example for enforcing "ssd" disks to be switched to use "deadline" i\o scheduler.
          echo 'ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"' > /etc/udev/rules.d/60-ssd-scheduler.rules
          echo $(date) 'post_deploy_script done' >> /root/post_deploy_done
      
        preDeployScript: |
          #!/bin/bash -ex
          echo "$(date) pre_deploy_script done" >> /root/pre_deploy_done
      
        volumeGroups:
        - devices:
          - partition: lvm_root_part
          name: lvm_root
        - devices:
          - partition: lvm_lvp_part
          - partition: lvm_lvp_part_sdf
          name: lvm_lvp
        - devices:
          - partition: lvm_dummy_part
          name: lvm_forawesomeapp
      
  8. Create the L2Template objects:

    • managed-ns_L2Template_bm-1490-template-controls-netplan.yaml

      apiVersion: ipam.mirantis.com/v1alpha1
      kind: L2Template
      metadata:
        labels:
          bm-1490-template-controls-netplan: anymagicstring
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        name: bm-1490-template-controls-netplan
        namespace: managed-ns
      spec:
        ifMapping:
        - enp9s0f0
        - enp9s0f1
        - eno1
        - ens3f1
        l3Layout:
        - scope: namespace
          subnetName: lcm-nw
        - scope: namespace
          subnetName: storage-frontend
        - scope: namespace
          subnetName: storage-backend
        - scope: namespace
          subnetName: metallb-public-for-extiface
      npTemplate: |-
        version: 2
        ethernets:
          {{nic 0}}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{mac 0}}
            set-name: {{nic 0}}
            mtu: 1500
          {{nic 1}}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{mac 1}}
            set-name: {{nic 1}}
            mtu: 1500
          {{nic 2}}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{mac 2}}
            set-name: {{nic 2}}
            mtu: 1500
          {{nic 3}}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{mac 3}}
            set-name: {{nic 3}}
            mtu: 1500
        bonds:
          bond0:
            parameters:
              mode: 802.3ad
              #transmit-hash-policy: layer3+4
              #mii-monitor-interval: 100
            interfaces:
              - {{ nic 0 }}
              - {{ nic 1 }}
          bond1:
            parameters:
              mode: 802.3ad
              #transmit-hash-policy: layer3+4
              #mii-monitor-interval: 100
            interfaces:
              - {{ nic 2 }}
              - {{ nic 3 }}
        vlans:
          stor-f:
            id: 1494
            link: bond1
            addresses:
              - {{ip "stor-frontend:storage-frontend"}}
          stor-b:
            id: 1489
            link: bond1
            addresses:
              - {{ip "stor-backend:storage-backend"}}
          m-pub:
            id: 1491
            link: bond0
        bridges:
          # we set up keepalived (loadbalancer_host) addr from metallb NW.
          # so, to perform guessing keepalived interface on master nodes,
          # we need to pass addresses
          kalive-ext:
            interfaces: [m-pub]
            addresses:
              - {{ ip "kalive-ext:metallb-public-for-extiface" }}
          #``k8s-lcm`` name is mandatory here.
          k8s-lcm:
            dhcp4: false
            dhcp6: false
            gateway4: {{ gateway_from_subnet "lcm-nw" }}
            addresses:
              - {{ ip "0:lcm-nw" }}
            nameservers:
              addresses: [ 172.18.176.6 ]
            interfaces:
                - bond0
      
    • managed-ns_L2Template_bm-1490-template-workers-netplan.yaml

      apiVersion: ipam.mirantis.com/v1alpha1
      kind: L2Template
      metadata:
        labels:
          bm-1490-template-workers-netplan: anymagicstring
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        name: bm-1490-template-workers-netplan
        namespace: managed-ns
      spec:
        ifMapping:
        - eno1
        - eno2
        - ens7f0
        - ens7f1
        l3Layout:
        - scope: namespace
          subnetName: lcm-nw
        - scope: namespace
          subnetName: storage-frontend
        - scope: namespace
          subnetName: storage-backend
        - scope: namespace
          subnetName: metallb-public-for-extiface
          npTemplate: |-
            version: 2
            ethernets:
              {{nic 0}}:
                nameservers:
                  addresses: [ 172.18.176.6 ]
                match:
                  macaddress: {{mac 0}}
                #``k8s-lcm`` name is mandatory here.
                set-name: "k8s-lcm"
                mtu: 1500
                gateway4: {{gateway_from_subnet "lcm-nw"}}
                addresses:
                  - {{ ip "0:lcm-nw" }}
              {{nic 1}}:
                dhcp4: false
                dhcp6: false
                match:
                  macaddress: {{mac 1}}
                set-name: {{nic 1}}
                mtu: 1500
              {{nic 2}}:
                dhcp4: false
                dhcp6: false
                match:
                  macaddress: {{mac 2}}
                set-name: {{nic 2}}
                mtu: 1500
              {{nic 3}}:
                dhcp4: false
                dhcp6: false
                match:
                  macaddress: {{mac 3}}
                set-name: {{nic 3}}
                mtu: 1500
            bonds:
              bond0:
                interfaces:
                  - {{ nic 1 }}
              bond1:
                parameters:
                  mode: 802.3ad
                  #transmit-hash-policy: layer3+4
                  #mii-monitor-interval: 100
                interfaces:
                  - {{ nic 2 }}
                  - {{ nic 3 }}
            vlans:
              stor-f:
                id: 1494
                link: bond1
                addresses:
                  - {{ip "stor-frontend:storage-frontend"}}
              stor-b:
                id: 1489
                link: bond1
                addresses:
                  - {{ip "stor-backend:storage-backend"}}
              m-pub:
                id: 1491
                link: {{ nic 1 }}
            bridges:
              k8s-ext:
                interfaces: [m-pub]
      
    • managed-ns_L2Template_bm-1490-template-controls-netplan-cz7700-pxebond.yaml

      apiVersion: ipam.mirantis.com/v1alpha1
      kind: L2Template
      metadata:
        labels:
          bm-1490-template-controls-netplan-cz7700-pxebond: anymagicstring
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        name: bm-1490-template-controls-netplan-cz7700-pxebond
        namespace: managed-ns
      spec:
        ifMapping:
        - enp9s0f0
        - enp9s0f1
        - eno1
        - ens3f1
        l3Layout:
        - scope: namespace
          subnetName: lcm-nw
        - scope: namespace
          subnetName: storage-frontend
        - scope: namespace
          subnetName: storage-backend
        - scope: namespace
          subnetName: metallb-public-for-extiface
      npTemplate: |-
        version: 2
        ethernets:
          {{nic 0}}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{mac 0}}
            set-name: {{nic 0}}
            mtu: 1500
          {{nic 1}}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{mac 1}}
            set-name: {{nic 1}}
            mtu: 1500
          {{nic 2}}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{mac 2}}
            set-name: {{nic 2}}
            mtu: 1500
          {{nic 3}}:
            dhcp4: false
            dhcp6: false
            match:
              macaddress: {{mac 3}}
            set-name: {{nic 3}}
            mtu: 1500
        bonds:
          bond0:
            parameters:
              mode: 802.3ad
              #transmit-hash-policy: layer3+4
              #mii-monitor-interval: 100
            interfaces:
              - {{ nic 0 }}
              - {{ nic 1 }}
          bond1:
            parameters:
              mode: 802.3ad
              #transmit-hash-policy: layer3+4
              #mii-monitor-interval: 100
            interfaces:
              - {{ nic 2 }}
              - {{ nic 3 }}
        vlans:
          stor-f:
            id: 1494
            link: bond1
            addresses:
              - {{ip "stor-frontend:storage-frontend"}}
          stor-b:
            id: 1489
            link: bond1
            addresses:
              - {{ip "stor-backend:storage-backend"}}
          m-pub:
            id: 1491
            link: bond0
        bridges:
          # we set up keepalived (loadbalancer_host) addr from metallb NW.
          # so, to perform guessing keepalived interface on master nodes,
          # we need to pass addresses.
          kalive-ext:
            interfaces: [m-pub]
            addresses:
              - {{ ip "kalive-ext:metallb-public-for-extiface" }}
          #``k8s-lcm`` name is mandatory here.
          k8s-lcm:
            dhcp4: false
            dhcp6: false
            gateway4: {{ gateway_from_subnet "lcm-nw" }}
            addresses:
              - {{ ip "0:lcm-nw" }}
            nameservers:
              addresses: [ 172.18.176.6 ]
            interfaces:
                - bond0
      
  9. Create the Subnet objects:

    • managed-ns_Subnet_lcm-nw.yaml

      apiVersion: ipam.mirantis.com/v1alpha1
      kind: Subnet
      metadata:
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          kaas.mirantis.com/region: region-one
        name: lcm-nw
        namespace: managed-ns
      spec:
        cidr: 172.16.170.0/24
        excludeRanges:
        - 172.16.168.3
        - 172.16.170.150
        gateway: 172.16.170.1
        includeRanges:
        - 172.16.170.150-172.16.170.250
      
    • managed-ns_Subnet_metallb-public-for-managed.yaml

      apiVersion: ipam.mirantis.com/v1alpha1
      kind: Subnet
      metadata:
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          ipam/SVC-MetalLB: '1'
          kaas.mirantis.com/region: region-one
        name: metallb-public-for-managed
        namespace: managed-ns
      spec:
        cidr: 172.16.168.0/24
        excludeRanges:
        - 172.16.168.3
        - 172.16.168.1-172.16.168.2
        - 172.16.168.10-172.16.168.30
        gateway: 172.16.168.1
      
    • managed-ns_Subnet_metallb-public-for-extiface.yaml

      apiVersion: ipam.mirantis.com/v1alpha1
      kind: Subnet
      metadata:
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          kaas.mirantis.com/region: region-one
        name: metallb-public-for-extiface
        namespace: managed-ns
      spec:
        cidr: 172.16.168.0/24
        gateway: 172.16.168.1
        includeRanges:
        - 172.16.168.10-172.16.168.30
      
    • managed-ns_Subnet_storage-backend.yaml

      apiVersion: ipam.mirantis.com/v1alpha1
      kind: Subnet
      metadata:
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          ipam/SVC-ceph-cluster: '1'
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        name: storage-backend
        namespace: managed-ns
      spec:
        cidr: 10.12.0.0/24
      
    • managed-ns_Subnet_storage-frontend.yaml

      apiVersion: ipam.mirantis.com/v1alpha1
      kind: Subnet
      metadata:
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          ipam/SVC-ceph-public: '1'
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        name: storage-frontend
        namespace: managed-ns
      spec:
        cidr: 10.12.1.0/24
      
  10. Create the PublicKey object for a managed cluster connection. For details, see Public key resources.

    managed-ns_PublicKey_managed-cluster-key.yaml

    apiVersion: kaas.mirantis.com/v1alpha1
    kind: PublicKey
    metadata:
      name: managed-cluster-key
      namespace: managed-ns
    spec:
      publicKey: ssh-rsa AAEXAMPLEXXX
    
  11. Create the Cluster object. For details, see Cluster resources.

    managed-ns_Cluster_managed-cluster.yaml

    apiVersion: cluster.k8s.io/v1alpha1
    kind: Cluster
    metadata:
      annotations:
        kaas.mirantis.com/lcm: 'true'
      labels:
        kaas.mirantis.com/provider: baremetal
        kaas.mirantis.com/region: region-one
      name: managed-cluster
      namespace: managed-ns
    spec:
      clusterNetwork:
        pods:
          cidrBlocks:
          - 192.168.0.0/16
        serviceDomain: ''
        services:
          cidrBlocks:
          - 10.232.0.0/18
      providerSpec:
        value:
          apiVersion: baremetal.k8s.io/v1alpha1
          dedicatedControlPlane: false
          dnsNameservers:
          - 172.18.176.6
          - 172.19.80.70
          helmReleases:
          - name: ceph-controller
          - enabled: true
            name: stacklight
            values:
              alertmanagerSimpleConfig:
                email:
                  enabled: false
                slack:
                  enabled: false
              elasticsearch:
                logstashRetentionTime: '30'
                persistentVolumeClaimSize: 30Gi
              highAvailabilityEnabled: false
              logging:
                enabled: false
              prometheusServer:
                customAlerts: []
                persistentVolumeClaimSize: 16Gi
                retentionSize: 15GB
                retentionTime: 15d
                watchDogAlertEnabled: false
          - name: metallb
            # since we defined the metallb subnet, we don't
            # need any extra configuration in cluster
            values: {}
          kind: BaremetalClusterProviderSpec
          loadBalancerHost: 172.16.168.3
          publicKeys:
          - name: managed-cluster-key
          region: region-one
          release: mke-5-16-0-3-3-6
    
  12. Create the Machine objects linked to each bmh node. For details, see Machine resources.

    • managed-ns_Machine_cz7700-managed-cluster-control-noefi-.yaml

      apiVersion: cluster.k8s.io/v1alpha1
      kind: Machine
      metadata:
        generateName: cz7700-managed-cluster-control-noefi-
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          cluster.sigs.k8s.io/control-plane: controlplane
          hostlabel.bm.kaas.mirantis.com/controlplane: controlplane
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        namespace: managed-ns
      spec:
        providerSpec:
          value:
            apiVersion: baremetal.k8s.io/v1alpha1
            hostSelector:
              matchLabels:
                kaas.mirantis.com/baremetalhost-id: cz7700
            kind: BareMetalMachineProviderSpec
            l2TemplateSelector:
              label: bm-1490-template-controls-netplan-cz7700-pxebond
            publicKeys:
            - name: managed-cluster-key
      
    • managed-ns_Machine_cz7741-managed-cluster-control-noefi-.yaml

      apiVersion: cluster.k8s.io/v1alpha1
      kind: Machine
      metadata:
        generateName: cz7741-managed-cluster-control-noefi-
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          cluster.sigs.k8s.io/control-plane: controlplane
          hostlabel.bm.kaas.mirantis.com/controlplane: controlplane
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        namespace: managed-ns
      spec:
        providerSpec:
          value:
            apiVersion: baremetal.k8s.io/v1alpha1
            bareMetalHostProfile:
              name: bmhp-cluster-default
              namespace: managed-ns
            hostSelector:
              matchLabels:
                kaas.mirantis.com/baremetalhost-id: cz7741
            kind: BareMetalMachineProviderSpec
            l2TemplateSelector:
              label: bm-1490-template-controls-netplan
            publicKeys:
            - name: managed-cluster-key
      
    • managed-ns_Machine_cz7743-managed-cluster-control-noefi-.yaml

      apiVersion: cluster.k8s.io/v1alpha1
      kind: Machine
      metadata:
        generateName: cz7743-managed-cluster-control-noefi-
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          cluster.sigs.k8s.io/control-plane: controlplane
          hostlabel.bm.kaas.mirantis.com/controlplane: controlplane
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        namespace: managed-ns
      spec:
        providerSpec:
          value:
            apiVersion: baremetal.k8s.io/v1alpha1
            bareMetalHostProfile:
              name: bmhp-cluster-default
              namespace: managed-ns
            hostSelector:
              matchLabels:
                kaas.mirantis.com/baremetalhost-id: cz7743
            kind: BareMetalMachineProviderSpec
            l2TemplateSelector:
              label: bm-1490-template-controls-netplan
            publicKeys:
            - name: managed-cluster-key
      
    • managed-ns_Machine_cz812-managed-cluster-storage-worker-noefi-.yaml

      apiVersion: cluster.k8s.io/v1alpha1
      kind: Machine
      metadata:
        generateName: cz812-managed-cluster-storage-worker-noefi-
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          hostlabel.bm.kaas.mirantis.com/storage: storage
          hostlabel.bm.kaas.mirantis.com/worker: worker
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        namespace: managed-ns
      spec:
        providerSpec:
          value:
            apiVersion: baremetal.k8s.io/v1alpha1
            bareMetalHostProfile:
              name: worker-storage1
              namespace: managed-ns
            hostSelector:
              matchLabels:
                kaas.mirantis.com/baremetalhost-id: cz812
            kind: BareMetalMachineProviderSpec
            l2TemplateSelector:
              label: bm-1490-template-workers-netplan
            publicKeys:
            - name: managed-cluster-key
      
    • managed-ns_Machine_cz813-managed-cluster-storage-worker-noefi-.yaml

      apiVersion: cluster.k8s.io/v1alpha1
      kind: Machine
      metadata:
        generateName: cz813-managed-cluster-storage-worker-noefi-
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          hostlabel.bm.kaas.mirantis.com/storage: storage
          hostlabel.bm.kaas.mirantis.com/worker: worker
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        namespace: managed-ns
      spec:
        providerSpec:
          value:
            apiVersion: baremetal.k8s.io/v1alpha1
            bareMetalHostProfile:
              name: worker-storage1
              namespace: managed-ns
            hostSelector:
              matchLabels:
                kaas.mirantis.com/baremetalhost-id: cz813
            kind: BareMetalMachineProviderSpec
            l2TemplateSelector:
              label: bm-1490-template-workers-netplan
            publicKeys:
            - name: managed-cluster-key
      
    • managed-ns_Machine_cz814-managed-cluster-storage-worker-noefi-.yaml

      apiVersion: cluster.k8s.io/v1alpha1
      kind: Machine
      metadata:
        generateName: cz814-managed-cluster-storage-worker-noefi-
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          hostlabel.bm.kaas.mirantis.com/storage: storage
          hostlabel.bm.kaas.mirantis.com/worker: worker
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
        namespace: managed-ns
      spec:
        providerSpec:
          value:
            apiVersion: baremetal.k8s.io/v1alpha1
            bareMetalHostProfile:
              name: worker-storage1
              namespace: managed-ns
            hostSelector:
              matchLabels:
                kaas.mirantis.com/baremetalhost-id: cz814
            kind: BareMetalMachineProviderSpec
            l2TemplateSelector:
              label: bm-1490-template-workers-netplan
            publicKeys:
            - name: managed-cluster-key
      
    • managed-ns_Machine_cz815-managed-cluster-worker-noefi-.yaml

      apiVersion: cluster.k8s.io/v1alpha1
      kind: Machine
      metadata:
        generateName: cz815-managed-cluster-worker-noefi-
        labels:
          cluster.sigs.k8s.io/cluster-name: managed-cluster
          hostlabel.bm.kaas.mirantis.com/worker: worker
          kaas.mirantis.com/provider: baremetal
          kaas.mirantis.com/region: region-one
          si-role/node-for-delete: 'true'
        namespace: managed-ns
      spec:
        providerSpec:
          value:
            apiVersion: baremetal.k8s.io/v1alpha1
            bareMetalHostProfile:
              name: worker-storage1
              namespace: managed-ns
            hostSelector:
              matchLabels:
                kaas.mirantis.com/baremetalhost-id: cz815
            kind: BareMetalMachineProviderSpec
            l2TemplateSelector:
              label: bm-1490-template-workers-netplan
            publicKeys:
            - name: managed-cluster-key
      
  13. Verify that the bmh nodes are in the provisioning state:

    KUBECONFIG=kubectl kubectl -n managed-ns get bmh  -o wide
    

    Example of system response:

    NAME                                  STATUS STATE          CONSUMER                                    BMC          BOOTMODE   ONLINE  ERROR REGION
    cz7700-managed-cluster-control-noefi  OK     provisioning   cz7700-managed-cluster-control-noefi-8bkqw  192.168.1.12  legacy     true          region-one
    cz7741-managed-cluster-control-noefi  OK     provisioning   cz7741-managed-cluster-control-noefi-42tp2  192.168.1.76  legacy     true          region-one
    cz7743-managed-cluster-control-noefi  OK     provisioning   cz7743-managed-cluster-control-noefi-8cwpw  192.168.1.78  legacy     true          region-one
    ...
    

    Wait until all bmh nodes are in the provisioned state.

  14. Verify that the lcmmachine phase has started:

    KUBECONFIG=kubeconfig kubectl -n managed-ns get lcmmachines  -o wide
    

    Example of system response:

    NAME                                       CLUSTERNAME       TYPE      STATE   INTERNALIP     HOSTNAME                                         AGENTVERSION
    cz7700-managed-cluster-control-noefi-8bkqw managed-cluster   control   Deploy  172.16.170.153 kaas-node-803721b4-227c-4675-acc5-15ff9d3cfde2   v0.2.0-349-g4870b7f5
    cz7741-managed-cluster-control-noefi-42tp2 managed-cluster   control   Prepare 172.16.170.152 kaas-node-6b8f0d51-4c5e-43c5-ac53-a95988b1a526   v0.2.0-349-g4870b7f5
    cz7743-managed-cluster-control-noefi-8cwpw managed-cluster   control   Prepare 172.16.170.151 kaas-node-e9b7447d-5010-439b-8c95-3598518f8e0a   v0.2.0-349-g4870b7f5
    ...
    
  15. Verify that the lcmmachine phase is complete and the Kubernetes cluster is created:

    KUBECONFIG=kubeconfig kubectl -n managed-ns get lcmmachines  -o wide
    

    Example of system response:

    NAME                                       CLUSTERNAME       TYPE     STATE  INTERNALIP      HOSTNAME                                        AGENTVERSION
    cz7700-managed-cluster-control-noefi-8bkqw  managed-cluster  control  Ready  172.16.170.153  kaas-node-803721b4-227c-4675-acc5-15ff9d3cfde2  v0.2.0-349-g4870b7f5
    cz7741-managed-cluster-control-noefi-42tp2  managed-cluster  control  Ready  172.16.170.152  kaas-node-6b8f0d51-4c5e-43c5-ac53-a95988b1a526  v0.2.0-349-g4870b7f5
    cz7743-managed-cluster-control-noefi-8cwpw  managed-cluster  control  Ready  172.16.170.151  kaas-node-e9b7447d-5010-439b-8c95-3598518f8e0a  v0.2.0-349-g4870b7f5
    ...
    
  16. Create the KaaSCephCluster object:

    managed-ns_KaaSCephCluster_ceph-cluster-managed-cluster.yaml

    apiVersion: kaas.mirantis.com/v1alpha1
    kind: KaaSCephCluster
    metadata:
      name: ceph-cluster-managed-cluster
      namespace: managed-ns
    spec:
      cephClusterSpec:
        failureDomain: host
        nodes:
          # Add the exact ``nodes`` names.
          # Obtain the name from "get bmh -o wide" ``consumer`` field.
          cz812-managed-cluster-storage-worker-noefi-58spl:
            roles:
            - mgr
            - mon
            - osd
          # All disk configuration must be reflected in ``baremetalhostprofile``
            storageDevices:
            - config:
                deviceClass: ssd
              name: sdb
          cz813-managed-cluster-storage-worker-noefi-lr4k4:
            roles:
            - mgr
            - mon
            - osd
            storageDevices:
            - config:
                deviceClass: ssd
              name: sdb
          cz814-managed-cluster-storage-worker-noefi-z2m67:
            roles:
            - mgr
            - mon
            - osd
            storageDevices:
            - config:
                deviceClass: ssd
              name: sdb
        pools:
        - default: true
          deviceClass: ssd
          name: kubernetes
          replicated:
            size: 2
          role: kubernetes
      k8sCluster:
        name: managed-cluster
        namespace: managed-ns
    
  17. Obtain kubeconfig of the newly created managed cluster:

    KUBECONFIG=kubeconfig kubectl -n managed-ns get secrets managed-cluster-kubeconfig -o jsonpath='{.data.admin\.conf}' | base64 -d |  tee managed.kubeconfig
    
  18. Verify the status of the Ceph cluster in your managed cluster:

    KUBECONFIG=managed.kubeconfig kubectl -n rook-ceph exec -it $(KUBECONFIG=managed.kubeconfig kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') -- ceph -s
    

    Example of system response:

    cluster:
      id:     e75c6abd-c5d5-4ae8-af17-4711354ff8ef
      health: HEALTH_OK
    services:
      mon: 3 daemons, quorum a,b,c (age 55m)
      mgr: a(active, since 55m)
      osd: 3 osds: 3 up (since 54m), 3 in (since 54m)
    data:
      pools:   1 pools, 32 pgs
      objects: 273 objects, 555 MiB
      usage:   4.0 GiB used, 1.6 TiB / 1.6 TiB avail
      pgs:     32 active+clean
    io:
      client:   51 KiB/s wr, 0 op/s rd, 4 op/s wr
    
Configure multiple DHCP ranges using Subnet resources

Caution

This feature is available starting from the Container Cloud release 2.13.0.

To facilitate multi-rack and other types of distributed bare metal datacenter topologies, the dnsmasq DHCP server used for host provisioning in Container Cloud supports working with multiple L2 segments through network routers that support DHCP relay.

To configure DHCP ranges for dnsmasq, create the Subnet objects tagged with the ipam/SVC-dhcp-range label while setting up subnets for a managed cluster using CLI.

For every dhcp-range record, Container Cloud also configures the dhcp-option record to pass the default route through the default gateway from the corresponding subnet to all hosts that obtain addresses from that DHCP range. You can also specify DNS server addresses for servers that boot over PXE. They will be configured by Container Cloud using another dhcp-option record.

Note

  • The Subnet objects for DHCP ranges should not reference any specific cluster, as DHCP server configuration is only applicable to the management or regional cluster. The kaas.mirantis.com/region label that specifies the region will be used to determine where to apply the DHCP ranges from the given Subnet object. The Cluster reference will be ignored.

  • The baremetal-operator chart allows using multiple DHCP ranges in the dnsmasq.conf file. The chart iterates over a list of the dhcp-range parameters from its values and adds all items from the list to the dnsmasq configuration.

  • The baremetal-operator chart allows using single DHCP range for backwards compatibility. By default, the KAAS_BM_BM_DHCP_RANGE environment variable is still used to define the DHCP range for a management or regional cluster nodes during provisioning.

To configure DHCP ranges for dnsmasq:

  1. Create the Subnet objects tagged with the ipam/SVC-dhcp-range label.

    To create the Subnet objects, refer to Create subnets.

    Use the following Subnet object example to specify DHCP ranges and DHCP options to pass the default route and DNS server addresses:

    apiVersion: "ipam.mirantis.com/v1alpha1"
    kind: Subnet
    metadata:
      name: mgmt-dhcp-range
      namespace: default
      labels:
        ipam/SVC-dhcp-range: ""
        kaas.mirantis.com/provider: baremetal
        kaas.mirantis.com/region: region-one
    spec:
      cidr: 10.0.0.0/24
      gateway: 10.0.0.1
      includeRanges:
        - 10.0.0.121-10.0.0.125
        - 10.0.0.191-10.0.0.199
      nameservers:
      - 172.118.24.6
      - 8.8.8.8
    

    After creating the above Subnet object, the following dnsmasq parameters will be set using the baremetal-operator Helm chart:

    dhcp-range=set:mgmt-dhcp-range-0,10.0.0.121,10.0.0.125,255.255.255.0
    dhcp-range=set:mgmt-dhcp-range-1,10.0.0.191,10.0.0.199,255.255.255.0
    dhcp-option=tag:mgmt-dhcp-range-0,option:router,10.0.0.1
    dhcp-option=tag:mgmt-dhcp-range-1,option:router,10.0.0.1
    dhcp-option=tag:mgmt-dhcp-range-0,option:dns-server,172.118.24.6,8.8.8.8
    dhcp-option=tag:mgmt-dhcp-range-1,option:dns-server,172.118.24.6,8.8.8.8
    
    The dnsmasq parameters composed from the Subnet object

    Parameter

    Description

    dhcp-range=set:mgmt-dhcp-range-0,10.0.0.121,10.0.0.125,255.255.255.0

    DHCP range is set according to the cidr and includeRanges parameters of the Subnet object. The mgmt-dhcp-range-0 tag is formed from the Subnet object name and address range index within the Subnet object.

    dhcp-option=tag:mgmt-dhcp-range-0,option:router,10.0.0.1

    The default router option is set according to the gateway parameter of the Subnet object. The tag is the same as in the dhcp-range parameter.

    dhcp-option=tag:mgmt-dhcp-range-0,option:dns-server,172.118.24.6,8.8.8.8

    Optional, available when the nameservers parameter is set in the Subnet object. The DNS server option is set according to the nameservers parameter of the Subnet object. The tag is the same as in the dhcp-range parameter.

  2. Verify that the changes are applied to dnsmasq.conf:

    kubectl --kubeconfig <pathToMgmtOrRegionalClusterKubeconfig> \
    -n kaas get cm dnsmasq-config -ojson| jq -r '.data."dnsmasq.conf"'
    
Add a machine

This section describes how to add a machine to a newly created managed cluster using either the Mirantis Container Cloud web UI or CLI for an advanced configuration.

Create a machine using web UI

After you add bare metal hosts and create a managed cluster as described in Create a managed cluster, proceed with associating Kubernetes machines of your cluster with the previously added bare metal hosts using the Mirantis Container Cloud web UI.

To add a Kubernetes machine to a baremetal-based managed cluster:

  1. Log in to the Mirantis Container Cloud web UI with the operator or writer permissions.

  2. Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.

  3. In the Clusters tab, click the required cluster name. The cluster page with the Machines list opens.

  4. Click Create Machine button.

  5. Fill out the Create New Machine form as required:

    • Count

      Specify the number of machines to add.

    • Manager

      Select Manager or Worker to create a Kubernetes manager or worker node. The required minimum number of machines is three for the manager nodes HA and two for the Container Cloud workloads.

    • BareMetal Host Label

      Assign the role to the new machine(s) to link the machine to a previously created bare metal host with the corresponding label. You can assign one role type per machine. The supported labels include:

      • Worker

        The default role for any node in a managed cluster. Only the kubelet service is running on the machines of this type.

      • Manager

        This node hosts the manager services of a managed cluster. For the reliability reasons, Container Cloud does not permit running end user workloads on the manager nodes or use them as storage nodes.

      • Storage

        This node is a worker node that also hosts Ceph OSDs and provides its disk resources to Ceph. Container Cloud permits end users to run workloads on storage nodes by default.

    • Node Labels

      Select the required node labels for the worker machine to run certain components on a specific node. For example, for the StackLight nodes that run Elasticsearch and require more resources than a standard node, select the StackLight label. The list of available node labels is obtained from your current Cluster release.

      Caution

      If you deploy StackLight in the HA mode (recommended):

      • Add the StackLight label to minimum three worker nodes. Otherwise, StackLight will not be deployed until the required number of worker nodes is configured with the StackLight label.

      • Removal of the StackLight label from worker nodes along with removal of worker nodes with StackLight label can cause the StackLight components to become inaccessible. It is important to correctly maintain the worker nodes where the StackLight local volumes were provisioned. For details, see Delete a machine.

        To obtain the list of nodes where StackLight is deployed, refer to Upgrade managed clusters with StackLight deployed in HA mode.

      Note

      You can add node labels after deploying a worker machine. On the Machines page, click the More action icon in the last column of the required machine field and select Configure machine.

    • L2 Template

      From the drop-down list, select the previously created L2 template, if any. For details, see Create L2 templates. Otherwise, leave the default selection to use a preinstalled L2 template.

    • BM Host Profile

      From the drop-down list, select the previously created custom bare metal host profile, if any. For details, see Create a custom bare metal host profile. Otherwise, leave the default selection.

  6. Click Create.

    At this point, Container Cloud adds the new machine object to the specified managed cluster. And the Bare Metal Operator controller creates the relation to BareMetalHost with the labels matching the roles.

    Provisioning of the newly created machine starts when the machine object is created and includes the following stages:

    1. Creation of partitions on the local disks as required by the operating system and the Container Cloud architecture.

    2. Configuration of the network interfaces on the host as required by the operating system and the Container Cloud architecture.

    3. Installation and configuration of the Container Cloud LCM agent.

  7. Repeat the steps above for the remaining machines.

    Monitor the deploy or update live status of the machine:

    • Quick status

      On the Clusters page, in the Managers or Workers columns. The green status icon indicates that the machine is Ready, the orange status icon indicates that the machine is Updating.

    • Detailed status

      In the Machines section of a particular cluster page, in the Status column. Hover over a particular machine status icon to verify the deploy or update status of a specific machine component.

    You can monitor the status of the following machine components:

    Component

    Description

    Kubelet

    Readiness of a node in a Kubernetes cluster, as reported by kubelet

    Swarm

    Health and readiness of a node in a Docker Swarm cluster

    LCM

    LCM readiness status of a node

    ProviderInstance

    Readiness of a node in the underlying infrastructure (virtual or bare metal, depending on the provider type)

    The machine creation starts with the Provision status. During provisioning, the machine is not expected to be accessible since its infrastructure (VM, network, and so on) is being created.

    Other machine statuses are the same as the LCMMachine object states described in LCM controller.

    Once the status changes to Ready, the deployment of the managed cluster components on this machine is complete.

Now, proceed to