This documentation provides information on how to deploy and operate Mirantis Container Cloud.
The documentation is intended to help operators understand the core concepts of the product.
The information provided in this documentation set is being constantly improved and amended based on the feedback and kind requests from our software consumers. This documentation set outlines description of the features that are supported within two latest Cloud Container minor releases, with a corresponding note Available since release.
The following table lists the guides included in the documentation set you are reading:
Guide |
Purpose |
---|---|
Reference Architecture |
Learn the fundamentals of Container Cloud reference architecture to plan your deployment. |
Deployment Guide |
Deploy Container Cloud of a preferred configuration using supported deployment profiles tailored to the demands of specific business cases. |
Operations Guide |
Deploy and operate the Container Cloud managed clusters. |
Release Compatibility Matrix |
Deployment compatibility of the Container Cloud components versions for each product release. |
Release Notes |
Learn about new features and bug fixes in the current Container Cloud version as well as in the Container Cloud minor releases. |
For your convenience, we provide all guides from this documentation set in HTML (default), single-page HTML, PDF, and ePUB formats. To use the preferred format of a guide, select the required option from the Formats menu next to the guide title on the Container Cloud documentation home page.
This documentation assumes that the reader is familiar with network and cloud concepts and is intended for the following users:
Infrastructure Operator
Is member of the IT operations team
Has working knowledge of Linux, virtualization, Kubernetes API and CLI, and OpenStack to support the application development team
Accesses Mirantis Container Cloud and Kubernetes through a local machine or web UI
Provides verified artifacts through a central repository to the Tenant DevOps engineers
Tenant DevOps engineer
Is member of the application development team and reports to line-of-business (LOB)
Has working knowledge of Linux, virtualization, Kubernetes API and CLI to support application owners
Accesses Container Cloud and Kubernetes through a local machine or web UI
Consumes artifacts from a central repository approved by the Infrastructure Operator
This documentation set uses the following conventions in the HTML format:
Convention |
Description |
---|---|
boldface font |
Inline CLI tools and commands, titles of the procedures and system response examples, table titles. |
|
Files names and paths, Helm charts parameters and their values, names of packages, nodes names and labels, and so on. |
italic font |
Information that distinguishes some concept or term. |
External links and cross-references, footnotes. |
|
Main menu > menu item |
GUI elements that include any part of interactive user interface and menu navigation. |
Superscript |
Some extra, brief information. For example, if a feature is available from a specific release or if a feature is in the Technology Preview development stage. |
Note The Note block |
Messages of a generic meaning that may be useful to the user. |
Caution The Caution block |
Information that prevents a user from mistakes and undesirable consequences when following the procedures. |
Warning The Warning block |
Messages that include details that can be easily missed, but should not be ignored by the user and are valuable before proceeding. |
See also The See also block |
List of references that may be helpful for understanding of some related tools, concepts, and so on. |
Learn more The Learn more block |
Used in the Release Notes to wrap a list of internal references to the reference architecture, deployment and operation procedures specific to a newly implemented product feature. |
This documentation set includes description of the Technology Preview features. A Technology Preview feature provides early access to upcoming product innovations, allowing customers to experience the functionality and provide feedback during the development process. Technology Preview features may be privately or publicly available and neither are intended for production use. While Mirantis will provide support for such features through official channels, normal Service Level Agreements do not apply. Customers may be supported by Mirantis Customer Support or Mirantis Field Support.
As Mirantis considers making future iterations of Technology Preview features generally available, we will attempt to resolve any issues that customers experience when using these features.
During the development of a Technology Preview feature, additional components may become available to the public for testing. Because Technology Preview features are being under development, Mirantis cannot guarantee the stability of such features. As a result, if you are using Technology Preview features, you may not be able to seamlessly upgrade to subsequent releases of that feature. Mirantis makes no guarantees that Technology Preview features will be graduated to a generally available product release.
The Mirantis Customer Success Organization may create bug reports on behalf of support cases filed by customers. These bug reports will then be forwarded to the Mirantis Product team for possible inclusion in a future release.
The documentation set refers to Mirantis Container Cloud GA as to the latest released GA version of the product. For details about the Container Cloud GA minor releases dates, refer to Container Cloud releases.
The Mirantis Container Cloud APIs are implemented using the Kubernetes CustomResourceDefinitions (CRDs) that enable you to expand the Kubernetes API. For details, see Mirantis Container Cloud API.
You can operate Container Cloud using the kubectl command-line tool that is based on the Kubernetes API. For the kubectl reference, see the official Kubernetes documentation.
The Container Cloud Operations Guide mostly contains manuals that describe the Container Cloud web UI that is intuitive and easy to get started with. Some sections are divided into a web UI instruction and an analogous but more advanced CLI one. Certain Container Cloud operations can be performed only using CLI with the corresponding steps described in dedicated sections. For details, refer to the required component section of this guide.
Note
This tutorial applies only to the Container Cloud web UI users
with the writer
access role assigned by the Infrastructure
Operator. To add a bare metal host,
the operator
access role is also required.
After you deploy the Mirantis Container Cloud management cluster, you can start creating managed clusters that will be based on the same cloud provider type that you have for the management cluster: OpenStack, AWS, bare metal, or VMWare vSphere.
The deployment procedure is performed using the Container Cloud web UI and comprises the following steps:
Create an initial cluster configuration depending on the provider type.
For a baremetal-based managed cluster, create and configure bare metal
hosts with corresponding labels for machines such as worker
,
manager
, or storage
.
Add the required amount of machines with the corresponding configuration to the managed cluster.
For a baremetal-based managed cluster, add a Ceph cluster.
After bootstrapping your baremetal-based Mirantis Container Cloud management cluster as described in Deployment Guide: Deploy a baremetal-based management cluster, you start creating the baremetal-based managed clusters using the Container Cloud web UI.
This section instructs you on how to configure and deploy a managed cluster that is based on the baremetal-based management cluster through the Mirantis Container Cloud web UI.
To create a managed cluster on bare metal:
Recommended. Verify that you have successfully configured an L2 template for a new cluster as described in Advanced networking configuration. You may skip this step if you do not require L2 separation for network traffic.
Optional. Create a custom bare metal host profile depending on your needs as described in Create a custom bare metal host profile.
Log in to the Container Cloud web UI with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the SSH keys tab, click Add SSH Key to upload the public SSH key that will be used for the SSH access to VMs.
In the Clusters tab, click Create Cluster.
Configure the new cluster in the Create New Cluster wizard that opens:
Define general and Kubernetes parameters:
Section |
Parameter name |
Description |
---|---|---|
General settings |
Cluster name |
The cluster name. |
Provider |
Select Baremetal. |
|
Region |
From the drop-down list, select Baremetal. |
|
Release version |
The Container Cloud version. |
|
SSH keys |
From the drop-down list, select the SSH key name that you have previously added for SSH access to the bare metal hosts. |
|
Provider |
LB host IP |
The IP address of the load balancer endpoint that will be used to access the Kubernetes API of the new cluster. This IP address must be on the Combined/PXE network. |
LB address range |
The range of IP addresses that can be assigned to load balancers for Kubernetes Services by MetalLB. |
|
Kubernetes |
Node CIDR |
Not applicable to bare metal. Set to an example value
|
Services CIDR blocks |
The Kubernetes Services CIDR blocks.
For example, |
|
Pods CIDR blocks |
The Kubernetes pods CIDR blocks.
For example, |
Configure StackLight:
Section |
Parameter name |
Description |
---|---|---|
StackLight |
Enable Monitoring |
Selected by default. Deselect to skip StackLight deployment. Note You can also enable, disable, or configure StackLight parameters after deploying a managed cluster. For details, see Change a cluster configuration or Configure StackLight. |
Enable Logging |
Select to deploy the StackLight logging stack. For details about the logging components, see Reference Architecture: StackLight deployment architecture. |
|
HA Mode |
Select to enable StackLight monitoring in the HA mode. For the differences between HA and non-HA modes, see Reference Architecture: StackLight deployment architecture. |
|
Elasticsearch |
Retention Time |
The Elasticsearch logs retention period in Logstash. |
Persistent Volume Claim Size |
The Elasticsearch persistent volume claim size. |
|
Prometheus |
Retention Time |
The Prometheus database retention period. |
Retention Size |
The Prometheus database retention size. |
|
Persistent Volume Claim Size |
The Prometheus persistent volume claim size. |
|
Enable Watchdog Alert |
Select to enable the Watchdog alert that fires as long as the entire alerting pipeline is functional. |
|
Custom Alerts |
Specify alerting rules for new custom alerts or upload a YAML file in the following exemplary format: - alert: HighErrorRate
expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
for: 10m
labels:
severity: page
annotations:
summary: High request latency
For details, see Official Prometheus documentation: Alerting rules. For the list of the predefined StackLight alerts, see Operations Guide: Available StackLight alerts. |
|
StackLight Email Alerts |
Enable Email Alerts |
Select to enable the StackLight email alerts. |
Send Resolved |
Select to enable notifications about resolved StackLight alerts. |
|
Require TLS |
Select to enable transmitting emails through TLS. |
|
Email alerts configuration for StackLight |
Fill out the following email alerts parameters as required:
|
|
StackLight Slack Alerts |
Enable Slack alerts |
Select to enable the StackLight Slack alerts. |
Send Resolved |
Select to enable notifications about resolved StackLight alerts. |
|
Slack alerts configuration for StackLight |
Fill out the following Slack alerts parameters as required:
|
Click Create.
To view the deployment status, verify the cluster status on the Clusters page. Once the orange blinking dot near the cluster name disappears, the deployment is complete.
Now, proceed to Add a bare metal host.
This section describes how to add a bare metal host to a newly created managed cluster using either the Container Cloud web UI or CLI for an advanced configuration.
After you create a managed cluster as described in Create a managed cluster, proceed with adding a bare metal host through the Mirantis Container Cloud web UI using the instruction below.
Before you proceed with adding a bare metal host:
Verify that the physical network on the server has been configured correctly. See Reference Architecture: Network fabric for details.
Enable the boot NIC support for UEFI load. Usually, at least the built-in network interfaces support it.
Enable the UEFI-LAN-OPROM support in BIOS -> Advanced -> PCIPCIe.
Enable the IPv4-PXE stack.
Set the following boot order:
UEFI-DISK
UEFI-PXE
If your PXE network is not configured to use the first network interface, fix the UEFI-PXE boot order to speed up node discovering by selecting only one required network interface.
Power off all bare metal hosts.
Warning
Only one Ethernet port on a host must be connected to the
Common/PXE network at any given time. The physical address
(MAC) of this interface must be noted and used to configure
the BareMetalHost
object describing the host.
To add a bare metal host to a baremetal-based managed cluster:
Log in to the Container Cloud web UI with the operator
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Baremetal tab, click Add BM host.
Fill out the Add new BM host form as required:
Specify the name of the new bare metal host.
Specify the name of the user for accessing the BMC (IPMI user).
Specify the password of the user for accessing the BMC (IPMI password).
Specify the MAC address of the PXE network interface.
Specify the IP address to access the BMC.
Assign the machine label to the new host that defines which type of machine may be deployed on this bare metal host. Only one label can be assigned to a host. The supported labels include:
This label is selected and set by default.
Assign this label to the bare metal hosts that can be used
to deploy machines with the manager
type. These hosts
must match the CPU and RAM requirements described in
Reference Architecture: Reference hardware
configuration.
The host with this label may be used to deploy
the worker
machine type. Assign this label to the bare metal hosts
that have sufficient CPU and RAM resources, as described in
Reference Architecture: Reference hardware
configuration.
Assign this label to the bare metal hosts that have sufficient storage devices to match Reference Architecture: Reference hardware configuration. Hosts with this label will be used to deploy machines with the storage type that run Ceph OSDs.
Click Create
While adding the bare metal host, Container Cloud discovers and inspects
the hardware of the bare metal host and adds it to BareMetalHost.status
for future references.
Now, you can proceed to Create a machine using web UI.
After you create a managed cluster as described in Create a managed cluster, proceed with adding bare metal hosts using the Mirantis Container Cloud CLI using the instruction below.
To add a bare metal host using API:
Verify that you configured each bare metal host as follows:
Enable the boot NIC support for UEFI load. Usually, at least the built-in network interfaces support it.
Enable the UEFI-LAN-OPROM support in BIOS -> Advanced -> PCIPCIe.
Enable the IPv4-PXE stack.
Set the following boot order:
UEFI-DISK
UEFI-PXE
If your PXE network is not configured to use the first network interface, fix the UEFI-PXE boot order to speed up node discovering by selecting only one required network interface.
Power off all bare metal hosts.
Warning
Only one Ethernet port on a host must be connected to the
Common/PXE network at any given time. The physical address
(MAC) of this interface must be noted and used to configure
the BareMetalHost
object describing the host.
Log in to the host where your management cluster kubeconfig
is located
and where kubectl is installed.
Create a secret YAML file that describes the unique credentials of the new bare metal host.
Example of the bare metal host secret:
apiVersion: v1
data:
password: <credentials-password>
username: <credentials-user-name>
kind: Secret
metadata:
labels:
kaas.mirantis.com/credentials: "true"
kaas.mirantis.com/provider: baremetal
kaas.mirantis.com/region: region-one
name: <credentials-name>
namespace: <managed-cluster-project-name>
type: Opaque
In the data
section, add the IPMI user name and password in the base64
encoding to access the BMC. To obtain the base64-encoded credentials, you
can use the following command in your Linux console:
echo -n <username|password> | base64
Caution
Each bare metal host must have a unique Secret
.
Apply this secret YAML file to your deployment:
kubectl apply -f ${<bmh-cred-file-name>}.yaml
Create a YAML file that contains a description of the new bare metal host.
Example of the bare metal host configuration file with the worker
role:
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
labels:
kaas.mirantis.com/baremetalhost-id: <unique-bare-metal-host-hardware-node-id>
hostlabel.bm.kaas.mirantis.com/worker: "true"
kaas.mirantis.com/provider: baremetal
kaas.mirantis.com/region: region-one
name: <bare-metal-host-unique-name>
namespace: <managed-cluster-project-name>
spec:
bmc:
address: <ip_address_for-bmc-access>
credentialsName: <credentials-name>
bootMACAddress: <bare-metal-host-boot-mac-address>
online: true
For a detailed fields description, see BareMetalHost.
Apply this configuration YAML file to your deployment:
kubectl apply -f ${<bare-metal-host-config-file-name>}.yaml
Now, proceed with Deploy a machine to a specific bare metal host.
This section describes how to add a machine to a newly created managed cluster using either the Mirantis Container Cloud web UI or CLI for an advanced configuration.
After you add a bare metal host to the managed cluster as described in Add a bare metal host using web UI, you can create a Kubernetes machine in your cluster using the Mirantis Container Cloud web UI.
To add a Kubernetes machine to a baremetal-based managed cluster:
Log in to the Mirantis Container Cloud web UI
with the operator
or writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the required cluster name. The cluster page with the Machines list opens.
Click Create Machine button.
Fill out the Create New Machine form as required:
Specify the number of machines to add.
Select Manager or Worker to create a Kubernetes manager or worker node. The required minimum number of machines is three for the manager nodes HA and two for the Container Cloud workloads.
Assign the role to the new machine(s) to link the machine to a previously created bare metal host with the corresponding label. You can assign one role type per machine. The supported labels include:
The default role for any node in a managed cluster.
Only the kubelet
service is running on the machines of this type.
This node hosts the manager services of a managed cluster. For the reliability reasons, Container Cloud does not permit running end user workloads on the manager nodes or use them as storage nodes.
This node is a worker node that also hosts Ceph OSDs and provides its disk resources to Ceph. Container Cloud permits end users to run workloads on storage nodes by default.
Select the required node labels for the machine to run
certain components on a specific node. For example, for the StackLight nodes
that run Elasticsearch and require more resources than a standard node,
select the StackLight label.
The list of available node labels is obtained
from your current Cluster
release.
Caution
If you deploy StackLight in the HA mode (recommended), add the StackLight label to minimum three nodes.
Note
You can configure node labels after deploying a machine. On the Machines page, click the More action icon in the last column of the required machine field and select Configure machine.
Click Create.
At this point, Container Cloud adds the new machine object
to the specified managed cluster. And the Bare Metal Operator controller
creates the relation to BareMetalHost
with the labels matching the roles.
Provisioning of the newly created machine starts when the machine object is created and includes the following stages:
Creation of partitions on the local disks as required by the operating system and the Container Cloud architecture.
Configuration of the network interfaces on the host as required by the operating system and the Container Cloud architecture.
Installation and configuration of the Container Cloud LCM agent.
Now, proceed to Add a Ceph cluster.
This section describes a bare metal host and machine configuration using Mirantis Container Cloud CLI.
A Kubernetes machine requires a dedicated bare metal host for deployment.
The bare metal hosts are represented by the BareMetalHost
objects
in Kubernetes API. All BareMetalHost
objects are labeled by the Operator
when created. A label reflects the hardware capabilities of a host.
As a result of labeling, all bare metal hosts are divided into three types:
Control Plane
, Worker
, and Storage
.
In some cases, you may need to deploy a machine to a specific bare metal host. This is especially useful when some of your bare metal hosts have different hardware configuration than the rest.
To deploy a machine to a specific bare metal host:
Log in to the host where your management cluster kubeconfig
is located
and where kubectl is installed.
Identify the bare metal host that you want to associate with the specific
machine. For example, host host-1
.
kubectl get baremetalhost host-1 -o yaml
Add a label that will uniquely identify this host, for example, by the name of the host and machine that you want to deploy on it.
Caution
Do not remove any existing labels from the BareMetalHost
resource.
For more details about labels, see BareMetalHost.
kubectl edit baremetalhost host-1
Configuration example:
kind: BareMetalHost
metadata:
name: host-1
namespace: myProjectName
labels:
kaas.mirantis.com/baremetalhost-id: host-1-worker-HW11-cad5
...
Create a new text file with the YAML definition of the Machine
object, as defined in Machine.
Add a label selector that matches the label you have added to the
BareMetalHost
object in the previous step.
Example:
kind: Machine
metadata:
name: worker-HW11-cad5
namespace: myProjectName
spec:
...
providerSpec:
value:
apiVersion: baremetal.k8s.io/v1alpha1
kind: BareMetalMachineProviderSpec
...
hostSelector:
matchLabels:
kaas.mirantis.com/baremetalhost-id: host-1-worker-HW11-cad5
...
Specify the details of the machine configuration in the object created in the previous step. For example:
Add a reference to a custom BareMetalHostProfile
object,
as defined in Machine.
Specify an override for the ordering and naming of the NICs for the machine. For details, see Override network interfaces naming and order.
If you use a specific L2 template for the machine, set the unique name
or label of the corresponding L2 template in the L2templateSelector
section of the Machine
object.
Add the configured machine to the cluster:
kubectl create -f worker-HW11-cad5.yaml
Once done, this machine will be associated with the specified bare metal host.
An L2 template contains the ifMapping
field that allows you to
identify Ethernet interfaces for the template. The Machine
object
API enables the Operator to override the mapping from the L2 template
by enforcing a specific order of names of the interfaces when applied
to the template.
The field l2TemplateIfMappingOverride
in the spec of the Machine
object contains a list of interfaces names. The order of the interfaces
names in the list is important because the L2Template
object will
be rendered with NICs ordered as per this list.
Note
Changes in the l2TemplateIfMappingOverride
field will apply
only once when the Machine
and corresponding IpamHost
objects
are created. Further changes to l2TemplateIfMappingOverride
will not reset the interfaces assignment and configuration.
Caution
The l2TemplateIfMappingOverride
field must contain the names of
all interfaces of the bare metal host.
The following example illustrates how to include the override field to the
Machine
object. In this example, we configure the interface eno1
,
which is the second on-board interface of the server, to precede the first
on-board interface eno0
.
apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
metadata:
finalizers:
- foregroundDeletion
- machine.cluster.sigs.k8s.io
labels:
cluster.sigs.k8s.io/cluster-name: kaas-mgmt
cluster.sigs.k8s.io/control-plane: "true"
kaas.mirantis.com/provider: baremetal
kaas.mirantis.com/region: region-one
spec:
providerSpec:
value:
apiVersion: baremetal.k8s.io/v1alpha1
hostSelector:
matchLabels:
baremetal: hw-master-0
image: {}
kind: BareMetalMachineProviderSpec
l2TemplateIfMappingOverride:
- eno1
- eno0
- enp0s1
- enp0s2
As a result of the configuration above, when used with the example
L2 template for bonds and bridges described in Create L2 templates,
the enp0s1
and enp0s2
interfaces will be bonded and
that bond will be used to create subinterfaces for Kubernetes networks
(k8s-pods
) and for Kubernetes external network (k8s-ext
).
See also
After you add machines to your new bare metal managed cluster as described in Add a machine, you can create a Ceph cluster on top of this managed cluster using the Mirantis Container Cloud web UI.
For an advanced configuration through the KaaSCephCluster
CR, see
Ceph advanced configuration.
The procedure below enables you to create a Ceph cluster with minimum three Ceph nodes that provides persistent volumes to the Kubernetes workloads in the managed cluster.
To create a Ceph cluster in the managed cluster:
Log in to the Container Cloud web UI with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the required cluster name. The Cluster page with the Machines and Ceph clusters lists opens.
In the Ceph Clusters block, click Create Cluster.
Configure the Ceph cluster in the Create New Ceph Cluster wizard that opens:
Section |
Parameter name |
Description |
---|---|---|
General settings |
Name |
The Ceph cluster name. |
Cluster Network |
Replication network for Ceph OSDs. Must contain the CIDR definition
and match the corresponding values of the cluster |
|
Public Network |
Public network for Ceph data. Must contain the CIDR definition and
match the corresponding values of the cluster |
|
Enable OSDs LCM |
Select to enable LCM for Ceph OSDs. |
|
Machines / Machine #1-3 |
Select machine |
Select the name of the Kubernetes machine that will host the corresponding Ceph node in the Ceph cluster. |
Manager, Monitor |
Select the required Ceph services to install on the Ceph node. |
|
Devices |
Select the disk that Ceph will use. Warning Do not select the device for system services,
for example, |
|
Enable Object Storage Available since 2.5.0 |
Select to enable the single-instance RGW Object Storage. |
To add more Ceph nodes to the new Ceph cluster, click + next to any Ceph Machine title in the Machines tab. Configure a Ceph node as required.
Warning
Do not add more than 3 Manager
and/or Monitor
services to the Ceph cluster.
After you add and configure all nodes in your Ceph cluster, click Create.
Once done, verify your Ceph cluster as described in Verify Ceph.
Due to a development limitation in baremetal operator
,
deletion of a managed cluster requires preliminary deletion
of the worker machines running on the cluster.
Using the Container Cloud web UI, first delete worker machines one by one until you hit the minimum of 2 workers for an operational cluster. After that, you can delete the cluster with the remaining workers and managers.
To delete a baremetal-based managed cluster:
Log in to the Mirantis Container Cloud web UI
with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the required cluster name to open the list of machines running on it.
Click the More action icon in the last column of the worker machine you want to delete and select Delete. Confirm the deletion.
Repeat the step above until you have 2 workers left.
In the Clusters tab, click the More action icon in the last column of the required cluster and select Delete.
Verify the list of machines to be removed. Confirm the deletion.
Optional. If you do not plan to reuse the credentials of the deleted cluster, delete them:
In the Credentials tab, click the Delete credential action icon next to the name of the credentials to be deleted.
Confirm the deletion.
Warning
You can delete credentials only after deleting the managed cluster they relate to.
Deleting a cluster automatically frees up the resources allocated for this cluster, for example, instances, load balancers, networks, floating IPs, and so on.
By default, Mirantis Container Cloud configures a single interface on the cluster nodes, leaving all other physical interfaces intact.
With L2 networking templates, you can create advanced host networking configurations for your clusters. For example, you can create bond interfaces on top of physical interfaces on the host or use multiple subnets to separate different types of network traffic.
You can use several host-specific L2 templates per one cluster to support different hardware configurations. For example, you can create L2 templates with different number and layout of NICs to be applied to the specific machines of one cluster.
When you create a baremetal-based project, the exemplary templates
with the ipam/PreInstalledL2Template
label are copied to this project.
These templates are preinstalled during the management cluster bootstrap.
Follow the procedures below to create L2 templates for your managed clusters.
Before creating an L2 template, ensure that you have the required subnets
that can be used in the L2 template to allocate IP addresses for the
managed cluster nodes.
Where required, create a number of subnets for a particular project
using the Subnet
CR. A subnet has three logical scopes:
global - CR uses the default
namespace.
A subnet can be used for any cluster located in any project.
namespaced - CR uses the namespace that corresponds to a particular project where managed clusters are located. A subnet can be used for any cluster located in the same project.
cluster - CR uses the namespace where the referenced cluster is located.
A subnet is only accessible to the cluster that
L2Template.spec.clusterRef
refers to. The Subnet
objects
with the cluster
scope will be created for every new cluster.
You can have subnets with the same name in different projects. In this case, the subnet that has the same project as the cluster will be used. One L2 template may often reference several subnets, those subnets may have different scopes in this case.
The IP address objects (IPaddr
CR) that are allocated from subnets
always have the same project as their corresponding IpamHost
objects,
regardless of the subnet scope.
To create subnets:
Log in to a local machine where your management cluster kubeconfig
is located and where kubectl
is installed.
Note
The management cluster kubeconfig
is created
during the last stage of the management cluster bootstrap.
Create the subnet.yaml
file with a number of global or namespaced
subnets:
kubectl --kubeconfig <pathToManagementClusterKubeconfig> apply -f <SubnetFileName.yaml>
Note
In the command above and in the steps below, substitute the parameters enclosed in angle brackets with the corresponding values.
Example of a subnet.yaml
file:
apiVersion: ipam.mirantis.com/v1alpha1
kind: Subnet
metadata:
name: demo
namespace: demo-namespace
spec:
cidr: 10.11.0.0/24
gateway: 10.11.0.9
includeRanges:
- 10.11.0.5-10.11.0.70
nameservers:
- 172.18.176.6
Parameter |
Description |
---|---|
|
A valid IPv4 CIDR, for example, 10.11.0.0/24. |
|
A list of IP address ranges within the given CIDR that should be used
in the allocation of IPs for nodes (excluding the gateway address).
The IPs outside the given ranges will not be used in the allocation.
Each element of the list can be either an interval 10.11.0.5-10.11.0.70
or a single address 10.11.0.77. In the example above, the addresses
|
|
A list of IP address ranges within the given CIDR that should not
be used in the allocation of IPs for nodes. The IPs within the given CIDR
but outside the given ranges will be used in the allocation
(excluding gateway address). Each element of the list can be either
an interval 10.11.0.5-10.11.0.70 or a single address 10.11.0.77.
The |
|
If set to |
|
A valid gateway address, for example, 10.11.0.9. |
|
A list of the IP addresses of name servers. Each element of the list is a single address, for example, 172.18.176.6. |
Caution
The subnet for the PXE network is automatically created
during deployment and must contain
the ipam/DefaultSubnet: "1"
label.
Each bare metal region must have only one subnet
with this label.
The following labels in metadata describe or change the subnet functioning:
Parameter |
Description |
---|---|
|
UID of the cluster that the subnet belongs to. In most cases, this label
is automatically set by the |
|
When set to Caution Using of a dedicated network for Kubernetes pods traffic and using of a dedicated network for external connection to the Kubernetes services exposed by the cluster described above are available as Technology Preview. Use such configurations for testing and evaluation purposes only. For details about the Mirantis Technology Preview support scope, see the Preface section of this guide. The following feature is still under development and will be announced in one of the following Container Cloud releases:
|
Verify that the subnet is successfully created:
kubectl get subnet kaas-mgmt -oyaml
In the system output, verify the status
fields of the subnet.yaml
file using the table below.
Parameter |
Description |
---|---|
|
Contains a short state description and a more detailed one if applicable. The short status values are as follows:
|
|
Reflects the actual CIDR, has the same meaning as |
|
Reflects the actual gateway, has the same meaning as |
|
Reflects the actual name servers, has same meaning as |
|
Specifies the address ranges that are calculated using the fields from
|
|
Includes the date and time of the latest update of the |
|
Includes the number of currently available IP addresses that can be allocated for nodes from the subnet. |
|
Specifies the list of IPv4 addresses with the corresponding |
|
Contains the total number of IP addresses being held by ranges that equals to a sum
of the |
|
Contains thevVersion of the |
Example of a successfully created subnet:
apiVersion: ipam.mirantis.com/v1alpha1
kind: Subnet
metadata:
labels:
ipam/UID: 6039758f-23ee-40ba-8c0f-61c01b0ac863
kaas.mirantis.com/provider: baremetal
kaas.mirantis.com/region: region-one
name: kaas-mgmt
namespace: default
spec:
cidr: 10.0.0.0/24
excludeRanges:
- 10.0.0.100
- 10.0.0.101-10.0.0.120
gateway: 10.0.0.1
includeRanges:
- 10.0.0.50-10.0.0.90
nameservers:
- 172.18.176.6
status:
allocatable: 38
allocatedIPs:
- 10.0.0.50:0b50774f-ffed-11ea-84c7-0242c0a85b02
- 10.0.0.51:1422e651-ffed-11ea-84c7-0242c0a85b02
- 10.0.0.52:1d19912c-ffed-11ea-84c7-0242c0a85b02
capacity: 41
cidr: 10.0.0.0/24
gateway: 10.0.0.1
lastUpdate: "2020-09-26T11:40:44Z"
nameservers:
- 172.18.176.6
ranges:
- 10.0.0.50-10.0.0.90
statusMessage: OK
versionIpam: v3.0.999-20200807-130909-44151f8
Proceed to creating an L2 template for one or multiple managed clusters as described in Create L2 templates.
Before creating an L2 template, ensure that you have the required subnets
that can be used in the L2 template to allocate IP addresses for the
managed cluster nodes. You can also create multiple subnets using
the SubnetPool
object to separate different types of network traffic.
SubnetPool
allows for automatic creation of Subnet
objects
that will consume blocks from the parent SubnetPool
CIDR IP address
range. The SubnetPool
blockSize
setting defines the IP address
block size to allocate to each child Subnet
. SubnetPool
has a global
scope, so any SubnetPool
can be used to create the Subnet
objects
for any namespace and for any cluster.
To automate multiple subnet creation using SubnetPool:
Log in to a local machine where your management cluster kubeconfig
is located and where kubectl
is installed.
Note
The management cluster kubeconfig
is created
during the last stage of the management cluster bootstrap.
Create the subnetpool.yaml
file with a number of subnet pools:
Note
You can define either or both subnets and subnet pools, depending on the use case. A single L2 template can use either or both subnets and subnet pools.
kubectl --kubeconfig <pathToManagementClusterKubeconfig> apply -f <SubnetFileName.yaml>
Note
In the command above and in the steps below, substitute the parameters enclosed in angle brackets with the corresponding values.
Example of a subnetpool.yaml
file:
apiVersion: ipam.mirantis.com/v1alpha1
kind: SubnetPool
metadata:
name: kaas-mgmt
namespace: default
labels:
kaas.mirantis.com/provider: baremetal
kaas.mirantis.com/region: region-one
spec:
cidr: 10.10.0.0/16
blockSize: /25
nameservers:
- 172.18.176.6
gatewayPolicy: first
For the specification fields description of the SubnetPool
object,
see SubnetPool spec.
Verify that the subnet pool is successfully created:
kubectl get subnetpool kaas-mgmt -oyaml
In the system output, verify the status
fields of the
subnetpool.yaml
file. For the status fields description of the
SunbetPool
object, see SubnetPool status.
Proceed to creating an L2 template for one or multiple managed clusters
as described in Create L2 templates. In this procedure, select
the exemplary L2 template for multiple subnets that contains the
l3Layout
section.
Caution
Using the l3Layout
section, define all subnets of a cluster.
Otherwise, do not use the l3Layout
section.
Defining only part of subnets is not allowed.
After you create subnets for one or more managed clusters or projects as described in Create subnets or Automate multiple subnet creation using SubnetPool, follow the procedure below to create L2 templates for a managed cluster. This procedure contains exemplary L2 templates for the following use cases:
This section contains an exemplary L2 template that demonstrates how to set up bonds and bridges on hosts for your managed clusters as described in Create L2 templates.
Starting from Container Cloud 2.4.0, if you want to use a dedicated network
for Kubernetes pods traffic, configure each node with an IPv4 and/or IPv6
address that will be used to route the pods traffic between nodes.
To accomplish that, use the npTemplate.bridges.k8s-pods
bridge
in the L2 template, as demonstrated in the example below.
This bridge name is reserved for the
Kubernetes pods network. When the k8s-pods
bridge is defined in an L2
template, Calico CNI uses that network for routing the pods traffic between
nodes.
Starting from Container Cloud 2.5.0, you can use a dedicated network
for external connection to the Kubernetes services exposed by the cluster.
If enabled, MetalLB will listen and respond on the dedicated virtual bridge.
To accomplish that, configure each node where metallb-speaker
is deployed
with an IPv4 or IPv6 address. Both, the MetalLB IP address ranges and the IP
addresses configured on those nodes, must fit in the same CIDR.
Use the npTemplate.bridges.k8s-ext
bridge in the L2 template,
as demonstrated in the example below.
This bridge name is reserved for the Kubernetes external network.
The Subnet
object that corresponds to the k8s-ext
bridge must have
explicitly excluded IP address ranges that are in use by MetalLB.
Caution
Using of a dedicated network for Kubernetes pods traffic and using of a dedicated network for external connection to the Kubernetes services exposed by the cluster described above are available as Technology Preview. Use such configurations for testing and evaluation purposes only. For details about the Mirantis Technology Preview support scope, see the Preface section of this guide.
The following feature is still under development and will be announced in one of the following Container Cloud releases:
Switching Kubernetes API to listen to the specified IP address on the node
Example of an L2 template with interfaces bonding:
apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
name: test-managed
namespace: managed-ns
spec:
clusterRef: managed-cluster
autoIfMappingPrio:
- provision
- eno
- ens
- enp
npTemplate: |
version: 2
ethernets:
ten10gbe0s0:
dhcp4: false
dhcp6: false
match:
macaddress: {{mac 2}}
set-name: {{nic 2}}
ten10gbe0s1:
dhcp4: false
dhcp6: false
match:
macaddress: {{mac 3}}
set-name: {{nic 3}}
bonds:
bond0:
interfaces:
- ten10gbe0s0
- ten10gbe0s1
bridges:
k8s-ext:
interfaces: [bond0]
addresses:
- {{ip "k8s-ext:demo-ext"}}
k8s-pods:
interfaces: [bond0]
addresses:
- {{ip "k8s-pods:demo-pods"}}
This section contains an exemplary L2 template for automatic multiple
subnet creation as described in Automate multiple subnet creation using SubnetPool. This template
also contains the L3Layout
section that allows defining the Subnet
scopes and enables optional auto-creation of the Subnet
objects
from the SubnetPool
objects.
For details on how to create L2 templates, see Create L2 templates.
Caution
Do not assign an IP address to the PXE nic 0
NIC explicitly
to prevent the IP duplication during updates.
The IP address is automatically assigned by the
bootstrapping engine.
Example of an L2 template for multiple subnets:
apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
name: test-managed
namespace: managed-ns
spec:
clusterRef: managed-cluster
autoIfMappingPrio:
- provision
- eno
- ens
- enp
l3Layout:
- subnetName: pxe-subnet
scope: global
- subnetName: subnet-1
subnetPool: kaas-mgmt
scope: namespace
- subnetName: subnet-2
subnetPool: kaas-mgmt
scope: cluster
npTemplate: |
version: 2
ethernets:
onboard1gbe0:
dhcp4: false
dhcp6: false
match:
macaddress: {{mac 0}}
set-name: {{nic 0}}
# IMPORTANT: do not assign an IP address here explicitly
# to prevent IP duplication issues. The IP will be assigned
# automatically by the bootstrapping engine.
# addresses: []
onboard1gbe1:
dhcp4: false
dhcp6: false
match:
macaddress: {{mac 1}}
set-name: {{nic 1}}
ten10gbe0s0:
dhcp4: false
dhcp6: false
match:
macaddress: {{mac 2}}
set-name: {{nic 2}}
addresses:
- {{ip "2:subnet-1"}}
ten10gbe0s1:
dhcp4: false
dhcp6: false
match:
macaddress: {{mac 3}}
set-name: {{nic 3}}
addresses:
- {{ip "3:subnet-2"}}
In the template above, the following networks are defined
in the l3Layout
section:
pxe-subnet
- global PXE network that already exists. A subnet name
must refer to the PXE subnet created for the region.
subnet-1
- unless already created, this subnet will be created
from the kaas-mgmt
subnet pool. The subnet name must be unique within
the project. This subnet is shared between the project clusters.
subnet-2
- will be created from the kaas-mgmt
subnet pool.
This subnet has the cluster
scope. Therefore, the real name of the
Subnet
CR object consists of the subnet name defined in l3Layout
and the cluster UID.
But the npTemplate
section of the L2 template must contain only
the subnet name defined in l3Layout
.
The subnets of the cluster
scope are not shared between clusters.
Caution
Using the l3Layout
section, define all subnets of a cluster.
Otherwise, do not use the l3Layout
section.
Defining only part of subnets is not allowed.
To create an L2 template for a new managed cluster:
Log in to a local machine where your management cluster kubeconfig
is located and where kubectl
is installed.
Note
The management cluster kubeconfig
is created
during the last stage of the management cluster bootstrap.
Inspect the existing L2 templates to select the one that fits your deployment:
kubectl --kubeconfig <pathToManagementClusterKubeconfig> \
get l2template -n <ProjectNameForNewManagedCluster>
Create an L2 YAML template specific to your deployment using one of the exemplary templates:
Note
You can create several L2 templates with different configurations to be applied to different nodes of the same cluster. In this case:
First create the default L2 template for a cluster.
It will be used for machines that do not have
L2templateSelector
.
Verify that the unique ipam/DefaultForCluster
label
is added to the first L2 template of the cluster.
Set a unique name
and add a unique label
to the
metadata
section of each L2 template of the cluster.
To select a particular L2 template for a machine,
use either the L2 template name or label in the
L2templateSelector
section of the corresponding machine
configuration file.
If you use an L2 template for only one machine, set name
.
For a group of machines, set label
.
For details about configuration of machines, see Deploy a machine to a specific bare metal host.
Add or edit the mandatory parameters in the new L2 template.
The following tables provide the description of the mandatory
and the l3Layout
section parameters in the example templates
mentioned in the previous step.
Parameter |
Description |
---|---|
|
References the Cluster object that this template is applied to.
The Caution
|
|
|
|
A netplan-compatible configuration with special lookup functions
that defines the networking settings for the cluster hosts,
where physical NIC names and details are parameterized.
This configuration will be processed using Go templates.
Instead of specifying IP and MAC addresses, interface names,
and other network details specific to a particular host,
the template supports use of special lookup functions.
These lookup functions, such as Caution All rules and restrictions of the netplan configuration also apply to L2 templates. For details, see the official netplan documentation. |
For more details about the L2Template
custom resource (CR), see
the L2Template API section.
Parameter |
Description |
---|---|
|
Name of the |
|
Optional. Default: none. Name of the parent |
|
Logical scope of the
|
The following table describes the main lookup functions for an L2 template.
Lookup function |
Description |
---|---|
|
Name of a NIC number N. NIC numbers correspond to the interface mapping list. |
|
MAC address of a NIC number N registered during a host hardware inspection. |
|
IP address and mask for a NIC number N. The address will be auto-allocated from the given subnet if the address does not exist yet. |
|
IP address and mask for a virtual interface, |
|
IPv4 default gateway address from the given subnet. |
|
List of the IP addresses of name servers from the given subnet. |
Note
Every subnet referenced in an L2 template can have either a global or namespaced scope. In the latter case, the subnet must exist in the same project where the corresponding cluster and L2 template are located.
Add the L2 template to your management cluster:
kubectl --kubeconfig <pathToManagementClusterKubeconfig> apply -f <pathToL2TemplateYamlFile>
Optional. Further modify the template:
kubectl --kubeconfig <pathToManagementClusterKubeconfig> \
-n <ProjectNameForNewManagedCluster> edit l2template <L2templateName>
Proceed with creating a managed cluster as described in Create a managed cluster. The resulting L2 template will be used to render the netplan configuration for the managed cluster machines.
The workflow of the netplan configuration using an L2 template is as follows:
The kaas-ipam
service uses the data from BareMetalHost
,
the L2 template, and subnets to generate the netplan configuration
for every cluster machine.
The generated netplan configuration is saved in the
status.netconfigV2
section of the IpamHost
resource.
If the status.l2RenderResult
field of the IpamHost
resource
is OK
, the configuration was rendered in the IpamHost
resource
successfully. Otherwise, the status contains an error message.
The baremetal-provider
service copies data
from the status.netconfigV2
of IpamHost
to the
Spec.StateItemsOverwrites[‘deploy’][‘bm_ipam_netconfigv2’]
parameter
of LCMMachine
.
The lcm-agent
service on every host synchronizes the LCMMachine
data to its host. The lcm-agent
service runs
a playbook to update the netplan configuration on the host
during the pre-download
and deploy
phases.
The bare metal host profile is a Kubernetes custom resource. It allows the operator to define how the storage devices and the operating system are provisioned and configured.
This section describes the bare metal host profile default settings and configuration of custom profiles for managed clusters using Mirantis Container Cloud API. This procedure also applies to a management cluster with a few differences described in Deployment Guide: Customize the default bare metal host profile.
The default host profile requires three storage devices in the following strict order:
This device contains boot data and operating system data. It
is partitioned using the GUID Partition Table (GPT) labels.
The root file system is an ext4
file system
created on top of an LVM logical volume.
For a detailed layout, refer to the table below.
This device contains an ext4
file system with directories mounted
as persistent volumes to Kubernetes. These volumes are used by
the Mirantis Container Cloud services to store its data,
including monitoring and identity databases.
This device is used as a Ceph datastore or Ceph OSD.
The following table summarizes the default configuration of the host system storage set up by the Container Cloud bare metal management.
Device/partition |
Name/Mount point |
Recommended size, GB |
Description |
---|---|---|---|
|
|
4 MiB |
The mandatory GRUB boot partition required for non-UEFI systems. |
|
|
0.2 GiB |
The boot partition required for the UEFI boot mode. |
|
|
64 MiB |
The mandatory partition for the |
|
|
100% of the remaining free space in the LVM volume group |
The main LVM physical volume that is used to create the root file system. |
|
|
100% of the remaining free space in the LVM volume group |
The LVM physical volume that is used to create the file system
for |
|
|
100% of the remaining free space in the LVM volume group |
Clean raw disk that will be used for the Ceph storage back end. |
If required, you can customize the default host storage configuration. For details, see Create a custom host profile.
In addition to the default BareMetalHostProfile
object installed
with Mirantis Container Cloud, you can create custom profiles
for managed clusters using Container Cloud API.
Note
The procedure below also applies to the Container Cloud management clusters.
To create a custom bare metal host profile:
Select from the following options:
For a management cluster, log in to the bare metal seed node that will be used to bootstrap the management cluster.
For a managed cluster, log in to the local machine where you management
cluster kubeconfig
is located and where kubectl
is installed.
Note
The management cluster kubeconfig
is created automatically
during the last stage of the management cluster bootstrap.
Select from the following options:
For a management cluster, open
templates/bm/baremetalhostprofiles.yaml.template
for editing.
For a managed cluster, create a new bare metal host profile
under the templates/bm/
directory.
Edit the host profile using the example template below to meet your hardware configuration requirements:
apiVersion: metal3.io/v1alpha1
kind: BareMetalHostProfile
metadata:
name: <PROFILE_NAME>
namespace: <PROJECT_NAME>
spec:
devices:
# From the HW node, obtain the first device, which size is at least 60Gib
- device:
minSizeGiB: 60
wipe: true
partitions:
- name: bios_grub
partflags:
- bios_grub
sizeGiB: 0.00390625
wipe: true
- name: uefi
partflags:
- esp
sizeGiB: 0.2
wipe: true
- name: config-2
sizeGiB: 0.0625
wipe: true
- name: lvm_root_part
sizeGiB: 0
wipe: true
# From the HW node, obtain the second device, which size is at least 60Gib
# If a device exists but does not fit the size,
# the BareMetalHostProfile will not be applied to the node
- device:
minSizeGiB: 30
wipe: true
# From the HW node, obtain the disk device with the exact name
- device:
byName: /dev/nvme0n1
minSizeGiB: 30
wipe: true
partitions:
- name: lvm_lvp_part
sizeGiB: 0
wipe: true
# Example of wiping a device w\o partitioning it.
# Mandatory for the case when a disk is supposed to be used for Ceph back end
# later
- device:
byName: /dev/sde
wipe: true
fileSystems:
- fileSystem: vfat
partition: config-2
- fileSystem: vfat
mountPoint: /boot/efi
partition: uefi
- fileSystem: ext4
logicalVolume: root
mountPoint: /
- fileSystem: ext4
logicalVolume: lvp
mountPoint: /mnt/local-volumes/
logicalVolumes:
- name: root
sizeGiB: 0
vg: lvm_root
- name: lvp
sizeGiB: 0
vg: lvm_lvp
postDeployScript: |
#!/bin/bash -ex
echo $(date) 'post_deploy_script done' >> /root/post_deploy_done
preDeployScript: |
#!/bin/bash -ex
echo $(date) 'pre_deploy_script done' >> /root/pre_deploy_done
volumeGroups:
- devices:
- partition: lvm_root_part
name: lvm_root
- devices:
- partition: lvm_lvp_part
name: lvm_lvp
grubConfig:
defaultGrubOptions:
- GRUB_DISABLE_RECOVERY="true"
- GRUB_PRELOAD_MODULES=lvm
- GRUB_TIMEOUT=20
kernelParameters:
sysctl:
kernel.panic: "900"
kernel.dmesg_restrict: "1"
kernel.core_uses_pid: "1"
fs.file-max: "9223372036854775807"
fs.aio-max-nr: "1048576"
fs.inotify.max_user_instances: "4096"
vm.max_map_count: "262144"
Add or edit the mandatory parameters in the new BareMetalHostProfile
object. For the parameters description, see
API: BareMetalHostProfile spec.
Select from the following options:
For a management cluster, proceed with the cluster bootstrap procedure as described in Deployment Guide: Bootstrap a management cluster.
For a managed cluster:
Add the bare metal host profile to your management cluster:
kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> apply -f <pathToBareMetalHostProfileFile>
If required, further modify the host profile:
kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> edit baremetalhostprofile <hostProfileName>
Proceed with creating a managed cluster as described in Create a managed cluster.
The BareMetalHostProfile
API allows configuring a host to use the
huge pages feature of the Linux kernel on managed clusters.
Note
Huge pages is a mode of operation of the Linux kernel. With huge pages enabled, the kernel allocates the RAM in bigger chunks, or pages. This allows a KVM (kernel-based virtual machine) and VMs running on it to use the host RAM more efficiently and improves the performance of VMs.
To enable huge pages in a custom bare metal host profile for a managed cluster:
Log in to the local machine where you management
cluster kubeconfig
is located and where kubectl
is installed.
Note
The management cluster kubeconfig
is created automatically
during the last stage of the management cluster bootstrap.
Open for editing or create a new bare metal host profile
under the templates/bm/
directory.
Edit the grubConfig
section of the host profile spec
using
the example below to configure the kernel boot parameters and
enable huge pages:
spec:
grubConfig:
defaultGrubOptions:
- GRUB_DISABLE_RECOVERY="true"
- GRUB_PRELOAD_MODULES=lvm
- GRUB_TIMEOUT=20
- GRUB_CMDLINE_LINUX_DEFAULT="hugepagesz=1G hugepages=N"
The example configuration above will allocate N
huge pages of 1 GB each
on the server boot. The last hugepagesz
parameter value is default unless
default_hugepagesz
is defined. For details about possible values, see
official Linux kernel documentation.
Add the bare metal host profile to your management cluster:
kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> apply -f <pathToBareMetalHostProfileFile>
If required, further modify the host profile:
kubectl --kubeconfig <pathToManagementClusterKubeconfig> -n <projectName> edit baremetalhostprofile <hostProfileName>
Proceed with creating a managed cluster as described in Create a managed cluster.
After bootstrapping your OpenStack-based Mirantis Container Cloud management cluster as described in Deployment Guide: Deploy an OpenStack-based management cluster, you can create the OpenStack-based managed clusters using the Container Cloud web UI.
This section describes how to create an OpenStack-based managed cluster using the Mirantis Container Cloud web UI of the OpenStack-based management cluster.
To create an OpenStack-based managed cluster:
Log in to the Container Cloud web UI with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the SSH Keys tab, click Add SSH Key to upload the public SSH key that will be used for the OpenStack VMs creation.
In the Credentials tab:
Click Add Credential to add your OpenStack credentials.
You can either upload your OpenStack clouds.yaml
configuration
file or fill in the fields manually.
Verify that the new credentials status is Ready. If the status is Error, hover over the status to determine the reason of the issue.
Available since 2.5.0 Optional. In the Proxies tab, enable proxy access to the managed cluster:
Click Add Proxy.
In the Add New Proxy wizard, fill out the form with the following parameters:
Parameter |
Description |
---|---|
Proxy Name |
Name of the proxy server to use during a managed cluster creation. |
Region |
From the drop-down list, select the required region. |
HTTP Proxy |
Add the HTTP proxy server domain name in the following format:
|
HTTPS Proxy |
Add the HTTPS proxy server domain name in the same format as for HTTP Proxy. |
No Proxy |
Comma-separated list of IP addresses or domain names. |
For the list of Mirantis resources and IP addresses to be accessible from the Container Cloud clusters, see Reference Architecture: Hardware and system requirements.
In the Clusters tab, click Create Cluster and fill out the form with the following parameters as required:
Configure general settings and the Kubernetes parameters:
Section |
Parameter |
Description |
---|---|---|
General Settings |
Name |
Cluster name |
Provider |
Select OpenStack |
|
Provider Credential |
From the drop-down list, select the OpenStack credentials name that you have previously created. |
|
Release Version |
The Container Cloud version. |
|
Proxy |
Available since 2.5.0 Optional. From the drop-down list, select the proxy server name that you have previously created. |
|
SSH Keys |
From the drop-down list, select the SSH key name that you have previously added for SSH access to VMs. |
|
Provider |
External Network |
Type of the external network in the OpenStack cloud provider. |
DNS Name Servers |
Comma-separated list of the DNS hosts IPs for the OpenStack VMs configuration. |
|
Kubernetes |
Node CIDR |
The Kubernetes nodes CIDR block. For example, |
Services CIDR Blocks |
The Kubernetes Services CIDR block. For example, |
|
Pods CIDR Blocks |
The Kubernetes Pods CIDR block. For example, |
Configure StackLight:
Section |
Parameter name |
Description |
---|---|---|
StackLight |
Enable Monitoring |
Selected by default. Deselect to skip StackLight deployment. Note You can also enable, disable, or configure StackLight parameters after deploying a managed cluster. For details, see Change a cluster configuration or Configure StackLight. |
Enable Logging |
Select to deploy the StackLight logging stack. For details about the logging components, see Reference Architecture: StackLight deployment architecture. |
|
HA Mode |
Select to enable StackLight monitoring in the HA mode. For the differences between HA and non-HA modes, see Reference Architecture: StackLight deployment architecture. |
|
Elasticsearch |
Retention Time |
The Elasticsearch logs retention period in Logstash. |
Persistent Volume Claim Size |
The Elasticsearch persistent volume claim size. |
|
Prometheus |
Retention Time |
The Prometheus database retention period. |
Retention Size |
The Prometheus database retention size. |
|
Persistent Volume Claim Size |
The Prometheus persistent volume claim size. |
|
Enable Watchdog Alert |
Select to enable the Watchdog alert that fires as long as the entire alerting pipeline is functional. |
|
Custom Alerts |
Specify alerting rules for new custom alerts or upload a YAML file in the following exemplary format: - alert: HighErrorRate
expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
for: 10m
labels:
severity: page
annotations:
summary: High request latency
For details, see Official Prometheus documentation: Alerting rules. For the list of the predefined StackLight alerts, see Operations Guide: Available StackLight alerts. |
|
StackLight Email Alerts |
Enable Email Alerts |
Select to enable the StackLight email alerts. |
Send Resolved |
Select to enable notifications about resolved StackLight alerts. |
|
Require TLS |
Select to enable transmitting emails through TLS. |
|
Email alerts configuration for StackLight |
Fill out the following email alerts parameters as required:
|
|
StackLight Slack Alerts |
Enable Slack alerts |
Select to enable the StackLight Slack alerts. |
Send Resolved |
Select to enable notifications about resolved StackLight alerts. |
|
Slack alerts configuration for StackLight |
Fill out the following Slack alerts parameters as required:
|
Click Create.
To view the deployment status, verify the cluster status on the Clusters page. Once the orange blinking dot near the cluster name disappears, the deployment is complete.
Proceed with Add a machine.
See also
After you create a new OpenStack-based Mirantis Container Cloud managed cluster as described in Create a managed cluster, proceed with adding machines to this cluster using the Container Cloud web UI.
You can also use the instruction below to scale up an existing managed cluster.
To add a machine to an OpenStack-based managed cluster:
Log in to the Container Cloud web UI with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the required cluster name. The cluster page with Machines list opens.
On the cluster page, click Create Machine.
Fill out the form with the following parameters as required:
Parameter |
Description |
---|---|
Count |
Specify the number of machines to create. The required minimum number of machines is three for the manager nodes HA and two for the Container Cloud workloads. Select Manager or Worker to create a Kubernetes manager or worker node. |
Flavor |
From the drop-down list, select the required hardware configuration for the machine. The list of available flavors corresponds to the one in your OpenStack environment. For the hardware requirements, see: Reference Architecture: Requirements for an OpenStack-based cluster. |
Image |
From the drop-down list, select the cloud image with Ubuntu 18.04. If you do not have this image in the list, add it to your OpenStack environment using the Horizon web UI by downloading the image from the Ubuntu official website. |
Availability zone |
From the drop-down list, select the availability zone from which the new machine will be launched. |
Node Labels |
Select the required node labels for the machine to run
certain components on a specific node. For example, for the StackLight nodes
that run Elasticsearch and require more resources than a standard node,
select the StackLight label.
The list of available node labels is obtained
from your current Caution If you deploy StackLight in the HA mode (recommended), add the StackLight label to minimum three nodes. Note You can configure node labels after deploying a machine. On the Machines page, click the More action icon in the last column of the required machine field and select Configure machine. |
Click Create.
Repeat the steps above for the remaining machines.
You can monitor the machine status in the Managers or Workers columns on the Clusters page. Once the status changes to Ready, the deployment of the managed cluster components on this machine is complete.
The machine creation starts with the Provision status. During provisioning, the machine is not expected to be accessible since its infrastructure (VM, network, and so on) is being created.
Other machine statuses are the same as the LCMMachine object states described in Reference Architecture: LCM controller.
Verify the status of the cluster nodes as described in Connect to a Mirantis Container Cloud cluster.
Warning
An operational managed cluster deployment must contain a minimum of 3 Kubernetes manager nodes and 2 Kubernetes worker nodes. The deployment of the cluster does not start until the minimum number of nodes is created.
To meet the etcd quorum and to prevent the deployment failure, deletion of the manager nodes is prohibited.
A machine with the manager node role is automatically deleted during the managed cluster deletion.
See also
Deleting a managed cluster does not require a preliminary deletion of VMs that run on this cluster.
To delete an OpenStack-based managed cluster:
Log in to the Mirantis Container Cloud web UI
with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the More action icon in the last column of the required cluster and select Delete.
Verify the list of machines to be removed. Confirm the deletion.
Deleting a cluster automatically frees up the resources allocated for this cluster, for example, instances, load balancers, networks, floating IPs.
If the cluster deletion hangs and the The cluster is being deleted message does not disappear for a while:
Expand the menu of the tab with your username.
Click Download kubeconfig
to download kubeconfig
of your management cluster.
Log in to any local machine with kubectl
installed.
Copy the downloaded kubeconfig
to this machine.
Run the following command:
kubectl --kubeconfig <KUBECONFIG_PATH> edit -n <PROJECT_NAME> cluster <MANAGED_CLUSTER_NAME>
Edit the opened kubeconfig
by removing the following lines:
finalizers:
- cluster.cluster.k8s.io
If you are going to remove the associated regional cluster or if you do not plan to reuse the credentials of the deleted cluster, delete them:
In the Credentials tab, verify that the required credentials are not in the In Use status.
Click the Delete credential action icon next to the name of the credentials to be deleted.
Confirm the deletion.
Warning
You can delete credentials only after deleting the managed cluster they relate to.
After bootstrapping your AWS-based Mirantis Container Cloud management cluster as described in Deployment Guide: Deploy an AWS-based management cluster, you can create the AWS-based managed clusters using the Container Cloud web UI.
This section describes how to create an AWS-based managed cluster using the Mirantis Container Cloud web UI of the AWS-based management cluster.
To create an AWS-based managed cluster:
Log in to the Container Cloud web UI with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the SSH Keys tab, click Add SSH Key to upload the public SSH key that will be configured on each AWS instance to provide user access.
In the Credentials tab:
Click Add Credential and fill in the required fields to add your AWS credentials.
Verify that the new credentials status is Ready. If the status is Error, hover over the status to determine the reason of the issue.
In the Clusters tab, click Create Cluster and fill out the form with the following parameters as required:
Configure general settings and the Kubernetes parameters:
Section |
Parameter |
Description |
---|---|---|
General settings |
Name |
Cluster name |
Provider |
Select AWS |
|
Provider credential |
From the drop-down list, select the previously created AWS credentials name. |
|
Release version |
The Container Cloud version. |
|
SSH keys |
From the drop-down list, select the SSH key name that you have previously added for SSH access to VMs. |
|
Provider |
AWS region |
From the drop-down list, select the AWS Region for the managed
cluster. For example, |
Kubernetes |
Services CIDR blocks |
The Kubernetes Services CIDR block. For example, |
Pods CIDR blocks |
The Kubernetes Pods CIDR block. For example, |
Configure StackLight:
Section |
Parameter name |
Description |
---|---|---|
StackLight |
Enable Monitoring |
Selected by default. Deselect to skip StackLight deployment. Note You can also enable, disable, or configure StackLight parameters after deploying a managed cluster. For details, see Change a cluster configuration or Configure StackLight. |
Enable Logging |
Select to deploy the StackLight logging stack. For details about the logging components, see Reference Architecture: StackLight deployment architecture. |
|
HA Mode |
Select to enable StackLight monitoring in the HA mode. For the differences between HA and non-HA modes, see Reference Architecture: StackLight deployment architecture. |
|
Elasticsearch |
Retention Time |
The Elasticsearch logs retention period in Logstash. |
Persistent Volume Claim Size |
The Elasticsearch persistent volume claim size. |
|
Prometheus |
Retention Time |
The Prometheus database retention period. |
Retention Size |
The Prometheus database retention size. |
|
Persistent Volume Claim Size |
The Prometheus persistent volume claim size. |
|
Enable Watchdog Alert |
Select to enable the Watchdog alert that fires as long as the entire alerting pipeline is functional. |
|
Custom Alerts |
Specify alerting rules for new custom alerts or upload a YAML file in the following exemplary format: - alert: HighErrorRate
expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
for: 10m
labels:
severity: page
annotations:
summary: High request latency
For details, see Official Prometheus documentation: Alerting rules. For the list of the predefined StackLight alerts, see Operations Guide: Available StackLight alerts. |
|
StackLight Email Alerts |
Enable Email Alerts |
Select to enable the StackLight email alerts. |
Send Resolved |
Select to enable notifications about resolved StackLight alerts. |
|
Require TLS |
Select to enable transmitting emails through TLS. |
|
Email alerts configuration for StackLight |
Fill out the following email alerts parameters as required:
|
|
StackLight Slack Alerts |
Enable Slack alerts |
Select to enable the StackLight Slack alerts. |
Send Resolved |
Select to enable notifications about resolved StackLight alerts. |
|
Slack alerts configuration for StackLight |
Fill out the following Slack alerts parameters as required:
|
Click Create.
To view the deployment status, verify the cluster status on the Clusters page. Once the orange blinking dot near the cluster name disappears, the deployment is complete.
Proceed with Add a machine.
See also
After you create a new AWS-based managed cluster as described in Create a managed cluster, proceed with adding machines to this cluster using the Mirantis Container Cloud web UI.
You can also use the instruction below to scale up an existing managed cluster.
To add a machine to an AWS-based managed cluster:
Log in to the Container Cloud web UI with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the required cluster name. The cluster page with the Machines list opens.
Click Create Machine.
Fill out the form with the following parameters as required:
Parameter |
Description |
---|---|
Count |
Specify the number of machines to create. The required minimum number of machines is three for the manager nodes HA and two for the Container Cloud workloads. Select Manager or Worker to create a Kubernetes manager or worker node. |
Instance type |
From the drop-down list, select the required AWS instance type. For production deployments, Mirantis recommends:
For more details about requirements, see Reference architecture: AWS system requirements. |
AMI ID |
From the drop-down list, select the required AMI ID of Ubuntu 18.04.
For example, |
Root device size |
Select the required root device size, |
Node Labels |
Select the required node labels for the machine to run
certain components on a specific node. For example, for the StackLight nodes
that run Elasticsearch and require more resources than a standard node,
select the StackLight label.
The list of available node labels is obtained
from your current Caution If you deploy StackLight in the HA mode (recommended), add the StackLight label to minimum three nodes. Note You can configure node labels after deploying a machine. On the Machines page, click the More action icon in the last column of the required machine field and select Configure machine. |
Click Create.
Repeat the steps above for the remaining machines.
You can monitor the machine status in the Managers or Workers columns on the Clusters page. Once the status changes to Ready, the deployment of the managed cluster components on this machine is complete.
The machine creation starts with the Provision status. During provisioning, the machine is not expected to be accessible since its infrastructure (VM, network, and so on) is being created.
Other machine statuses are the same as the LCMMachine object states described in Reference Architecture: LCM controller.
Verify the status of the cluster nodes as described in Connect to a Mirantis Container Cloud cluster.
Warning
An operational managed cluster deployment must contain a minimum of 3 Kubernetes manager nodes and 2 Kubernetes worker nodes. The deployment of the cluster does not start until the minimum number of nodes is created.
To meet the etcd quorum and to prevent the deployment failure, deletion of the manager nodes is prohibited.
A machine with the manager node role is automatically deleted during the managed cluster deletion.
See also
Deleting a managed cluster does not require a preliminary deletion of VMs that run on this cluster.
To delete an AWS-based managed cluster:
Log in to the Container Cloud web UI with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the More action icon in the last column of the required cluster and select Delete.
Verify the list of machines to be removed. Confirm the deletion.
Deleting a cluster automatically removes the Amazon Virtual Private Cloud (VPC) connected with this cluster and frees up the resources allocated for this cluster, for example, instances, load balancers, networks, floating IPs.
If you are going to remove the associated regional cluster or if you do not plan to reuse the credentials of the deleted cluster, delete them:
In the Credentials tab, verify that the required credentials are not in the In Use status.
Click the Delete credential action icon next to the name of the credentials to be deleted.
Confirm the deletion.
Warning
You can delete credentials only after deleting the managed cluster they relate to.
Caution
This feature is available as Technology Preview. Use such configuration for testing and evaluation purposes only. For details about the Mirantis Technology Preview support scope, see the Preface section of this guide.
After bootstrapping your VMWare vSphere-based Mirantis Container Cloud management cluster as described in Deployment Guide: Deploy a VMWare vSphere-based management cluster, you can create vSphere-based managed clusters using the Container Cloud web UI.
This section describes how to create a VMWare vSphere-based managed cluster using the Mirantis Container Cloud web UI of the vSphere-based management cluster.
Caution
The proxy support for the vSphere-based managed clusters is only partially integrated in Container Cloud 2.5.0. Therefore, until the feature is announced as generally available, disregard the Proxies tab of the Container Cloud web UI for cluster creation to prevent deployment failures.
To create a vSphere-based managed cluster:
Log in to the Container Cloud web UI with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the SSH Keys tab, click Add SSH Key to upload the public SSH key that will be used for the vSphere VMs creation.
In the Credentials tab:
Click Add Credential to add your vSphere credentials.
You can either upload your vSphere vsphere.yaml
configuration
file or fill in the fields manually.
Verify that the new credentials status is Ready. If the status is Error, hover over the status to determine the reason of the issue.
In the RHEL Licenses tab, click Add RHEL License and fill out the form with the following parameters:
Parameter |
Description |
---|---|
RHEL License Name |
RHEL license name |
Username |
User name to access the RHEL license |
Password |
Password to access the RHEL license |
Pool IDs |
Optional. Specify the pool IDs for RHEL licenses for Virtual Datacenters. Otherwise, Subscription Manager will select a subscription from the list of available and appropriate for the machines. |
In the Clusters tab, click Create Cluster and fill out the form with the following parameters as required:
Configure general settings and Kubernetes parameters:
Section |
Parameter |
Description |
---|---|---|
General Settings |
Name |
Cluster name |
Provider |
Select vSphere |
|
Provider Credential |
From the drop-down list, select the vSphere credentials name that you have previously added. |
|
Release Version |
The Container Cloud version. |
|
Proxy |
Available since 2.5.0, Technology Preview Optional. Disregard this field since the feature is not fully integrated yet. |
|
SSH Keys |
From the drop-down list, select the SSH key name that you have previously added for SSH access to VMs. |
|
Provider |
LB Host IP |
The IP address of the load balancer endpoint that will be used to access the Kubernetes API of the new cluster. |
LB Address Range |
The range of IP addresses that can be assigned to load balancers for Kubernetes Services. |
|
Kubernetes |
Node CIDR |
The Kubernetes nodes CIDR block. For example, |
Services CIDR Blocks |
The Kubernetes Services CIDR block. For example, |
|
Pods CIDR Blocks |
The Kubernetes Pods CIDR block. For example, |
Configure StackLight:
Section |
Parameter name |
Description |
---|---|---|
StackLight |
Enable Monitoring |
Selected by default. Deselect to skip StackLight deployment. Note You can also enable, disable, or configure StackLight parameters after deploying a managed cluster. For details, see Change a cluster configuration or Configure StackLight. |
Enable Logging |
Select to deploy the StackLight logging stack. For details about the logging components, see Reference Architecture: StackLight deployment architecture. |
|
HA Mode |
Select to enable StackLight monitoring in the HA mode. For the differences between HA and non-HA modes, see Reference Architecture: StackLight deployment architecture. |
|
Elasticsearch |
Retention Time |
The Elasticsearch logs retention period in Logstash. |
Persistent Volume Claim Size |
The Elasticsearch persistent volume claim size. |
|
Prometheus |
Retention Time |
The Prometheus database retention period. |
Retention Size |
The Prometheus database retention size. |
|
Persistent Volume Claim Size |
The Prometheus persistent volume claim size. |
|
Enable Watchdog Alert |
Select to enable the Watchdog alert that fires as long as the entire alerting pipeline is functional. |
|
Custom Alerts |
Specify alerting rules for new custom alerts or upload a YAML file in the following exemplary format: - alert: HighErrorRate
expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
for: 10m
labels:
severity: page
annotations:
summary: High request latency
For details, see Official Prometheus documentation: Alerting rules. For the list of the predefined StackLight alerts, see Operations Guide: Available StackLight alerts. |
|
StackLight Email Alerts |
Enable Email Alerts |
Select to enable the StackLight email alerts. |
Send Resolved |
Select to enable notifications about resolved StackLight alerts. |
|
Require TLS |
Select to enable transmitting emails through TLS. |
|
Email alerts configuration for StackLight |
Fill out the following email alerts parameters as required:
|
|
StackLight Slack Alerts |
Enable Slack alerts |
Select to enable the StackLight Slack alerts. |
Send Resolved |
Select to enable notifications about resolved StackLight alerts. |
|
Slack alerts configuration for StackLight |
Fill out the following Slack alerts parameters as required:
|
Click Create.
To view the deployment status, verify the cluster status on the Clusters page. Once the orange blinking dot near the cluster name disappears, the deployment is complete.
Proceed with Add a machine.
See also
After you create a new VMWare vSphere-based Mirantis Container Cloud managed cluster as described in Create a managed cluster, proceed with adding machines to this cluster using the Container Cloud web UI.
You can also use the instruction below to scale up an existing managed cluster.
To add a machine to a vSphere-based managed cluster:
Log in to the Container Cloud web UI with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the required cluster name. The cluster page with Machines list opens.
On the cluster page, click Create Machine.
Fill out the form with the following parameters as required:
Parameter |
Description |
---|---|
Count |
Number of machines to create. The required minimum number of machines is three for the manager nodes HA and two for the Container Cloud workloads. Select Manager or Worker to create a Kubernetes manager or worker node. |
Template Path |
Path to the prepared OVF template. |
SSH Username |
SSH user name to access the node. Defaults to |
RHEL License |
From the drop-down list, select the RHEL license that you previously added for the cluster being deployed. |
Node Labels |
Select the required node labels for the machine to run
certain components on a specific node. For example, for the StackLight nodes
that run Elasticsearch and require more resources than a standard node,
select the StackLight label.
The list of available node labels is obtained
from your current Caution If you deploy StackLight in the HA mode (recommended), add the StackLight label to minimum three nodes. Note You can configure node labels after deploying a machine. On the Machines page, click the More action icon in the last column of the required machine field and select Configure machine. |
Click Create.
Repeat the steps above for the remaining machines.
You can monitor the machine status in the Managers or Workers columns on the Clusters page. Once the status changes to Ready, the deployment of the managed cluster components on this machine is complete.
The machine creation starts with the Provision status. During provisioning, the machine is not expected to be accessible since its infrastructure (VM, network, and so on) is being created.
Other machine statuses are the same as the LCMMachine object states described in Reference Architecture: LCM controller.
Verify the status of the cluster nodes as described in Connect to a Mirantis Container Cloud cluster.
Warning
An operational managed cluster deployment must contain a minimum of 3 Kubernetes manager nodes and 2 Kubernetes worker nodes. The deployment of the cluster does not start until the minimum number of nodes is created.
To meet the etcd quorum and to prevent the deployment failure, deletion of the manager nodes is prohibited.
A machine with the manager node role is automatically deleted during the managed cluster deletion.
See also
Deleting a managed cluster does not require a preliminary deletion of VMs that run on this cluster.
To delete a VMWare vSphere-based managed cluster:
Log in to the Mirantis Container Cloud web UI
with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the More action icon in the last column of the required cluster and select Delete.
Verify the list of machines to be removed. Confirm the deletion.
Deleting a cluster automatically turns the machines off. Therefore, clean up the hosts manually in the vSphere web UI. The machines will be automatically released from the RHEL subscription.
If you are going to remove the associated regional cluster or if you do not plan to reuse the credentials of the deleted cluster, delete them:
In the Credentials tab, verify that the required credentials are not in the In Use status.
Click the Delete credential action icon next to the name of the credentials to be deleted.
Confirm the deletion.
Warning
You can delete credentials only after deleting the managed cluster they relate to.
After deploying a managed cluster, you can enable or disable StackLight
and configure its parameters if enabled. Alternatively, you can configure
StackLight through kubeconfig
as described in Configure StackLight.
To change a cluster configuration:
Log in to the Mirantis Container Cloud web UI
with the writer
permissions.
Select the required project.
On the Clusters page, click the More action icon in the last column of the required cluster and select Configure cluster.
In the Configure cluster window, select or deselect StackLight and configure its parameters if enabled.
Click Update to apply the changes.
A Mirantis Container Cloud management cluster automatically upgrades to a new available Container Cloud release version that supports new Cluster releases. Once done, a newer version of a Cluster release becomes available for managed clusters that you update using the Container Cloud web UI.
Caution
Make sure to update the Cluster release version of your managed cluster before the current Cluster release version becomes unsupported by a new Container Cloud release version. Otherwise, Container Cloud stops auto-upgrade and eventually Container Cloud itself becomes unsupported.
This section describes how to update a managed cluster of any provider type using the Container Cloud web UI.
To update a managed cluster:
For bare metal clusters, add the maintenance
label for Ceph:
Open the KaasCephCluster
CR for editing:
kubectl edit kaascephcluster
Add the maintenance: "true"
label:
metadata:
labels:
maintenance: "true"
Log in to the Container Cloud web UI with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click More action icon in the last column for each cluster and select Update cluster where available.
In the Release Update window, select the required Cluster release to update your managed cluster to.
The Description section contains the list of components versions to be installed with a new Cluster release. The release notes for each Container Cloud and Cluster release are available at Release Notes: Container Cloud releases and Release Notes: Cluster releases.
Click Update.
Before the cluster update starts, Container Cloud performs a backup of MKE and Docker Swarm. The backup directory is located under:
/srv/backup/swarm
on every Contaier Cloud node for Docker Swarm
/srv/backup/ucp
on one of the controller nodes for MKE
To view the update status, verify the cluster status on the Clusters page. Once the orange blinking dot near the cluster name disappears, the update is complete.
For bare metal clusters, remove the maintenance: true
label for Ceph
from the KaasCephCluster
CR once the update is complete
and all nodes are in the Ready status.
Caution
Due to the development limitations, the MCR upgrade to version 19.03.13 or 19.03.14 on existing Container Cloud clusters is not supported.
Note
In rare cases, after a managed cluster upgrade, Grafana may
stop working due to the issues with helm-controller
.
The development team is working on the issue that will be addressed in the upcoming release.
Note
MKE and Kubernetes API may return short-term 50x errors during the upgrade process. Ignore these errors.
This section instructs you on how to scale down an existing managed cluster through the Mirantis Container Cloud web UI.
Warning
An operational managed cluster deployment must contain a minimum of 3 Kubernetes manager nodes and 2 Kubernetes worker nodes. The deployment of the cluster does not start until the minimum number of nodes is created.
To meet the etcd quorum and to prevent the deployment failure, deletion of the manager nodes is prohibited.
A machine with the manager node role is automatically deleted during the managed cluster deletion.
To delete a machine from a managed cluster:
Log in to the Container Cloud web UI with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click on the required cluster name to open the list of machines running on it.
Click the More action icon in the last column of the machine you want to delete and select Delete. Confirm the deletion.
Deleting a machine automatically frees up the resources allocated to this machine.
Starting from Mirantis Kubernetes Engine (MKE) 3.3.3, you can attach an existing MKE cluster that is not deployed by Mirantis Container Cloud to a management cluster. This feature allows for visualization of all your MKE clusters details in one place including clusters health, capacity, and usage.
For supported configurations of existing MKE clusters that are not deployed by Container Cloud, see MKE, MSR, and MCR Compatibility Matrix.
Note
Using the free Mirantis license, you can create up to three Container Cloud managed clusters with three worker nodes on each cluster. Within the same quota, you can also attach existing MKE clusters that are not deployed by Container Cloud. If you need to increase this quota, contact Mirantis support for further details.
Using the instruction below, you can also install StackLight to your existing MKE cluster during the attach procedure. For the StackLight system requirements, refer to the Reference Architecture: Requirements of the corresponding cloud provider.
You can also update all your MKE clusters to the latest version once your management cluster automatically updates to a newer version where a new MKE Cluster release with the latest MKE version is available. For details, see Update a managed cluster.
Caution
An MKE cluster can be attached to only one management cluster. Attachment of a Container Cloud-based MKE cluster to another management cluster is not supported.
Deletion of a Container Cloud project with attached MKE clusters is not supported.
To attach an existing MKE cluster:
Log in to the Container Cloud web UI with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, expand the Create Cluster menu and click Attach Existing MKE Cluster.
In the wizard that opens, fill out the form with the following parameters as required:
Configure general settings:
Section |
Parameter |
Description |
---|---|---|
General Settings |
Cluster Name |
Specify the cluster name. |
Region |
Select the required cloud provider: OpenStack, AWS, or bare metal. |
Upload the MKE client bundle or fill in the fields manually. To download the MKE client bundle, refer to MKE user access: Download client certificates.
Configure StackLight:
Section |
Parameter name |
Description |
---|---|---|
StackLight |
Enable Monitoring |
Selected by default. Deselect to skip StackLight deployment. Note You can also enable, disable, or configure StackLight parameters after deploying a managed cluster. For details, see Change a cluster configuration or Configure StackLight. |
Enable Logging |
Select to deploy the StackLight logging stack. For details about the logging components, see Reference Architecture: StackLight deployment architecture. |
|
HA Mode |
Select to enable StackLight monitoring in the HA mode. For the differences between HA and non-HA modes, see Reference Architecture: StackLight deployment architecture. |
|
Elasticsearch |
Retention Time |
The Elasticsearch logs retention period in Logstash. |
Persistent Volume Claim Size |
The Elasticsearch persistent volume claim size. |
|
Prometheus |
Retention Time |
The Prometheus database retention period. |
Retention Size |
The Prometheus database retention size. |
|
Persistent Volume Claim Size |
The Prometheus persistent volume claim size. |
|
Enable Watchdog Alert |
Select to enable the Watchdog alert that fires as long as the entire alerting pipeline is functional. |
|
Custom Alerts |
Specify alerting rules for new custom alerts or upload a YAML file in the following exemplary format: - alert: HighErrorRate
expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
for: 10m
labels:
severity: page
annotations:
summary: High request latency
For details, see Official Prometheus documentation: Alerting rules. For the list of the predefined StackLight alerts, see Operations Guide: Available StackLight alerts. |
|
StackLight Email Alerts |
Enable Email Alerts |
Select to enable the StackLight email alerts. |
Send Resolved |
Select to enable notifications about resolved StackLight alerts. |
|
Require TLS |
Select to enable transmitting emails through TLS. |
|
Email alerts configuration for StackLight |
Fill out the following email alerts parameters as required:
|
|
StackLight Slack Alerts |
Enable Slack alerts |
Select to enable the StackLight Slack alerts. |
Send Resolved |
Select to enable notifications about resolved StackLight alerts. |
|
Slack alerts configuration for StackLight |
Fill out the following Slack alerts parameters as required:
|
Click Create.
To view the deployment status, verify the cluster status on the Clusters page. Once the orange blinking dot near the cluster name disappears, the deployment is complete.
After you deploy a new or attach an existing Mirantis Kubernetes Engine (MKE) cluster to a management cluster, start managing your cluster using the MKE web UI.
To connect to the MKE web UI:
Log in to the Mirantis Container Cloud web UI
with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the More action icon in the last column of the required MKE cluster and select Cluster info.
In the dialog box with the cluster information, copy the MKE UI endpoint.
Paste the copied IP to a web browser and use the same credentials that you use to access the Container Cloud web UI.
Warning
To ensure the Container Cloud stability in managing the Container Cloud-based MKE clusters, a number of MKE API functions is not available for the Container Cloud-based MKE clusters as compared to the attached MKE clusters that are deployed not by Container Cloud. Use the Container Cloud web UI or CLI for this functionality instead.
See Reference Architecture: MKE API limitations for details.
Caution
The MKE web UI contains help links that lead to the MKE, MSR, and MCR documentation suite. Besides MKE and Mirantis Container Runtime (MCR), which are integrated with Container Cloud, that documentation suite covers other MKE, MSR, and MCR components and cannot be fully applied to the Container Cloud-based MKE clusters. Therefore, to avoid any sort of misconceptions, before you proceed with MKE web UI documentation, read Reference Architecture: MKE API limitations and make sure you are using the documentation of the supported MKE version as per Release Compatibility Matrix.
After you deploy a Mirantis Container Cloud management or managed cluster, connect to the cluster to verify the availability and status of the nodes as described below.
This section also describes how to SSH to a node of a cluster where Bastion host is used for SSH access. For example, on the OpenStack-based management cluster or AWS-based management and managed clusters.
To connect to a managed cluster:
Log in to the Container Cloud web UI with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the required cluster name. The cluster page with the Machines list opens.
Verify the status of the manager nodes. Once the first manager node is deployed and has the Ready status, the Download Kubeconfig option for the cluster being deployed becomes active.
Open the Clusters tab.
Click the More action icon in the last column of the required cluster and select Download Kubeconfig:
Enter your user password.
Not recommended. Select Offline Token to generate an offline
IAM token. Otherwise, for security reasons, the kubeconfig
token
expires every 30 minutes of the Container Cloud API idle time
and you have to download kubeconfig
again with a newly generated
token.
Click Download.
Verify the availability of the managed cluster machines:
Export the kubeconfig
parameters to your local machine with
access to kubectl. For example:
export KUBECONFIG=~/Downloads/kubeconfig-test-cluster.yml
Obtain the list of available Container Cloud machines:
kubectl get nodes -o wide
The system response must contain the details of the nodes
in the READY
status.
To connect to a management cluster:
Log in to a local machine where your management cluster kubeconfig
is located and where kubectl
is installed.
Note
The management cluster kubeconfig
is created
during the last stage of the management cluster bootstrap.
Obtain the list of available management cluster machines:
kubectl get nodes -o wide
The system response must contain the details of the nodes
in the READY
status.
To SSH to a Container Cloud cluster node if Bastion is used:
Obtain kubeconfig
of the management or managed cluster as described
in the procedures above.
Obtain the internal IP address of a node you require access to:
kubectl get nodes -o wide
Obtain the Bastion public IP:
kubectl get cluster -o jsonpath='{.status.providerStatus.bastion.publicIp}' \
-n <project_name> <cluster_name>
Run the following command:
ssh -i <private_key> ubuntu@<node_internal_ip> -o "proxycommand ssh -W %h:%p \
-i <private_key> ubuntu@<bastion_public_ip>"
Substitute the parameters enclosed in angle brackets
with the corresponding values of your cluster obtained in previous steps.
The <private_key>
for a management cluster is located at
~/.ssh/openstack_tmp
. For a managed cluster, this is
the SSH Key that you added in the Container Cloud web UI
before the managed cluster creation.
The Mirantis Container Cloud web UI enables you to perform the following operations with the Container Cloud management and regional clusters:
View the cluster details (such as cluster ID, creation date, nodes count, and so on) as well as obtain a list of the cluster endpoints including the StackLight components, depending on your deployment configuration.
To view generic cluster details, in the Clusters tab, click the More action icon in the last column of the required cluster and select Cluster info.
Note
Adding more than 3 nodes or deleting nodes from a management or regional cluster is not supported.
Removing a management or regional cluster using the Container Cloud web UI is not supported. Use the dedicated cleanup script instead. For details, see Remove a management cluster and Remove a regional cluster.
Before removing a regional cluster, delete the credentials of the deleted managed clusters associated with the region.
Verify the current release version of the cluster including the list of installed components with their versions and the cluster release change log.
To view a cluster release version details, in the Clusters tab, click the version in the Release column next to the name of the required cluster.
This section outlines the operations that can be performed with a management or regional cluster.
A management cluster upgrade to a newer version is performed automatically once a new Container Cloud version is released. Regional clusters also upgrade automatically along with the management cluster. For more details about the Container Cloud release upgrade mechanism, see: Reference Architecture: Container Cloud release controller.
Container Cloud remains operational during the management and regional clusters upgrade. Managed clusters are not affected during this upgrade. For the list of components that are updated during the Container Cloud upgrade, see the Components versions section of the corresponding Container Cloud release in Release Notes.
When Mirantis announces support of the newest versions of Mirantis Container Runtime (MCR) and Mirantis Kubernetes Engine (MKE), Container Cloud automatically upgrades these components as well. For the maintenance window best practices before upgrade of these components, see official MKE, MSR, and MCR Documentation.
Caution
Due to the development limitations, the MCR upgrade to version 19.03.13 or 19.03.14 on existing Container Cloud clusters is not supported.
Note
MKE and Kubernetes API may return short-term 50x errors during the upgrade process. Ignore these errors.
Caution
This feature is available starting from the Container Cloud release 2.5.0.
If you did not add the NTP server parameters during the management cluster bootstrap, configure them on the existing regional cluster as required. These parameters are applied to all machines of regional and managed clusters in the specified region.
Warning
The procedure below triggers an upgrade of all clusters in a specific region, which may lead to workload disruption during nodes cordoning and draining.
To configure an NTP server for a regional cluster:
Download your management cluster kubeconfig
:
Log in to the Mirantis Container Cloud web UI
with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
Expand the menu of the tab with your user name.
Click Download kubeconfig to download kubeconfig
of your management cluster.
Log in to any local machine with kubectl
installed.
Copy the downloaded kubeconfig
to this machine.
Use the downloaded kubeconfig
to edit the management cluster:
kubectl --kubeconfig <kubeconfigPath> edit -n <projectName> cluster <managementClusterName>
In the command above and the step below, replace the parameters enclosed in angle brackets with the corresponding values of your cluster.
In the regional
section, add the ntp:servers
section with the
list of required servers names:
spec:
...
providerSpec:
value:
kaas:
...
regional:
- helmReleases:
- name: <providerName>
values:
config:
lcm:
...
ntp:
servers:
- 0.pool.ntp.org
...
This section describes how to remove a management cluster.
To remove a management cluster:
Verify that you have successfully removed all managed clusters that run on top of the management cluster to be removed. For details, see the corresponding Delete a managed cluster section depending on your cloud provider in Create and operate managed clusters.
Log in to a local machine where your management cluster kubeconfig
is located and where kubectl
is installed.
Note
The management cluster kubeconfig
is created
during the last stage of the management cluster bootstrap.
Run the following script:
bootstrap.sh cleanup
Note
Removing a management or regional cluster using the Container Cloud web UI is not supported.
This section describes how to remove a regional cluster.
To remove a regional cluster:
Log in to the Container Cloud web UI with the writer
permissions.
Switch to the project with the managed clusters of the regional cluster to remove using the Switch Project action icon located on top of the main left-side navigation panel.
Verify that you have successfully deleted all managed clusters that run on top of the regional cluster to be removed. For details, see the corresponding Delete a managed cluster section depending on your cloud provider in Create and operate managed clusters.
Delete the credentials associated with the region:
In the Credentials tab, click the first credentials name.
In the window that opens, capture the Region Name field.
Repeat two previous steps for the remaining credentials in the list.
Delete all credentials with the name of the region that you are going to remove.
Log in to a local machine where your management and regional clusters
kubeconfig
files are located and where kubectl
is installed.
Note
The management or regional cluster kubeconfig
files
are created during the last stage of the management or regional
cluster bootstrap.
Run the following script with the corresponding values of your cluster:
REGIONAL_CLUSTER_NAME=<regionalClusterName> REGIONAL_KUBECONFIG=<pathToRegionalClusterKubeconfig> KUBECONFIG=<mgmtClusterKubeconfig> ./bootstrap.sh destroy_regional
Note
Removing a management or regional cluster using the Container Cloud web UI is not supported.
IAM CLI is a user-facing command-line tool for managing scopes, roles,
and grants. Using your personal credentials, you can perform different
IAM operations through the iamctl
tool. For example, you can
verify the current status of the IAM service, request or revoke service tokens,
verify your own grants within Mirantis Container Cloud
as well as your token details.
The iamctl
command-line interface uses the iamctl.yaml
configuration file to interact with IAM.
To create the IAM CLI configuration file:
Log in to the management cluster.
Change the directory to one of the following:
$HOME/.iamctl
$HOME
$HOME/etc
/etc/iamctl
Create iamctl.yaml
with the following exemplary parameters and values
that correspond to your deployment:
server: <IAM_API_ADDRESS>
timeout: 60
verbose: 99 # Verbosity level, from 0 to 99
tls:
enabled: true
ca: <PATH_TO_CA_BUNDLE>
auth:
issuer: <IAM_REALM_IN_KEYCLOAK>
ca: <PATH_TO_CA_BUNDLE>
client_id: iam
client_secret:
The <IAM_REALM_IN_KEYCLOAK>
value has the
<keycloak-url>/auth/realms/<realm-name>
format, where <realm-name>
defaults to iam
.
Using iamctl, you can perform different role-based access control operations in your managed cluster. For example:
Grant or revoke access to a managed cluster and a specific user for troubleshooting
Grant or revoke access to a Mirantis Container Cloud project that contains several managed clusters
Create or delete tokens for the Container Cloud services with a specific set of grants as well as identify when a service token was used the last time
The iamctl command-line interface contains the following set of commands:
The following tables describe the iamctl commands with their descriptions.
Usage |
Description |
---|---|
iamctl --help, iamctl help |
Output the list of available commands. |
iamctl help <command> |
Output the description of a specific command. |
Usage |
Description |
---|---|
iamctl account info |
Output detailed account information such as user email, user name, the details of their active and offline sessions, tokens statuses and expiration dates. |
iamctl account login |
Log in the current user. The system prompts to enter your authentication
credentials. After a successful login, your user token is added to the
|
iamctl account logout |
Log out the current user.
Once done, the user information is removed from |
Usage |
Description |
---|---|
iamctl scope list |
List the IAM scopes available for the current environment. Example output: +---------------+--------------------------+
| NAME | DESCRIPTION |
+---------------+--------------------------+
| m:iam | IAM scope |
| m:kaas | Container Cloud scope |
| m:k8s:managed | |
| m:k8s | Kubernetes scope |
| m:cloud | Cloud scope |
+---------------+--------------------------+
|
iamctl scope list [prefix] |
Output the specified scope list. For example: iamctl m:k8s. |
Usage |
Description |
---|---|
iamctl role list <scope> |
List the roles for the specified scope in IAM. |
iamctl role show <scope> <role> |
Output the details of the specified scope role including the role name
( |
Usage |
Description |
---|---|
iamctl grant give [username] [scope] [role] |
Provide a user with a role in a scope. For example, the
iamctl grant give jdoe m:iam admin command provides the IAM
For the list of supported IAM scopes and roles, see: Role list. Note To lock or disable a user, use LDAP or Google OAuth depending on the external provider integrated to your deployment. |
iamctl grant list <username> |
List the grants provided to the specified user. For example: iamctl grant list jdoe. Example output: +--------+--------+---------------+
| SCOPE | ROLE | GRANT FQN |
+--------+--------+---------------+
| m:iam | admin | m:iam@admin |
| m:sl | viewer | m:sl@viewer |
| m:kaas | writer | m:kaas@writer |
+--------+--------+---------------+
|
iamctl grant revoke [username] [scope] [role] |
Revoke the grants provided to the user. |
Usage |
Description |
---|---|
iamctl servicetoken list [--all] |
List the details of all service tokens created by the current user. The output includes the following service token details:
|
iamctl servicetoken show [ID] |
Output the details of a service token with the specified ID. |
iamctl servicetoken create [alias] [service] [grant1 grants2...] |
Create a token for a specific service with the specified set of grants. For example, iamctl servicetoken create new-token iam m:iam@viewer. |
iamctl servicetoken delete [ID1 ID2...] |
Delete a service token with the specified ID. |
Usage |
Description |
---|---|
iamctl user list |
List user names and emails of all current users. |
iamctl user show <username> |
Output the details of the specified user. |
Mirantis Container Cloud creates the IAM roles in scopes.
For each application type, such as iam
, k8s
, or kaas
,
Container Cloud creates a scope in Keycloak.
And every scope contains a set of roles such as admin
, user
,
viewer
. The default IAM roles can be changed during a managed cluster
deployment. You can grant or revoke a role access using the IAM CLI.
For details, see: IAM CLI.
Example of the structure of a cluster-admin
role in a managed cluster:
m:k8s:kaas-tenant-name:k8s-cluster-name@cluster-admin
m
- prefix for all IAM roles in Container Cloud
k8s
- application type, Kubernetes
kaas-tenant-name:k8s-cluster-name
- a managed cluster identifier
in Container Cloud (CLUSTER_ID
)
@
- delimiter between a scope and role
cluster-admin
- name of the role within the Kubernetes scope
The following tables include the scopes and their roles descriptions by Container Cloud components:
Scope identifier |
Role name |
Grant example |
Role description |
---|---|---|---|
|
|
|
List the managed clusters within the Container Cloud scope. |
|
|
Create or delete the managed clusters within the Container Cloud scope. |
|
|
|
Add or delete a bare metal host and machine within the Container Cloud scope, create a project. |
|
|
|
|
List the managed clusters within the specified Container Cloud cluster ID. |
|
|
Create or delete the managed clusters within the specified Container Cloud cluster ID. |
Grant is available by default. Other grants can be added during a management and managed cluster deployment.
Scope identifier |
Role name |
Grant example |
Role description |
---|---|---|---|
|
|
|
Allow the super-user access to perform any action on any resource
on the cluster level.
When used in |
Scope identifier |
Role name |
Grant example |
Role description |
---|---|---|---|
|
|
|
Access the specified web UI(s) within the scope. The |
Using StackLight, you can monitor the components deployed in Mirantis Container Cloud and be quickly notified of critical conditions that may occur in the system to prevent service downtimes.
By default, StackLight provides five web UIs including Prometheus, Alertmanager, Alerta, Kibana, and Grafana. This section describes how to access any of these web UIs. To use an optional Cerebro web UI, which is disabled by default, to debug the Elasticsearch clusters, see Access Elasticsearch clusters using Cerebro.
To access a StackLight web UI:
Log in to the Mirantis Container Cloud web UI.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
In the Clusters tab, click the More action icon in the last column of the required cluster and select Cluster info.
In the dialog box with the cluster information, copy the required endpoint IP from the StackLight Endpoints section.
Paste the copied IP to a web browser and use the default credentials to log in to the web UI. Once done, you are automatically authenticated to all StackLight web UIs.
Note
The Alertmanager web UI displays alerts received by all configured receivers, which can be mistaken for duplicates. To only display the alerts received by a particular receiver, use the Receivers filter.
Using the Grafana web UI, you can view the visual representation of the metric graphs based on the time series databases.
Note
Starting from the Container Cloud release 2.4.0, most Grafana dashboards include a View logs in Kibana link to immediately view relevant logs in the Kibana web UI.
To view the Grafana dashboards:
Log in to the Grafana web UI as described in Access StackLight web UIs.
From the drop-down list, select the required dashboard to inspect the status and statistics of the corresponding service in your management or managed cluster:
Component |
Dashboard |
Description |
---|---|---|
Ceph cluster |
Ceph Cluster |
Provides the overall health status of the Ceph cluster, capacity, latency, and recovery metrics. |
Ceph Nodes |
Provides an overview of the host-related metrics, such as the number of Ceph Monitors, Ceph OSD hosts, average usage of resources across the cluster, network and hosts load. |
|
Ceph OSD |
Provides metrics for Ceph OSDs, including the Ceph OSD read and write latencies, distribution of PGs per Ceph OSD, Ceph OSDs and physical device performance. |
|
Ceph Pools |
Provides metrics for Ceph pools, including the client IOPS and throughput by pool and pools capacity usage. |
|
Ironic bare metal |
Ironic BM |
Provides graphs on Ironic health, HTTP API availability, provisioned
nodes by state and installed |
Container Cloud clusters |
Clusters Overview |
Represents the main cluster capacity statistics for all clusters of a Mirantis Container Cloud deployment where StackLight is installed. |
Kubernetes resources |
Kubernetes Calico |
Provides metrics of the entire Calico cluster usage, including the cluster status, host status, and Felix resources. |
Kubernetes Cluster |
Provides metrics for the entire Kubernetes cluster, including the cluster status, host status, and resources consumption. |
|
Kubernetes Deployments |
Provides information on the desired and current state of all service replicas deployed on a Container Cloud cluster. |
|
Kubernetes Namespaces |
Provides the pods state summary and the CPU, MEM, network, and IOPS resources consumption per name space. |
|
Kubernetes Nodes |
Provides charts showing resources consumption per Container Cloud cluster node. |
|
Kubernetes Pods |
Provides charts showing resources consumption per deployed pod. |
|
NGINX |
NGINX |
Provides the overall status of the NGINX cluster and information about NGINX requests and connections. |
StackLight |
Alertmanager |
Provides performance metrics on the overall health status of the Prometheus Alertmanager service, the number of firing and resolved alerts received for various periods, the rate of successful and failed notifications, and the resources consumption. |
Elasticsearch |
Provides information about the overall health status of the Elasticsearch cluster, including the resources consumption and the state of the shards. |
|
Grafana |
Provides performance metrics for the Grafana service, including the total number of Grafana entities, CPU and memory consumption. |
|
PostgreSQL |
Provides PostgreSQL statistics, including read (DQL) and write (DML) row operations, transaction and lock, replication lag and conflict, and checkpoint statistics, as well as PostgreSQL performance metrics. |
|
Prometheus |
Provides the availability and performance behavior of the Prometheus servers, the sample ingestion rate, and system usage statistics per server. Also, provides statistics about the overall status and uptime of the Prometheus service, the chunks number of the local storage memory, target scrapes, and queries duration. |
|
Pushgateway |
Provides performance metrics and the overall health status of the service, the rate of samples received for various periods, and the resources consumption. |
|
Prometheus Relay |
Provides service status and resources consumption metrics. |
|
Telemeter Server |
Provides statistics and the overall health status of the Telemeter service. |
|
System |
System |
Provides a detailed resource consumption and operating system information per Container Cloud cluster node. |
Mirantis Kubernetes Engine (MKE) |
MKE Cluster |
Provides a global overview of an MKE cluster: statistics about the number of the worker and manager nodes, containers, images, Swarm services. |
MKE Containers |
Provides per container resources consumption metrics for the MKE containers such as CPU, RAM, network. |
Using the Kibana web UI, you can view the visual representation of logs and Kubernetes events of your deployment.
To view the Kibana dashboards:
Log in to the Kibana web UI as described in Access StackLight web UIs.
Click the required dashboard to inspect the visualizations or perform a search:
Dashboard |
Description |
---|---|
Logs |
Provides visualizations on the number of log messages per severity, source, and top log-producing host, namespaces, containers, and applications. Includes search. |
Kubernetes events |
Provides visualizations on the number of Kubernetes events per type, and top event-producing resources and namespaces by reason and event type. Includes search. |
This section provides an overview of the available predefined StackLight alerts. To view the alerts, use the Prometheus web UI. To view the firing alerts, use Alertmanager or Alerta web UI.
Caution
This feature is available starting from the Container Cloud release 2.4.0.
Using alert inhibition rules, Alertmanager decreases alert noise by suppressing dependent alerts notifications to provide a clearer view on the cloud status and simplify troubleshooting. Alert inhibition rules are enabled by default.
The following table describes the dependency between alerts. Once an alert from the Alert column raises, the alert from the Silences column will be suppressed with the Inhibited status in the Alertmanager web UI.
Alert |
Silences |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This section describes the alerts for the Alertmanager service.
Severity |
Warning |
---|---|
Summary |
Failure to reload the Alertmanager configuration. |
Description |
Reloading the Alertmanager configuration failed for the
|
Severity |
Major |
---|---|
Summary |
Alertmanager cluster members are not found. |
Description |
Alertmanager has not found all other members of the cluster. |
Severity |
Warning |
---|---|
Summary |
Alertmanager has failed notifications. |
Description |
An average of |
Severity |
Warning |
---|---|
Summary |
Alertmanager has invalid alerts. |
Description |
An average of |
This section describes the alerts for Calico.
Severity |
Warning |
---|---|
Summary |
High number of data plane failures within Felix. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
Felix address message batch size is higher than 5. |
Description |
The size of the data plane address message batch on the
|
Severity |
Warning |
---|---|
Summary |
Felix interface message batch size is higher than 5. |
Description |
The size of the data plane interface message batch on the
|
Severity |
Warning |
---|---|
Summary |
More than 5 IPset errors occur in Felix per hour. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
More than 5 iptable save errors occur in Felix per hour. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
More than 5 iptable restore errors occur in Felix per hour. |
Description |
The |
This section describes the alerts for the Ceph cluster.
Severity |
Minor |
---|---|
Summary |
Ceph cluster health is |
Description |
The Ceph cluster is in the |
Severity |
Critical |
---|---|
Summary |
Ceph cluster health is |
Description |
The Ceph cluster is in the |
Severity |
Major |
---|---|
Summary |
Ceph cluster quorum is at risk. |
Description |
The Ceph cluster quorum is low. |
Severity |
Minor |
---|---|
Summary |
Ceph OSDs are down. |
Description |
|
Severity |
Critical |
---|---|
Summary |
Disk is not responding. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
Ceph cluster is nearly full. |
Description |
The Ceph cluster utilization has crossed 85%, expansion is required. |
Severity |
Critical |
---|---|
Summary |
Ceph cluster is full. |
Description |
The Ceph cluster utilization has crossed 95%, immediate expansion is required. |
Severity |
Warning |
---|---|
Summary |
Some Ceph OSDs have more than 200 PGs. |
Description |
Some Ceph OSDs contain more than 200 Placement Groups. This may have a
negative impact on the cluster performance. For details, run
|
Severity |
Critical |
---|---|
Summary |
Some Ceph OSDs have more than 300 PGs. |
Description |
Some Ceph OSDs contain more than 300 Placement Groups. This may have a
negative impact on the cluster performance. For details, run
|
Severity |
Warning |
---|---|
Summary |
Too many leader changes occur in the Ceph cluster. |
Description |
|
Severity |
Critical |
---|---|
Summary |
Ceph node |
Description |
The |
Severity |
Warning |
---|---|
Summary |
Multiple versions of Ceph OSDs are running. |
Description |
|
Severity |
Warning |
---|---|
Summary |
Multiple versions of Ceph Monitors are running. |
Description |
|
Severity |
Minor |
---|---|
Summary |
Too many inconsistent Ceph PGs. |
Description |
The Ceph cluster detects inconsistencies in one or more replicas of an
object in |
Severity |
Minor |
---|---|
Summary |
Too many undersized Ceph PGs. |
Description |
The Ceph cluster reports |
This section describes the alerts for the Docker Swarm service.
Severity |
Major |
---|---|
Summary |
Docker Swarm Manager leadership election loop. |
Description |
More than 2 Docker Swarm leader elections occur for the last 10 minutes. |
Severity |
Warning |
---|---|
Summary |
Docker Swarm network is unhealthy. |
Description |
The Note For the |
Severity |
Major |
---|---|
Summary |
Docker Swarm node is flapping. |
Description |
The |
Severity |
Major |
---|---|
Summary |
Docker Swarm replica is down. |
Description |
The |
Severity |
Major |
---|---|
Summary |
Docker Swarm service replica is flapping. |
Description |
The |
Severity |
Critical |
---|---|
Summary |
Docker Swarm service outage. |
Description |
All |
This section describes the alerts for the Elasticsearch service.
Severity |
Critical |
---|---|
Summary |
Elasticsearch heap usage is too high (>90%). |
Description |
Elasticsearch heap usage is over 90% for 5 minutes. |
Severity |
Warning |
---|---|
Summary |
Elasticsearch heap usage is high (>80%). |
Description |
Elasticsearch heap usage is over 80% for 5 minutes. |
Severity |
Critical |
---|---|
Summary |
Elasticsearch critical status. |
Description |
The Elasticsearch cluster status has changed to |
Severity |
Warning |
---|---|
Summary |
Elasticsearch warning status. |
Description |
The Elasticsearch cluster status has changed to |
Severity |
Warning |
---|---|
Summary |
Shards relocation takes more than 20 minutes. |
Description |
Elasticsearch has |
Severity |
Warning |
---|---|
Summary |
Shards initialization takes more than 10 minutes. |
Description |
Elasticsearch has |
Severity |
Major |
---|---|
Summary |
Shards have unassigned status for 5 minutes. |
Description |
Elasticsearch has |
Severity |
Warning |
---|---|
Summary |
Tasks have pending state for 10 minutes. |
Description |
Elasticsearch has |
Severity |
Major |
---|---|
Summary |
Elasticsearch cluster has no new data for 30 minutes. |
Description |
No new data has arrived to the Elasticsearch cluster for 30 minutes. |
Severity |
Warning |
---|---|
Summary |
Elasticsearch node has no new data for 30 minutes. |
Description |
No new data has arrived to the |
This section describes the alerts for the etcd service.
Severity |
Critical |
---|---|
Summary |
The etcd cluster has insufficient members. |
Description |
The |
Severity |
Critical |
---|---|
Summary |
The etcd cluster has no leader. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
More than 3 leader changes occurred in the the etcd cluster within the last hour. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The etcd cluster has slow gRPC requests. |
Description |
The gRPC requests to |
Severity |
Warning |
---|---|
Summary |
The etcd cluster has slow member communication. |
Description |
The member communication with |
Severity |
Warning |
---|---|
Summary |
The etcd cluster has more than 5 proposal failures. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The etcd cluster has high fync duration. |
Description |
The duration of 99% of all fync operations on the
|
Severity |
Warning |
---|---|
Summary |
The etcd cluster has high commit duration. |
Description |
The duration of 99% of all commit operations on the
|
This section describes the alerts for external endpoints.
Severity |
Critical |
---|---|
Summary |
External endpoint is down. |
Description |
The |
Severity |
Critical |
---|---|
Summary |
Failure to establish a TCP or TLS connection. |
Description |
The system cannot establish a TCP or TLS connection to
|
This section lists the general available alerts.
Severity |
Critical |
---|---|
Summary |
The |
Description |
The |
Severity |
Critical |
---|---|
Summary |
The |
Description |
The |
Severity |
Critical |
---|---|
Summary |
The |
Description |
The |
Severity |
None |
---|---|
Summary |
Watchdog alert that is always firing. |
Description |
This alert ensures that the entire alerting pipeline is functional.
This alert should always be firing in Alertmanager against a receiver.
Some integrations with various notification mechanisms can send a
notification when this alert is not firing. For example, the
|
This section lists the general alerts for Kubernetes nodes.
Severity |
Critical |
---|---|
Summary |
Node uses 95% of file descriptors. |
Description |
The |
Severity |
Major |
---|---|
Summary |
Node uses 90% of file descriptors. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
Node uses 80% of file descriptors. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
High CPU consumption. |
Description |
The average CPU consumption on the |
Severity |
Warning |
---|---|
Summary |
System load is more than 1 per CPU. |
Description |
The system load per CPU on the |
Severity |
Critical |
---|---|
Summary |
System load is more than 2 per CPU. |
Description |
The system load per CPU on the |
Severity |
Warning |
---|---|
Summary |
Disk partition |
Description |
The |
Severity |
Major |
---|---|
Summary |
Disk partition |
Description |
The |
Severity |
Warning |
---|---|
Summary |
More than 90% of memory is used or less than 8 GB is available. |
Description |
The |
Severity |
Major |
---|---|
Summary |
More than 95% of memory is used or less than 4 GB of memory is available. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
Severity |
Major |
---|---|
Summary |
The |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
This section describes the alerts for Ironic bare metal. The alerted events include Ironic API availability and Ironic processes availability.
Severity |
Major |
---|---|
Summary |
Ironic metrics missing. |
Description |
Metrics retrieved from the Ironic API are not available for 2 minutes. |
Severity |
Critical |
---|---|
Summary |
Ironic API outage. |
Description |
The Ironic API is not accessible. |
This section lists the alerts for Kubernetes applications.
Severity |
Critical |
---|---|
Summary |
The |
Description |
The |
Severity |
Critical |
---|---|
Summary |
The |
Description |
The |
Severity |
Major |
---|---|
Summary |
The |
Description |
The |
Severity |
Major |
---|---|
Summary |
The |
Description |
The |
Severity |
Major |
---|---|
Summary |
The |
Description |
The |
Severity |
Critical |
---|---|
Summary |
The |
Description |
The |
Severity |
Major |
---|---|
Summary |
The |
Description |
The |
Severity |
Major |
---|---|
Summary |
The |
Description |
Only |
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
Severity |
Minor |
---|---|
Summary |
The |
Description |
The |
Severity |
Minor |
---|---|
Summary |
The |
Description |
The |
This section lists the alerts for Kubernetes resources.
Severity |
Warning |
---|---|
Summary |
Kubernetes has overcommitted CPU requests. |
Description |
The Kubernetes cluster has overcommitted CPU resource requests for Pods and cannot tolerate node failure. |
Severity |
Warning |
---|---|
Summary |
Kubernetes has overcommitted memory requests. |
Description |
The Kubernetes cluster has overcommitted memory resource requests for Pods and cannot tolerate node failure. |
Severity |
Warning |
---|---|
Summary |
Kubernetes has overcommitted CPU requests for namespaces. |
Description |
The Kubernetes cluster has overcommitted CPU resource requests for namespaces. |
Severity |
Warning |
---|---|
Summary |
Kubernetes has overcommitted memory requests for namespaces. |
Description |
The Kubernetes cluster has overcommitted memory resource requests for namespaces. |
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
This section lists the alerts for Kubernetes storage.
Caution
Due to the upstream bug in Kubernetes,
metrics for the KubePersistentVolumeUsageCritical
and
KubePersistentVolumeFullInFourDays
alerts that are collected
for persistent volumes provisioned by cinder-csi-plugin
are not available.
Severity |
Critical |
---|---|
Summary |
The |
Description |
The PersistentVolume claimed by |
Severity |
Warning |
---|---|
Summary |
The |
Description |
Based on the recent sampling, the |
Severity |
Critical |
---|---|
Summary |
The status of the |
Description |
The status of the |
This section lists the alerts for the Kubernetes system.
Severity |
Warning |
---|---|
Summary |
The |
Description |
The Kubernetes |
Severity |
Warning |
---|---|
Summary |
Kubernetes components have mismatching versions. |
Description |
Kubernetes has components with |
Severity |
Warning |
---|---|
Summary |
Kubernetes API client has more than 1% of error requests. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
kubelet reached 90% of Pods limit. |
Description |
The |
Severity |
Critical |
---|---|
Summary |
Kubernetes API endpoint is down. |
Description |
The Kubernetes API endpoint |
Severity |
Critical |
---|---|
Summary |
Kubernetes API is down. |
Description |
The Kubernetes API is not accessible for the last 30 seconds. |
Severity |
Warning |
---|---|
Summary |
The API server has a 99th percentile latency of more than 1 second. |
Description |
The API server has a 99th percentile latency of |
Severity |
Major |
---|---|
Summary |
The API server has a 99th percentile latency of more than 4 seconds. |
Description |
The API server has a 99th percentile latency of |
Severity |
Major |
---|---|
Summary |
API server returns errors for more than 3% of requests. |
Description |
The API server returns errors for |
Severity |
Warning |
---|---|
Summary |
API server returns errors for more than 1% of requests. |
Description |
The API server returns errors for |
Severity |
Major |
---|---|
Summary |
API server returns errors for 10% of requests. |
Description |
The API server returns errors for |
Severity |
Warning |
---|---|
Summary |
API server returns errors for 5% of requests. |
Description |
The API server returns errors for |
Severity |
Warning |
---|---|
Summary |
A client certificate expires in 7 days. |
Description |
A client certificate used to authenticate to the API server expires in less than 7 days. |
Severity |
Critical |
---|---|
Summary |
A client certificate expires in 24 hours. |
Description |
A client certificate used to authenticate to the API server expires in less than 24 hours. |
Severity |
Warning |
---|---|
Summary |
Failure to get Kubernetes container metrics. |
Description |
Prometheus was not able to scrape metrics from the container on the
|
This section lists the alerts for the Netchecker service.
Severity |
Warning |
---|---|
Summary |
Netchecker has a high number of errors. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The number of agent reports is lower than expected. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The TCP connection to Netchecker server takes too much time. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The DNS lookup time is too high. |
Description |
The DNS lookup time on the |
This section lists the alerts for the NGINX service.
Severity |
Critical |
---|---|
Summary |
The NGINX service is down. |
Description |
The NGINX service on the |
Severity |
Minor |
---|---|
Summary |
NGINX drops incoming connections. |
Description |
The NGINX service on the |
This section lists the alerts for a Kubernetes node network.
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
Severity |
Warning |
---|---|
Summary |
60 or more received packets were dropped. |
Description |
|
Severity |
Warning |
---|---|
Summary |
100 transmitted packets were dropped. |
Description |
|
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
This section lists the alerts for a Kubernetes node time.
Severity |
Warning |
---|---|
Summary |
The NTP offset reached the limit of 0.03 seconds. |
Description |
Clock skew was detected on the
|
This section lists the alerts for the PoststgreSQL and Patroni services.
Severity |
Major |
---|---|
Summary |
Patroni cluster member is experiencing data page corruption. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
PostgreSQL transactions deadlocks. |
Description |
The transactions submitted to the Patroni |
Severity |
Warning |
---|---|
Summary |
Insufficient memory for PostgreSQL queries. |
Description |
The query data does not fit into working memory on the
|
Severity |
Critical |
---|---|
Summary |
Patroni cluster split-brain detected. |
Description |
The |
Severity |
Major |
---|---|
Summary |
Patroni cluster primary node is missing. |
Description |
The primary node of the |
Severity |
Critical |
---|---|
Summary |
PostgreSQL is down on the cluster primary node. |
Description |
The |
Severity |
Minor |
---|---|
Summary |
Patroni cluster has replicas with inoperable PostgreSQL. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
Patroni cluster has non-streaming replicas. |
Description |
The |
Severity |
Major |
---|---|
Summary |
Replication has stopped. |
Description |
Replication has stopped on the
|
Severity |
Warning |
---|---|
Summary |
WAL segment application is slow. |
Description |
Slow replication while applying WAL segments on the
|
Severity |
Warning |
---|---|
Summary |
Streaming replication is slow. |
Description |
Slow replication while downloading WAL segments for the
|
Severity |
Major |
---|---|
Summary |
Patroni cluster WAL segment writes are failing. |
Description |
The |
This section describes the alerts for the Prometheus service.
Severity |
Warning |
---|---|
Summary |
Failure to reload the Prometheus configuration. |
Description |
Reloading of the Prometheus configuration has failed for the
|
Severity |
Warning |
---|---|
Summary |
Prometheus alert notification queue is running full. |
Description |
The Prometheus alert notification queue is running full for the
|
Severity |
Warning |
---|---|
Summary |
Errors occur while sending alerts from Prometheus. |
Description |
Errors occur while sending alerts from the
|
Severity |
Major |
---|---|
Summary |
Errors occur while sending alerts from Prometheus. |
Description |
Errors occur while sending alerts from the
|
Severity |
Minor |
---|---|
Summary |
Prometheus is not connected to Alertmanager. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
Prometheus has issues reloading data blocks from disk. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
Prometheus has issues compacting sample blocks. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
Prometheus encountered WAL corruptions. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
Prometheus does not ingest samples. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
Prometheus has many rejected samples. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
Prometheus failed to evaluate recording rules. |
Description |
The |
This section lists the alerts for the Salesforce notifier service.
Severity |
Critical |
---|---|
Summary |
Failure to authenticate to Salesforce. |
Description |
The |
This section describes the alerts for SMART disks.
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
Severity |
Warning |
---|---|
Summary |
The |
Description |
The |
Severity |
Major |
---|---|
Summary |
The |
Description |
The |
Severity |
Major |
---|---|
Summary |
The |
Description |
The |
Severity |
Major |
---|---|
Summary |
The |
Description |
The |
Severity |
Major |
---|---|
Summary |
The |
Description |
The |
Severity |
Major |
---|---|
Summary |
The |
Description |
The |
This section lists the alerts for SSL certificates.
Severity |
Warning |
---|---|
Summary |
SSL certificate expires in 30 days. |
Description |
The SSL certificate for |
Severity |
Major |
---|---|
Summary |
SSL certificate expires in 10 days. |
Description |
The SSL certificate for |
Severity |
Critical |
---|---|
Summary |
SSL certificate probes are failing. |
Description |
The SSL certificate probes for the |
Severity |
Major |
---|---|
Summary |
SSL certificate for a Container Cloud service expires in 10 days. |
Description |
The SSL certificate for the Container Cloud |
Severity |
Warning |
---|---|
Summary |
SSL certificate for a Container Cloud service expires in 30 days. |
Description |
The SSL certificate for the Container Cloud
|
Severity |
Critical |
---|---|
Summary |
SSL certificate probes for a Container Cloud service are failing. |
Description |
The SSL certificate probes for the Container Cloud
|
Caution
This feature is available starting from the Container Cloud release 2.4.0.
This section lists the alerts for the Telegraf service.
Severity |
Major |
---|---|
Summary |
Telegraf failed to gather metrics. |
Description |
Telegraf has gathering errors for the last 10 minutes. |
This section describes the alerts for the Telemeter service.
Severity |
Warning |
---|---|
Summary |
Telemeter client failed to send data to the server. |
Description |
Telemeter client has failed to send data to the Telemeter server twice
for the last 30 minutes. Verify the |
This section describes the alerts for the Mirantis Kubernetes Engine (MKE) cluster.
Severity |
Critical |
---|---|
Summary |
MKE API endpoint is down. |
Description |
The MKE API endpoint |
Severity |
Critical |
---|---|
Summary |
MKE API is down. |
Description |
The MKE API (port 443) is not accessible for the last minute. |
Severity |
Major |
---|---|
Summary |
MKE container is in the |
Description |
The |
Severity |
Critical |
---|---|
Summary |
MKE node disk is 95% full. |
Description |
The |
Severity |
Warning |
---|---|
Summary |
MKE node disk is 85% full. |
Description |
The |
Severity |
Critical |
---|---|
Summary |
MKE node is down. |
Description |
The |
This section describes the initial steps required for StackLight configuration. For a detailed description of StackLight configuration options, see StackLight configuration parameters.
Download your management cluster kubeconfig
:
Log in to the Mirantis Container Cloud web UI
with the writer
permissions.
Switch to the required project using the Switch Project action icon located on top of the main left-side navigation panel.
Expand the menu of the tab with your user name.
Click Download kubeconfig to download kubeconfig
of your management cluster.
Log in to any local machine with kubectl
installed.
Copy the downloaded kubeconfig
to this machine.
Run one of the following commands:
For a management cluster:
kubectl --kubeconfig <KUBECONFIG_PATH> edit -n <PROJECT_NAME> cluster <MANAGEMENT_CLUSTER_NAME>
For a managed cluster:
kubectl --kubeconfig <KUBECONFIG_PATH> edit -n <PROJECT_NAME> cluster <MANAGED_CLUSTER_NAME>
In the following section of the opened manifest, configure the required StackLight parameters as described in StackLight configuration parameters.
spec:
providerSpec:
value:
helmReleases:
- name: stacklight
values:
This section describes the StackLight configuration keys that you can specify
in the values
section to change StackLight settings as required. Prior to
making any changes to StackLight configuration, perform the steps described in
Configure StackLight.
After changing StackLight configuration, verify the changes as described in
Verify StackLight after configuration.
Key |
Description |
Example values |
---|---|---|
|
Enables or disables Alerta. Set to |
|
Key |
Description |
Example values |
---|---|---|
|
Defines the Elasticsearch |
|
Key |
Description |
Example values |
---|---|---|
|
Disables Grafana Image Renderer. For example, for resource-limited environments. Enabled by default. |
|
|
Defines the home dashboard. Set to |
|
Key |
Description |
Example values |
---|---|---|
|
Enables or disables the StackLight logging stack. For details about the
logging components, see Reference Architecture:
StackLight deployment architecture. Set to |
|
|
Enables or disables Cerebro, a web UI for interacting with the Elasticsearch cluster that stores logs. To access the Cerebro web UI, see Access Elasticsearch clusters using Cerebro. Note Prior to enabling Cerebro, verify that your Container Cloud cluster has minimum 0.5-1 GB of free RAM and 1 vCPU available. |
|
Key |
Description |
Example values |
---|---|---|
|
Enables or disables StackLight multiserver mode. For details, see
StackLight database modes in Reference Architecture:
StackLight deployment architecture. Set to |
|
Key |
Description |
Example values |
---|---|---|
|
Disables or enables the metric collector. Modify this parameter
for the management cluster only. Set to |
|
Key |
Description |
Example values |
---|---|---|
|
Defines the Prometheus database retention period. Passed to the
|
|
|
Defines the Prometheus database retention size. Passed to the
|
|
|
Defines the minimum amount of time for Prometheus to wait before
resending an alert to Alertmanager. Passed to the
|
|
Key |
Description |
Example values |
---|---|---|
|
Specifies the approximate expected cluster size. Set to
|
|
Key |
Description |
Example values |
---|---|---|
|
Provides the capability to override the default resource requests or limits for any StackLight component for the predefined cluster sizes. For a list of StackLight components, see Components versions in Release Notes: Cluster releases. |
resourcesPerClusterSize:
elasticsearch:
small:
limits:
cpu: "1000m"
memory: "4Gi"
medium:
limits:
cpu: "2000m"
memory: "8Gi"
requests:
cpu: "1000m"
memory: "4Gi"
large:
limits:
cpu: "4000m"
memory: "16Gi"
|
|
Provides the capability to override the containers resource requests or limits for any StackLight component. For a list of StackLight components, see Components versions in Release Notes: Cluster releases. |
resources:
alerta:
requests:
cpu: "50m"
memory: "200Mi"
limits:
memory: "500Mi"
Using the example above, each pod in the |
Key |
Description |
Example values |
---|---|---|
|
Kubernetes tolerations to add to all StackLight components. |
default:
- key: "com.docker.ucp.manager"
operator: "Exists"
effect: "NoSchedule"
|
|
Defines Kubernetes tolerations (overrides the default ones) for any StackLight component. |
component:
elasticsearch:
- key: "com.docker.ucp.manager"
operator: "Exists"
effect: "NoSchedule"
postgresql:
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
|
Key |
Description |
Example values |
---|---|---|
|
Defines the |
|
|
Defines (overrides the |
componentStorageClasses:
elasticsearch: ""
fluentd: ""
postgresql: ""
prometheusAlertManager: ""
prometheusPushGateway: ""
prometheusServer: ""
|
Key |
Description |
Example values |
---|---|---|
|
Defines the |
default:
role: stacklight
|
|
Defines the |
component:
alerta:
role: stacklight
component: alerta
kibana:
role: stacklight
component: kibana
|
On the managed clusters with limited Internet access, proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled, for example, for the Salesforce integration and Alertmanager notifications external rules.
Key |
Description |
Example values |
---|---|---|
|
Unique cluster identifier
Note
|
|
|
Enables or disables reporting of Prometheus metrics to Salesforce. For details, see StackLight deployment architecture. Disabled by default. |
|
|
Salesforce parameters and credentials for the metrics reporting integration. |
Note Modify this parameter if salesForceAuth:
url: "<SF instance URL>"
username: "<SF account email address>"
password: "<SF password>"
environment_id: "<Cloud identifier>"
organization_id: "<Organization identifier>"
sandbox_enabled: "<Set to true or false>"
|
|
Defines the Kubernetes cron job for sending metrics to Salesforce. By default, reports are sent at midnight server time. |
cronjob:
schedule: "0 0 * * *"
concurrencyPolicy: "Allow"
failedJobsHistoryLimit: ""
successfulJobsHistoryLimit: ""
startingDeadlineSeconds: 200
|
Key |
Description |
Example values |
---|---|---|
|
Enables or disables Ceph monitoring. Set to |
|
Key |
Description |
Example values |
---|---|---|
|
Enables or disables HTTP endpoints monitoring. If enabled, the
monitoring tool performs the probes against the defined endpoints every
15 seconds. Set to |
|
|
Defines the directory path with external endpoints certificates on host. |
|
|
Defines the list of HTTP endpoints to monitor. |
domains:
- https://prometheus.io_health
- http://example.com:8080_status
- http://example.net:8080_pulse
|
Key |
Description |
Example values |
---|---|---|
|
Enables or disables monitoring of bare metal Ironic. To enable, specify the Ironic API URL. |
|
|
Defines whether to skip the chain and host verification. Set to
|
|
Key |
Description |
Example values |
---|---|---|
|
Enables or disables StackLight to monitor and alert on the expiration
date of the TLS certificate of an HTTPS endpoint. If enabled, the
monitoring tool performs the probes against the defined endpoints every
hour. Set to |
|
|
Defines the list of HTTPS endpoints to monitor the certificates from. |
domains:
- https://prometheus.io
- https://example.com:8080
|
Key |
Description |
Example values |
---|---|---|
|
On the clusters that run large-scale workloads, workload monitoring generates a big amount of resource-consuming metrics. To prevent generation of excessive metrics, you can disable workload monitoring in the StackLight metrics and monitor only the infrastructure. The |
metricFilter:
enabled: true
action: keep
namespaces:
- kaas
- kube-system
- stacklight
|
Key |
Description |
Example values |
---|---|---|
|
Enables or disables Mirantis Kubernetes Engine (MKE) monitoring.
Set to |
|
|
Defines the dockerd data root directory of persistent Docker state. For details, see Docker documentation: Daemon CLI (dockerd). |
|
Key |
Description |
Example values |
---|---|---|
|
Defines custom alerts. Also, modifies or disables existing alert configurations. For the list of predefined alerts, see Available StackLight alerts. While adding or modifying alerts, follow the Alerting rules. |
customAlerts:
# To add a new alert:
- alert: ExampleAlert
annotations:
description: Alert description
summary: Alert summary
expr: example_metric > 0
for: 5m
labels:
severity: warning
# To modify an existing alert expression:
- alert: AlertmanagerFailedReload:
expr: alertmanager_config_last_reload_successful == 5
# To disable an existing alert:
- alert: TargetDown
enabled: false
An optional field |
Key |
Description |
Example values |
---|---|---|
|
Enables or disables the |
|
On the managed clusters with limited Internet access, proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled, for example, for the Salesforce integration and Alertmanager notifications external rules.
Key |
Description |
Example values |
---|---|---|
|
Provides a generic template for notifications receiver configurations. For a list of supported receivers, see Prometheus Alertmanager documentation: Receiver. |
For example, to enable notifications to OpsGenie: alertmanagerSimpleConfig:
genericReceivers:
- name: HTTP-opsgenie
enabled: true # optional
opsgenie_configs:
- api_url: "https://example.app.eu.opsgenie.com/"
api_key: "secret-key"
send_resolved: true
|
|
Provides a template for notifications route configuration. For details, see Prometheus Alertmanager documentation: Route. |
genericRoutes:
- receiver: HTTP-opsgenie
enabled: true # optional
match_re:
severity: major|critical
continue: true
|
|
Disables or enables alert inhibition rules. If enabled, Alertmanager decreases alert noise by suppressing dependent alerts notifications to provide a clearer view on the cloud status and simplify troubleshooting. Enabled by default. For details, see Alert dependencies. For details on inhibition rules, see Prometheus documentation. |
|
Key |
Description |
Example values |
---|---|---|
|
Enables or disables Alertmanager integration with email. Set to
|
|
|
Defines the notification parameters for Alertmanager integration with email. For details, see Prometheus Alertmanager documentation: Email configuration. |
email:
enabled: false
send_resolved: true
to: "to@test.com"
from: "from@test.com"
smarthost: smtp.gmail.com:587
auth_username: "from@test.com"
auth_password: password
auth_identity: "from@test.com"
require_tls: true
|
|
Defines the route for Alertmanager integration with email. For details, see Prometheus Alertmanager documentation: Route. |
route:
match: {}
match_re: {}
routes: []
|
On the managed clusters with limited Internet access, proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled, for example, for the Salesforce integration and Alertmanager notifications external rules.
Key |
Description |
Example values |
---|---|---|
|
Unique cluster identifier
Note
|
|
|
Enables or disables Alertmanager integration with Salesforce using the
|
|
|
Defines the Salesforce parameters and credentials for integration with Alertmanager. |
auth:
url: "<SF instance URL>"
username: "<SF account email address>"
password: "<SF password>"
environment_id: "<Cloud identifier>"
organization_id: "<Organization identifier>"
sandbox_enabled: "<Set to true or false>"
|
|
Defines the notifications route for Alertmanager integration with Salesforce. For details, see Prometheus Alertmanager documentation: Route. |
route:
match: {}
match_re: {}
routes: []
|
On the managed clusters with limited Internet access, proxy is required for StackLight components that use HTTP and HTTPS and are disabled by default but need external access if enabled, for example, for the Salesforce integration and Alertmanager notifications external rules.
Key |
Description |
Example values |
---|---|---|
|
Enables or disables Alertmanager integration with Slack. For
details, see Prometheus Alertmanager documentation: Slack configuration.
Set to |
|
|
Defines the Slack webhook URL. |
|
|
Defines the Slack channel or user to send notifications to. |
|
|
Defines the notifications route for Alertmanager integration with Slack. For details, see Prometheus Alertmanager documentation: Route. |
route:
match: {}
match_re: {}
routes: []
|
This section describes how to verify StackLight after configuring its parameters as described in Configure StackLight and StackLight configuration parameters. Perform the verification procedure described for a particular modified StackLight key.
To verify StackLight after configuration:
Key |
Verification procedure |
---|---|
|
Verify that Alerta is present in the list of StackLight resources. An empty output indicates that Alerta is disabled. kubectl get all -n stacklight -l app=alerta
|
|
Verify that the kubectl get cm elasticsearch-curator-config -n \
stacklight -o=jsonpath='{.data.action_file\.yml}'
|
|
Verify the Grafana Image Renderer. If set to kubectl logs -f -n stacklight -l app=grafana --container grafana-renderer
|
|
In the Grafana web UI, verify that the desired dashboard is set as a home dashboard. |
|
Verify that Elasticsearch, Fluentd, and Kibana are present in the list of StackLight resources. An empty output indicates that the StackLight logging stack is disabled. kubectl get all -n stacklight -l 'app in
(elasticsearch-master,kibana,fluentd-elasticsearch)'
|
|
Run kubectl get sts -n stacklight. The output includes the number of services replicas for the HA or non-HA StackLight modes. For details, see StackLight deployment architecture. |
|
Verify that metric collector is present in the list of StackLight resources. An empty output indicates that metric collector is disabled. kubectl get all -n stacklight -l app=metric-collector
|
|
|
|
|
|
Verify that the appropriate components pods are located on the intended nodes: kubectl get pod -o=custom-columns=NAME:.metadata.name,\
STATUS:.status.phase,NODE:.spec.nodeName -n stacklight
|
|
Verify that the appropriate components PVCs have been created according
to the configured kubectl get pvc -n stacklight
|
|
|
|
|
|
|
|
In the Grafana web UI, verify that the Ironic BM dashboard displays valuable data (no false-positive or empty panels). |
|
|
|
|
|
|
|
In the Prometheus web UI, navigate to Alerts and verify that
the |
|
In the Prometheus web UI, navigate to Alerts and verify that the list of alerts has changed according to your customization. |
|
In the Prometheus web UI, navigate to Alerts and verify that
the list of alerts contains the |
|
In the Alertmanager web UI, navigate to Status and verify that the Config section contains the intended receiver(s). |
|
In the Alertmanager web UI, navigate to Status and verify that the Config section contains the intended route(s). |
|
Run the following command. An empty output indicates either a failure or that the feature is disabled. kubectl get cm -n stacklight prometheus-alertmanager -o yaml | grep -A 6 inhibit_rules
|
|
In the Alertmanager web UI, navigate to Status and verify
that the Config section contains the |
|
|
|
In the Alertmanager web UI, navigate to Status and verify
that the Config section contains the |
Caution
This feature is available starting from the Container Cloud release 2.5.0.
Note
Prior to enabling Cerebro, verify that your Container Cloud cluster has minimum 0.5-1 GB of free RAM and 1 vCPU available.
Cerebro is a web UI for Elasticsearch, allowing visual inspection of and interaction with an Elasticsearch cluster, useful for evaluating its health and convenient debugging. Cerebro is disabled by default. Mirantis recommends that you enable Cerebro if needed, for example, if your Elasticsearch cluster encounters an issue, and disable it afterward.
To enable or disable Cerebro, set the logging.cerebro
parameter to true
or false
as described in Configure StackLight and StackLight configuration parameters.
To access the Cerebro web UI:
Obtain the host IP address:
Log in to the Container Cloud web UI with writer
or operator
permissions and switch to the required project.
From the Clusters page, click the required cluster.
Click the ⋮ action icon in the last column of any machine of the manager type.
Click Machine info and copy the Host IP.
Log in to a local machine where your management cluster kubeconfig
is
located and where kubectl
is installed.
Obtain the cluster network CIDR:
kubectl get cluster kaas-mgmt -o jsonpath='{.spec.clusterNetwork.services.cidrBlocks}'
Create an SSH tunnel to the host, for example, using sshuttle
:
Note
This step requires SSH access to Container Cloud hosts.
sshuttle -r ubuntu@<HOST_IP> <CIDR>
Obtain the Cerebro IP address:
kubectl get svc -n stacklight cerebro -o jsonpath='{.spec.clusterIP}'
Paste the Cerebro IP address in a web browser.
StackLight can scrape metrics from any service that exposes Prometheus metrics
and is running on the Kubernetes cluster. Such metrics appear in Prometheus
under the
{job="stacklight-generic",service="<service_name>",namespace="<service_namespace>"}
set of labels. If the Kubernetes service is backed by Kubernetes pods, the set
of labels also includes {pod="<pod_name>"}
.
To enable the functionality, define at least one of the following annotations in the service metadata:
"generic.stacklight.mirantis.com/scrape-port"
- the HTTP endpoint port.
By default, the port number found through Kubernetes service discovery,
usually __meta_kubernetes_pod_container_port_number.
If none discovered, use the default port for the chosen scheme.
"generic.stacklight.mirantis.com/scrape-path"
- the HTTP endpoint path,
related to the Prometheus scrape_config.metrics_path
option. By default,
/metrics
.
"generic.stacklight.mirantis.com/scrape-scheme"
- the HTTP endpoint
scheme between HTTP and HTTPS, related to the Prometheus
scrape_config.scheme
option. By default, http
.
For example:
metadata:
annotations:
"generic.stacklight.mirantis.com/scrape-path": "/metrics"
metadata:
annotations:
"generic.stacklight.mirantis.com/scrape-port": "8080"
This section outlines Ceph LCM operations such as adding Ceph Monitor, Ceph nodes, and RADOS Gateway nodes to an existing Ceph cluster or removing them, as well as removing or replacing Ceph OSDs or updating your Ceph cluster.
Ceph controller can automatically redeploy Ceph OSDs in case of significant
configuration changes such as changing the block.db
device or replacing
Ceph OSDs. Ceph controller can also clean disks and configuration during a Ceph
OSD removal.
To remove a single Ceph OSD or the entire Ceph node, manually remove its
definition from the kaasCephCluster
CR.
To enable automated management of Ceph OSDs:
Log in to a local machine running Ubuntu 18.04 where kubectl is installed.
Obtain and export kubeconfig
of the management cluster as described in
Connect to a Mirantis Container Cloud cluster.
Open the KaasCephCluster
CR for editing. Choose from the following
options:
For a management cluster:
kubectl edit kaascephcluster
For a managed cluster:
kubectl edit kaascephcluster -n <managedClusterProjectName>
Substitute <managedClusterProjectName>
with the corresponding value.
Set the manageOsds
parameter to true
:
spec:
cephClusterSpec:
manageOsds: true
Once done, all Ceph OSDs with a modified configuration will be redeployed. Mirantis recommends modifying only one Ceph node at a time. For details about supported configuration parameters, see OSD Configuration Settings.
Mirantis Ceph controller simplifies a Ceph cluster management
by automating LCM operations.
To modify Ceph components, only the MiraCeph
custom resource (CR) update
is required. Once you update the MiraCeph
CR, the Ceph controller
automatically adds, removes, or reconfigures Ceph nodes as required.
Note
When adding a Ceph node with the Ceph Monitor role, if any issues occur with
the Ceph Monitor, rook-ceph
removes it and adds a new Ceph Monitor instead,
named using the next alphabetic character in order. Therefore, the Ceph Monitor
names may not follow the alphabetical order. For example, a
, b
, d
,
instead of a
, b
, c
.
To add, remove, or reconfigure Ceph nodes on a management or managed cluster:
To modify Ceph OSDs, verify that the manageOsds
parameter is set to
true
in the KaasCephCluster
CR as described in Enable automated Ceph LCM.
Log in to a local machine running Ubuntu 18.04 where kubectl
is installed.
Obtain and export kubeconfig
of the management cluster
as described in Connect to a Mirantis Container Cloud cluster.
Open the KaasCephCluster
CR for editing.
Choose from the following options:
For a management cluster:
kubectl edit kaascephcluster
For a managed cluster:
kubectl edit kaascephcluster -n <managedClusterProjectName>
Substitute <managedClusterProjectName>
with the corresponding value.
In the nodes
section, specify or remove the parameters for a
Ceph OSD as required. For the parameters description, see
OSD Configuration Settings.
For example:
nodes:
kaas-mgmt-node-5bgk6:
roles:
- mon
- mgr
storageDevices:
- config:
storeType: bluestore
name: sdb
Note
To use a new Ceph node for a Ceph Monitor or Ceph Manager
deployment, also specify the roles
parameter.
If you are making changes for your managed cluster, obtain and export
kubeconfig
of the managed cluster as described
in Connect to a Mirantis Container Cloud cluster. Otherwise, skip this step.
Monitor the status of your Ceph cluster deployment. For example:
kubectl -n rook-ceph get pods
kubectl -n ceph-lcm-mirantis logs ceph-controller-78c95fb75c-dtbxk
kubectl -n rook-ceph logs rook-ceph-operator-56d6b49967-5swxr
Connect to the terminal of the ceph-tools
pod:
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod \
-l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
Verify that the Ceph node has been successfully added, removed, or reconfigured:
Verify that the Ceph cluster status is healthy:
ceph status
Example of a positive system response:
cluster:
id: 0868d89f-0e3a-456b-afc4-59f06ed9fbf7
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 20h)
mgr: a(active, since 20h)
osd: 9 osds: 9 up (since 20h), 9 in (since 2d)
data:
pools: 1 pools, 32 pgs
objects: 0 objects, 0 B
usage: 9.1 GiB used, 231 GiB / 240 GiB avail
pgs: 32 active+clean
Verify that the status of the Ceph OSDs is up
:
ceph osd tree
Example of a positive system response:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.23424 root default
-3 0.07808 host osd1
1 hdd 0.02930 osd.1 up 1.00000 1.00000
3 hdd 0.01949 osd.3 up 1.00000 1.00000
6 hdd 0.02930 osd.6 up 1.00000 1.00000
-15 0.07808 host osd2
2 hdd 0.02930 osd.2 up 1.00000 1.00000
5 hdd 0.01949 osd.5 up 1.00000 1.00000
8 hdd 0.02930 osd.8 up 1.00000 1.00000
-9 0.07808 host osd3
0 hdd 0.02930 osd.0 up 1.00000 1.00000
4 hdd 0.01949 osd.4 up 1.00000 1.00000
7 hdd 0.02930 osd.7 up 1.00000 1.00000
After a physical disk replacement, you can use Rook to redeploy a failed
Ceph OSD by restarting rook-operator
that triggers
the reconfiguration of the management or managed cluster.
To redeploy a failed Ceph OSD:
Log in to a local machine running Ubuntu 18.04 where kubectl
is installed.
Obtain and export kubeconfig
of the required management or managed
cluster as described in Connect to a Mirantis Container Cloud cluster.
Identify the failed Ceph OSD ID:
ceph osd tree
Remove the Ceph OSD deployment from the management or managed cluster:
kubectl delete deployment -n rook-ceph rook-ceph-osd-<ID>
Connect to the terminal of the ceph-tools
pod:
kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod \
-l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
Remove the failed Ceph OSD from the Ceph cluster:
ceph osd purge osd.<ID>
Replace the failed disk.
Restart the Rook operator:
kubectl delete pod $(kubectl -n rook-ceph get pod -l "app=rook-ceph-operator" \
-o jsonpath='{.items[0].metadata.name}') -n rook-ceph
You can update Ceph cluster to the latest minor version of Ceph Nautilus by triggering the existing Ceph cluster update.
To update Ceph cluster:
Verify that your management cluster is automatically upgraded to the latest Mirantis Container Cloud release:
Log in to the Container Cloud web UI with the writer permissions.
On the bottom of the page, verify the Container Cloud version number.
Verify that your managed clusters are updated to the latest Cluster release. For details, see Update a managed cluster.
Log in to a local machine running Ubuntu 18.04 where kubectl
is installed.
Obtain and export kubeconfig
of the management cluster
as described in Connect to a Mirantis Container Cloud cluster.
Open the KaasCephCluster
CR for editing:
kubectl edit kaascephcluster
Update the version
parameter. For example:
version: 14.2.9
Obtain and export kubeconfig
of the managed clusters
as described in Connect to a Mirantis Container Cloud cluster.
Repeat the steps 5-7 to update Ceph on every managed cluster.
Ceph controller enables you to deploy RADOS Gateway (RGW) Object Storage instances and automatically manages its resources such as users and buckets. Ceph Object Storage has an integration with OpenStack Object Storage (Swift) in Mirantis OpenStack for Kubernetes (MOS).
To enable the RGW Object Storage:
Select from the following options:
If you do not have a management cluster yet, open
kaascephcluster.yaml.template
for editing.
If the management cluster is already deployed, open the
KaasCephCluster
CR for editing. Select from the following options:
If the Ceph cluster is placed in the management cluster:
kubectl edit kaascephcluster
If the Ceph cluster is placed in a managed cluster:
kubectl edit kaascephcluster -n <managedClusterProjectName>
Substitute <managedClusterProjectName>
with a corresponding value.
Using the following table, update the rgw
section specification as
required:
Parameter |
Description |
---|---|
|
Ceph Object Storage instance name. |
|
Object storage data pool spec that should only contain rgw:
dataPool:
replicated:
size: 3
metadataPool:
replicated:
size: 3
where rgw:
dataPool:
erasureCoded:
codingChunks: 1
dataChunks: 2
|
|
Object storage metadata pool spec that should only contain
|
|
The gateway settings corresponding to the
For example: gateway:
allNodes: false
instances: 1
port: 80
securePort: 8443
|
|
Defines whether to delete the data and metadata pools in the |
|
Optional. To create new Ceph RGW resources, such as buckets or users, specify the following keys. Ceph controller will automatically create the specified object storage users and buckets in the Ceph cluster.
|
For example:
rgw:
name: rgw-store
dataPool:
erasureCoded:
codingChunks: 1
dataChunks: 2
failureDomain: host
metadataPool:
failureDomain: host
replicated:
size: 3
gateway:
allNodes: false
instances: 1
port: 80
securePort: 8443
preservePoolsOnDelete: false
This section describes how to verify the components of a Ceph cluster after deployment. For troubleshooting, verify Ceph controller and Rook logs as described in Verify Ceph controller and Rook.
To confirm that all Ceph components including mon
, mgr
, osd
, and
rgw
have joined your cluster properly, analyze the logs for each pod and
verify the Ceph status:
kubectl exec -it rook-ceph-tools-5748bc69c6-cpzf8 -n rook-ceph bash
ceph -s
Example of a positive system response:
cluster:
id: 4336ab3b-2025-4c7b-b9a9-3999944853c8
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 20m)
mgr: a(active, since 19m)
osd: 6 osds: 6 up (since 16m), 6 in (since 16m)
rgw: 1 daemon active (miraobjstore.a)
data:
pools: 12 pools, 216 pgs
objects: 201 objects, 3.9 KiB
usage: 6.1 GiB used, 174 GiB / 180 GiB avail
pgs: 216 active+clean
To ensure that rook-discover
is running properly, verify if the
local-device
configmap has been created for each Ceph node specified
in the cluster configuration:
Obtain the list of local devices:
kubectl get configmap -n rook-ceph | grep local-device
Example of a system response:
local-device-01 1 30m
local-device-02 1 29m
local-device-03 1 30m
Verify that each device from the list contains information about available devices for the Ceph node deployment:
kubectl describe configmap local-device-01 -n rook-ceph
Example of a positive system response:
Name: local-device-01
Namespace: rook-ceph
Labels: app=rook-discover
rook.io/node=01
Annotations: <none>
Data
====
devices:
----
[{"name":"vdd","parent":"","hasChildren":false,"devLinks":"/dev/disk/by-id/virtio-41d72dac-c0ff-4f24-b /dev/disk/by-path/virtio-pci-0000:00:09.0","size":32212254720,"uuid":"27e9cf64-85f4-48e7-8862-faa7270202ed","serial":"41d72dac-c0ff-4f24-b","type":"disk","rotational":true,"readOnly":false,"Partitions":null,"filesystem":"","vendor":"","model":"","wwn":"","wwnVendorExtension":"","empty":true,"cephVolumeData":"{\"path\":\"/dev/vdd\",\"available\":true,\"rejected_reasons\":[],\"sys_api\":{\"size\":32212254720.0,\"scheduler_mode\":\"none\",\"rotational\":\"1\",\"vendor\":\"0x1af4\",\"human_readable_size\":\"30.00 GB\",\"sectors\":0,\"sas_device_handle\":\"\",\"rev\":\"\",\"sas_address\":\"\",\"locked\":0,\"sectorsize\":\"512\",\"removable\":\"0\",\"path\":\"/dev/vdd\",\"support_discard\":\"0\",\"model\":\"\",\"ro\":\"0\",\"nr_requests\":\"128\",\"partitions\":{}},\"lvs\":[]}","label":""},{"name":"vdb","parent":"","hasChildren":false,"devLinks":"/dev/disk/by-path/virtio-pci-0000:00:07.0","size":67108864,"uuid":"988692e5-94ac-4c9a-bc48-7b057dd94fa4","serial":"","type":"disk","rotational":true,"readOnly":false,"Partitions":null,"filesystem":"","vendor":"","model":"","wwn":"","wwnVendorExtension":"","empty":true,"cephVolumeData":"{\"path\":\"/dev/vdb\",\"available\":false,\"rejected_reasons\":[\"Insufficient space (\\u003c5GB)\"],\"sys_api\":{\"size\":67108864.0,\"scheduler_mode\":\"none\",\"rotational\":\"1\",\"vendor\":\"0x1af4\",\"human_readable_size\":\"64.00 MB\",\"sectors\":0,\"sas_device_handle\":\"\",\"rev\":\"\",\"sas_address\":\"\",\"locked\":0,\"sectorsize\":\"512\",\"removable\":\"0\",\"path\":\"/dev/vdb\",\"support_discard\":\"0\",\"model\":\"\",\"ro\":\"0\",\"nr_requests\":\"128\",\"partitions\":{}},\"lvs\":[]}","label":""},{"name":"vdc","parent":"","hasChildren":false,"devLinks":"/dev/disk/by-id/virtio-e8fdba13-e24b-41f0-9 /dev/disk/by-path/virtio-pci-0000:00:08.0","size":32212254720,"uuid":"190a50e7-bc79-43a9-a6e6-81b173cd2e0c","serial":"e8fdba13-e24b-41f0-9","type":"disk","rotational":true,"readOnly":false,"Partitions":null,"filesystem":"","vendor":"","model":"","wwn":"","wwnVendorExtension":"","empty":true,"cephVolumeData":"{\"path\":\"/dev/vdc\",\"available\":true,\"rejected_reasons\":[],\"sys_api\":{\"size\":32212254720.0,\"scheduler_mode\":\"none\",\"rotational\":\"1\",\"vendor\":\"0x1af4\",\"human_readable_size\":\"30.00 GB\",\"sectors\":0,\"sas_device_handle\":\"\",\"rev\":\"\",\"sas_address\":\"\",\"locked\":0,\"sectorsize\":\"512\",\"removable\":\"0\",\"path\":\"/dev/vdc\",\"support_discard\":\"0\",\"model\":\"\",\"ro\":\"0\",\"nr_requests\":\"128\",\"partitions\":{}},\"lvs\":[]}","label":""}]
To verify the state of a Ceph cluster, Ceph controller provides a Kubernetes
API that includes a custom MiraCephLog
resource. The resource contains
information about the state of different components of your Ceph cluster.
To verify the Ceph cluster state:
Obtain kubeconfig
of the management or managed cluster and provide it as
an environment variable:
export KUBECONFIG=<path-to-kubeconfig>
Obtain MiraCephLog
:
kubectl get miracephlog rook-ceph -n ceph-lcm-mirantis -o yaml
Verify the state of the required component using the MiraCephLog
specification description below.
Field |
Description |
---|---|
|
The tail of the |
|
The string result of the current state of Ceph OSDs. If all OSDs
operate properly, the value is |
|
The list of Ceph block pools. Use this list to verify whether all defined pools have been created properly. |
The starting point for Ceph troubleshooting is the ceph-controller
and
rook-operator
logs. Once you locate the component that causes issues,
verify the logs of the related pod. This section describes how to verify the
Ceph controller and Rook objects of a Ceph cluster.
To verify Ceph controller and Rook:
Verify data access. Ceph volumes can be consumed directly by Kubernetes workloads and internally, for example, by OpenStack services. To verify the Kubernetes storage:
Verify the available storage classes. The storage classes that are
automatically managed by Ceph controller use the
rook-ceph.rbd.csi.ceph.com
provisioner.
kubectl get storageclass
Example of system response:
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
iam-kaas-iam-data kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 64m
kubernetes-ssd (default) rook-ceph.rbd.csi.ceph.com Delete Immediate false 55m
stacklight-alertmanager-data kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 55m
stacklight-elasticsearch-data kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 55m
stacklight-postgresql-db kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 55m
stacklight-prometheus-data kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 55m
Verify that volumes are properly connected to the pod:
Obtain the list of volumes:
kubectl get persistentvolumeclaims -n kaas
Example of system response:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
ironic-aio-pvc Bound pvc-9132beb2-6a17-4877-af40-06031d52da47 5Gi RWO kubernetes-ssd 62m
ironic-inspector-pvc Bound pvc-e84e9a9e-51b8-4c57-b116-0e1e6a9e7e94 1Gi RWO kubernetes-ssd 62m
mariadb-pvc Bound pvc-fb0dbf01-ee4b-4c88-8b08-901080ee8e14 2Gi RWO kubernetes-ssd 62m
mysql-data-mariadb-server-0 Bound local-pv-d1ecc89d 457Gi RWO iam-kaas-iam-data 62m
mysql-data-mariadb-server-1 Bound local-pv-1f385d17 457Gi RWO iam-kaas-iam-data 62m
mysql-data-mariadb-server-2 Bound local-pv-79a820d7 457Gi RWO iam-kaas-iam-data 62m
For each volume, verify the connection. For example:
kubectl describe pvc ironic-aio-pvc -n kaas
Example of a positive system response:
Name: ironic-aio-pvc
Namespace: kaas
StorageClass: kubernetes-ssd
Status: Bound
Volume: pvc-9132beb2-6a17-4877-af40-06031d52da47
Labels: <none>
Annotations: pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
volume.beta.kubernetes.io/storage-provisioner: rook-ceph.rbd.csi.ceph.com
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 5Gi
Access Modes: RWO
VolumeMode: Filesystem
Events: <none>
Mounted By: dnsmasq-dbd84d496-6fcz4
httpd-0
ironic-555bff5dd8-kb8p2
In case of connection issues, inspect the pod description for the volume information:
kubectl describe pod <crashloopbackoff-pod-name>
Example of system response:
...
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1h 1h 3 default-scheduler Warning FailedScheduling PersistentVolumeClaim is not bound: "mysql-pv-claim" (repeated 2 times)
1h 35s 36 kubelet, 172.17.8.101 Warning FailedMount Unable to mount volumes for pod "wordpress-mysql-918363043-50pjr_default(08d14e75-bd99-11e7-bc4c-001c428b9fc8)": timeout expired waiting for volumes to attach/mount for pod "default"/"wordpress-mysql-918363043-50pjr". list of unattached/unmounted volumes=[mysql-persistent-storage]
1h 35s 36 kubelet, 172.17.8.101 Warning FailedSync Error syncing pod
Verify that the CSI provisioner plugins were started properly and have
the Running
status:
Obtain the list of CSI provisioner plugins:
kubectl -n rook-ceph get pod -l app=csi-rbdplugin-provisioner
Verify the logs of the required CSI provisioner:
kubectl logs -n rook-ceph <csi-provisioner-plugin-name> csi-provisioner
Verify the Ceph cluster status:
Verify that the status of each pod in the ceph-lcm-mirantis
and
rook-ceph
name spaces is Running
:
For ceph-lcm-mirantis
:
kubectl get pod -n ceph-lcm-mirantis
For rook-ceph
:
kubectl get pod -n rook-ceph
Verify Ceph controller. Ceph controller prepares the configuration that Rook
uses to deploy the Ceph cluster, managed using the KaasCephCluster
resource. If Rook cannot finish the deployment, verify the Rook operator
logs as described in the step 4.
List the pods:
kubectl -n ceph-lcm-mirantis get pods
Verify the logs of the required pod:
kubectl -n ceph-lcm-mirantis logs <ceph-controller-pod-name>
Verify the configuration:
kubectl get kaascephcluster -n <managedClusterProjectName> -o yaml
On the managed cluster, verify the MiraCeph
subresource:
kubectl get miraceph -n ceph-lcm-mirantis -o yaml
Verify the Rook operator logs. Rook deploys a Ceph cluster based on custom
resources created by the MiraCeph
controller, such as pools, clients,
cephcluster
, and so on. Rook logs contain details about components
orchestration. For details about the Ceph cluster status and to get access
to CLI tools, connect to the ceph-tools
pod as described in the step 5.
Verify the Rook operators logs:
kubectl -n rook-ceph logs -l app=rook-ceph-operator
Verify the CephCluster
configuratuion:
Note
The MiraCeph
controller manages the CephCluster
CR .
Open the CephCluster
CR only for verification and do not modify it
manually.
kubectl get cephcluster -n rook-ceph -o yaml
Verify the ceph-tools
pod:
Execute the ceph-tools
pod:
kubectl --kubeconfig <pathToManagedClusterKubeconfig> -n rook-ceph exec -it $(kubectl --kubeconfig <pathToManagedClusterKubeconfig> -n rook-ceph get pod -l app=rook-ceph-tools -o jsonpath='{.items[0].metadata.name}') bash
Verify that CLI commands can run on the ceph-tools
pod:
ceph -s
Verify hardware:
Through the ceph-tools
pod, obtain the required device in your
cluster:
ceph osd tree
Enter all Ceph OSD pods in the rook-ceph
namespace one by one:
kubectl exec -it -n rook-ceph <osd-pod-name> bash
Verify that the ceph-volume
tool is available on all pods running on
the target node:
ceph-volume lvm list
This section describes how to configure a Ceph cluster through the
KaaSCephCluster
(kaascephclusters.kaas.mirantis.com
) CR during or
after the deployment of a management or managed cluster.
The KaaSCephCluster
CR spec has two sections, cephClusterSpec
and
k8sCluster
and specifies the nodes to deploy as Ceph components. Based on
the roles definitions in the KaaSCephCluster
CR, Ceph Controller
automatically labels nodes for Ceph Monitors and Managers. Ceph OSDs are
deployed based on the storageDevices
parameter defined for each Ceph node.
For a default KaaSCephCluster
CR, see
templates/bm/kaascephcluster.yaml.template
.
For details on how to configure the default template for a baremetal-based
cluster bootstrap, see Deployment Guide: Bootstrap a management
cluster.
To configure a Ceph cluster:
Select from the following options:
If you do not have a management cluster yet, open
kaascephcluster.yaml.template
for editing.
If the management cluster is already deployed, open the
KaasCephCluster
CR for editing:
If the Ceph cluster is placed in the management cluster:
kubectl edit kaascephcluster
If the Ceph cluster is placed in a managed cluster:
kubectl edit kaascephcluster -n <managedClusterProjectName>
Substitute <managedClusterProjectName>
with a corresponding value.
Using the tables below, configure the Ceph cluster as required.
Parameter |
Description |
---|---|
|
Describes a Ceph cluster in the management cluster. For details on
|
|
Defines the management cluster on which the spec:
k8sCluster:
name: kaas-mgmt
namespace: default
|
Parameter |
Description |
---|---|
|
Recommended. Enables automated management of Ceph OSDs. For details, see Enable automated Ceph LCM. |
|
Specifies the CIDR for Ceph OSD replication network. |
|
Specifies the CIDR for communication between the service and operator. |
|
Specifies the list of Ceph nodes. For details, see
Node parameters. The nodes:
master-0:
<node spec>
master-1:
<node spec>
...
worker-0:
<node spec>
|
|
Specifies the list of Ceph pools. For details, see Pool parameters. |
|
Specifies RADOS Gateway, the Ceph Object Storage. For details, see RADOS Gateway parameters. |
Example configuration:
spec:
cephClusterSpec:
manageOsds: true
network:
clusterNet: 10.10.10.0/24
publicNet: 10.10.11.0/24
nodes:
master-0:
<node spec>
...
pools:
- <pool spec>
...
Parameter |
Description |
---|---|
|
Specifies the
|
|
Specifies the list of devices to use for Ceph OSD deployment. Includes the following parameters:
|
Parameter |
Description |
---|---|
|
Specifies the pool name as a prefix for each Ceph block pool. |
|
Specifies the pool role and is used mostly for Mirantis OpenStack for Kubernetes (MOS) pools. |
|
Defines if the pool and dependent StorageClass should be set as default. Must be enabled only for one pool. |
|
Specifies the device class for the defined pool. Possible values are HDD, SSD, and NVMe. |
|
The number of pool replicas. The |
|
Enables the erasure-coded pool. For details, see Rook documentation:
Erasure coded
and Ceph documentation: Erasure coded pool. The
|
|
The failure domain across which the replicas or chunks of data will
be spread. Set to |
Example configuration:
pools:
- name: kubernetes
role: kubernetes
deviceClass: hdd
replicated:
size: 3
default: true
To configure additional required pools for MOS, see MOS Deployment Guide: Deploy a Ceph cluster.
Parameter |
Description |
---|---|
|
Ceph Object Storage instance name. |
|
Object storage data pool spec that should only contain rgw:
dataPool:
replicated:
size: 3
metadataPool:
replicated:
size: 3
where rgw:
dataPool:
erasureCoded:
codingChunks: 1
dataChunks: 2
|
|
Object storage metadata pool spec that should only contain
|
|
The gateway settings corresponding to the
For example: gateway:
allNodes: false
instances: 1
port: 80
securePort: 8443
|
|
Defines whether to delete the data and metadata pools in the |
|
Optional. To create new Ceph RGW resources, such as buckets or users, specify the following keys. Ceph controller will automatically create the specified object storage users and buckets in the Ceph cluster.
|
For configuration example, see Enable Ceph RGW Object Storage.
Select from the following options:
If you are bootstrapping a management cluster, save the updated
KaaSCephCluster
template to the
templates/bm/kaascephcluster.yaml.template
file and proceed with the
bootstrap.
If you are creating a managed cluster, save the updated
KaaSCephCluster
template to the corresponding file and proceed with
the managed cluster creation.
If you are configuring KaaSCephCluster
of an existing management
cluster, run the following command:
kubectl apply
If you are configuring KaaSCephCluster
of an existing managed cluster,
run the following command:
kubectl apply -n <managedClusterProjectName>
Substitute <managedClusterProjectName>
with the corresponding value.
This section provides solutions to the issues that may occur while operating a Mirantis Container Cloud management, regional, or managed cluster.
While operating your management, regional, or managed cluster, you may require collecting and inspecting the cluster logs to analyze cluster events or troubleshoot issues. For the logs structure, see Deployment Guide: Collect the bootstrap logs.
To collect cluster logs:
Choose from the following options:
If you did not delete the kaas-bootstrap
folder from the bootstrap
node, log in to the bootstrap node.
If you deleted the kaas-bootstrap
folder:
Log in to a local machine running Ubuntu 18.04
where kubectl
is installed.
Download and run the Container Cloud bootstrap script:
wget https://binary.mirantis.com/releases/get_container_cloud.sh
chmod 0755 get_container_cloud.sh
./get_container_cloud.sh
Obtain kubeconfig
of the required cluster. The management or regional
cluster kubeconfig
files are created during the last stage
of the management or regional cluster bootstrap. To obtain a managed cluster
kubeconfig
, see Connect to a Mirantis Container Cloud cluster.
Obtain the private SSH key of the required cluster. For a management
or regional cluster, this key is created during bootstrap of a management
cluster in ~/.ssh/openstack_tmp
.
For a managed cluster, this is an SSH key added in the Container Cloud
web UI before the managed cluster creation.
Depending on the cluster type that you require logs from, run the corresponding command:
For a management cluster:
kaas collect logs --management-kubeconfig <pathToMgmtClusterKubeconfig> \
--key-file <pathToMgmtClusterPrivateSshKey> \
--cluster-name <clusterName> --cluster-namespace <clusterProject>
For a regional cluster:
kaas collect logs --management-kubeconfig <pathToMgmtClusterKubeconfig> \
--key-file <pathToRegionalClusterSshKey> --kubeconfig <pathToRegionalClusterKubeconfig> \
--cluster-name <clusterName> --cluster-namespace <clusterProject>
For a managed cluster:
kaas collect logs --management-kubeconfig <pathToMgmtClusterKubeconfig> \
--key-file <pathToManagedClusterSshKey> --kubeconfig <pathToManagedClusterKubeconfig> \
--cluster-name <clusterName> --cluster-namespace <clusterProject>
Substitute the parameters enclosed in angle brackets with the corresponding values of your cluster.
Optionally, add --output-dir
that is a directory path to collect logs.
The default value is logs/
.
For example, logs/<clusterName>/events.log
.
Warning
This section is intended only for advanced Infrastructure Operators who are familiar with Kubernetes Cluster API.
Mirantis currently supports only those Mirantis Container Cloud API features that are implemented in the Container Cloud web UI. Use other Container Cloud API features for testing and evaluation purposes only.
The Container Cloud APIs are implemented using the Kubernetes
CustomResourceDefinitions
(CRDs) that enable you to expand
the Kubernetes API. Different types of resources are grouped in the dedicated
files, such as cluster.yaml
or machines.yaml
.
This section contains descriptions and examples of the Container Cloud API resources for the bare metal cloud provider.
Note
The API documentation for the OpenStack, AWS, and VMWare vSphere resources will be added in the upcoming Container Cloud releases.
This section describes the PublicKey
resource used in Mirantis
Container Cloud API for all supported providers: OpenStack, AWS, and
bare metal. This resource is used to provide SSH access
to every machine of a Container Cloud cluster.
The Container Cloud PublicKey
CR contains the following fields:
apiVersion
API version of the object that is kaas.mirantis.com/v1alpha1
kind
Object type that is PublicKey
metadata
The metadata
object field of the PublicKey
resource contains
the following fields:
name
Name of the public key
namespace
Project where the public key is created
spec
The spec
object field of the PublicKey
resource contains the
publicKey
field that is an SSH public key value.
The PublicKey
resource example:
apiVersion: kaas.mirantis.com/v1alpha1
kind: PublicKey
metadata:
name: demokey
namespace: test
spec:
publicKey: |
ssh-rsa AAAAB3NzaC1yc2EAAAA…
This section contains descriptions and examples of the baremetal-based Kubernetes resources for Mirantis Container Cloud.
This section describes the Cluster
resource used the in Mirantis
Container Cloud API that describes the cluster-level parameters.
For demonstration purposes, the Container Cloud Cluster
custom resource (CR) is split into the following major sections:
Warning
The fields of the Cluster
resource that are located
under the status
section including providerStatus
are available for viewing only.
They are automatically generated by the bare metal cloud provider
and must not be modified using Container Cloud API.
The Container Cloud Cluster
CR contains the following fields:
apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1
.
kind
Object type that is Cluster
.
The metadata
object field of the Cluster
resource
contains the following fields:
name
Name of a cluster. A managed cluster name is specified under the
Cluster Name
field in the Create Cluster wizard of the
Container Cloud web UI. A management and regional cluster names
are configurable in the bootstrap script.
namespace
Project in which the cluster object was created. The management
and regional clusters are created in the default
project.
The managed cluster project equals to the selected project name.
labels
Key-value pairs attached to the object:
kaas.mirantis.com/provider
Provider type that is baremetal
for the baremetal-based clusters.
kaas.mirantis.com/region
Region name. The default region name for the management
cluster is region-one
. For the regional cluster, it is configurable
using the REGION
parameter in the bootstrap script.
Configuration example:
apiVersion: cluster.k8s.io/v1alpha1
kind: Cluster
metadata:
name: demo
namespace: test
labels:
kaas.mirantis.com/provider: baremetal
kaas.mirantis.com/region: region-one
The spec
object field of the Cluster
object
represents the BaremetalClusterProviderSpec
subresource that
contains a complete description of the desired bare metal cluster
state and all details to create the cluster-level
resources. It also contains the fields required for LCM deployment
and integration of the Container Cloud components.
The providerSpec
object field is custom for each cloud provider and
contains the following generic fields for the bare metal provider:
apiVersion
API version of the object that is baremetal.k8s.io/v1alpha1
kind
Object type that is BaremetalClusterProviderSpec
Configuration example:
spec:
...
providerSpec:
value:
apiVersion: baremetal.k8s.io/v1alpha1
kind: BaremetalClusterProviderSpec
The providerSpec
object field of the Cluster
resource
contains the following common fields for all Container Cloud
providers:
publicKeys
List of the SSH public key references
release
Name of the ClusterRelease
object to install on a cluster
helmReleases
List of the enabled Helm releases from the Release
object that run
on a Container Cloud cluster
Configuration example:
spec:
...
providerSpec:
value:
publicKeys:
- name: bootstrap-key
release: ucp-5-7-0-3-3-3-tp11
helmReleases:
- name: metallb
values:
configInline:
address-pools:
- addresses:
- 10.0.0.101-10.0.0.120
name: default
protocol: layer2
...
- name: stacklight
This section represents the Container Cloud components that are enabled on a cluster. It contains the following fields:
management
Configuration for the management cluster components:
enabled
Management cluster enabled (true
) or disabled (false
).
helmReleases
List of the management cluster Helm releases that will be installed
on the cluster. A Helm release includes the name
and values
fields. The specified values will be merged with relevant Helm release
values of the management cluster in the Release
object.
regional
List of regional clusters components on the Container Cloud cluster for each configured provider available for a specific region:
provider
Provider type that is baremetal
.
helmReleases
List of the regional Helm releases that will be installed
on the cluster. A Helm release includes the name
and values
fields. The specified values will be merged with relevant
regional Helm release values in the Release
object.
release
Name of the Container Cloud Release
object.
Configuration example:
spec:
...
providerSpec:
value:
kaas:
management:
enabled: true
helmReleases:
- name: kaas-ui
values:
serviceConfig:
server: https://10.0.0.117
regional:
- helmReleases:
- name: baremetal-provider
values: {}
provider: baremetal
- helmReleases:
- name: byo-provider
values: {}
provider: byo
release: kaas-2-0-0
Must not be modified using API
The common providerStatus
object field of the Cluster
resource
contains the following fields:
apiVersion
API version of the object that is baremetal.k8s.io/v1alpha1
kind
Object type that is BaremetalClusterProviderStatus
loadBalancerHost
Load balancer IP or host name of the Container Cloud cluster
apiServerCertificate
Server certificate of Kubernetes API
ucpDashboard
URL of the Mirantis Kubernetes Engine (MKE) Dashboard
Configuration example:
status:
providerStatus:
apiVersion: baremetal.k8s.io/v1alpha1
kind: BaremetalClusterProviderStatus
loadBalancerHost: 10.0.0.100
apiServerCertificate: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS…
ucpDashboard: https://10.0.0.100:6443
Must not be modified using API
The providerStatus
object field of the Cluster
resource that reflects
the cluster readiness contains the following fields:
persistentVolumesProviderProvisioned
Status of the persistent volumes provisioning.
Prevents the Helm releases that require persistent volumes from being
installed until some default StorageClass
is added to the Cluster
object.
helm
Details about the deployed Helm releases:
ready
Status of the deployed Helm releases. The true
value indicates that
all Helm releases are deployed successfully.
releases
List of the enabled Helm releases that run on the Container Cloud cluster:
releaseStatuses
List of the deployed Helm releases. The success: true
field
indicates that the release is deployed successfully.
stacklight
Status of the StackLight deployment. Contains URLs of all StackLight
components. The success: true
field indicates that StackLight
is deployed successfully.
nodes
Details about the cluster nodes:
ready
Number of nodes that completed the deployment or update.
requested
Total number of nodes. If the number of ready
nodes does not match
the number of requested
nodes, it means that a cluster is being
currently deployed or updated.
notReadyObjects
The list of the services
, deployments
, and statefulsets
Kubernetes objects that are not in the Ready
state yet.
A service
is not ready if its external address has not been provisioned
yet. A deployment
or statefulset
is not ready if the number of
ready replicas is not equal to the number of desired replicas. Both objects
contain the name and namespace of the object and the number of ready and
desired replicas (for controllers). If all objects are ready, the
notReadyObjects
list is empty.
Configuration example:
status:
providerStatus:
persistentVolumesProviderProvisioned: true
helm:
ready: true
releases:
releaseStatuses:
iam:
success: true
...
stacklight:
alerta:
url: http://10.0.0.106
alertmanager:
url: http://10.0.0.107
grafana:
url: http://10.0.0.108
kibana:
url: http://10.0.0.109
prometheus:
url: http://10.0.0.110
success: true
nodes:
ready: 3
requested: 3
notReadyObjects:
services:
- name: testservice
namespace: default
deployments:
- name: baremetal-provider
namespace: kaas
replicas: 3
readyReplicas: 2
statefulsets: {}
Must not be modified using API
The oidc
section of the providerStatus
object field
in the Cluster
resource reflects the Open ID Connect configuration details.
It contains the required details to obtain a token for
a Container Cloud cluster and consists of the following fields:
certificate
Base64-encoded OIDC certificate.
clientId
Client ID for OIDC requests.
groupsClaim
Name of an OIDC groups claim.
issuerUrl
Issuer URL to obtain the representation of the realm.
ready
OIDC status relevance. If true
, the status corresponds to the
LCMCluster
OIDC configuration.
Configuration example:
status:
providerStatus:
oidc:
certificate: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUREekNDQWZ...
clientId: kaas
groupsClaim: iam_roles
issuerUrl: https://10.0.0.117/auth/realms/iam
ready: true
Must not be modified using API
The releaseRefs
section of the providerStatus
object field
in the Cluster
resource provides the current Cluster release version
as well as the one available for upgrade. It contains the following fields:
current
Details of the currently installed Cluster release:
lcmType
Type of the Cluster release (ucp
).
name
Name of the Cluster release resource.
version
Version of the Cluster release.
unsupportedSinceKaaSVersion
Indicates that a Container Cloud release newer than the current one exists and that it does not support the current Cluster release.
available
List of the releases available for upgrade. Contains the name
and
version
fields.
Configuration example:
status:
providerStatus:
releaseRefs:
available:
- name: ucp-5-5-0-3-4-0-dev
version: 5.5.0+3.4.0-dev
current:
lcmType: ucp
name: ucp-5-4-0-3-3-0-beta1
version: 5.4.0+3.3.0-beta1
This section describes the Machine
resource used in Mirantis
Container Cloud API for bare metal provider.
The Machine
resource describes the machine-level parameters.
For demonstration purposes, the Container Cloud Machine
custom resource (CR) is split into the following major sections:
The Container Cloud Machine
CR contains the following fields:
apiVersion
API version of the object that is cluster.k8s.io/v1alpha1
.
kind
Object type that is Machine
.
The metadata
object field of the Machine
resource contains
the following fields:
name
Name of the Machine
object.
namespace
Project in which the Machine
object is created.
annotations
Key-value pair to attach arbitrary metadata to the object:
metal3.io/BareMetalHost
Annotation attached to the Machine
object to reference
the corresponding BareMetalHost
object in the
<BareMetalHostProjectName/BareMetalHostName>
format.
labels
Key-value pairs that are attached to the object:
kaas.mirantis.com/provider
Provider type that matches the provider type in the Cluster
object
and must be baremetal
.
kaas.mirantis.com/region
Region name that matches the region name in the Cluster
object.
cluster.sigs.k8s.io/cluster-name
Cluster name that the Machine
object is linked to.
cluster.sigs.k8s.io/control-plane
For the control plane role of a machine, this label contains any value,
for example, "true"
.
For the worker role, this label is absent or does not contain any value.
Configuration example:
apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
metadata:
name: example-control-plane
namespace: example-ns
annotations:
metal3.io/BareMetalHost: default/master-0
labels:
kaas.mirantis.com/provider: baremetal
kaas.mirantis.com/region: region-one
cluster.sigs.k8s.io/cluster-name: example-cluster
cluster.sigs.k8s.io/control-plane: "true" # remove for worker
The spec
object field of the Machine
object represents
the BareMetalMachineProviderSpec
subresource with all required
details to create a bare metal instance. It contains the following fields:
apiVersion
API version of the object that is baremetal.k8s.io/v1alpha1
.
kind
Object type that is BareMetalMachineProviderSpec
.
bareMetalHostProfile
Configuration profile of a bare metal host:
name
Name of a bare metal host profile
namespace
Project in which the bare metal host profile is created.
l2TemplateIfMappingOverride
If specified, overrides the interface mapping value for the corresponding
L2Template
object.
l2TemplateSelector
If specified, contains the name
(first priority) or label
of the L2 template that will be applied during a machine creation.
The l2TemplateSelector
field is copied from the Machine
providerSpec
object to the IpamHost
object only once,
during a machine creation. To modify l2TemplateSelector
after creation
of a Machine
CR, edit the IpamHost
object.
hostSelector
Specifies the matching criteria for labels on the bare metal hosts.
Limits the set of the BareMetalHost
objects considered for
claiming for the Machine
object. The following selector labels
can be added when creating a machine using the Container Cloud web UI:
hostlabel.bm.kaas.mirantis.com/controlplane
hostlabel.bm.kaas.mirantis.com/worker
hostlabel.bm.kaas.mirantis.com/storage
Any custom label that is assigned to one or more bare metal hosts using API
can be used as a host selector. If the BareMetalHost
objects
with the specified label are missing, the Machine
object will not
be deployed until at least one bare metal host with the specified label
is available.
nodeLabels
List of node labels to be attached to the corresponding node. Enables
running of certain components on separate cluster nodes.
The list of allowed node labels is defined in the
providerStatus.releaseRef.current.allowedNodeLabels
cluster status.
Addition of any unsupported node label not from this list is restricted.
Configuration example:
spec:
...
providerSpec:
value:
apiVersion: baremetal.k8s.io/v1alpha1
kind: BareMetalMachineProviderSpec
bareMetalHostProfile:
name: default
namespace: default
l2TemplateIfMappingOverride:
- eno1
- enp0s0
l2TemplateSelector:
label: l2-template1-label-1
hostSelector:
matchLabels:
baremetal: hw-master-0
kind: BareMetalMachineProviderSpec
nodeLabels:
- key: stacklight
value: enabled
The status
object field of the Machine
object represents the
BareMetalMachineProviderStatus
subresource that describes the current
bare metal instance state and contains the following fields:
apiVersion
API version of the object that is cluster.k8s.io/v1alpha1
.
kind
Object type that is BareMetalMachineProviderStatus
.
hardware
Provides a machine hardware information:
cpu
Number of CPUs.
ram
RAM capacity in GB.
storage
List of hard drives mounted on the machine. Contains the disk name and size in GB.
status
Represents the current status of a machine:
Provision
Machine is yet to obtain a status.
Uninitialized
Machine is yet to obtain a node IP address and hostname.
Pending
Machine is yet to receive the deployment instructions. It is either not booted yet or waits for the LCM controller to be deployed.
Prepare
Machine is running the Prepare
phase when mostly Docker images
and packages are being predownloaded.
Deploy
Machine is processing the LCM controller instructions.
Reconfigure
Some configurations are being updated on a machine.
Ready
Machine is deployed and the supported Mirantis Kubernetes Engine (MKE) version is set.
Configuration example:
status:
providerStatus:
apiVersion: baremetal.k8s.io/v1alpha1
kind: BareMetalMachineProviderStatus
hardware:
cpu: 11
ram: 16
storage:
- name: /dev/vda
size: 61
- name: /dev/vdb
size: 32
- name: /dev/vdc
size: 32
status: Ready
This section describes the BareMetalHostProfile
resource used
in Mirantis Container Cloud API
to define how the storage devices and operating system
are provisioned and configured.
For demonstration purposes, the Container Cloud BareMetalHostProfile
custom resource (CR) is split into the following major sections:
The Container Cloud BareMetalHostProfile
CR contains
the following fields:
apiVersion
API version of the object that is metal3.io/v1alpha1
.
kind
Object type that is BareMetalHostProfile
.
metadata
The metadata
field contains the following subfields:
name
Name of the bare metal host profile.
namespace
Project in which the bare metal host profile was created.
Configuration example:
apiVersion: metal3.io/v1alpha1
kind: BareMetalHostProfile
metadata:
name: default
namespace: default
The spec
field of BareMetalHostProfile
object contains
the fields to customize your hardware configuration:
devices
List of definitions of the physical storage devices. To configure more
than three storage devices per host, add additional devices to this list.
Each device
in the list may have one or more
partitions
defined by the list in the partitions
field.
fileSystems
List of file systems. Each file system can be created on top of either device, partition, or logical volume. If more file systems are required for additional devices, define them in this field.
logicalVolumes
List of LVM logical volumes. Every logical volume belongs to a volume
group from the volumeGroups
list and has the sizeGiB
attribute
for size in gigabytes.
volumeGroups
List of definitions of LVM volume groups. Each volume group contains one
or more devices or partitions from the devices
list.
preDeployScript
Shell script that is executed on a host before provisioning the target
operating system inside the ramfs
system.
postDeployScript
Shell script that is executed on a host after deploying the operating
system inside the ramfs
system that is chrooted to the target
operating system.
grubConfig
List of options passed to the Linux GRUB bootloader. Each string in the list defines one parameter.
kernelParameters:sysctl
List of options passed to /etc/sysctl.d/999-baremetal.conf
during bmh
provisioning.
Configuration example:
spec:
devices:
- device:
wipe: true
partitions:
- dev: ""
name: bios_grub
partflags:
- bios_grub
sizeGiB: 0.00390625
...
- device:
wipe: true
partitions:
- dev: ""
name: lvm_lvp_part
fileSystems:
- fileSystem: vfat
partition: config-2
- fileSystem: vfat
mountPoint: /boot/efi
partition: uefi
...
- fileSystem: ext4
logicalVolume: lvp
mountPoint: /mnt/local-volumes/
logicalVolumes:
- name: root
sizeGiB: 0
vg: lvm_root
- name: lvp
sizeGiB: 0
vg: lvm_lvp
postDeployScript: |
#!/bin/bash -ex
echo $(date) 'post_deploy_script done' >> /root/post_deploy_done
preDeployScript: |
#!/bin/bash -ex
echo $(date) 'pre_deploy_script done' >> /root/pre_deploy_done
volumeGroups:
- devices:
- partition: lvm_root_part
name: lvm_root
- devices:
- partition: lvm_lvp_part
name: lvm_lvp
grubConfig:
defaultGrubOptions:
- GRUB_DISABLE_RECOVERY="true"
- GRUB_PRELOAD_MODULES=lvm
- GRUB_TIMEOUT=20
kernelParameters:
sysctl:
kernel.panic: "900"
kernel.dmesg_restrict: "1"
kernel.core_uses_pid: "1"
fs.file-max: "9223372036854775807"
fs.aio-max-nr: "1048576"
fs.inotify.max_user_instances: "4096"
vm.max_map_count: "262144"
This section describes the BareMetalHost
resource used in the
Mirantis Container Cloud API. BareMetalHost
object
is being created for each Machine
and contains all information about
machine hardware configuration. It is needed for further selecting which
machine to choose for the deploy. When machine is created
the provider assigns a BareMetalHost
to that machine based on
labels and BareMetalHostProfile
configuration.
For demonstration purposes, the Container Cloud BareMetalHost
custom resource (CR) can be split into the following major sections:
The Container Cloud BareMetalHost
CR contains the following fields:
apiVersion
API version of the object that is metal3.io/v1alpha1
.
kind
Object type that is BareMetalHost
.
metadata
The metadata field contains the following subfields:
name
Name of the BareMetalHost
object.
namespace
Project in which the BareMetalHost
object was created.
labels
Labels used by the bare metal provider to find a matching
BareMetalHost
object to deploy a machine:
hostlabel.bm.kaas.mirantis.com/controlplane
hostlabel.bm.kaas.mirantis.com/worker
hostlabel.bm.kaas.mirantis.com/storage
Each BareMetalHost
object added using the Container Cloud web UI
will be assigned one of these labels. If the BareMetalHost
and
Machine
objects are created using API, any label may be used
to match these objects for a bare metal host to deploy a machine.
Configuration example:
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: master-0
namespace: default
labels:
baremetal: hw-master-0
The spec
section for the BareMetalHost
object defines the desired state
of BareMetalHost
. It contains the following fields:
bmc
Details for communication with the Baseboard Management Controller (bmc
)
module on a host:
address
URL for accessing bmc
in the network.
credentialsName
Name of the secret containing the bmc
credentials. The secret
requires the username
and password
keys in the Base64 encoding.
bootMACAddress
MAC address for booting.
bootUEFI
UEFI boot mode enabled (true
) or disabled (false
).
online
Defines whether the server must be online after inspection.
Configuration example:
spec:
bmc:
address: 5.43.227.106:623
credentialsName: master-0-bmc-secret
bootMACAddress: 0c:c4:7a:a8:d3:44
bootUEFI: true
consumerRef:
apiVersion: cluster.k8s.io/v1alpha1
kind: Machine
name: master-0
namespace: default
online: true
The status
field of the BareMetalHost
object defines the current
state of BareMetalHost
. It contains the following fields:
errorMessage
Last error message reported by the provisioning subsystem.
goodCredentials
Last credentials that were validated.
hardware
Hardware discovered on the host. Contains information about the storage, CPU, host name, firmware, and so on.
operationalStatus
Status of the host:
OK
Host is configured correctly and is manageable.
discovered
Host is only partially configured. For example, the bmc
address
is discovered but not the login credentials.
error
Host has any sort of error.
poweredOn
Host availability status: powered on (true
) or powered off (false
).
provisioning
State information tracked by the provisioner:
state
Current action being done with the host by the provisioner.
id
UUID of a machine.
triedCredentials
Details of the last credentials sent to the provisioning back end.
Configuration example:
status:
errorMessage: ""
goodCredentials:
credentials:
name: master-0-bmc-secret
namespace: default
credentialsVersion: "13404"
hardware:
cpu:
arch: x86_64
clockMegahertz: 3000
count: 32
flags:
- 3dnowprefetch
- abm
...
model: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
firmware:
bios:
date: ""
vendor: ""
version: ""
hostname: ipa-fcab7472-892f-473c-85a4-35d64e96c78f
nics:
- ip: ""
mac: 0c:c4:7a:a8:d3:45
model: 0x8086 0x1521
name: enp8s0f1
pxe: false
speedGbps: 0
vlanId: 0
...
ramMebibytes: 262144
storage:
- by_path: /dev/disk/by-path/pci-0000:00:1f.2-ata-1
hctl: "4:0:0:0"
model: Micron_5200_MTFD
name: /dev/sda
rotational: false
serialNumber: 18381E8DC148
sizeBytes: 1920383410176
vendor: ATA
wwn: "0x500a07511e8dc148"
wwnWithExtension: "0x500a07511e8dc148"
...
systemVendor:
manufacturer: Supermicro
productName: SYS-6018R-TDW (To be filled by O.E.M.)
serialNumber: E16865116300188
operationalStatus: OK
poweredOn: true
provisioning:
state: provisioned
triedCredentials:
credentials:
name: master-0-bmc-secret
namespace: default
credentialsVersion: "13404"
This section describes the IpamHost
resource used in Mirantis
Container Cloud API. The kaas-ipam
controller monitors
the current state of the bare metal Machine
, verifies if BareMetalHost
is successfully created and inspection is completed.
Then the kaas-ipam
controller fetches the information about the network
card, creates the IpamHost
object, and requests the IP address.
The IpamHost
object is created for each Machine
and contains
all configuration of the host network interfaces and IP address.
It also contains the information about associated BareMetalHost
,
Machine
, and MAC addresses.
For demonstration purposes, the Container Cloud IpamHost
custom resource (CR) is split into the following major sections:
The Container Cloud IpamHost
CR contains the following fields:
apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1
kind
Object type that is IpamHost
metadata
The metadata
field contains the following subfields:
name
Name of the IpamHost
object
namespace
Project in which the IpamHost
object has been created
labels
Key-value pairs that are attached to the object:
cluster.sigs.k8s.io/cluster-name
References the Cluster
object name that IpamHost
is
assigned to
ipam/BMHostID
Unique ID of the associated BareMetalHost
object
ipam/MAC-XX-XX-XX-XX-XX-XX: "1"
Number of NICs of the host that the corresponding MAC address is assigned to
ipam/MachineID
Unique ID of the associated Machine
object
ipam/UID
Unique ID of the IpamHost
object
Configuration example:
apiVersion: ipam.mirantis.com/v1alpha1
kind: IpamHost
metadata:
name: master-0
namespace: default
labels:
cluster.sigs.k8s.io/cluster-name: kaas-mgmt
ipam/BMHostID: 57250885-f803-11ea-88c8-0242c0a85b02
ipam/MAC-0C-C4-7A-1E-A9-5C: "1"
ipam/MAC-0C-C4-7A-1E-A9-5D: "1"
ipam/MachineID: 573386ab-f803-11ea-88c8-0242c0a85b02
ipam/UID: 834a2fc0-f804-11ea-88c8-0242c0a85b02
The spec
field of the IpamHost
resource describes the desired
state of the object. It contains the following fields:
nicMACmap
Represents an unordered list of all NICs of the host.
Each NIC entry contains such fields as name
, mac
, ip
,
and so on. The primary
field defines that the current NIC is primary.
Only one NIC can be primary.
l2TemplateSelector
If specified, contains the name
(first priority) or label
of the L2 template that will be applied during a machine creation.
The l2TemplateSelector
field is copied from the Machine
providerSpec
object to the IpamHost
object only once,
during a machine creation. To modify l2TemplateSelector
after creation
of a Machine
CR, edit the IpamHost
object.
Configuration example:
spec:
nicMACmap:
- mac: 0c:c4:7a:1e:a9:5c
name: ens11f0
- ip: 172.16.48.157
mac: 0c:c4:7a:1e:a9:5d
name: ens11f1
primary: true
l2TemplateSelector:
label:xxx
The status
field of the IpamHost
resource describes the observed
state of the object. It contains the following fields:
ipAllocationResult
Status of IP allocation for the primary NIC (PXE boot). Possible values
are OK
or ERR
if no IP address was allocated.
l2RenderResult
Result of the L2 template rendering, if applicable. Possible values are
OK
or an error message.
lastUpdated
Date and time of the last IpamHost
status update.
nicMACmap
Unordered list of all NICs of host with a detailed description. Each
nicMACmap
entry contains additional fields such as ipRef
,
nameservers
, online
, and so on.
osMetadataNetwork
Configuration of the host OS metadata network. This configuration is used
in the cloud-init
tool and is applicable to the primary NIC only.
It is added when the IP address is allocated and
the ipAllocationResult
status is OK
.
versionIpam
IPAM version used during the last update of the object.
Configuration example:
status:
ipAllocationResult: OK
l2RenderResult: There are no available L2Templates
lastUpdated: "2020-09-16T11:02:39Z"
nicMACmap:
- mac: 0C:C4:7A:1E:A9:5C
name: ens11f0
- gateway: 172.16.48.1
ip: 172.16.48.200/24
ipRef: default/auto-0c-c4-7a-a8-d3-44
mac: 0C:C4:7A:1E:A9:5D
name: ens11f1
nameservers:
- 172.18.176.6
online: true
primary: true
osMetadataNetwork:
links:
- ethernet_mac_address: 0C:C4:7A:A8:D3:44
id: enp8s0f0
type: phy
networks:
- ip_address: 172.16.48.200
link: enp8s0f0
netmask: 255.255.255.0
routes:
- gateway: 172.16.48.1
netmask: 0.0.0.0
network: 0.0.0.0
type: ipv4
services:
- address: 172.18.176.6
type: dns
versionIpam: v3.0.999-20200807-130909-44151f8
This section describes the Subnet
resource used in Mirantis
Container Cloud API to allocate IP addresses for the cluster nodes.
For demonstration purposes, the Container Cloud Subnet
custom resource (CR) can be split into the following major sections:
The Container Cloud Subnet
CR contains the following fields:
apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1
.
kind
Object type that is Subnet
metadata
This field contains the following subfields:
name
Name of the Subnet
object.
namespace
Project in which the Subnet
object was created.
labels
Key-value pairs that are attached to the object:
ipam/DefaultSubnet: "1"
Indicates that the subnet was automatically created for the PXE network. The subnet with this label is unique for a specific region and global for all clusters and projects in the region.
ipam/UID
Unique ID of a subnet.
kaas.mirantis.com/provider
Provider type.
kaas.mirantis.com/region
Region type.
Configuration example:
apiVersion: ipam.mirantis.com/v1alpha1
kind: Subnet
metadata:
name: kaas-mgmt
namespace: default
labels:
ipam/DefaultSubnet: "1"
ipam/UID: 1bae269c-c507-4404-b534-2c135edaebf5
kaas.mirantis.com/provider: baremetal
kaas.mirantis.com/region: region-one
The spec
field of the Subnet
resource describes the desired state of
a subnet. It contains the following fields:
cidr
A valid IPv4 CIDR, for example, 10.11.0.0/24
.
gateway
A valid gateway address, for example, 10.11.0.9
.
includeRanges
A list of IP address ranges within the given CIDR that should be used in
the allocation of IPs for nodes. The gateway, network, broadcast, and DNS
addresses will be excluded (protected) automatically if they intersect with
one of the range. The IPs outside the given ranges will not be used in
the allocation. Each element of the list can be either an interval
10.11.0.5-10.11.0.70
or a single address 10.11.0.77
. The
includeRanges
parameter is mutually exclusive with excludeRanges
.
excludeRanges
A list of IP address ranges within the given CIDR that should not be
used in the allocation of IPs for nodes. The IPs within the given CIDR
but outside the given ranges will be used in the allocation.
The gateway, network, broadcast, and DNS addresses will be excluded
(protected) automatically if they are included in the CIDR.
Each element of the list can be either an interval 10.11.0.5-10.11.0.70
or a single address 10.11.0.77
. The excludeRanges
parameter
is mutually exclusive with includeRanges
.
useWholeCidr
If set to false
(by default), the subnet address and broadcast
address will be excluded from the address allocation.
If set to true
, the subnet address and the broadcast address
are included into the address allocation for nodes.
nameservers
The list of IP addresses of name servers. Each element of the list
is a single address, for example, 172.18.176.6
.
Configuration example:
spec:
cidr: 172.16.48.0/24
excludeRanges:
- 172.16.48.99
- 172.16.48.101-172.16.48.145
gateway: 172.16.48.1
nameservers:
- 172.18.176.6
The status
field of the Subnet
resource describes the actual state of
a subnet. It contains the following fields:
allocatable
The number of IP addresses that are available for allocation.
allocatedIPs
The list of allocated IP addresses in the IP:<IPAddr object UID>
format.
capacity
The total number of IP addresses to be allocated, including the sum of allocatable and already allocated IP addresses.
cidr
The IPv4 CIDR for a subnet.
gateway
The gateway address for a subnet.
nameservers
The list of IP addresses of name servers.
ranges
The list of IP address ranges within the given CIDR that are used in the allocation of IPs for nodes.
Configuration example:
status:
allocatable: 51
allocatedIPs:
- 172.16.48.200:24e94698-f726-11ea-a717-0242c0a85b02
- 172.16.48.201:2bb62373-f726-11ea-a717-0242c0a85b02
- 172.16.48.202:37806659-f726-11ea-a717-0242c0a85b02
capacity: 54
cidr: 172.16.48.0/24
gateway: 172.16.48.1
lastUpdate: "2020-09-15T12:27:58Z"
nameservers:
- 172.18.176.6
ranges:
- 172.16.48.200-172.16.48.253
statusMessage: OK
This section describes the SubnetPool
resource used in
Mirantis Container Cloud API to manage a pool of addresses
from which subnets can be allocated.
For demonstration purposes, the Container Cloud SubnetPool
custom resource (CR) is split into the following major sections:
The Container Cloud SubnetPool
CR contains the following fields:
apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1
.
kind
Object type that is SubnetPool
.
metadata
The metadata
field contains the following subfields:
name
Name of the SubnetPool
object.
namespace
Project in which the SubnetPool
object was created.
labels
Key-value pairs that are attached to the object:
kaas.mirantis.com/provider
Provider type that is baremetal
.
kaas.mirantis.com/region
Region name.
Configuration example:
apiVersion: ipam.mirantis.com/v1alpha1
kind: SubnetPool
metadata:
name: kaas-mgmt
namespace: default
labels:
kaas.mirantis.com/provider: baremetal
kaas.mirantis.com/region: region-one
The spec
field of the SubnetPool
resource describes the desired state
of a subnet pool. It contains the following fields:
cidr
Valid IPv4 CIDR. For example, 10.10.0.0/16.
blockSize
IP address block size to use when assigning an IP address block
to every new child Subnet
object. For example, if you set /25
,
every new child Subnet
will have 128 IPs to allocate.
Possible values are from /29
to the cidr
size. Immutable.
nameservers
Optional. List of IP addresses of name servers to use for every new child
Subnet
object. Each element of the list is a single address,
for example, 172.18.176.6. Default: empty.
gatewayPolicy
Optional. Method of assigning a gateway address to new child Subnet
objects. Default: none
. Possible values are:
first
- first IP of the IP address block assigned to a child
Subnet
, for example, 10.11.10.1.
last
- last IP of the IP address block assigned to a child Subnet
,
for example, 10.11.10.254.
none
- no gateway address.
Configuration example:
spec:
cidr: 10.10.0.0/16
blockSize: /25
nameservers:
- 172.18.176.6
gatewayPolicy: first
The status
field of the SubnetPool
resource describes the actual state
of a subnet pool. It contains the following fields:
statusMessage
Message that reflects the current status of the SubnetPool
resource.
Possible values are:
OK
- a subnet pool is active.
ERR: <error message>
- a subnet pool is in the Failure
state.
TERM
- a subnet pool is terminating.
allocatedSubnets
List of allocated subnets. Each subnet has the <CIDR>:<SUBNET_UID>
format.
blockSize
Block size to use for IP address assignments from the defined pool.
capacity
Total number of IP addresses to be allocated. Includes the number of allocatable and already allocated IP addresses.
allocatable
Number of subnets with the blockSize
size that are available for
allocation.
lastUpdate
Date and time of the last SubnetPool
status update.
versionIpam
IPAM version used during the last object update.
Example:
status:
allocatedSubnets:
- 10.10.0.0/24:0272bfa9-19de-11eb-b591-0242ac110002
blockSize: /24
capacity: 54
allocatable: 51
lastUpdate: "2020-09-15T08:30:08Z"
versionIpam: v3.0.999-20200807-130909-44151f8
statusMessage: OK
This section describes the IPaddr
resource used in Mirantis
Container Cloud API. The IPAddr
object describes an IP address
and contains all information about the associated MAC address.
For demonstration purposes, the Container Cloud IPaddr
custom resource (CR) is split into the following major sections:
The Container Cloud IPaddr
CR contains the following fields:
apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1
kind
Object type that is IPaddr
metadata
The metadata
field contains the following subfields:
name
Name of the IPaddr
object in the auto-XX-XX-XX-XX-XX-XX
format
where XX-XX-XX-XX-XX-XX is the associated MAC address
namespace
Project in which the IPaddr
object was created
labels
Key-value pairs that are attached to the object:
ipam/IP
IPv4 address
ipam/IpamHostID
Unique ID of the associated IpamHost
object
ipam/MAC
MAC address
ipam/SubnetID
Unique ID of the Subnet
object
ipam/UID
Unique ID of the IPAddr
object
Configuration example:
apiVersion: ipam.mirantis.com/v1alpha1
kind: IPaddr
metadata:
name: auto-0c-c4-7a-a8-b8-18
namespace: default
labels:
ipam/IP: 172.16.48.201
ipam/IpamHostID: 848b59cf-f804-11ea-88c8-0242c0a85b02
ipam/MAC: 0C-C4-7A-A8-B8-18
ipam/SubnetID: 572b38de-f803-11ea-88c8-0242c0a85b02
ipam/UID: 84925cac-f804-11ea-88c8-0242c0a85b02
The spec
object field of the IPAddr
resource contains the associated
MAC address and the reference to the Subnet
object:
mac
MAC address in the XX:XX:XX:XX:XX:XX
format
subnetRef
Reference to the Subnet
resource in the
<subnetProjectName>/<subnetName>
format
Configuration example:
spec:
mac: 0C:C4:7A:A8:B8:18
subnetRef: default/kaas-mgmt
The status
object field of the IPAddr
resource reflects the actual
state of the IPAddr
object. In contains the following fields:
address
IP address.
cidr
IPv4 CIDR for the Subnet
.
gateway
Gateway address for the Subnet
.
lastUpdate
Date and time of the last IPAddr
status update.
mac
MAC address in the XX:XX:XX:XX:XX:XX
format.
nameservers
List of the IP addresses of name servers of the Subnet
.
Each element of the list is a single address, for example, 172.18.176.6.
phase
Current phase of the IP address. Possible values: Active
, Failed
,
or Terminating
.
versionIpam
IPAM version used during the last update of the object.
Configuration example:
status:
address: 172.16.48.201
cidr: 172.16.48.201/24
gateway: 172.16.48.1
lastUpdate: "2020-09-16T10:08:07Z"
mac: 0C:C4:7A:A8:B8:18
nameservers:
- 172.18.176.6
phase: Active
versionIpam: v3.0.999-20200807-130909-44151f8
This section describes the L2Template
resource used in Mirantis
Container Cloud API.
By default, Container Cloud configures a single interface on cluster nodes,
leaving all other physical interfaces intact.
With L2Template
, you can create advanced host networking configurations
for your clusters. For example, you can create bond interfaces on top of
physical interfaces on the host.
For demonstration purposes, the Container Cloud L2Template
custom resource (CR) is split into the following major sections:
The Container Cloud L2Template
CR contains the following fields:
apiVersion
API version of the object that is ipam.mirantis.com/v1alpha1
.
kind
Object type that is L2Template
.
metadata
The metadata
field contains the following subfields:
name
Name of the L2Template
object.
namespace
Project in which the L2Template
object was created.
labels
Key-value pairs that are attached to the object:
Caution
All ipam/*
labels, except ipam/DefaultForCluster
,
are set automatically and must not be configured manually.
ipam/Cluster
References the Cluster
object name that this template is
applied to. The process of selecting the L2Template
object for
a specific cluster is as follows:
The kaas-ipam
controller monitors the L2Template
objects
with the ipam/Cluster:<clusterName>
label.
The L2Template
object with the ipam/Cluster: <clusterName>
label is assigned to a cluster with Name: <clusterName>
,
if available. Otherwise, the default L2Template
object
with the ipam/Cluster: default
label is assigned to a cluster.
ipam/PreInstalledL2Template: "1"
Is automatically added during a management or regional cluster
deployment.
Indicates that the current L2Template
object was preinstalled.
Represents L2 templates that are automatically copied to a project
once it is created. Once the L2 templates are copied,
the ipam/PreInstalledL2Template
label is removed.
ipam/DefaultForCluster
This label is unique per cluster. When you use several L2 templates
per cluster, only the first template is automatically labeled
as the default one. All consequent templates must be referenced
in the machines configuration files using L2templateSelector
.
You can manually configure this label if required.
ipam/UID
Unique ID of an object.
kaas.mirantis.com/provider
Provider type.
kaas.mirantis.com/region
Region type.
Configuration example:
apiVersion: ipam.mirantis.com/v1alpha1
kind: L2Template
metadata:
name: l2template-test
namespace: default
labels:
ipam/Cluster: test
ipam/DefaultForCluster: "1"
kaas.mirantis.com/provider: baremetal
kaas.mirantis.com/region: region-one
The spec
field of the L2Template
resource describes the desired
state of the object. It contains the following fields:
clusterRef
The Cluster
object that this template is applied to.
The default
value is used to apply the given template to all clusters
within a particular project, unless an L2 template that references
a specific cluster name exists.
Caution
A cluster can be associated with only one template.
An L2 template must have the same namespace as the referenced cluster.
A project can have only one default L2 template.
ifMapping
The list of interface names for the template. The interface mapping is
defined globally for all bare metal hosts in the cluster but can be
overridden at the host level, if required, by editing the IpamHost
object for a particular host. The ifMapping
parameter
is mutually exclusive with autoIfMappingPrio
.
autoIfMappingPrio
The list of prefixes, such as eno
, ens
, and so on, to match the
interfaces to automatically create a list for the template. The result
of generation may be overridden at the host level using
ifMappingOverride
in the corresponded IpamHost
spec
.
The autoIfMappingPrio
parameter is mutually exclusive
with ifMapping
.
npTemplate
A netplan-compatible configuration with special lookup functions that
defines the networking settings for the cluster hosts, where physical
NIC names and details are parameterized. This configuration will be
processed using Go templates. Instead of specifying IP and MAC addresses,
interface names, and other network details specific to a particular host,
the template supports use of special lookup functions. These lookup
functions, such as nic
, mac
, ip
, and so on, return
host-specific network information when the template is rendered for
a particular host.
Caution
All rules and restrictions of the netplan configuration also apply to L2 templates. For details, see the official netplan documentation.
Configuration example:
spec:
autoIfMappingPrio:
- provision
- eno
- ens
- enp
l3Layout: null
npTemplate: |
version: 2
ethernets:
{{nic 0}}:
dhcp4: false
dhcp6: false
addresses:
- {{ip "0:kaas-mgmt"}}
gateway4: {{gateway_from_subnet "kaas-mgmt"}}
nameservers:
addresses: {{nameservers_from_subnet "kaas-mgmt"}}
match:
macaddress: {{mac 0}}
set-name: {{nic 0}}
The status
field of the L2Template
resource reflects the actual state
of the L2Template
object and contains the following fields:
phase
Current phase of the L2Template
object.
Possible values: Ready
, Failed
, or Terminating
.
reason
Detailed error message in case L2Template
has the Failed
status.
lastUpdate
Date and time of the last L2Template
status update.
versionIpam
IPAM version used during the last update of the object.
Configuration example:
status:
lastUpdate: "2020-09-15T08:30:08Z"
phase: Failed
reason: The kaas-mgmt subnet in the terminating state.
versionIpam: v3.0.999-20200807-130909-44151f8